The new version of GPT-3 is much better behaved (and should be less toxic) - Michael Dukakis Institute for Leadership and Innovation (MDI)

OpenAI has built a new version of GPT-3, its game-changing language model, that it says does away with some of the most toxic issues that plagued its predecessor. The San Francisco-based lab says the updated model, called InstructGPT, is better at following the instructions of people using it—known as “alignment” in AI jargon—and thus produces less offensive language, less misinformation, and fewer mistakes overall—unless explicitly told not to do so.

Large language models like GPT-3 are trained using vast bodies of text, much it taken from the internet, in which they encounter the best and worst of what people put down in words. That is a problem for today’s chatbots and text-generation tools. The models soak up toxic language—from text that is racist and misogynistic or that contains more insidious, baked-in prejudices—as well as falsehoods.

OpenAI has made IntructGPT the default model for users of its application programming interface (API)—a service that gives access to the company’s language models for a fee. GPT-3 will still be available but OpenAI does not recommend using it. “It’s the first time these alignment techniques are being applied to a real product,” says Jan Leike, who co-leads OpenAI’s alignment team.

Previous attempts to tackle the problem included filtering out offensive language from the training set. But that can make models perform less well, especially in cases where the training data is already sparse, such as text from minority groups.

The OpenAI researchers have avoided this problem by starting with a fully trained GPT-3 model. They then added another round of training, using reinforcement learning to teach the model what it should say and when, based on the preferences of human users.

The article was originally posted at MIT Technology Review.

The Boston Global Forum (BGF), in collaboration with the United Nations Centennial Initiative, released a major work entitled Remaking the World – Toward an Age of Global Enlightenment. More than twenty distinguished leaders, scholars, analysts, and thinkers put forth unprecedented approaches to the challenges before us. These include President of the European Commission Ursula von der Leyen, Governor Michael Dukakis, Father of Internet Vint Cerf, Former Secretary of Defense Ash Carter, Harvard University Professors Joseph Nye and Thomas Patterson, MIT Professors Nazli Choucri and Alex ‘Sandy’ Pentland, and European Parliament Member Eva Kaili. The BGF introduced core concepts shaping pathbreaking international initiatives, notably, the Social Contract for the AI Age, an AI International Accord, the Global Alliance for Digital Governance, the AI World Society (AIWS) Ecosystem, and AIWS City.

Related contents