You’ve probably heard us say this countless times: GPT-3, the gargantuan AI that spews uncannily human-like language, is a marvel. It’s also largely a mirage. You can tell with a simple trick: Ask it the color of sheep, and it will suggest “black” as often as “white”—reflecting the phrase “black sheep” in our vernacular.
That’s the problem with language models: because they’re only trained on text, they lack common sense. Now researchers from the University of North Carolina, Chapel Hill, have designed a new technique to change that. They call it “vokenization,” and it gives language models like GPT-3 the ability to “see.”
It’s not the first time people have sought to combine language models with computer vision. This is actually a rapidly growing area of AI research. The idea is that both types of AI have different strengths. Language models like GPT-3 are trained through unsupervised learning, which requires no manual data labeling, making them easy to scale. Image models like object recognition systems, by contrast, learn more directly from reality. In other words, their understanding doesn’t rely on the kind of abstraction of the world that text provides. They can “see” from pictures of sheep that they are in fact white.
AI models that can parse both language and visual input also have very practical uses. If we want to build robotic assistants, for example, they need computer vision to navigate the world and language to communicate about it to humans.
The original article can be found here.
To support for AI technology and development for social impact, Michael Dukakis Institute for Leadership and Innovation (MDI) and Artificial Intelligence World Society (AIWS.net) has developed AIWS Ethics and Practice Index to measure the ethical values and help people achieve well-being and happiness, as well as solve important issues, such as SDGs. In this effort, Michael Dukakis Institute for Leadership and Innovation (MDI) invites participation and collaboration with think tanks, universities, non-profits, firms, and other entities that share its commitment to the constructive and development of full-scale AI for world society.