Inside AI’s Identity Crisis: AGI or Just Smoke? | Image Source: www.sciencefriday.com
NEW YORK, March 31, 2025 – The race for AI reached a fever field. Artificial intelligence laboratories, with public funding and fascination, chest advances and brain boots almost every week. But as the narrative scale – from intelligent assistants to so-called models of reasoning capable of solving complex problems – remains the question: are we really on the way to IGA, or are we simply caught in a well-funded illusion?
At the heart of this debate is a test, a network puzzle known as ARC-AGI, created by François Cholet, a leading Google researcher. The ARC test does not require AI models to remember the facts or complete written notices. They need to think. And for years, even the most advanced models triggered it. That is, even the new OpenAI model, o3, influenced the field by achieving a success rate of 87 percent, approximately equalling human performance. The company said it was a monumental achievement. Chollet, however, simply said that he felt “vinced”. For him, AI is not a question of intelligence, it is a question of intelligence.
What is CRA-AGI and why is it important?
The ARC-AGI, short for the abstraction body and the reason for general artificial intelligence, is not its average reference point. According to Chollet, it is specifically designed to measure “fluid intelligence” – the human ability to reason with unknown problems. Small quads are represented with visual patterns and are invited to infer the rules and complete the patterns. It’s not about memorization, it’s about understanding.
“You can’t force your way through it,” Chollet explained. The essay questions models to build new strategies on the fly, like a person in front of a new type of mathematical puzzle or visual puzzle. And most AI models failed dramatically. GPT-3, for example, scored zero. GPT-4 has just improved. The models of Claude and Gemini reversed the average, but remained in the unique figures.
Only OpenAI’s O3 model, launched at the end of last year, finally broke the code: 87 percent. These eyebrows raised by the industry have revived conversations about whether we are really close to IMA.
That AGI or just a smarter parrot?
The result seems impressive, but many experts are still not convinced. “A high score does not mean that the model includes anything,” said Melanie Mitchell of the Santa Fe Institute. ”He could use sophisticated model recognition, test logic and terror, or even pure luck.” In fact, the path to the success of the O3 came with heavy IT costs – running each puzzle for 14 minutes and generating more than 1000 possible answers before choosing the best. It is far from a rapid human perception.
Chollet, the architect of the ARC-AGI, qualified the performance as a step, but warned against the brute force combined with brilliance. “The moment when companies get tested,” he said, “they love it. Before that, they claim it doesn’t matter.” And faithful to its opposite form, it again raises the issue. Last week, he introduced ARC-AGI-2, a more robust and intelligent version of the test.
The new test has so far humiliated the giants. OpenAI O1 fell 32 percent in the original 3 percent in the suite. Public versions of the o3 collapsed at less than 2%. Human testers still have an average of about 60%. It’s a clear reminder: scoring a high score once does not mean you have mastered the reasoning. You just crossed a maze.
Why AI keeps fighting with reality and reason
It is not only an abstract puzzle that flummox AI. According to the Association for the Advancement of Artificial Intelligence (AAAI), reality remains a major obstacle. A recent report revealed that even the most advanced OpenAI and Anthropic AI models did not adequately answer half of the questions in a basic reference test called SimpleQA. It is not a fluid intelligence, it is a crude understanding of the facts.
The AAAI conducted surveys of 475 IA researchers, and 76 percent of IA researchers. It was agreed that it would be enough to expand existing models, by providing more data and larger servers, not to deliver the IAG. Stuart Russell, from the UC Berkeley, summarized the mood: “The misunderstood climb is displaced.” AI needs more than training data; She needs reasoning. We need to move from surface level to deep cognition.
Researchers are testing new approaches, such as increased recovery generation (RAG), the thought chain (CoT) and automated reasoning controls. These methods are intended to help the reason for AI more like humans, step by step, logically and systematically. But even with the latter, 60% of the researchers interviewed remain pessimistic that the factual gap will soon be closed.
What exactly is AGI and who decides?
That’s where things get bad. Nevertheless, there is no universally accepted definition of IMA. Is the ARC-AGI coming? Being an expert on all human subjects? Or just make billions of revenue?
According to a report by The Information, OpenAI and Microsoft have privately agreed that AGI could be defined as a software capable of generating $100 billion in profits. It is a distant cry of the coincidence of human cognition. It reflects an organizational approach to utility and market value, not scientific validation.
Mark Chen, Research Manager at OpenAI, recognized this in January: “We are moving on to evaluations that reflect the usefulness.” This means that performance in tasks such as web browsing or customer support could be more important than abstract reasoning. In other words, a robot that is bad in the grids could still be good in business.
What leads to the gap between public perception and reality?
The AAAI report has highlighted a worrying trend: the hypocrite overcomes science. According to the survey, 79 percent of IA researchers believe that the public misinterprets the true capabilities of IA. Models seem intelligent, even human, in conversation. But behind the curtain, they often trust statistical stuff and memorized models.
“The current general AI cycle is the first exposure of many people to AI,” explains the report. And without the tools to assess complaints, people are deceived by illuminating demos. As Gartner points out, we have passed the “top of inflated expectations” and are bogged down in the “hard disillusionment.” That’s where reality begins, and trust is shaken.
This gap has real consequences. Companies could invest in AI solutions that are delivered. Teams can automate sensitive tasks, such as legal summaries or medical advice, without adequate supervision. And users could put too much faith in robots that still fight with basic reasoning and truth.
How does all this affect digital marketing and SEO?
For sellers and SEO professionals, AI is both a leading and mining field. AI tools can speed up keyword search, content production and customer support. But its limits – especially around the truth – pose risks. According to the AIAA, AI-generated content must be verified, tested and reviewed by humans. Especially in regulated areas such as finance and health, errors can be costly.
Search engines are increasingly cracked in low quality or deceptive content. Google algorithms promote experience, authority and trust (E-A-T). The text generated by the CEW, regardless of its fluidity, may be short unless accompanied by human monitoring.
Smart advertisers adopt a human-plus@-@ AI strategy. Use robots to write and think, but let publishers refine and check. This hybrid model combines efficiency and credibility and avoids SEO sanctions that come with unverified automation.
Can AI still help the planet?
Curiously, while AI struggles with reasoning and truth, it is promising in environmental applications. The AAAI report notes that AI is accelerating progress in battery design, climate modelling and carbon removal. It also improves efficiency in all industries, from energy networks to logistics, thus providing a path to sustainability.
But it has a cost. AI data centres use large amounts of electricity and produce electronic waste. According to the United Nations Environment Programme, these centres contribute significantly to global emissions. As large technology companies intend to invest $1 billion in IA infrastructure, the environmental impact could be incredible.
However, the AIAA remains optimistic. With smarter deployment, network improvements and green innovation, AI could become a key ally in the fight against climate change. It is perhaps his most important role, without artificial Einstein, more carbon conscious.
Are we on the AGI? Maybe. But as Cholet and others have shown, intelligence is not just about solving puzzles or climbing servers. It is about creativity, insight and adaptability. AI can still do it. But first, you have to stop pretending you already did.