OpenAI’s o3 And Musk’s Grok 4 Presenting Us With A Beginner Game Of Chess
OpenAI’s o3 model delivered a clean sweep against Elon Musk’s Grok 4, winning 4-0 in the finals of Google’s Kaggle Game Arena AI Chess Exhibition last Thursday.
But if you expected a clash of AI titans matching wits like Grandmasters, you might be surprised—chess legend Magnus Carlsen quipped that the bots played more like “talented kids who don’t know how the pieces move.”
Over the three-day tournament, leading AI chatbots were competing with one another without any specialised training. This means that they weren't allowed to use chess engines, look up no chess moves, and were forced to play chess based on whatever knowledge they have gathered from the internet.
The result of that was nothing short of comedic gold. Grandmaster Magnus Carlsen-regarded by the chess world as the greatest player of all time-who co-commentated the final, estimated that both AIs played at the level of an amateur with about an 800 Elo rating.
During his commentary, Carlsen mocked both the AI models for making very basic errors.
"They oscillate between really, really good play and incomprehensible sequences."
At one point, after watching Grok walk its king directly into danger, he joked it might think they were playing King of the Hill instead of chess.
The actual games were also described as a masterclass in how not to play chess: in the first game, Grok essentially gave away one of its most important pieces for free, and went on the further trade off the rest of his pieces while being behind.
In the second game, Grok tried to execute an advance strategy called the "poisoned pawn", only for its plan to back fire, resulting in Grok immediately losing its queen.
In game three, Grok had built what looked like a solid position in the opening phase only to lose all of its pieces in the middle game.
Not Ready for the Chess Big Leagues—Yet
Despite Grok’s strong run in earlier rounds—drawing praise from grandmasters like Hikaru Nakamura—o3’s victory showcased more consistent logic and fewer catastrophic mistakes.
Still, both bots displayed limited understanding of chess’s deeper tactics and positional play, sometimes failing to even deliver checkmate despite sizeable advantages.
Technical glitches abounded as well. Tournament rules gave each model four chances to generate a legal move—when they failed, sometimes trying to “teleport” pieces or make impossible moves, they were disqualified.
AI Hype Meets Chess Reality
The spectacle was a stark reality check for those expecting AI to replace us in our job and revolutionise how we work.
Yet this competition showed that they can't even play a board game that has existed for 1,500 years without trying to cheat or forgetting the rules.
Carlsen even joked that the bots were sometimes better at counting captured pieces than actually playing winning chess.
As AI models continue to spark headlines about their potential to reshape work and society, these chess antics serve as a reminder: while AI can compose convincing emails or answer trivia, but it might not be as smart as we think they are.
For now, humanity’s chess dominance is safe—but the lessons from this playful AI exhibition are invaluable for understanding both the limits and rapid evolution of modern AI.