Gabung/ Daftar

OpenAI's o3 Beats Grok 4 With Blunders, Showing Us AI Might Not Be As Smart As We Think

2025/08/12 14:28

Mengikuti

OpenAI’s o3 And Musk’s Grok 4 Presenting Us With A Beginner Game Of Chess

OpenAI’s o3 model delivered a clean sweep against Elon Musk’s Grok 4, winning 4-0 in the finals of Google’s Kaggle Game Arena AI Chess Exhibition last Thursday.

But if you expected a clash of AI titans matching wits like Grandmasters, you might be surprised—chess legend Magnus Carlsen quipped that the bots played more like “talented kids who don’t know how the pieces move.”

Over the three-day tournament, leading AI chatbots were competing with one another without any specialised training. This means that they weren't allowed to use chess engines, look up no chess moves, and were forced to play chess based on whatever knowledge they have gathered from the internet.

The result of that was nothing short of comedic gold. Grandmaster Magnus Carlsen-regarded by the chess world as the greatest player of all time-who co-commentated the final, estimated that both AIs played at the level of an amateur with about an 800 Elo rating.

During his commentary, Carlsen mocked both the AI models for making very basic errors.

"They oscillate between really, really good play and incomprehensible sequences."

OpenAI's o3 beat Grok 4 with the blunders .and wins Kaggle's AI Chess Exhibition Tournament. #chess #grok #OpenAI #checkmate #AI #AIchess pic.twitter.com/FjpuqjIAfx
— Chess news (@usefulchess) August 9, 2025

At one point, after watching Grok walk its king directly into danger, he joked it might think they were playing King of the Hill instead of chess.

The actual games were also described as a masterclass in how not to play chess: in the first game, Grok essentially gave away one of its most important pieces for free, and went on the further trade off the rest of his pieces while being behind.

This is a side effect btw. @xAI spent almost no effort on chess. https://t.co/p18DFFn35A
— Elon Musk (@elonmusk) August 5, 2025

In the second game, Grok tried to execute an advance strategy called the "poisoned pawn", only for its plan to back fire, resulting in Grok immediately losing its queen.

In game three, Grok had built what looked like a solid position in the opening phase only to lose all of its pieces in the middle game.

Not Ready for the Chess Big Leagues—Yet

Despite Grok’s strong run in earlier rounds—drawing praise from grandmasters like Hikaru Nakamura—o3’s victory showcased more consistent logic and fewer catastrophic mistakes.

In the AI games, Grok leads the chess board, Gemini close behind, OpenAI stumbles, Claude underwhelms and DeepSeek comes last 🤖 pic.twitter.com/koMfN6zygF
— Tansu Yegen (@TansuYegen) August 7, 2025

Still, both bots displayed limited understanding of chess’s deeper tactics and positional play, sometimes failing to even deliver checkmate despite sizeable advantages.

Technical glitches abounded as well. Tournament rules gave each model four chances to generate a legal move—when they failed, sometimes trying to “teleport” pieces or make impossible moves, they were disqualified.

AI Hype Meets Chess Reality

The spectacle was a stark reality check for those expecting AI to replace us in our job and revolutionise how we work.

Yet this competition showed that they can't even play a board game that has existed for 1,500 years without trying to cheat or forgetting the rules.

Carlsen even joked that the bots were sometimes better at counting captured pieces than actually playing winning chess.

As AI models continue to spark headlines about their potential to reshape work and society, these chess antics serve as a reminder: while AI can compose convincing emails or answer trivia, but it might not be as smart as we think they are.

SA News #Thread | 1/10 The AI tsunami is here.

By 2030, 30% of U.S. jobs may be automated. That’s 12 million people needing to reinvent their careers.
If your job is routine or repetitive, it’s at serious risk.#AI #FutureOfWork #JobCrisis pic.twitter.com/xIEftd9r4c
— SA News Channel (@SatlokChannel) July 31, 2025

For now, humanity’s chess dominance is safe—but the lessons from this playful AI exhibition are invaluable for understanding both the limits and rapid evolution of modern AI.

Dapatkan pemahaman yang lebih luas tentang industri kripto melalui laporan informatif, dan terlibat dalam diskusi mendalam dengan penulis dan pembaca yang berpikiran sama. Anda dipersilakan untuk bergabung dengan kami di komunitas Coinlive kami yang sedang berkembang:https://t.me/CoinliveSG

Tambahkan komentar

Gabunguntuk meninggalkan komentar Anda yang luar biasa…

0 Komentar

paling awal

Muat lebih banyak komentar