Dialogue with a16z: LLM is lossy compression, and the world model is the real direction

2025/06/05 21:04

World Labs is a startup founded in 2024 by Fei-Fei Li, a famous AI expert and professor at Stanford University, dedicated to developing the next generation of AI systems with "spatial intelligence".

Since its establishment, World Labs has completed two rounds of financing, raising a total of approximately US$230 million. Major investors include a16z, Radical Ventures, NEA, NVIDIA NVentures, AMD Ventures and Intel Capital. The company's valuation exceeded US$1 billion in just three months, making it a new unicorn in the field of AI.

Recently, Fei-Fei Li had a conversation with two partners of a16z, Martin Casado and Eric Torenberg. For the first time, she publicly talked about the concept construction, research direction and grand vision behind their co-founding of World Labs. The past and present of a16z platform strategy: from VC "unwilling to clean up" to "full stack service".

Fei-Fei Li pointed out the core point of this conversation at the beginning: "I don't need a large language model to convince me, the world model is the really important direction."

She emphasized that spatial intelligence - whether it is the three-dimensional physical world we live in or the imagined digital universe - is an indispensable part of intelligence. And today, we finally have the ability to generate and reconstruct these universes.

▍Intelligence older than language: spatial perception and 3D reconstruction

Fei-Fei Li pointed out that compared with language, spatial perception is a more ancient and instinctive ability in the process of human evolution. She shared a personal experience: a few years ago, she temporarily lost her stereoscopic vision due to a corneal injury. During that time, she did not dare to drive alone. Even on familiar streets, it was difficult to judge the distance from the car next to her.

This experimental experience made her deeply realize the basic role of the 3D perception system in human actions. For AI, if it cannot build a 3D world model, it cannot truly understand, operate or reconstruct the real world.

Martin Casado added that the lack of this three-dimensional intelligence is the key reason why robots and embodied intelligent systems have been slow to land. He used a popular example to explain: if you take a person into an unfamiliar room, blindfold him, and only use language to describe the space, and then ask him to complete a task - it is almost impossible. Once the eyes are opened, the brain can automatically reconstruct the spatial model and complete the action. This reconstruction ability is completely lacking in the current mainstream language model.

▍The technical critical point from NeRF to the world model

Talking about why she chose to establish World Labs at this time, Fei-Fei Li believes that this is the result of long-term academic research and industrial foundation accumulation.

She recalled that as early as four years ago, a research breakthrough called NeRF (Neural Radiance Field) had opened up a new path for 3D visual modeling. The proposer of NeRF is Ben Mildenhall, one of the current co-founders of World Labs.

Another founder, Christopher, conducted pioneering research in efficient 3D representation, promoting the return of volumetric 3D modeling in the industry.

Together with Justin Johnson, who applied GAN technology to image style transfer in the early days, these scattered research results can now be integrated into the same team, focusing on a "North Star" goal: to build AI's world model capabilities.

Martin attributed this goal to the deep integration of two systems: one is the AI model, data and architecture itself, and the other is the engineering system of graphics rendering and space reconstruction. The ability to enable experts from these two worlds to collaborate efficiently on a single platform is itself an important organizational innovation in the technology industry.

▍The language model is not the end, but the prologue

Li Fei-Fei emphasized that her belief in the world model did not come from disappointment with LLM, but from a deeper understanding of the nature of intelligence.

She pointed out that language is a "lossy compression" way of cognition, which abstracts the world but also loses rich physical and perceptual information. The real world has no words, grammar and text, only physics, movement and three-dimensional structure.

This view also changed her perception of the form that AI companies should have. She turned from a Stanford professor to an entrepreneur because she realized that academic research alone is far from enough to achieve the modeling of spatial intelligence - it requires industrial computing power investment, system-level architecture scheduling and the collaborative ability of top cross-border talents.

All of this can only be truly implemented in a company with a highly organized level and outstanding full-stack engineering collaboration capabilities.

▍Spatial intelligence applications far exceed robots

For most people, "world model" is still an abstract scientific research term. But Fei-Fei Li and Martin jointly pointed out that its applications far exceed autonomous driving and robots.

Creativity is essentially visual. Industrial design, filmmaking, architectural composition, and even game development all rely on three-dimensional construction and manipulation. If AI has world model capabilities, it can not only "understand" the three-dimensional world, but also "generate" and "operate" virtual space.

Martin described that with just a photo of a table, the model can infer the shape and material behind it, and then build a complete spatial scene. On this basis, users can even measure, add, delete or redesign the space. This is a more intuitive and free way of human-computer interaction than text instructions, and it also opens up a new dimension for design, creation and simulation experiments.

Li Feifei further proposed that digital space is bringing an unprecedented opportunity for change: "Humans have only lived in a three-dimensional physical world so far. But the digital world will allow us to enter the 'multiverse' for the first time."

She listed several examples: some universes are built specifically for robots, some universes serve human creativity, and some are used for telling, communicating and experiencing travel. These spaces that once existed only in imagination will now be truly generated and understood, used and transformed by machines.

▍The next battle of basic models, 3D panoramic modeling

Back to the technology itself, Fei-Fei Li emphasized that World Labs is not only to create an AI that "can see", but also to make AI understand the 3D structure, dynamics and combinatorial logic of the world. This is not only a more difficult engineering problem, but also a new philosophy of representation.

She believes that scientific discoveries such as the double helix structure of DNA and buckyballs are the crystallization of spatial intelligence. It is impossible to deduce such geometric structures purely by language. This is why the world model can not only improve the understanding ability of machines, but also open up new creative paths for human science and art.

Martin concluded that the revolution brought about by LLM proves a fact: when we find the right data structure and model representation, the ability of AI will be improved exponentially. Now, they believe that the "world model" is standing at a similar critical point.

▍The key to understanding and building the world

"We are actually walking backwards on the road of evolution." When Martin put forward this point of view, the whole conversation also went to the philosophical level.

Language is one of the latest modules in the evolution of the human brain, while the spatial perception system has existed since the arthropods, which is 500 million years ago. Today's AI, if it only "learns language", cannot really be called "understanding the world". Only by building a human-like space model can AI truly enter the door of "embodied intelligence".

Fei-Fei Li concluded with her usual firm tone: "I have been waiting for this day. It's not because I don't believe in language models, but because I know very well that the real world is not made up of text."

And the world model is the key for AI to truly understand and build this world. From I/O to iO, Jony Ive will promote a new design movement - AI is rewriting the computing paradigm and hardware definition, and it is also a new battlefield after the big model.

Gain a broader understanding of the crypto industry through informative reports, and engage in in-depth discussions with other like-minded authors and readers. You are welcome to join us in our growing Coinlive community:https://t.me/CoinliveSG

Add Comment

LoginLeave your comments

0 Comments

Earliest

Load more comments

Live Updates

20 hours ago
Binance launches Alpha page, showcasing Alpha Points and historical return data.
Bullish
Bearish
20 hours ago
Zelensky stated that Ukraine is ready to hold talks with Russia in any form at any time.
Bullish
Bearish
20 hours ago
Eric Trump: Justin Sun's lawsuit against World Liberty is "absurd," but not as outrageous as buying bananas for $6 million.
Bullish
Bearish
20 hours ago
WLFI founder: Justin Sun's lawsuit against WLFI is baseless.
Bullish
Bearish
20 hours ago
Projects such as OpenClaw and Hermes have been explicitly included in the GLM Coding Plan's support scope.
Bullish
Bearish
20 hours ago
Iran's Hormuz Strait Management Plan Includes Rial-Based Transit Fees
Bullish
Bearish
20 hours ago
Russia's March Industrial Output Surpasses Expectations
Bullish
Bearish
Yesterday
Portugal Considers Air France-KLM and Lufthansa Bids for TAP Stake
Bullish
Bearish
Yesterday
Weatherford Anticipates Earnings Decline Amid Iran Impact
Bullish
Bearish
Yesterday
Iran Responds to Pakistan's Ceasefire Extension Request
Bullish
Bearish

Dialogue with a16z: LLM is lossy compression, and the world model is the real direction

▍Intelligence older than language: spatial perception and 3D reconstruction

▍The technical critical point from NeRF to the world model

▍The language model is not the end, but the prologue

▍Spatial intelligence applications far exceed robots

▍The next battle of basic models, 3D panoramic modeling

▍The key to understanding and building the world

Live Updates

Trending News

OpenAI Warns US Lawmakers That Chinese AI Firm DeepSeek is Copying Outputs to Train Rival Chatbots

Crypto Used to Fund Human Trafficking Explodes 85% in 2025 With Hundreds of Millions at Risk

Russian Officials Confirm No Plans to Block Google Despite YouTube and WhatsApp Bans: Here’s Why

Fake Uniswap Ads Drain Mid-Six-Figure Wallet as Scammers Exploit Search Engine Results

OpenAI Engineer’s AI Bot ‘Accidentally’ Sends Entire Memecoin Holdings to ‘Beggar’ on X

OpenClaw Discord Bans All Crypto Talk in AI Community, Users Risk Getting Kicked for Mentioning Bitcoin

Russia Registers 5,500 Crypto Miners But Majority Continue Operating in the Shadows

BitGo Becomes Official Issuer and Custodian for New Frontier Labs’ FYUSD Stablecoin in Asia

Inside Binance Billion-Dollar Probe: How Staff Exposed Iran-Backed Transactions and Were Removed

Seoul Police Raid Bithumb After Korean Lawmaker Accused of Using Influence to Get Son a Job