Sign up for our daily and weekly newsletters for the latest updates and exclusive content on the industry’s best AI coverage. Learn more
Every week, sometimes every day, new cutting-edge AI models are introduced to the world. As we move into 2025, the pace at which new models are being released is dizzying, if not exhausting. The curves of the roller coaster continue to grow exponentially, and fatigue and wonder have become constant companions. Each release highlights why. this Certain models are better than all others. Endless collections of benchmarks and bar charts fill our feeds as we struggle to keep up.
18 months ago, the majority of developers and enterprises were using a single AI model. Today, the opposite is true. It is rare to find a business of significant scale limited to the functionality of a single model. Companies are wary of supplier lock-in. This is especially true for technology, which has become a core part of both long-term corporate strategy and short-term profits. It is becoming increasingly risky for teams to invest all of their money in a single Large Language Model (LLM).
However, despite this fragmentation, many model providers still defend the view that AI will be a winner-takes-all market. They argue that they lack the expertise and compute required to train best-of-breed models and that they are defensible and self-reinforcing. In their view, the hype bubble around building AI models will eventually burst, leaving behind a single, giant artificial general intelligence (AGI) model that will be used for anything and everything. Owning such a model exclusively means becoming the most powerful company in the world. The size of this prize started a race to secure more and more GPUs, with new zeros being added to the number of training parameters every few months.
We believe that this view is incorrect. Neither next year nor in the next decade will there be a single model that dominates the universe. Instead, the future of AI will be multi-model.
Language models are fuzzy goods.
that Oxford Dictionary of Economics A commodity is defined as “a standardized good that can be bought and sold on a scale and whose units are interchangeable.” Language models are commodities in two important senses.
- The models themselves are becoming more interchangeable across a wider range of tasks.
- The research expertise needed to create these models is becoming increasingly distributed and accessible, with leading labs barely able to outpace each other and independent researchers in the open source community catching up.
However, although language models are being commercialized, their performance is uneven. From GPT-4 to Mistral Small, there is a large core set of features that all models are perfectly suited to handle. At the same time, as we move into margins and edge cases, we are seeing increasingly greater differentiation with some model providers specializing explicitly in code generation, inference, search-augmented generation (RAG), or mathematics. This leads to endless manual work, Reddit searches, evaluation, and fine-tuning to find the right model for each task.
So a language model is a commodity, but it is more accurately described as: fuzzy goods. In many use cases, AI models are almost interchangeable, with metrics like price and latency determining which model to use. However, at the edges of the feature, the opposite happens. That means models will continue to specialize and become more differentiated. For example, Deepseek-V2.5 is more powerful than GPT-4o in C# coding despite being smaller and 50 times cheaper.
The two dynamics of commoditization and specialization root out the thesis that a single model is best suited to handle all possible use cases. Rather, they point to the increasingly fragmented landscape of AI.
Multi-model orchestration and routing
There is an apt analogy for the market dynamics of language models. It’s the human brain. The structure of our brains has remained unchanged for 100,000 years, and our brains are more similar than similar. For most of our time on Earth, most people have learned the same things and have similar abilities.
But something has changed. We have developed the ability to communicate verbally. First verbally, then in writing. Communication protocols facilitate networks, and as humans began to network with each other, we began to become increasingly specialized. We have been freed from the burden of being self-sufficient islands, generalists across all domains. Paradoxically, the collective abundance of specialization also means that the average human today is a much stronger generalist than our ancestors.
In a sufficiently large input space, the universe will always tend to specialize. This is true in everything from molecular chemistry to biology to human society. If there is enough diversity, a distributed system will always be more computationally efficient than a monolith. We believe the same will be true for AI. The more you leverage the strengths of multiple models instead of relying on just one model, the more specialized that model becomes, expanding the boundaries of its capabilities.
An increasingly important pattern to leverage the strengths of different models is routing. Dynamically send queries to the best-fitting model while leveraging cheaper, faster models without sacrificing quality. Routing allows you to take advantage of all the benefits of specialization without sacrificing the robustness of generalization: higher accuracy at lower cost and latency.
A simple demonstration of the power of routing can be seen in the fact that most of the world’s best models are themselves routers. mix of experts An architecture that routes each next token generation to dozens of specialized submodels. If it is true that LLM is leading to an exponential proliferation of fuzzy products, then routing should become an essential part of every AI stack.
There is a view that LLM will plateau as it reaches human intelligence. The view is that once our capabilities are fully saturated, we will consolidate around a single generic model in the same way we consolidated around AWS or the iPhone. None of these platforms (or their competitors) have seen a 10x improvement in performance over the past few years, so we’d do well to familiarize ourselves with their ecosystem. However, we believe that AI will not stop at human-level intelligence. It will go far beyond the limits we can imagine. Then, like the rest of the natural world, it will become increasingly fragmented and specialized.
I can’t stress enough how great AI model fragmentation is. A fragmented market is an efficient market. Empower buyers, maximize innovation, and minimize costs. And to the extent that we can leverage networks of smaller, more specialized models rather than sending everything through the internals of a single giant model, we are moving toward a future of much safer, more interpretable, and more steerable AI.
The greatest inventions have no master. Ben Franklin’s heirs do not own a biography. Turing’s estate does not own any computers. AI is undoubtedly one of humanity’s greatest inventions. We believe the future will and should be multi-model.
Zack Kass is the former head of go-to-market. Open AI.
Tomás Hernando Kofman is co-founder and CEO. Not a diamond.
data decision maker
Welcome to the VentureBeat community!
DataDecisionMakers is a place for professionals, including technical people, who work with data to share data-related insights and innovations.
If you want to read about cutting-edge ideas, latest information, best practices, and the future of data and data technology, join DataDecisionMakers.
You might also consider contributing your own article!
Learn more at DataDecisionMakers