AI Integration and Modularization – Stratechery by Ben Thompson

Satya Nadella, in last week’s Stratechery Interview, said in response to a question about Google and AI:

I look at it and say, look, I think there’s room always for somebody to vertically integrate. I always go back, there’s what is the Gates/Grove model, and then let’s call it the Apple or maybe the new Google model, which is the vertical integration model. I think both of them have plays.

One of the earliest economists to explore the question of integration versus modularization was Ronald Coase in his seminal paper The Nature of the Firm; Coase concluded:

When we are considering how large a firm will be the principle of marginalism works smoothly. The question always is, will it pay to bring an extra exchange transaction under the organising authority? At the margin, the costs of organising within the firm will be equal either to the costs of organising in another firm or to the costs involved in leaving the transaction to be “organised” by the price mechanism.

It was Professor Clayton Christensen who extended the analysis of integration versus modularization beyond the economists’ domain of measurable costs to the more ineffable realm of innovation. From The Innovator’s Solution:

The improvement of integrated versus modular systems over time according to Professor Christensen

The left side of figure 5-1 indicates that when there is a performance gap — when product functionality and reliability are not yet good enough to address the needs of customers in a given tier of the market — companies must compete by making the best possible products. In the race to do this, firms that build their products around proprietary, interdependent architectures enjoy an important competitive advantage against competitors whose product architectures are modular, because the standardization inherent in modularity takes too many degrees of design freedom away from engineers, and they cannot not optimize performance.

To close the performance gap with each new product generation, competitive forces compel engineers to fit the pieces of their systems together in ever-more-efficient ways in order to wring the most performance possible out of the technology that is available. When firms must compete by making the best possible products, they cannot simply assemble standardized components, because from an engineering point of view, standardization of interfaces (meaning fewer degrees of design freedom) would force them to hack away from the frontier of what is technologically possible. When the product is not good enough, backing off from the best that can be done means that you’ll fall behind.

Companies that compete with proprietary, interdependent architectures must be integrated: They must control the design and manufacture of every critical component of the system in order to make any piece of the system. As an illustration, during the early days of the mainframe computer industry, when functionality and reliability were not yet good enough to satisfy the needs of mainstream customers, you could not have existed as an independent contract manufacturer of mainframe computers because the way the machines were designed depended on the art that would be used in manufacturing, and vice versa. There was no clean interface between design and manufacturing. Similarly, you could not have existed as an independent supplier of operating systems, core memory, or logic circuitry to the mainframe industry because these key subsystems had to be interdependently and iteratively designed, too.

I made my own contribution to this literature in 2013’s What Clayton Christensen Got Wrong. My dispute wasn’t with the above excerpt, but rather the follow-on argument that integrated solutions would eventually overshoot customers and be disrupted by modular alternatives; it was on this basis that Christensen regularly predicted that Apple would lose its lead in smartphones, but I didn’t think that would happen in a consumer market where there were costs to modularization beyond those measured by economists:

The issue I have with this analysis of vertical integration — and this is exactly what I was taught at business school — is that the only considered costs are financial. But there are other, more difficult to quantify costs. Modularization incurs costs in the design and experience of using products that cannot be overcome, yet cannot be measured. Business buyers — and the analysts who study them — simply ignore them, but consumers don’t. Some consumers inherently know and value quality, look-and-feel, and attention to detail, and are willing to pay a premium that far exceeds the financial costs of being vertically integrated.

This ended up being correct as far as smartphones are concerned, and even computers: yes, Windows-based modular computers dominated the first 30 years of computing, but today the Mac is dominant amongst consumers, something Microsoft implicitly admitted in their framing of Copilot+ PCs. Both smartphones and PCs, though, are physical devices you hold in your hands; does the assumption that integration wins in the beginning — and sometimes even the end — hold in AI?

Integrated Versus Modular AI

The integrated versus modular dichotomy in PCs looked like this:

Integrated versus modular in PCs

Apple briefly experimented with modularization in the 1990s, and it nearly bankrupted them; eventually the company went in the opposite direction and integrated all the way down to the processor, following the path set by the iPhone:

Integrated versus modular in smartphones

The similarities between these two images should be striking; Mark Zuckerberg is counting on the same pattern repeating itself for headset computers, with Meta as the open alternative. When it comes to AI, though, Google is, as Nadella noted, the integrated player:

Google's integrated AI stack

Google trains and runs its Gemini family of models on its own TPU processors, which are only available on Google’s cloud infrastructure. Developers can access Gemini through Vertex AI, Google’s fully-managed AI development platform; and, to the extent Vertex AI is similar to Google’s internal development environment, that is the platform on which Google is building its own consumer-facing AI apps. It’s all Google, from top-to-bottom, and there is evidence that this integration is paying off: Gemini 1.5’s industry leading 2 million token context window almost certainly required joint innovation between Google’s infrastructure team and its model-building team.

On the other extreme is AWS, which doesn’t have any of its own models which, while it has its own Titan family of models, appears to be primarily focused on its Bedrock managed development platform, which lets you use any model. Amazon’s other focus has been on developing its own chips, although the vast majority a good portion of its AI business runs on Nvidia GPUs.

AWS's modular AI stack

Microsoft is in the middle, thanks to its close ties to OpenAI and its models. The company added Azure Models-as-a-Service last year, but its primary focus for both external customers and its own internal apps has been building on top of OpenAI’s GPT family of models; Microsoft has also launched its own chip for inference, but the vast majority of its workloads run on Nvidia.

Microsoft's somewhat integrated AI stack

Finally there is Meta, which only builds for itself; that means the most important point of integration is between the apps and the model; that’s why Llama 3, for example, was optimized for low inference costs, even at the expense of higher training costs. This also means that Meta can skip the managed service layer completely.

Meta's mostly integrated AI stack

One other company to highlight is Databricks (whose CEO I spoke to earlier this month). Databricks, thanks to its acquisition of MosaicML, helps customers train their own LLMs on their own data, which is, of course, housed on Databricks, which itself sits on top of the hyperscalers:

Databrick's customized model AI stack

Databricks is worth highlighting because of the primacy its approach places on data; data and model are the points of integration.

Big Tech Implications

Google

The first takeaway from this analysis is that Google’s strategy truly is unique: they are, as Nadella noted, the Apple of AI. The bigger question is if this matters: as I noted above, integration has proven to be a sustainable differentiation in (1) the consumer market, where the buyer is the user, and thus values the user experience benefits that come from integration, and when (2) those user experience benefits are manifested in devices.

Google is certainly building products for the consumer market, but those products are not devices; they are Internet services. And, as you might have noticed, the historical discussion didn’t really mention the Internet. Both Google and Meta, the two biggest winners of the Internet epoch, built their services on commodity hardware. Granted, those services scaled thanks to the deep infrastructure work undertaken by both companies, but even there Google’s more customized approach has been at least rivaled by Meta’s more open approach. What is notable is that both companies are integrating their models and their apps, as is OpenAI with ChatGPT.

The second question for Google is if they are even good at making products anymore; part of what makes Apple so remarkable is not only that the company is integrated, but also that it maintained its standard for excellence for so long even as it continued to release groundbreaking new products beyond the iPhone, like the Apple Watch and AirPods. It may be the case that selling hardware, which has to be perfect every year to justify a significant outlay of money by consumers, provides a much better incentive structure for maintaining excellence and execution than does being an Aggregator that users access for free.

What this analysis also highlights is the potential for Google’s True Moonshot: actually putting weight behind the company’s Pixel phones as a vertically-integrated iPhone rival. From that Article:

Google’s collection of moonshots — from Waymo to Google Fiber to Nest to Project Wing to Verily to Project Loon (and the list goes on) — have mostly been science projects that have, for the most part, served to divert profits from Google Search away from shareholders. Waymo is probably the most interesting, but even if it succeeds, it is ultimately a car service rather far afield from Google’s mission statement “to organize the world’s information and make it universally accessible and useful.”

What, though, if the mission statement were the moonshot all along? What if “I’m Feeling Lucky” were not a whimsical button on a spartan home page, but the default way of interacting with all of the world’s information? What if an AI Assistant were so good, and so natural, that anyone with seamless access to it simply used it all the time, without thought?

That, needless to say, is probably the only thing that truly scares Apple. Yes, Android has its advantages to iOS, but they aren’t particularly meaningful to most people, and even for those that care — like me — they are not large enough to give up on iOS’s overall superior user experience. The only thing that drives meaningful shifts in platform marketshare are paradigm shifts, and while I doubt the v1 version of Pixie [Google’s rumored Pixel-only AI assistant] would be good enough to drive switching from iPhone users, there is at least a path to where it does exactly that.

Of course Pixel would need to win in the Android space first, and that would mean massively more investment by Google in go-to-market activities in particular, from opening stores to subsidizing carriers to ramping up production capacity. It would not be cheap, which is why it’s no surprise that Google hasn’t truly invested to make Pixel a meaningful player in the smartphone space.

The potential payoff, though, is astronomical: a world with Pixie everywhere means a world where Google makes real money from selling hardware, in addition to services for enterprises and schools, and cloud services that leverage Google’s infrastructure to provide the same capabilities to businesses. Moreover, it’s a world where Google is truly integrated: the company already makes the chips, in both its phones and its data centers, it makes the models, and it does it all with the largest collection of data in the world.

As I noted in an Update last month, Google’s recent reorg points in this direction, although Google I/O didn’t provide any hints that this shift in strategy might be coming; instead, the big focus was a new AI-driven search experience, which, needless to say, has seen mixed results. Indeed, the fact that Google is being mocked mercilessly for messed-up AI answers gets at why consumer-facing AI may be disruptive for the company: the reason why incumbents find it hard to respond to disruptive technologies is because they are, at least at the beginning, not good enough for the incumbent’s core offering. Time will tell if this gives more fuel to a shift in smartphone strategies, or makes the company more reticent.

The enterprise space is a different question: while I was very impressed with Google’s enterprise pitch, which benefits from its integration with Google’s infrastructure without all of the overhead of potentially disrupting the company’s existing products, it’s going to be a heavy lift to overcome data gravity, i.e. the fact that many enterprise customers will simply find it easier to use AI services on the same clouds where they already store their data (Google does, of course, also support non-Gemini models and Nvidia GPUs for enterprise customers). To the extent Google wins in enterprise it may be by capturing the next generation of startups that are AI first and, by definition, data light; a new company has the freedom to base its decision on infrastructure and integration.

AWS

Amazon is certainly hoping that argument is correct: the company is operating as if everything in the AI value chain is modular and ultimately a commodity, which implies that it believes that data gravity will matter most. What is difficult to separate is to what extent this is the correct interpretation of the strategic landscape versus a convenient interpretation of the facts that happens to perfectly align with Amazon’s strengths and weaknesses, including infrastructure that is heavily optimized for commodity workloads.

Microsoft

Microsoft, meanwhile, is, as I noted above, in the middle, but not entirely by choice. Last October on the company’s earnings call Nadella talked extensively about how the company was optimizing its infrastructure around OpenAI:

It is true that the approach we have taken is a full stack approach all the way from whether it’s ChatGPT or Bing Chat or all our Copilots, all share the same model. So in some sense, one of the things that we do have is very, very high leverage of the one model that we used, which we trained, and then the one model that we are doing inferencing at scale. And that advantage sort of trickles down all the way to both utilization internally, utilization of third parties, and also over time, you can see the sort of stack optimization all the way to the silicon, because the abstraction layer to which the developers are riding is much higher up than low-level kernels, if you will.

So, therefore, I think there is a fundamental approach we took, which was a technical approach of saying we’ll have Copilots and Copilot stack all available. That doesn’t mean we don’t have people doing training for open source models or proprietary models. We also have a bunch of open source models. We have a bunch of fine-tuning happening, a bunch of RLHF happening. So there’s all kinds of ways people use it. But the thing is, we have scale leverage of one large model that was trained and one large model that’s being used for inference across all our first-party SaaS apps, as well as our API in our Azure AI service…

The lesson learned from the cloud side is — we’re not running a conglomerate of different businesses, it’s all one tech stack up and down Microsoft’s portfolio, and that, I think, is going to be very important because that discipline, given what the spend like — it will look like for this AI transition any business that’s not disciplined about their capital spend accruing across all their businesses could run into trouble.

Then, one month later, OpenAI nearly imploded and Microsoft had to face the reality that it is exceptionally risky to pin your strategy on integrating with a partner you don’t control; much of the company’s rhetoric — including the Nadella quote I opened this Article with — and actions since then have been focused on abstracting models away, particularly through the company’s own managed AI development platform, in an approach that looks more similar to Amazon’s. I suspect the company would actually like to lean more into integration, and perhaps still is (including acqui-hiring its own model and model-building team), but it has to hedge its bets.

Nvidia

All of this is, I think, good news for Nvidia. One underdiscussed implication of the rise of LLMs is that Nvidia’s CUDA moat has been diminished; the vast majority of development in AI is no longer happening with CUDA libraries, but rather on top of LLMs. That does, in theory, make it more likely that alternative GPU providers, whether that be AMD or hyperscalers’ internal efforts, put a dent in Nvidia’s dominance and margins.

Nvidia, though, is hardly resting on its moat: the company is making its GPUs more flexible over time, promising that its next generation of chips will ship in double the configurations of the current generation, including a renewed emphasis on Ethernet networking. This approach will maximize the Nvidia’s addressable market, driving more revenue which the company is funneling back into a one-year iteration cycle that promises to keep the chip-maker ahead of the alternatives.

I suspect that the only way to overcome this performance advantage, at least in the near term, will be through true vertical integration a la Google; to put it another way, while Google’s TPUs will remain a strong alternative, I am skeptical that hyperscaler internal chip efforts will be a major threat for the foreseeable future. Absent full stack integration those efforts are basically reduced to trying to make better chips than Nvidia, and good luck with that! Even AMD is discovering that a good portion of its GPU sales are a function of Nvidia scarcity.

Meta

This also explains Meta’s open source approach to Llama: the company is focused on products, which do benefit from integration, but there are also benefits that come from widespread usage, particularly in terms of optimization and complementary software. Open source accrues those benefits without imposing any incentives that detract from Meta’s product efforts (and don’t forget that Meta is receiving some portion of revenue from hyperscalers serving Llama models).

AI or AGI

The one company that I have not mentioned so far — at least in the context of AI — is Apple. The iPhone maker, like Amazon, appears to be betting that AI will be a feature or an app; like Amazon, it’s not clear to what extent this is strategic foresight versus motivated reasoning.

It does, though, get at the biggest question of all: LLMs are already incredible, and there is years of work to be done to fully productize the capabilities that exist today; are even better LLMs, though, capable of disrupting not just search but all of computing? To the extent that the answer is yes, the greater advantage I think that Google’s integrated approach will have, for the reasons Christensen laid out: achieving something approaching AGI, whatever that means, will require maximizing every efficiency and optimization, which rewards the integrated approach.

I am skeptical: I think that models will certainly differ, but not in a large enough way to not be treated as commodities; the most value will be derived from building platforms that treat models like processors, delivering performance improvements to developers who never need to know what is going on under the hood. This will mean the biggest benefits will accrue to horizontal reach — on the API layer, the model layer, and the GPU layer — as opposed to vertical integration; it is up to Google to prove me wrong.

I wrote a follow-up to this Article in this Daily Update.

Check Also

StriVectin Power Starters Set – CNET

Tighten & lift neck cream w/ face & eye serum.