NVIDIA Isn't the Company You Think it Is
Inside NVIDIA’s plan to turn compute into the new supply chain.
NVIDIA CEO Jensen Huang’s keynote at NVIDIA GTC DC last week may be remembered as a talk that changed the course of technology. It felt as important as Steve Jobs launching the iPhone in 2007.1 Except Jensen wasn’t launching a product; he was launching a new industrial revolution.
🎧 Prefer to listen? I read this article on the voiceover (above), on Spotify, and on Apple (Disclaimer: I own NVIDIA stock. I am not providing financial advice.)
The ~2,000 people in the audience weren’t gamers—they represented nearly every corner of modern industry. On my left, an executive selling on-chip cooling; on my right, a supply chain consultant. There were financial managers too, and logistics leaders, policy experts, communications execs, construction managers, robotics engineers, and IT folks—they were all there, and they all had something at stake.
More than once, I had goosebumps thinking about what comes next.
In this essay I focus on the ideas that didn’t make headlines but will matter most. I’ve organized it into three acts—Act I: The Shift in Identity, Act II: The Shift in Utility, and Act III: The Shift in Scale. But first, a short preamble to set the stage.
Preamble: The Engines of Intelligence
Most people know NVIDIA for its graphics cards—the chips that make games beautiful and animations realistic. But those same chips, called Graphics Processing Units or GPUs, have quickly become the engines of modern intelligence.
A GPU is built for parallel processing. While a Central Processing Unit or CPU handles a few tasks at a time, a GPU performs thousands—tiny math calculations that render 3D objects or simulate light and motion. The GPU’s design also happens to be the perfect solution for another frontier problem: teaching machines to learn.
The same geometry that renders realistic dragons and race cars can also map relationships between words, images, and ideas.
The turning point came in 2017, when a group of Google researchers published Attention is All You Need and changed everything. This research paper introduced the concept of a transformer, a technique that turns text (or any input) into tokens, encodes those tokens into mathematical vectors, performs lots of linear algebra, then decodes the result into output—language, images, code, or whatever. The paper’s breakthrough was that the transformer math produced better results if it was parallelized2. And parallelization was a task for which the GPU was purpose built.
So while the GPU began as a chip used to push pixels, after 2017 it quickly became the go-to chip for processing tokens. And once researchers realized the transformer / token technique could also solve a variety of difficult (and previously impossible) problems—demand for GPUs exploded. NVIDIA, already the dominant GPU supplier, found itself at the center of an entirely new market.
In short: GPUs made intelligence scalable. And that is where this story begins.
Once intelligence became scalable, the question was no longer what can GPUs do? but what else do you need to make them useful? Or perhaps: now what?
Act I: The Shift in Identity
NVIDIA isn’t a hardware company anymore—it’s become the backbone of modern computing. Since 2017, the year Google’s transformer paper changed the trajectory of AI, NVIDIA has reshaped itself from a chip designer into the central infrastructure provider for the age of intelligence. It now occupies a role similar to the one AT&T once played for early 20th century telecommunications: it’s the connective tissue catalyzing growth across entire industries.
Like AT&T, NVIDIA’s power comes from alignment—mirroring its customers’ ambitions, and evolving itself each time a new market emerged. Those shifts aren’t cosmetic; they mark new audiences, new business models, and new leverage.
NVIDIA’s real product is computation itself.
Across NVIDIA’s annual 10-K filings from 2011 to 20253, each year’s opening sentence showed the company’s evolving identity—from “GPU inventor” to “visual computing leader,” to “accelerated computing platform,” to today’s “full-stack computing infrastructure company.”
NVIDIA’s origin story may have been the GPU, but today the company’s real product is all of the necessary infrastructure to access computation at an unheard of scale.
💎 This is the Treasure of Our Company
Early in the keynote (timecode 00:08:17), Jensen paused on a slide showing several of NVIDIA’s CUDA-X libraries—quantum, weather, imaging, and several more.
“Our software is the treasure of our company,” he said. “Each one of these libraries opened new markets for us.” - Jensen Huang
It was a quick moment, easy to miss, but it captured a decade-long transformation in a single line: NVIDIA, once a company that sold silicon, now sells capability.
When NVIDIA first released CUDA in 2007, developers finally had a powerful (albeit complex) way to program GPUs. Twelve years later, CUDA-X was released, which abstracted away that complexity. Although CUDA-X is built on CUDA, CUDA-X transformed the GPU from a chip into a platform—one that almost anyone can use. Today there are ~400 CUDA-X libraries across a variety of fields.
Hardware is replaceable; software ecosystems aren’t.
This is the genius of Jensen’s “treasure” metaphor. Hardware is replaceable; software ecosystems aren’t. So when semiconductor leaders like TSMC, Samsung, and ASML adopt CUDA-X libraries like CuLitho, they’re wiring their businesses into NVIDIA’s software layer. And because that software stays compatible across GPU generations, the relationship compounds over time.
NVIDIA has achieved what Microsoft once did for PCs: it now provides software libraries in CUDA-X that amplify the value of the NVIDIA hardware beneath it. Every industry that adopts CUDA-X becomes an NVIDIA customer, and requires NVIDIA hardware for their application to run.
NVIDIA’s real asset isn’t chips or code; it’s the ecosystem itself.
Act II: The Shift in Utility
So what happens when we have an ecosystem of hardware and software powering advanced computing within every industry? Well, this modern computing infrastructure—once built to augment intelligence can now be used to augment effort. We will no longer teach machines to think; we will teach them to work.
But before we change the nature of a machine’s utility, we need new tools. In particular, we need digital worlds—digital twins—that mirror ours.
💡 It’s a Simulation, Not an Animation
Early in the keynote, Jensen played a three-minute montage (timecode 00:11:05) which, on the surface, seemed like a Pixar short. There was dramatic weather, intricate objects, realistic light, fabric fluttering, an engine exploding into its parts, and lots of robots doing things. Logos of Amazon, BMW, General Atomics and others flashed across the screen too. When the video ended, Jensen said something profound.
“Everything you saw was a simulation. There was no art. No animation. This is the beauty of mathematics.” - Jensen Huang
The audience was remarkably quiet. It looked like artistry, but it was physics and math. The miracle wasn’t how it looked, but how it was made.
Here is the difference: animations are designed; simulations are calculated. Models replace creative shortcuts with equations. Models help drive decisions and reduce risk. And models can be super useful when you need to train a variety of robots.
This video from March 2025 shows off some of NVIDIA’s modeling tech to do just that. Essentially, the software enables companies to train simulated robots inside simulated worlds. Then, robotic prototypes work out of the box, as if they had already been trained to handle a variety of real-world situations. Because they had.
"We’re working with Disney research on an entirely new framework and simulation platform … for the robot to learn how to be a good robot inside a physically-aware, physically-based environment.” - Jensen Huang
Jensen even demonstrated this on stage. More than an hour after the, “everything you saw was a simulation” comment, he showed something from Disney. It’s essentially a pet Star Wars robot walking and interacting with the world (timecode 01:33:28). As you watch, notice how it stumbles on the uneven terrain and recovers—an amazing skill it learned. If you were Disney, and planned to sell millions of pet robots, wouldn’t you want them to have experienced a huge variety of conditions first?
💡 AI is Work, Not Intelligence
Jensen was just wrapping up a discussion on AI as the New Industrial Revolution when he said (timecode 00:36:19):
“But AI is not a tool. AI is work. That is the profound difference. AI is in fact workers that can actually use tools.” - Jensen Huang
He wasn’t being poetic. He meant it literally. For decades, we’ve thought of AI as a tool that thinks—a system that analyzes, predicts, or advises. But the next generation of AI doesn’t just think; it acts. It designs, moves, builds, drives, and negotiates. It doesn’t simulate work—it does the work. And this isn’t just robots either, it’s AI in general: AI as the commander; AI as the director; AI as the supervisor.
This is what Jensen meant by the “new industrial revolution.” AI is becoming an operating class within the economy—performing cognitive and physical labor side-by-side with humans. The GPU, once an engine of images, is now an engine of effort.
And with more workers, we’ll need to think about scale differently.
Act III: The Shift in Scale
We’re entering an era where scale multiplies work. What started as parallel computation has become parallel production. At this scale, AI isn’t just a tool or a worker—it’s infrastructure that builds more of itself.
💡 Robots Using Robots to Build Robots
Jensen showed a short video covering American reindustrialization (timecode 01:28:50). In the video, Foxconn and Siemens demonstrated a digital twin of an NVIDIA factory that allowed engineers to optimize the mechanical, electrical, and plumbing layout for robotic automation. The same digital twin of the factory was used to train and simulate robots (as I described in Act II). But here is what blew my mind:
“That’s the future of manufacturing,” he said. “The factory is essentially a robot that’s orchestrating robots to build things that are robotic.” - Jensen Huang
That’s right. The factory itself is a robot that uses robots to build robots. When the entire facility is a single, programmable organism, efficiency becomes a software problem, not a human one. The limiting factors are no longer hands or hours—they’re energy, bandwidth, and imagination. And maybe logistics.
The implication also means manufacturing can become mobile. When the factory is software, it can be replicated anywhere power and connectivity exist. Perhaps it could even replicate itself. Geography no longer defines production; compute does. This is true scale; when production produces production.
💡 The AI Factory
There’s another kind of scale coming too. For decades, we’ve measured computing progress in floating point operations per second (FLOPS) or instructions per second (IPS), but we are soon likely to start measuring tokens per second. If you remember from the preamble, tokens connect inputs and outputs to the underlying math. And tokens are the new atoms of digital work. Jensen said (timecode 00:33:12),
“Tokens [are] … the computational unit … of artificial intelligence. You can tokenize almost anything … words, images, video, 3-D structures, chemicals, proteins, genes … anything with structure … anything with information content.” - Jensen Huang
Once you can tokenize something, AI can learn and translate inputs to outputs, and respond and generate. So what’s possible today with text will be possible with robotic motion and action and behavior. Under the covers, it’s essentially the same math.
But now imagine a factory that accepts input and spits out tokens. This was the real ah-ha moment for me. Jensen said (timecode 00:40:13),
“We need a new type of system and I call it an AI factory”, he said. “Its purpose is designed to produce tokens that are as valuable as possible.” - Jensen Huang
He said cloud computing is the general computer of the past. A traditional data center stores or retrieves information; an AI factory manufactures it. Every prompt, image, or simulation is an act of production. Compute becomes supply chain. And the measure of efficiency is no longer energy per bit; it’s energy per token.
We have reached the industrialization of compute.
If data was the new oil, tokens are the refined fuel—and NVIDIA just cornered the market on refineries.
🐇🕳️ The Rabbit Hole
If you’d like to dig a bit deeper, I’ve included some of my favorite videos on large language models, transformers, and the attention mechanism in the footnotes4. These are some of the best videos I’ve ever found on these three topics.
Jensen’s keynote from CES 2025 is pretty remarkable too, but I especially liked this short video at the end. If you watch it, you’ll see some of the things I described in the essay. There is some stuff I didn’t cover too, like the modularity NVIDIA is building in to the Blackwell chip, that not only makes it easier to manufacture in a variety of configurations, but also easier to scale.
While writing this article, I discovered Google Gemini can summarize and extract content from YouTube video links. Here are a few fun prompts to try if you find yourself in a similar situation. (Note: they didn’t work as well on ChatGPT).
💡 Prompt Idea: Analyze this video
This prompt will analyze a video and provide you a list of major thematic elements. Feel free to adjust if you want a different analysis (e.g., major thesis).
Please analyze this video and summarize the content into major themes. https://youtu.be/dQw4w9WgXcQ
💡 Prompt Idea: Content timestamps
This prompt will provide you a list of timestamps where your topic is referenced.
Please provide me a list of timestamps as formatted URLs where the topic of clear communication is discussed. https://youtu.be/dQw4w9WgXcQ
Enjoy!
— Rob Allegar
I’m a lifelong builder and advisor exploring what happens when technology stops behaving like a tool and starts acting like a collaborator. In this newsletter, I explore the space between ideas and execution, and help people build things that matter. roballegar.com
Thanks for reading. If you enjoyed this piece, hit ♥ or share it with someone else who is already worried that robots are going to take over the world.
Here is Steve Jobs launching the iPhone in 2007. The tech looks so old. But it was less than 20 years ago! Can you imagine what NVIDIA’s tech will make possible by 2045?
Transformers are not the only viable model with AI, but they are really effective. Perhaps in future essays I can cover others, but the differences get a bit technical.
Across NVIDIA’s annual 10-K filings from 2011 to 2025, each year’s opening sentence quietly redefined what the company believed it was:
👾 Identity Baseline: GPUs (2010 – 2012) - This is how most people think of NVIDIA - as a chip company. Their identity in 2011 reflected that too, “NVIDIA Corporation invented the graphics processing unit, or GPU, in 1999.”
👁️ Shift #1: Visual Computing (2013 – 2016) - NVIDIA described itself as a leader in visual computing. This was a subtle shift, but the first acknowledgement that their chips were useful for compute too, even if it was only visual compute.
🧠 Shift #2: Computer Science (2017 – 2020) - The company’s scope widens to support processing for computer science. GPUs become scientific instruments for research and machine learning. 2017 is the first time the term AI is used.
⚡️ Shift #3: Computational Problems (2021 – 2023) - The scope widens again, this time to accelerated computing. They also hint at a “platform strategy” to create value by unifying hardware, software, algorithms, libraries, systems, and services.
🌐 Shift #4: End-to-End Provider (2024 – 2025) - NVIDIA now calls itself a “full-stack computing infrastructure company with data-center-scale offerings.” They believe they are an essential innovation partner to every industry.
Grant Sanderson, known on YouTube as 3Blue1Brown, has been producing some of the best technical videos on the Internet for over a decade. I’ve picked these three to give you a pretty good starting point on the topics of LLMs, transformers, and the attention model that makes parallelization so effective for AI. I’d recommend you watch them in order.




The CuLitho adoption by TSMC is probably the most underappreciated lock-in mechanism you described. When the world's leading foundry optimizes their lithography workflows around NVIDIA software libraries, it creates bidirectional dependency that compounds across every customer's design wins. The treasure metaphor captures this perfectly because TSMC's competitive moat actually depends on NVIDIA's software continuing to deliver the best computational lithography performance, which requires cutting-edge GPU hardware. The softare ecosystem becomes infrastructure beneath infrastructure, and at that point you're not just selling compute cycles but orchestrating entire supply chains through CUDA-X.