NVIDIA GTC Keynote 2026 ~ How It Supports Next-Generation Intelligent Systems ~

2026.03.17

This page has been translated by machine translation. View original

Hello.
I am Takaaki Tanaka from the Smart Factory Team in the Manufacturing Business Technology Department.
 Introductionhttps://www.youtube.com/watch?v=jw_o0xr8MWU
Watch NVIDIA Founder and CEO Jensen Huang's GTC keynote as he unveils the latest breakthroughs in AI and accelerated computing. See how agentic AI, AI factories, and physical AI are powering the next generation of intelligent systems.
This is a summary of NVIDIA CEO Jensen Huang's NVIDIA GTC 2026 keynote.
 NVIDIA's 3 PlatformsNVIDIA has developed 3 platforms:
CUDA X: Algorithm libraries
Systems: Computing systems
AI Factory: Newly announced AI infrastructure
 CUDA's 20th Anniversary and Flywheel EffectCUDA has reached its 20th anniversary. The SIMT (Single Instruction, Multi-Threaded) architecture has made it easy to transform scalar code into multi-threaded applications.
The core of NVIDIA's strategy is the "flywheel effect":
Large install base attracts developers
Developers create new algorithms and achieve breakthroughs
Breakthroughs create new markets
New markets expand the ecosystem, further increasing the install base
This cycle has even caused the price of Ampere, shipped 6 years ago, to rise in the cloud.

Continuous software optimization has significantly extended the useful life of GPUs.
 GeForce 25th Anniversary and AI's Big BangGeForce invented programmable shaders (pixel shaders) 25 years ago.

It was the world's first programmable accelerator.
GeForce's spread of CUDA throughout the world enabled Alex Krizhevsky, Ilya Sutskever, Geoff Hinton, and Andrew Ng to discover they could accelerate deep learning with GPUs. This marked the beginning of the AI big bang 10 years ago.
About 8 years ago, NVIDIA introduced RTX, combining hardware ray tracing with AI.
 DLSS 5 and Neural RenderingThey announced the next-generation graphics technology "neural rendering," which is a fusion of 3D graphics and AI.
Structured data: Controllable and predictable 3D graphics
Generative AI: Probabilistic but highly realistic
By combining these two, they can generate content that is both beautiful and controllable. This concept will be repeated across all industries. Structured data forms the foundation of "trustworthy AI."
 Data Processing Innovation: cuDF and cuVS
 Structured DataData frames processed by platforms like SQL, Spark, Pandas, and Velox. Used in Snowflake, Databricks, Amazon EMR, Azure Fabric, Google Cloud BigQuery, etc. This is the "ground truth" of business.
AI agents use this data much faster than humans, necessitating thorough acceleration.
 Unstructured DataAbout 90% of data generated annually is unstructured - PDFs, videos, audio, etc. Previously difficult to query or search, AI's multimodal perception can now understand, vectorize, and make this content searchable.
 Two Foundation LibrariescuDF: For data frames (structured data)
cuVS: For vector stores (unstructured, semantic data)
 Cloud Partnerships IBM
Collaborating with IBM, inventor of SQL, to accelerate Watson X data with cuDF. Nestle can now process supply chain data across 185 countries 5 times faster and at 83% lower cost.
 DellAnnounced Dell AI data platform integrating cuDF and cuVS. Collaboration with NTT Data has achieved significant speed improvements.
 Google Cloud
Accelerating Vertex AI and BigQuery. Reduced Snapchat's computing costs by approximately 80%.
 AWS
Accelerating EMR, SageMaker, and Bedrock.

This year, bringing OpenAI to AWS is expected to greatly expand consumption. AWS was NVIDIA's first cloud partner.
 Microsoft Azure
Azure was the first to deploy NVIDIA's A100 supercomputer, which led to their success with OpenAI. Accelerating AI Foundry and Bing search.
Confidential computing allows deploying OpenAI and Anthropic models in environments where even operators cannot see the data or models.
 Oracle
NVIDIA was Oracle's first AI customer and also their first supplier. Deploying Cohere, Fireworks, OpenAI, and others.
 Palantir + Dell
The three companies are building an AI platform that can be fully deployed in air-gapped environments or on-premises. Integration with Palantir Ontologies enables AI deployment in any country or field.
 NVIDIA's Strategy: Vertical Integration × Horizontal OpennessNVIDIA is the world's first vertically integrated yet horizontally open company.
 Why Vertical Integration Is NecessaryThe essence of accelerated computing is accelerating applications. With Moore's Law reaching its limits, only domain-specific acceleration can achieve significant speed improvements and cost reductions.
Therefore, NVIDIA needs to understand:
Applications
Domains
Algorithms
Deployment scenarios (data centers, cloud, on-premises, edge, robotics)
 Horizontal OpennessNVIDIA's technology can be integrated into any platform. By providing software and libraries and integrating with partners' technologies, they deliver accelerated computing worldwide.
 Industry-Specific Deployments


Industry
Market Size
NVIDIA's Initiatives


Self-driving Cars
-
Wide reach and impact

Financial Services
-
Algorithmic trading shifting to deep learning/transformers. Largest participant percentage at GTC

Healthcare
-
AI physics/biology for drug discovery, diagnostic support AI agents

Industry
-
AI factories, chip plants, computer plant construction

Media/Gaming
-
Real-time AI platforms, translation, broadcast support. Holoscan

Quantum Computing
-
Building cuQuantum GPU hybrid systems with 35 companies

Retail/CPG
$35 trillion
Supply chain, shopping systems, customer support AI agents

Robotics/Manufacturing
$50 trillion
10 years of work, 110 robots on display

Telecommunications
$2 trillion
Base stations evolving into AI infrastructure platforms. Collaborating with Nokia, T-Mobile on Aerial (AI RAN)

 NVIDIA's Crown Jewel: CUDA X LibrariesNVIDIA is an algorithm company.

CUDA X libraries are the company's crown jewel, enabling computing platforms to solve industry-specific problems.

At this GTC, they announced approximately 100 libraries and about 40 models.
 Key Libraries

Library
Purpose


cuDNN
Deep neural networks (driver of the AI big bang)

cuOpt
Decision optimization

cuLitho
Computational lithography

cuDSS
Direct sparse solver

cuEquivariance
Geometry-aware neural networks

Aerial
AI RAN

Warp
Differentiable physics

Parabricks
Genomics

 Three Inflection Points in AI
The past two years have seen three critical inflection points in AI:
ChatGPT (late 2022-2023): Beginning of the generative AI era. Computing fundamentally shifted from "search-based" to "generation-based"
o1 (reasoning AI): Enabled reflection, planning, and problem decomposition. Made generative AI reliable and truth-based
Claude Code (agent AI): Can read files, create code, compile, test, and iterate. 100% of NVIDIA employees use either Claude Code, Codex, or Cursor
As a result, AI has evolved from "perceptual AI" → "generative AI" → "reasoning AI" → "AI that actually does work."
 Inference Inflection Point and Demand ExplosionAs AI became productive, an inference inflection point arrived:
Computational demand increased 1 million times in the past two years (10,000x increase in computation × 100x increase in usage)
Venture investment in AI startups reached $150 billion (largest in human history)
Investment scale shifted from millions to billions of dollars
By 2027, at least $1 trillion in demand is expected (doubling from $500 billion last year).
 Token Factory EconomicsData centers have transformed from "data centers for files" to "factories generating tokens."
 Two Axes of Token Economy

Axis
Meaning
Business Impact


Throughput (vertical)
Tokens generated per watt
Production capacity under power constraints

Token speed (horizontal)
Inference speed
AI intelligence, amount of context that can be processed

 Price Tier Differentiation


Tier
Characteristics
Price Example


Free
High throughput, low speed
$0

Standard
Medium throughput, medium speed
$3 / million tokens

High
Low throughput, high speed
$6 / million tokens

Premium
Lowest throughput, highest speed
$45-$150 / million tokens

Going forward, CEOs worldwide will study token factory efficiency as it directly impacts revenue.
 Grace Blackwell and Vera Rubin Performance
 Grace Blackwell NVLink 72While Moore's Law would predict 1.5x performance improvement, they achieved 35-50x
According to Dylan Patel of semi analysis, "Jensen is sandbagging. It's actually 50x"
Lowest cost per token in the world, "basically unbeatable"
 Vera Rubin Platform7 chips, 5 rack-scale computers
3.6 exaflops, 260TB/sec NVLink bandwidth
40 million times compute improvement over 10 years
100% liquid-cooled, cooled with 45-degree warm water
Installation time: reduced from 2 days to 2 hours
 Revenue Impact (for a 1GW data center)

Platform
Revenue relative to Hopper


Hopper
1x

Grace Blackwell
5x

Vera Rubin
25x (35x with Groq integration)

 Groq Integration
NVIDIA acquired the Groq team and licensed their technology.

They integrated two extreme processors:



Vera Rubin
Groq


Design philosophy
High throughput
Low latency

Memory
288GB/chip
500MB/chip (large amount of SRAM)

Use
Pre-fill, Attention
Decode, token generation

They integrated the two processors with software called Dynamo to optimize the inference pipeline, achieving both high throughput and low latency.
 Roadmap

Generation
GPU
CPU
LPU
Scale-up


Blackwell
Blackwell
Grace
-
NVLink 72

Rubin
Rubin
Vera
LP 30
NVLink 72 → NVLink 576

Rubin Ultra
Rubin Ultra
Vera
LP 35
Kyber (NVLink 144)

Feynman
Next-gen
Rosa
LP 40
Kyber + CPO

 NVIDIA PlatformA digital twin platform that optimizes AI factory design and operation:
DS World: Virtual design of AI factories on Omniverse
DS Flex: Dynamic power management with the grid
DS MaxQ: Dynamic maximization of token throughput
Partners: PTC, Dassault Systemes, Jacobs, Siemens, Cadence, Procore, etc.
AI factories have twice the optimization potential, and at this scale, doubling is enormous.
 Space Deployment
Thor: Radiation-approved, already mounted on satellites
Vera Rubin Space-1: Computer for space data centers (under development)
Challenge: In space, cooling relies solely on radiation as there's no conduction or convection
 OpenClaw
An open-source project developed by Peter Steinberger:
Most popular open-source project in human history (surpassed Linux's 30 years in just weeks)
Download and build AI agents with a single command
Andrej Karpathy's "Research" function: assign a task, go to sleep, wake up to 100 automatically run experiments
NVIDIA announced support for OpenClaw.
OpenClaw is an operating system for AI agents, featuring resource management, scheduling, sub-agent invocation, and multimodal I/O.
Just as Windows enabled personal computers, OpenClaw has enabled personal agents.
Like Linux, HTTP/HTML, and Kubernetes, it provided what the industry needed when it needed it. Every company needs an "OpenClaw strategy."
 Enterprise IT TransformationEvery SaaS company is becoming a GaaS (Agentic as a Service) company.
 Addressing Security ChallengesSecurity is essential as agents can access confidential information, execute code, and communicate externally.
NVIDIA developed NeMo Claw:
OpenShell (security integration)
Policy engine integration
Network guardrails
Privacy router
 NVIDIA Open Models
Six model families at the frontier of all domains:


Model
Purpose


Nemotron
Language, vision, RAG, voice

Cosmos
Physical AI, world generation

Alpamayo
Self-driving (world's first thinking and reasoning model)

Groot
General-purpose robotics

BioNeMo
Biology, molecular design

Earth Two
Weather and climate prediction

Nemotron-3 Ultra has become the world's best base model, supporting Sovereign AI development in various countries.
 Nemotron Coalition
Cursor, LangChain, Mistral, Perplexity, Sarvam, and many others are participating to jointly develop Nemotron-4.
 The Era of Token BudgetsIn the future, all engineers will have an annual token budget. In addition to their base salary, they'll receive about half that amount in tokens. In Silicon Valley, "how many tokens come with the job" has become a hiring condition.
 Physical AI Self-driving Cars' "ChatGPT Moment"
It has been proven that self-driving works reliably.
New partners: BYD, Hyundai, NISSAN, Geely (total 18 million vehicles annually)

Partnering with Uber to deploy robotaxis in multiple cities.
Alpamayo: World's first thinking and reasoning self-driving AI. The car can explain its actions.
 Robot Development Ecosystem

Tool
Purpose


Isaac Lab
Training and evaluation

Newton
Differentiable physics simulation

Cosmos
Neural simulation

Groot
Robot reasoning and behavior generation

 Disney CollaborationCo-developed with Disney Research. An Olaf robot appeared, learning to walk in Omniverse and adapting to the physical world with the Newton solver.
In the future Disney parks, character robots will walk around the parks.
 SummaryThis keynote was packed with content, and I've summarized some of the key points that caught my attention. AI has entered the era of reasoning and agents, with tokens becoming a new currency and value. I strongly felt that NVIDIA is positioning itself to provide the foundation for this across hardware, software, and the entire ecosystem.
 AI Transformation PointsEvolution from ChatGPT → o1 → Claude Code has shifted AI's purpose from "generation" → "reasoning" → "execution"
Computational demand increased 1 million times in the past 2 years, with $1 trillion in demand expected by 2027
100% of NVIDIA employees use Claude Code / Codex / Cursor
 Token FactoryData centers have shifted from "file storage" to "token production factories"
We've entered an era where CEOs worldwide will pursue token efficiency
In the future, engineers will have annual token budgets, which will become a hiring condition
 Vera Rubin Platform35-50x performance improvement compared to Hopper (vs. 1.5x expected from Moore's Law)
40 million times compute improvement over 10 years
Groq integration combines high throughput with low latency
 OpenClaw RevolutionOS for Agent AI (with impact equivalent to Linux, HTML, Kubernetes)
Most popular open-source project in human history
All SaaS companies will become GaaS (Agentic as a Service) companies
 Physical AIThe "ChatGPT moment" has arrived for self-driving
New partners: BYD, Hyundai, NISSAN, Geely (18 million vehicles annually)
Co-developed Olaf robot with Disney for future Disney parks

Industry	Market Size	NVIDIA's Initiatives
Self-driving Cars	-	Wide reach and impact
Financial Services	-	Algorithmic trading shifting to deep learning/transformers. Largest participant percentage at GTC
Healthcare	-	AI physics/biology for drug discovery, diagnostic support AI agents
Industry	-	AI factories, chip plants, computer plant construction
Media/Gaming	-	Real-time AI platforms, translation, broadcast support. Holoscan
Quantum Computing	-	Building cuQuantum GPU hybrid systems with 35 companies
Retail/CPG	$35 trillion	Supply chain, shopping systems, customer support AI agents
Robotics/Manufacturing	$50 trillion	10 years of work, 110 robots on display
Telecommunications	$2 trillion	Base stations evolving into AI infrastructure platforms. Collaborating with Nokia, T-Mobile on Aerial (AI RAN)

Library	Purpose
cuDNN	Deep neural networks (driver of the AI big bang)
cuOpt	Decision optimization
cuLitho	Computational lithography
cuDSS	Direct sparse solver
cuEquivariance	Geometry-aware neural networks
Aerial	AI RAN
Warp	Differentiable physics
Parabricks	Genomics

Axis	Meaning	Business Impact
Throughput (vertical)	Tokens generated per watt	Production capacity under power constraints
Token speed (horizontal)	Inference speed	AI intelligence, amount of context that can be processed

Tier	Characteristics	Price Example
Free	High throughput, low speed	$0
Standard	Medium throughput, medium speed	$3 / million tokens
High	Low throughput, high speed	$6 / million tokens
Premium	Lowest throughput, highest speed	$45-$150 / million tokens

Platform	Revenue relative to Hopper
Hopper	1x
Grace Blackwell	5x
Vera Rubin	25x (35x with Groq integration)

	Vera Rubin	Groq
Design philosophy	High throughput	Low latency
Memory	288GB/chip	500MB/chip (large amount of SRAM)
Use	Pre-fill, Attention	Decode, token generation

Generation	GPU	CPU	LPU	Scale-up
Blackwell	Blackwell	Grace	-	NVLink 72
Rubin	Rubin	Vera	LP 30	NVLink 72 → NVLink 576
Rubin Ultra	Rubin Ultra	Vera	LP 35	Kyber (NVLink 144)
Feynman	Next-gen	Rosa	LP 40	Kyber + CPO

Model	Purpose
Nemotron	Language, vision, RAG, voice
Cosmos	Physical AI, world generation
Alpamayo	Self-driving (world's first thinking and reasoning model)
Groot	General-purpose robotics
BioNeMo	Biology, molecular design
Earth Two	Weather and climate prediction

Tool	Purpose
Isaac Lab	Training and evaluation
Newton	Differentiable physics simulation
Cosmos	Neural simulation
Groot	Robot reasoning and behavior generation