NVIDIA GTC Keynote 2026 ~ How it supports next-generation intelligent systems ~

NVIDIA GTC Keynote 2026 ~ How it supports next-generation intelligent systems ~

2026.03.17

This page has been translated by machine translation. View original

Hello.

I am Takahashi Tanaka from the Smart Factory Team in the Manufacturing Business Technology Department.

Introduction

https://www.youtube.com/watch?v=jw_o0xr8MWU

Watch NVIDIA Founder and CEO Jensen Huang's GTC keynote as he unveils the latest breakthroughs in AI and accelerated computing. See how agentic AI, AI factories, and physical AI are powering the next generation of intelligent systems.

This is a summary of NVIDIA CEO Jensen Huang's NVIDIA GTC 2026 keynote.

Screenshot 2026-03-17 at 3.09.58

NVIDIA's 3 Platforms

NVIDIA has developed 3 platforms.

  • CUDA X: Algorithm library suite
  • Systems: Computing systems
  • AI Factory: Newly announced AI infrastructure

CUDA 20th Anniversary and Flywheel Effect

CUDA has reached its 20th anniversary. The SIMT (Single Instruction, Multi-Threaded) architecture has made it easy to deploy scalar code to multi-threaded applications.

The core of NVIDIA's strategy is the "flywheel effect."

  • Large install base attracts developers
  • Developers create new algorithms and achieve breakthroughs
  • Breakthroughs create new markets
  • New markets expand the ecosystem, further increasing the install base

This cycle has even caused the price of Ampere, which shipped 6 years ago, to increase in the cloud.
Continuous software optimization has significantly extended the useful life of GPUs.

GeForce 25th Anniversary and the AI Big Bang

GeForce invented programmable shaders (pixel shaders) 25 years ago.
It was the world's first programmable accelerator.

As GeForce spread CUDA throughout the world, Alex Krizhevsky, Ilya Sutskever, Geoff Hinton, and Andrew Ng discovered they could accelerate deep learning with GPUs. This became the start of the AI big bang 10 years ago.

About 8 years ago, they introduced RTX, fusing hardware ray tracing with AI.

DLSS 5 and Neural Rendering

They announced the next generation graphics technology "neural rendering." This is the fusion of 3D graphics and AI.

  • Structured data: Controllable, predictable 3D graphics
  • Generative AI: Probabilistic but very realistic

By combining these two, we can generate content that is both beautiful and controllable. This concept will be repeated across all industries. Structured data forms the foundation of "trustworthy AI."

Data Processing Innovation: cuDF and cuVS

Screenshot 2026-03-17 at 3.33.43

Structured Data

Data frames processed by platforms like SQL, Spark, Pandas, Velox. Used in Snowflake, Databricks, Amazon EMR, Azure Fabric, Google Cloud BigQuery. This is the "ground truth" of business.

AI agents use this data much faster than humans, requiring thorough acceleration.

Unstructured Data

About 90% of data generated each year is unstructured - PDF, video, audio. Previously difficult to query or search, but AI multimodal perception can now understand the meaning, vectorize, and make it searchable.

Two Foundation Libraries

  • cuDF: For data frames (structured data)
  • cuVS: For vector stores (unstructured data, semantic data)

Cloud Partnerships

IBM

Screenshot 2026-03-17 at 3.38.12

Collaborating with IBM, the inventor of SQL, to accelerate Watson X data with cuDF. Nestle can now process supply chain data from 185 countries 5 times faster and at 83% lower cost.

Dell

Announced Dell AI data platform integrating cuDF and cuVS. Collaboration with NTT Data has achieved significant acceleration.

Google Cloud

Screenshot 2026-03-17 at 3.41.03

Accelerating Vertex AI and BigQuery. Reduced Snapchat's computing costs by about 80%.

AWS

Screenshot 2026-03-17 at 3.43.20

Accelerating EMR, SageMaker, and Bedrock.
This year, bringing OpenAI to AWS is expected to significantly expand consumption. NVIDIA's first cloud partner.

Microsoft Azure

Screenshot 2026-03-17 at 3.43.38

Azure was the first to deploy NVIDIA's A100 supercomputer, which led to success with OpenAI. Accelerating AI Foundry and Bing search.

Confidential computing enables deployment of protected OpenAI and Anthropic models in environments where even operators cannot see the data or models.

Oracle

Screenshot 2026-03-17 at 3.44.26

NVIDIA was Oracle's first AI customer and also its first supplier. Deploying Cohere, Fireworks, OpenAI, and others.

Palantir + Dell

Screenshot 2026-03-17 at 3.48.35

The three companies are building an AI platform that can be fully deployed in air-gapped environments or on-premises. Integration with Palantir Ontologies enables AI deployment in any country or field.

NVIDIA's Strategy: Vertical Integration × Horizontal Open

NVIDIA is the world's first vertically integrated yet horizontally open company.

Why Vertical Integration is Necessary

The essence of accelerated computing is application acceleration. Now that Moore's Law has reached its limits, only domain-specific acceleration can achieve significant speedup and cost reduction.

Therefore, NVIDIA needs to understand:

  • Applications
  • Domains
  • Algorithms
  • Deployment scenarios (data center, cloud, on-premises, edge, robotics)

Horizontally Open

NVIDIA's technology can be integrated into any platform. They provide software and libraries, integrating with partner technologies to deliver accelerated computing worldwide.

Industry Deployments

Screenshot 2026-03-17 at 3.52.43

Industry Market Size NVIDIA's Initiatives
Self-driving cars - Broad reach and impact
Financial services - Algorithmic trading transitioning to deep learning/transformers. Largest percentage of GTC participants
Healthcare - AI physics/biology for drug discovery, diagnostic AI agents
Industrial - AI factories, chip plants, computer plant construction
Media/Gaming - Real-time AI platforms, translation, broadcast support. Holoscan
Quantum computing - Building cuQuantum GPU hybrid systems with 35 companies
Retail/CPG $35 trillion Supply chain, shopping systems, customer support AI agents
Robotics/Manufacturing $50 trillion Working for 10 years, exhibiting 110 robots
Telecommunications $2 trillion Base stations evolving into AI infrastructure platforms. Collaborating with Nokia, T-Mobile on Aerial (AI RAN)

NVIDIA's Treasure: CUDA X Libraries

NVIDIA is an algorithm company.
CUDA X libraries are the company's treasure, with computing platforms solving problems for each industry.
At this GTC, they announced approximately 100 libraries and 40 models.

Screenshot 2026-03-17 at 3.59.40

Key Libraries

Library Purpose
cuDNN Deep Neural Networks (cornerstone of the AI big bang)
cuOpt Decision optimization
cuLitho Computational lithography
cuDSS Direct sparse solver
cuEquivariance Geometry-aware neural networks
Aerial AI RAN
Warp Differentiable physics
Parabricks Genomics

3 Inflection Points in AI

Screenshot 2026-03-17 at 4.03.58

Three critical inflection points have arrived for AI in the past two years.

  • ChatGPT (late 2022-2023): Beginning of the generative AI era. Computing fundamentally changed from "search-based" to "generation-based"
  • o1 (inference AI): Enabled reflection, planning, and problem decomposition. Made generative AI trustworthy and truth-based
  • Claude Code (agent AI): Capable of reading files, creating code, compiling, testing, and iterating. 100% of NVIDIA employees use either Claude Code, Codex, or Cursor

As a result, AI has evolved from "perceiving AI" → "generating AI" → "reasoning AI" → "AI that actually does work."

Inference Inflection Point and Demand Explosion

As AI began doing productive work, the inference inflection point arrived.

  • Computational demand increased 1 million times in the past two years (10,000x increase in computation × 100x increase in usage)
  • Venture investments in AI startups reached $150 billion (largest in human history)
  • Investment scale shifted from millions to billions of dollars

By 2027, at least $1 trillion in demand is expected (doubled from $500 billion last year).

Token Factory Economics

Data centers have transformed from "data centers for files" to "factories that generate tokens."

Two Axes of the Token Economy

Axis Meaning Business Impact
Throughput (vertical axis) Tokens generated per watt Production volume under power constraints
Token velocity (horizontal axis) Speed of inference AI intelligence, amount of context it can process

Price Tier Differentiation

Screenshot 2026-03-17 at 4.23.30

Tier Characteristics Price Example
Free High throughput, low velocity $0
Standard Medium throughput, medium velocity $3 / million tokens
High Low throughput, high velocity $6 / million tokens
Premium Lowest throughput, highest velocity $45-$150 / million tokens

Going forward, CEOs worldwide will study token factory efficiency as it directly impacts revenue.

Grace Blackwell and Vera Rubin Performance

Screenshot 2026-03-17 at 5.29.23

  • While Moore's Law would expect 1.5x performance improvement, they achieved 35-50x
  • According to Dylan Patel of Semi Analysis, "Jensen sandbagged. It's actually 50x"
  • Token cost is the world's lowest and "basically untouchable"

Vera Rubin Platform

  • 7 chips, 5 rack-scale computers
  • 3.6 exaflops, 260TB/sec NVLink bandwidth
  • 40 million times compute improvement in 10 years
  • 100% liquid cooled, cooled with 45-degree warm water
  • Installation time: 2 days → 2 hours

Revenue Impact (for a 1GW data center)

Platform Revenue vs Hopper
Hopper 1x
Grace Blackwell 5x
Vera Rubin 25x (35x with Groq integration)

Groq Integration

Screenshot 2026-03-17 at 5.34.08

NVIDIA acquired the Groq team and licensed their technology.
They integrated two extreme processors.

Vera Rubin Groq
Design philosophy High throughput Low latency
Memory 288GB/chip 500MB/chip (massive SRAM)
Usage Pre-fill, Attention Decode, token generation

They integrated the two processors with Dynamo software to optimize the inference pipeline, achieving both high throughput and low latency.

Roadmap

Generation GPU CPU LPU Scale-up
Blackwell Blackwell Grace - NVLink 72 (copper)
Rubin Rubin Vera LP 30 NVLink 72 + optical → 576
Rubin Ultra Rubin Ultra Vera LP 35 Kyber (NVLink 144)
Feynman Next-gen Rosa LP 40 Kyber + CPO

NVIDIA Deepchex Platform

Digital twin platform optimizing AI factory design and operation.

  • DS World: Virtual design of AI factories on Omniverse
  • DS Flex: Dynamic power management with the grid
  • DS MaxQ: Dynamic maximization of token throughput

Partners: PTC, Dassault Systemes, Jacobs, Siemens, Cadence, Procore, etc.

AI factories have 2x optimization potential, and 2x at this scale is enormous.

Space Deployment

Screenshot 2026-03-17 at 5.35.31

  • Thor: Radiation-approved, already mounted on satellites
  • Vera Rubin Space-1: Computer for space data centers (in development)
  • Challenge: In space, cooling requires radiation only as there's no conduction or convection

OpenClaw

Screenshot 2026-03-17 at 5.36.30

Open source project developed by Peter Steinberger.

  • The most popular open source project in human history (surpassed 30 years of Linux in a few weeks)
  • Download and build AI agents with a single command
  • Andrej Karpathy's "Research" feature: Give it a task, sleep, and it automatically runs 100 experiments overnight

NVIDIA announced support for OpenClaw.

OpenClaw is an operating system for AI agents. It has resource management, scheduling, sub-agent calls, and multimodal I/O.

Just as Windows enabled personal computers, OpenClaw enabled personal agents.

Like Linux, HTTP/HTML, and Kubernetes, it provided what the industry needed at the right time. Every company needs an "OpenClaw strategy."

Enterprise IT Transformation

All SaaS companies are becoming GaaS (Agentic as a Service) companies.

Screenshot 2026-03-17 at 5.42.32

Addressing Security Challenges

Agents can access confidential information, execute code, and communicate externally, so security is essential.

Screenshot 2026-03-17 at 5.39.45

NVIDIA developed NeMo Claw

  • OpenShell (security integration)
  • Policy engine integration
  • Network guardrails
  • Privacy router

NVIDIA Open Models

Screenshot 2026-03-17 at 5.43.24

Six model families at the frontier in all domains:

Model Purpose
Nemotron Language, vision, RAG, voice
Cosmos Physical AI, world generation
Alpamayo Autonomous driving (world's first thinking and reasoning type)
Groot General-purpose robotics
BioNeMo Biology, molecular design
Earth Two Weather and climate prediction

Nemotron-3 Ultra has become the world's best base model, supporting the construction of Sovereign AI in each country.

Nemotron Coalition

Screenshot 2026-03-17 at 5.45.19

Cursor, LangChain, Mistral, Perplexity, Sarvam, and many others are participating in joint development of Nemotron-4.

The Era of Token Budgets

In the future, all engineers will have annual token budgets. In addition to base salary, they will receive about half that amount as tokens. In Silicon Valley, "how many tokens come with the job" has become a condition of employment.

Physical AI

The "ChatGPT Moment" of Self-Driving Cars

Screenshot 2026-03-17 at 5.46.54

It has been proven that self-driving can work reliably.

New partners: BYD, Hyundai, NISSAN, Geely (total of 18 million vehicles annually)
Partnering with Uber to deploy robotaxis in multiple cities.

Alpamayo: The world's first thinking and reasoning self-driving AI. Cars can explain their own actions.

Robot Development Ecosystem

Tool Purpose
Isaac Lab Training and evaluation
Newton Differentiable physics simulation
Cosmos Neural simulation
Groot Robot reasoning and action generation

Disney Collaboration

Joint development with Disney Research. Olaf robot appeared, learning to walk in Omniverse and adapting to the physical world with Newton solver.

In future Disneyland parks, character robots will walk around.

image

Summary

This keynote was packed with content, and I've summarized some of the key points that caught my attention. AI has entered the era of reasoning and agents, with tokens becoming the new currency and value. I strongly felt that NVIDIA is becoming an entity that provides the foundation for this across hardware, software, and the entire ecosystem.

AI Transformation Points

  • Evolution from ChatGPT → o1 → Claude Code shifted AI's purpose from "generation" → "reasoning" → "execution"
  • Computational demand increased 1 million times in the past 2 years, with $1 trillion in demand by 2027
  • 100% of NVIDIA employees use Claude Code / Codex / Cursor

Token Factory

  • Data centers transformed from "file storage" to "token production factories"
  • We're entering an era where CEOs worldwide will pursue token efficiency
  • In the future, engineers will have annual token budgets, which will become a hiring condition

Vera Rubin Platform

  • 35-50x performance improvement compared to Hopper (vs. 1.5x expected from Moore's Law)
  • 40 million times compute improvement in 10 years
  • Integration with Groq combines high throughput and low latency

OpenClaw Revolution

  • OS for agent AI (equivalent impact to Linux, HTML, Kubernetes)
  • Most popular open source project in human history
  • All SaaS companies becoming GaaS (Agentic as a Service) companies

Physical AI

  • ChatGPT moment has arrived for self-driving
  • New partners: BYD, Hyundai, Nissan, Geely (18 million vehicles annually)
  • Developing Olaf robot with Disney for future Disneyland parks

Share this article

FacebookHatena blogX