I investigated the current state of manufacturing VSS as seen at NVIDIA VSS 3.1.0 EA and Hannover Messe

I investigated the current state of manufacturing VSS as seen at NVIDIA VSS 3.1.0 EA and Hannover Messe

2026.05.10

This page has been translated by machine translation. View original

Introduction

Hello, I'm Morishige from Classmethod's Manufacturing Business Technology Department.

I published the article below in March 2026, and in the two months since then, there has been quite a lot of activity around VSS.

https://dev.classmethod.jp/articles/dgx-spark-vss-3-early-access/

VSS 3.1.0 EA was released on March 25, and the Warehouse Blueprint was also updated to 3.1.0 as of April 30. Furthermore, at Hannover Messe 2026 (4/20–24), there was a series of announcements of products incorporating VSS for the manufacturing industry.

In this article, which is a sequel to the previous one, I first organized how VSS is starting to be used on the manufacturing floor as seen at Hannover Messe, using four company case studies.

VSS 3.1.0 EA Release and Developments Since March

The 3.1.0 EA was announced on the NVIDIA Developer Forum.

https://forums.developer.nvidia.com/t/nvidia-ai-blueprint-for-video-search-and-summarization-vss-v3-1-0-early-access-is-publicly-available/364679

The Warehouse Blueprint documentation was also updated to 3.1.0 as of April 30. While GA has not yet been released, DGX Spark has been officially listed alongside AGX/IGX Thor in the "Validated Platforms."

Date Event
2026-03-25 VSS 3.1.0 EA released (Developer Forum)
2026-03-16–19 GTC 2026 (no mention of VSS in the keynote)
2026-04-20–24 Hannover Messe 2026 (VSS adoption in manufacturing surges into the spotlight)
2026-04-30 Warehouse Blueprint 3.1.0 documentation updated

What's interesting is that the evolution on the software side (3.0 → 3.1) and the movement on the field side (emergence of case studies) are happening at almost the same time. When I wrote the March article, VSS 3.0 EA was positioned as "architecturally flashy but still a painful EA to work with locally," but looking back now in May, it's clear that manufacturing sites around the world are beginning to enter the implementation phase.

Let's now look at four company case studies in order. Based on the NVIDIA official blog and each company's official information, I've organized what's visible regarding their connection to VSS.

Manufacturing VSS Case Study 1: Invisible AI Vision Execution System

This is where the case study section begins. Using the blog post NVIDIA published to coincide with Hannover Messe 2026 as a base, supplemented by each company's official information, I'll look at four companies in order.

https://blogs.nvidia.com/blog/ai-manufacturing-hannover-messe/

The first case study is Invisible AI's Vision Execution System. It is introduced in the above blog as a product that adopts the NVIDIA Metropolis VSS Blueprint and Cosmos Reason 2.

Invisible AI's concept is simple: lightweight cameras are installed at each station on the manufacturing line, and AI automatically converts "what happened" per work cycle into structured data. The three steps of Capture, Structure, and Act are listed on their official website.

Lightweight cameras see every cycle on every line. No wearables, no operator disruption.

A key feature is that it collects data without wearables and without disrupting workers' movements. Getting workers on a manufacturing line to wear sensors is difficult to negotiate with the floor, so a design that works entirely on the camera side helps lower the barrier to adoption.

The companies listed as adopters on the official website are Mercedes-Benz, Ford, Toyota, BMW, General Motors, and Nissan — essentially all the major players in the automotive industry. Toyota has provided the following comment:

Invisible AI has been a great partner for Toyota as we work toward building the manufacturing processes of the future.

The numbers are also specific: "10x faster than manual time studies," "3-5x ROI on each device," "Every minute of downtime we prevent saves us $1k," and "every workstation we can remove saves us $200k per year" — all linked directly to on-site labor cost calculations.

The connection to VSS lies in the part that converts video into "per-cycle structured data." The VLM (Cosmos Reason 2) reads the video and extracts events, and Behavior Analytics generates analysis events such as ROI, trip wires, and proximity violations — this corresponds to Invisible AI's Capture → Structure steps. The form is that VSS itself is not exposed directly to users, but is wrapped inside Invisible AI's platform for delivery.

Manufacturing VSS Case Study 2: Tulip Interfaces Factory Playback

Tulip Interfaces is a frontline operations platform that promotes "Composable AI for Frontline Operations." Factory Playback, announced at Hannover Messe 2026, is a new feature combining the VSS Blueprint with Cosmos Reason 2. The idea is to overlay video and production data onto a single timeline so you can rewind and drill down into it.

Combine video and production data into a single timeline you can rewind, explore, and analyze.

NVIDIA's official blog describes it as "synchronize machine telemetry, operator workflows, quality events and video." By aligning machine telemetry, operator workflows, quality events, and video on the same timeline, it enables a seamless way to confirm "what happened when, and what the video looked like at that moment."

The case study introduced as an adoption example is construction equipment manufacturer Terex.

estimated 3% increase in yield and 10% reduction in rework

The fact that the figures of 3% yield improvement and 10% rework reduction are presented with "estimated" is a careful way of phrasing things. This is because yield and rework have complex causes, making it difficult to isolate the contribution of AI alone.

Incidentally, Tulip itself has plenty of case studies that don't involve VSS: a 60% reduction in inspection and rework time at TICO Tractors, and a 30% faster clinical trial packaging process at Sharp, according to their official website. It's more accurate to read this as a company with substantial depth to its platform that has now added a VSS layer on top.

Manufacturing VSS Case Study 3: Fogsphere Vision Agent Platform

The third company, Fogsphere, is a Computer Vision AI company based in London, UK. At Hannover Messe 2026, they announced a platform that runs NVIDIA Cosmos Reason 2 + Metropolis VSS Blueprint on ARM-based edges. Unlike the previous two companies, which aggregate processing to an in-facility server, Fogsphere's distinctive design strongly emphasizes distributing AI at the edge, close to the CCTV cameras themselves.

Distribute AI at the 'Edges' of the city. Run AI over CCTV even with unstable 4g connections.

The claim that AI can run on CCTV even in unstable 4G environments hints at an ambition to operate in harsh field conditions. The configuration uses a three-tier Edge-to-Fog-to-Cloud architecture to process data hierarchically.

The target industries span nine sectors: Retail / Chemical / Construction / Smart City / Logistic / Oil & Gas / Power Generation / Healthcare / Manufacturing. The Saipem adoption case announced at Hannover Messe sits closest to the Oil & Gas position among these, and is used to detect safety and environmental risk events from video.

detect and respond in real time to high-risk safety and environmental events

The ARM-based edge deployment aspect seems well-suited to the Jetson Orin Nano Super line covered in the DGX Spark series. The fact that "VSS deployment on ARM edges" has emerged in the form of a commercial product feels like evidence that the ecosystem has matured.

Manufacturing VSS Case Study 4: Pegatron PCB Assembly

The final case goes back slightly in time: it's the case of Taiwanese electronics manufacturer Pegatron. Introduced in an NVIDIA official blog post from May 2025 as a representative case study of the VSS Blueprint, it's being used for SOP (Standard Operating Procedure) analysis and employee training in PCB (printed circuit board) assembly processes.

https://blogs.nvidia.com/blog/ai-blueprint-video-search-and-summarization/

the agents have reduced Pegatron's labor costs by 7% and defect rates by 67%

It's noteworthy that the figures of 7% labor cost reduction and 67% defect rate reduction are written using the expression "the agents." Rather than VSS itself, it's the agents built on top of it that are described as generating the results. This reveals a structure where VSS acts as the "foundation for reading video" behind the scenes, while agents take on the decision-making role on the floor.

A 67% reduction in defect rate changes in meaning depending on the baseline. If it's a line with originally high defect rates, that's a reduction by an order of magnitude, but if it's already a low-defect line, the absolute value would be small. Since the source article doesn't state an absolute value, it's safe to treat this as a reference figure.

The same blog also introduces Siemens' Industrial Copilot for Operations (30% productivity improvement) and Linker Vision in Kaohsiung, Taiwan (VSS adoption for smart city use), demonstrating that VSS is spreading vertically not just into manufacturing but also into public domains.

What Becomes Visible When You Line Up the Four Companies

Lining up the four companies reveals several common trends.

First, VSS does not appear on the surface as a standalone product. Invisible AI, Tulip, and Fogsphere all use VSS internally within their own platforms as a "plug-in point for video understanding," while their outward branding is built around their own product names. NVIDIA stays behind the scenes as a reference architecture provider, while solution providers handle the part that interfaces with the field — a two-layer structure.

Second, Cosmos Reason 2 is becoming the de facto default VLM. While VSS 3.0 EA offered a choice of Cosmos-Reason2-8B / Cosmos-Reason1-7B / Qwen3-VL, all three companies that announced at Hannover Messe adopted Cosmos Reason 2. Even though Qwen3-VL was added in VSS 3.1.0, the trend of Cosmos Reason 2 as the recommended default seems set to continue for some time.

However, there's a possibility of another shift from here. NVIDIA released Nemotron 3 Nano Omni (30B-A3B, supporting 4 modalities: text / image / audio / video) at the end of April 2026.

https://dev.classmethod.jp/articles/dgx-spark-nemotron3-nano-omni-multimodal-launch-bench/

https://dev.classmethod.jp/articles/dgx-spark-nemotron3-nano-omni-japanese-multimodal-bench/

The latter directly compared three models including Cosmos Reason 2 on Heron-Bench and JMMMU, revealing clearly differentiated strengths per benchmark. A future where it joins the VLM candidates for VSS doesn't seem far off. Furthermore, if the next generation of Cosmos Reason 2 (whether it will be called Cosmos Reason 3 is unknown) comes out, the recommended default should be replaced at that point. Personally, I think what's most worth watching is how the VLM lineup for VSS will shift in six months.

Third, ARM-based edge deployment is proving to be an even stronger need than expected. The Fogsphere example is emblematic, but manufacturing floors, CCTV networks, and on-site servers are not necessarily built on the premise of x86 + dGPU. NVIDIA has been addressing this need with its ARM-based lineup of Jetson Thor, IGX Thor, and DGX Spark, and the official listing of DGX Spark as a Validated Platform in VSS Warehouse Blueprint 3.1.0 is part of that trend.

Finally, from the perspective of adoption in Japanese manufacturing, "building VSS from scratch" is still a heavy option. The choice comes down to either adopting a VSS-embedded platform provided by overseas solution vendors like these, or pursuing a custom implementation based on the VSS Blueprint. The former allows faster movement but is weak in Japanese language support and handling process-specific characteristics, while the latter takes more effort but can be fitted to your own business processes — a trade-off. The kind of workaround I tried in previous verification articles, such as swapping in a Japanese LLM, is one example of the latter approach.

Connecting with the DevelopersIO On-Site Reports

Alongside the four case studies, I'd like to recommend reading the reports from Classmethod employees who covered Hannover Messe 2026 on the ground. While there are no articles specifically discussing VSS, the content is continuous with the movements of these four companies, reflecting a firsthand sense of the current of manufacturing AI.

https://dev.classmethod.jp/articles/hm26-aws/

Hamada (Hamako)'s report on visiting the AWS booth (2026-04-28) summarizes the exhibit featuring Isaac Sim / Cosmos WFM / GR00T / Jetson Thor as "agents bundling everything from above while preserving existing on-site systems." VSS didn't appear to be at the AWS booth, but the statement that "the AWS × NVIDIA collaboration is becoming three-dimensionally richer" resonates with the solution provider deployment of VSS covered in this article.

https://dev.classmethod.jp/articles/hm26_industrial_ai_factory/

Another article covering the T-Systems × Siemens × NVIDIA initiative (2026-04-27), in the context of Industrial AI Cloud, summarizes it as moving "from isolated pilots to AI-driven, interconnected factories." VSS is one piece handling the video layer of the factory, and it is only once embedded in the layer structure of sovereign AI, Industrial AI, and physical AI like this that it begins to function as a business. The fact that VSS is not a product that sells on its own is a structure shared with the four case studies in this article.

What Changed Significantly in 3.1.0

From here, this is the verification section. Of the differences from 3.0.0 EA to 3.1.0 EA, I picked up six changes that are meaningful from the perspective of running on DGX Spark. This is based on observations of the contents of the compose package downloaded from NGC.

Minimal Profile Now Enabled by Default

In the previous article, I wrote about how launching 3.0.0 EA brought up 42 services simultaneously, putting pressure on the DGX Spark's 128 GB UMA. In 3.1.0, the default in warehouse/.env is MINIMAL_PROFILE="true".

warehouse/.env
MINIMAL_PROFILE="true" # Minimal profile

With this Minimal Profile, ELK (Elasticsearch + Kibana), the Video Analytics API/UI, and the monitoring layer are excluded, and only the minimum necessary services are brought up, so the official announcement mentions a deployment time of "under 5 minutes, 4x faster." This is a clear answer to the situation I described in the previous article as "42 services crammed into one machine, which is heavy," and it makes things more convenient for use cases like just wanting to try Agentic Search or only needing the raw data from Behavior Analytics.

VLM-as-Verifier Promoted to an Independent Service

In the previous article, I wrote about how I had to patch four code bugs in Alert Bridge as a pain point. In 3.1.0, the target has been promoted to an independent service as vlm-as-verifier/compose.yml.

deployments/
├── vlm-as-verifier/
│   ├── README.md
│   ├── compose.yml
│   └── scripts/
└── warehouse/
    ├── vlm-as-verifier/  # warehouse-specific settings
    └── ...

It is incorporated into the compose.yml includes as a base service, and there is also a dedicated settings directory warehouse/vlm-as-verifier/ on the Warehouse side. Whether the four patches I applied in the previous article are directly reflected can't be verified strictly without running the actual machine, but at the design level, things are moving in the direction of "consolidating alert verification into a single service."

Perception Models + Data Directory Bundled in app-data

This is personally the most welcome change. Back in March, even after starting the Perception container, it would error with Cannot access ONNX file '/opt/storage/rtdetr_warehouse_v1.0.fp16.onnx', and I had to separately download the NGC nvidia/tao/rtdetr_2d_warehouse and manually place it in $MDX_DATA_DIR/models/mtmc/.

In 3.1.0, downloading vss-warehouse-app-data:3.1.0 from NGC includes the following files from the start.

vss-warehouse-app-data/
├── models/
│   ├── mtmc/rtdetr_warehouse_v1.0.1.fp16.onnx
│   └── sparse4d/ov/sparse4d_warehouse_v2.1.onnx
├── data_log/
│   ├── elastic/
│   ├── kafka/
│   ├── redis/
│   └── vss_video_analytics_api/
└── videos/
    ├── nv-warehouse-4cams/
    ├── warehouse-4cams-20mx20m-synthetic/
    └── warehouse-loading-dock-3cams-synthetic/

RT-DETR has been updated from v1.0 to v1.0.1, and Sparse4D has advanced from v2.0 to v2.1. Furthermore, the data directories that I used to create manually with mkdir -p data_log/{elastic,kafka,redis,...} in the previous article are now included as well. Simply pointing MDX_DATA_DIR to this directory is all that's needed to be ready to start, and the friction of the initial setup has been reduced by about half in terms of feel.

DGX Spark-Specific hw env Partially Added

In 3.0.0 EA, no hw-DGX-SPARK.env was provided for NIM hw environment files, and I had to manually create one by adapting hw-DGX-THOR.env. Checking 3.1.0, the support status is as follows.

NIM DGX Spark hw env Status
cosmos-reason2-8b Present Officially supported
nvidia-nemotron-nano-9b-v2-fp8 Present Officially supported (see below, uses vLLM internally)
nvidia-nemotron-nano-9b-v2 (BF16) Absent March pain point continues
cosmos-reason1-7b Absent Not supported on DGX Spark
nemotron-3-nano Absent Not supported on DGX Spark
qwen3-vl-8b-instruct Absent H100 series only
llama-3.3-nemotron-super-49b-v1.5 Absent H100 series only
gpt-oss-20b Absent H100 series only

With the combination of Cosmos Reason 2 8B (VLM) and Nemotron Nano 9B v2 FP8 (LLM), the March pain points have been addressed. On the other hand, BF16 Nemotron and large models still cannot run on DGX Spark's local NIM. The following note remains in the Known Limitations:

L4, RTX 6000 ADA, IGX-THOR and DGX-SPARK local NIMs are not supported

It's not that "DGX Spark now fully supports Local NIM," but rather that "a specific combination of FP8 + Cosmos Reason 2 has been addressed." The "alternative using NGC vLLM" adopted in the previous article remains a valid option.

FP8 LLM Internally Uses Official vLLM Container

Looking inside the compose for Nemotron Nano 9B v2 FP8, for which a DGX Spark hw env was prepared, there's another interesting discovery.

nim/nvidia-nemotron-nano-9b-v2-fp8/compose.yml
services:
  nvidia-nemotron-nano-9b-v2-fp8:
    image: nvcr.io/nvidia/vllm:25.12.post1-py3
    command:
      - python3
      - -m
      - vllm.entrypoints.openai.api_server
      - --model
      - nvidia/NVIDIA-Nemotron-Nano-9B-v2-FP8

Rather than a NIM container, it uses the NGC vLLM container. The workaround I described in the previous article — "LLM NIM returns exec format error on ARM64, so use NGC vLLM as an alternative" — has essentially been promoted to the official method. It's been built out to the point of fetching nemotron_toolcall_parser_no_streaming.py from the Hugging Face nvidia/NVIDIA-Nemotron-Nano-9B-v2 repository for the tool calling parser and bundling it in.

The decision to prioritize quickly delivering results based on vLLM rather than fully preparing an ARM64 NIM is transparent. Personally, this change was the most satisfying to see.

NIM Lineup Expanded

Finally, here is a summary of the NIM options added since 3.0.0 EA.

LLM option DGX Spark env
nvidia/nvidia-nemotron-nano-9b-v2 (existing) Absent
nvidia/nvidia-nemotron-nano-9b-v2-fp8 (new) Present
nvidia/nemotron-3-nano (new) Absent
nvidia/llama-3.3-nemotron-super-49b-v1.5 (new) Absent
openai/gpt-oss-20b (new) Absent
VLM option DGX Spark env
nvidia/cosmos-reason2-8b (existing) Present
nvidia/cosmos-reason1-7b (existing) Absent
Qwen/Qwen3-VL-8B-Instruct (new) Absent

If running on DGX Spark alone, the realistic choice is between Cosmos Reason 2 and Nemotron Nano FP8. However, by switching to a Remote NIM configuration (LLM_MODE=remote calling build.nvidia.com), it's also possible to incorporate Llama 3.3 Nemotron Super 49B or gpt-oss-20b via the cloud. A hybrid configuration of "DGX Spark for front-end analysis, cloud NIM for heavier inference" seems likely to become an option heading toward GA.

Summary

VSS 3.0.0 EA back in March was in a state of "architecturally flashy, but a continuous sequence of manual work to get running on DGX Spark." Catching up with 3.1.0 EA in May, here's my impression.

  • Setup is visibly improved: Minimal Profile is default, Perception models and sample videos are bundled in app-data, and pre-creating data directories is no longer necessary
  • The biggest change from the DGX Spark series perspective is official support for the Cosmos Reason 2 + Nemotron Nano FP8 combination: the exec format error pain from March can be avoided by taking this path
  • However, Local NIM for BF16 Nemotron and large models remains unsupported, and Remote NIM or self-managed vLLM remains the practical solution
  • In terms of features as well, there's a sense of one generational step forward between EAs, with VLM-as-Verifier becoming an independent service, Agentic Search in Alpha, and Multi-turn Agent Conversation improvements

And what the case study section revealed is the fact that VSS is beginning to permeate the world's manufacturing industry as "a video understanding engine embedded inside solution providers' platforms." Invisible AI, Tulip, Fogsphere, and Pegatron all run VSS behind the scenes of their own products, with the solution vendor side holding the touchpoint with the field. For Japanese manufacturing adoption, the choice comes down to either adopting such a built-in platform or pursuing a custom implementation for your own organization, and the Japanese LLM swap-in I tried in the previous article corresponds to the latter approach. The realistic route seems to be swapping in Nemotron 9B-v2-Japanese via vLLM into the DGX Spark + FP8 + Cosmos Reason 2 path shown by 3.1.0.

Whether to wait for GA depends on the use case. For safety monitoring or PoC-level use, 3.1.0 EA is already at a stage where it's usable enough. On the other hand, for production use, the security immaturity of EA (no TLS, no authentication, no rate limiting) is a concern, so waiting for GA is also a reasonable call. I'd like to follow up in a separate opportunity on the actual machine regarding how the four Alert Bridge patches from the March article were absorbed by the VLM-as-Verifier independent service, and whether Minimal Profile really comes up in 5 minutes.


生成AI活用はクラスメソッドにお任せ

過去に支援してきた生成AIの支援実績100+を元にホワイトペーパーを作成しました。御社が抱えている課題のうち、どれが解決できて、どのようなサービスが受けられるのか?4つのフェーズに分けてまとめています。どうぞお気軽にご覧ください。

生成AI資料イメージ

無料でダウンロードする

Share this article