I summarized recent developments in design systems advancing AI agent support - Meta Astryx, v0, Serendie, SmartHR

Research into efforts to get AI agents to correctly use design systems, examining Meta Astryx, v0 Design Systems 2.0, Serendie, and SmartHR Design System. This piece organizes the differences between four approaches: evaluation, ingestion, reference, and skill formation.

佐藤雅史

2026.07.04

This page has been translated by machine translation. View original

 IntroductionMechanisms for getting AI agents to correctly use design systems have been successively published and expanded by various companies since the beginning of this year.
June 2026　Meta published Astryx as OSS.
June 2026　v0's Design Systems 2.0 announced a renewal of the design system library and improvements to generation quality.
May 2026　SmartHR Design System added new plugins for Claude Code / Cursor.
February 2026 onwards　Mitsubishi Electric's Serendie Design System published an MCP Server, subsequently expanding AI agent-oriented mechanisms including Agent Skills and a Figma Plugin.
This article investigates what each of these design system mechanisms is capable of and what goals they pursue, and organizes the differences in their approaches.
 Meta AstryxAstryx is a React + StyleX-based design system that Meta published as OSS in June 2026.
https://astryx.atmeta.com/blog/introducing-astryx

https://github.com/facebook/astryx
It is described as the externally released version of a design system that was developed internally over 8 years and has supported more than 13,000 apps within Meta.
The technical stack is React + StyleX. It comes with over 150 accessible components, 7 themes, ready-to-ship templates, and a CLI.
What is distinctive about its AI support is the CLI. With astryx manifest --json, you can output CLI commands, arguments, flags, and response types in JSON format. A notable feature is that it is designed not only for human operation but also as an interface for AI agents to understand commands and options.
 vibe-tests — A harness for measuring generation qualityWhat personally caught my attention most in Astryx is vibe-tests. This is a harness that structurally evaluates how accurately LLMs can generate UI code using the design system.
internal/vibe-tests/README.md describes it as something that sends the same prompt to different design system configurations, makes the results measurable, and enables comparison. The three subjects of comparison are Astryx, a shadcn/ui baseline, and plain HTML + inline CSS.
It also explicitly states principles such as keeping evaluation logic equivalent across targets, not disclosing expectedComponents to agents as they are evaluation-only, and always launching sub-agents fresh without context. It is designed so agents are not told the answers but must discover them from the documentation and tools each system provides.
 5 dimensions of static evaluationStatic evaluation (src/universal-eval.ts) scores across 5 dimensions, each on a scale of 0 to 100.
Correctness: TypeScript errors, non-existent props, misuse of DOM events, etc.
Accessibility: icon-only buttons without aria-label, inputs without labels, onClick on non-interactive elements, missing alt attributes, etc.
Code Quality: nesting, branch count, overly long functions, any, console.log, missing keys during map, etc.
Efficiency: over-decoration, boilerplate, duplication, number of styling decisions per element, etc.
Maintainability: hardcoded colors, spacing, and typography; semantic token ratio; locality of state; etc.
Looking at the deduction logic for Correctness, each TypeScript error deducts -15. Phantom props such as onPress, which does not fire in React DOM, are also detected. onPress is treated as critical at -15, and things like onChange on a button element are moderate at -8, and so on.
 design-judge — Scoring visual appearance with a Vision LLMDesign quality that cannot be measured by static analysis is handled by design-judge in src/design-judge.ts. It takes a screenshot of the generated output, compares it to an ideal-state PNG placed in ideals/, and scores visual fidelity using a Vision LLM.
The reference images are PNGs based on Figma or HTML, and the requirement is that they must not be outputs generated by Astryx, the baseline, or HTML. ideals/README.md also states that images should be 1920×1200 PNGs and that at least one designer should review them. Currently, 55 prompt IDs and 62 images total are prepared.
Rather than deciding scores in a single run, evaluation is normally executed 3 times and the median is adopted.
 degradation mode — Measuring whether output degrades mid-wayDegradation mode is a mode for measuring whether an AI forgets design system constraints as a conversation grows longer and drifts toward using its own styles. It has the model produce an output over a 10-turn conversation, interspersing unrelated questions like "How do I center a div?" along the way. The output is recorded at turns 0, 6, and 10.
The goal is to confirm whether design system patterns continue to be used throughout iterative development. A notable feature is that the evaluation covers not just single-generation quality but also problems close to instruction persistence and context rot as conversations continue.
 v0 Design Systems 2.0v0's Design Systems 2.0 is an approach that teaches the generation tool itself your company's design system.
https://v0.app/docs/design-systems-2
Once you register your design system with v0, subsequent chats will generate UIs that conform to your company's components, tokens, and conventions. What is registered is saved to the workspace as a skill (a set of instructions that v0 retains).
Import sources support a wide range, including GitHub repositories, Figma frames, documentation links such as Storybook, attached files, and environment variables for private npm.
A validation step is inserted in the registration flow. After v0 reads the source, it builds a starter app once to verify that its understanding is correct, and after the user reviews and approves it, it is saved as a skill. Rather than jumping straight into production generation, this approach improves certainty by inserting a validation step.
What is explicitly stated as policy is grounding. If a component, prop, or token cannot be verified from the source, v0 should not use it. This is a mechanism for not fabricating things that do not exist.
Also, v0's design system has been progressively improved, with GitHub import added on February 25, 2026, direct linking with team skills on March 16, and the renewal of the design system library and improvement of generation quality on June 26, with various features continuing to be added.
 Mitsubishi Electric SerendieSerendie is Mitsubishi Electric's design system.
https://serendie.design/ai/
Published in November 2024, it provides various packages on GitHub such as React components, as well as the Serendie UI Kit published on Figma Community.
AI support consists of the following:
Remote MCP server
Agent Skills (serendie-overview)
Figma Plugin
Additionally, llms.txt (both a simplified version and a detailed llms-full.txt) is also provided.
The Serendie MCP server can be used simply by setting https://serendie.design/mcp as the endpoint. There are 4 MCP tools, which allow you to retrieve a list of Serendie UI components and their properties, design tokens, Serendie Symbols, and various guidelines.

Configuration examples and setup instructions for the MCP server and Agent Skills are prepared for major coding agents including Claude Code, Codex, and Cursor.
For Claude Code, the plugin bundles both MCP and Skills, and can be installed with the following commands:
/plugin marketplace add serendie/serendie
/plugin install serendie@serendie
 SmartHR Design SystemSmartHR Design System is SmartHR's design system.
https://smarthr.design/

https://github.com/kufu/smarthr-design-system/tree/main
The implementation foundation is the React library smarthr-ui.
Since May 2026, plugins for Claude Code / Cursor / Codex have been developed, and currently two types of Agent Skills are provided: component-guidelines and design-pattern-guidelines. Rather than an MCP server, the configuration distributes guidance on how to use components and design patterns as skills.
The two types of skills are as follows:
component-guidelines: A guide for correctly using each component in smarthr-ui. There are 104 component guides, each consisting of Props and type information, Do/Don'ts derived from eslint-plugin-smarthr, and checklists.
design-pattern-guidelines: A guide for correctly implementing page layouts and UI patterns in SmartHR products. 22 pattern guides are provided.
Individual guides in component-guidelines are auto-generated based on metadata, eslint-plugin-smarthr rule READMEs, and checklists. When AI-oriented guides are maintained separately by hand, they tend to fall out of date as updates fail to keep up, but this setup makes such divergence less likely to occur.
/plugin marketplace add kufu/smarthr-design-system
/plugin install smarthr-design-system@smarthr-design-system
 Organizing the 4 approachesEven within the same effort of "handing a design system to an AI," the emphasis differs across approaches: evaluate, ingest, reference, or turn into guides.


Target
Approach
Deliverables
Characteristics


Meta Astryx
Evaluation / Verification
Design system + evaluation harness (vibe-tests)
Scores agent outputs with static evaluation and Vision LLM, measuring whether output degrades mid-way

v0 Design Systems 2.0
Ingestion / Grounding
Skill-ification on the generation tool side
Feeds in company sources and prevents use of components, props, and tokens that cannot be verified

Mitsubishi Electric Serendie
Reference interface
Remote MCP + Agent Skills + llms.txt + Figma Plugin
Makes components, tokens, icons, and guidelines retrievable from agents and Figma

SmartHR Design System
Skill-ification of implementation guides
Agent Skills / plugins
Provides guidance on using smarthr-ui components and UI patterns for Claude Code / Cursor

Serendie and SmartHR are oriented toward preparing entry points for agents to access design system information. While Serendie provides a wide range of reference channels encompassing MCP, llms.txt, and a Figma Plugin, SmartHR organizes guidance on how to use smarthr-ui components and UI patterns as Agent Skills.
Astryx, on the other hand, goes beyond merely getting the design system used, and delves into "measuring whether it was used correctly." A distinguishing feature is that it has an evaluation mechanism that lies beyond the stages of passing information and having it ingested into generation tools.
It is also interesting that while both v0 and Astryx grapple with hallucination, their methods differ. v0 controls the issue through grounding by preventing unverifiable components, props, and tokens from being used at generation time, while Astryx detects non-existent props through deductions at evaluation time. Suppressing on the generation side versus measuring on the evaluation side — this shows two different layers of approach to the same problem.
 SummaryThrough this investigation of four initiatives, I organized approaches for handing design systems to AI agents. Building blocks such as MCP, Agent Skills, and evaluation harnesses have appeared in rapid succession, and combining agents with design systems is becoming a realistic option.
With directions such as evaluating, ingesting, referencing, and turning into guides, the situation is developing where you can choose an approach based on what you want to solve first when thinking about how to hand your design system to an AI.

I summarized recent developments in design systems advancing AI agent support - Meta Astryx, v0, Serendie, SmartHR

Introduction

Meta Astryx

vibe-tests — A harness for measuring generation quality

5 dimensions of static evaluation

design-judge — Scoring visual appearance with a Vision LLM

degradation mode — Measuring whether output degrades mid-way

v0 Design Systems 2.0

Mitsubishi Electric Serendie

SmartHR Design System

Organizing the 4 approaches

Summary

Claudeならクラスメソッドにお任せください

AWS Topics

Trending Topics

Products & Services

Features and Series

Target	Approach	Deliverables	Characteristics
Meta Astryx	Evaluation / Verification	Design system + evaluation harness (vibe-tests)	Scores agent outputs with static evaluation and Vision LLM, measuring whether output degrades mid-way
v0 Design Systems 2.0	Ingestion / Grounding	Skill-ification on the generation tool side	Feeds in company sources and prevents use of components, props, and tokens that cannot be verified
Mitsubishi Electric Serendie	Reference interface	Remote MCP + Agent Skills + llms.txt + Figma Plugin	Makes components, tokens, icons, and guidelines retrievable from agents and Figma
SmartHR Design System	Skill-ification of implementation guides	Agent Skills / plugins	Provides guidance on using smarthr-ui components and UI patterns for Claude Code / Cursor