I summarized recent developments in design systems advancing AI agent compatibility - Meta Astryx, v0, Serendie, SmartHR

I summarized recent developments in design systems advancing AI agent compatibility - Meta Astryx, v0, Serendie, SmartHR

Investigating efforts to make AI agents use design systems correctly, examining Meta Astryx, v0 Design Systems 2.0, Serendie, and SmartHR Design System. This piece organizes the differences among four approaches: evaluation, ingestion, reference, and skill formation.
2026.07.04

This page has been translated by machine translation. View original

Introduction

Mechanisms for making AI agents correctly use design systems have been successively published and expanded by various companies since the beginning of this year.

  • June 2026 Meta published Astryx as OSS.
  • June 2026 v0's Design Systems 2.0 announced a renewal of the design system library and improvements to generation quality.
  • May 2026 SmartHR Design System added new plugins for Claude Code / Cursor.
  • February 2026 onwards Mitsubishi Electric's Serendie Design System published an MCP Server, then expanded mechanisms for AI agents including Agent Skills and Figma Plugin.

This article investigates what each of the above design system mechanisms can do and what they aim to achieve, and organizes the differences in approach.

Meta Astryx

Astryx is a design system based on React + StyleX that Meta published as OSS in June 2026.

https://astryx.atmeta.com/blog/introducing-astryx
https://github.com/facebook/astryx

It is described as an externally released version of a design system that was cultivated internally for 8 years and has supported more than 13,000 apps within Meta.

The technical composition is React + StyleX. It comes with more than 150 accessible components, 7 themes, ready-to-ship templates, and a CLI.

What is distinctive about its AI support is the CLI. The CLI is equipped with a JSON API and Capability Manifest, and astryx manifest --json can output instructions for using the CLI in JSON format that is easy for AI agents to read. A notable characteristic is that it places the CLI for human use and the API that agents read on the same surface.

vibe-tests — A harness for measuring generation quality

What I personally found most noteworthy in Astryx is vibe-tests. This is a harness that structurally evaluates how correctly an LLM can generate UI code using the design system.

internal/vibe-tests/README.md describes it as something that sends the same prompt to different design system configurations, makes results measurable, and compares them. The three comparison targets are Astryx, a shadcn/ui baseline, and plain HTML + inline CSS.

Additionally, principles are explicitly stated such as: keep evaluation logic equivalent across all targets, do not disclose expectedComponents to the agent as it is for evaluation purposes only, and launch sub-agents fresh without context each time. The design is to not inform the agent of the answers, but to have it discover them from the documentation and tools provided by each system.

5 dimensions of static evaluation

Static evaluation (src/universal-eval.ts) scores across 5 dimensions, each from 0 to 100 points.

  • Correctness: TypeScript errors, non-existent props, misuse of DOM events, etc.
  • Accessibility: icon-only buttons without aria-label, inputs without labels, onClick on non-interactive elements, missing alt attributes, etc.
  • Code Quality: nesting, branch count, overly long functions, any, console.log, missing keys in map, etc.
  • Efficiency: excessive decoration, boilerplate, duplication, number of styling decisions per element, etc.
  • Maintainability: hardcoded colors, spacing, typography, semantic token ratio, locality of state, etc.

Looking at the deduction logic for Correctness, -15 for each TypeScript error. Phantom props such as onPress, which do not fire in React DOM, are also detected. onPress is treated as critical at -15, and onChange on a button element is treated as moderate at -8, and so on.

design-judge — Scoring appearance with a Vision LLM

Design quality that cannot be measured by static analysis is handled by design-judge in src/design-judge.ts. It compares a screenshot of the generated output with a PNG of the ideal state placed in ideals/, and scores visual fidelity using a Vision LLM.

The reference images are PNGs based on Figma and HTML, and a requirement is that they are not generated output from Astryx, baseline, or HTML. ideals/README.md also states that images should be 1920×1200 PNGs and that at least one designer should review them. Currently, 55 prompt IDs and 62 images total are prepared.

Scoring is not decided in a single run; normally it is executed 3 times and the median is adopted.

degradation mode — Measuring whether quality holds up midway

Degradation mode is a mode for measuring whether AI forgets the constraints of the design system as conversation grows longer and drifts toward writing its own styles. It has the agent produce output over a 10-turn conversation, interspersing unrelated questions like "How do I center a div?" midway through. Output is then recorded at turns 0, 6, and 10.

The aim is to confirm whether design system patterns can be consistently used throughout iterative development. A distinctive characteristic is that not only the quality of one-shot generation, but also problems closer to instruction persistence and context rot when conversation continues are included in the evaluation targets.

v0 Design Systems 2.0

v0's Design Systems 2.0 is an approach that teaches the generation tool about your own design system.

https://v0.app/docs/design-systems-2

Once your own design system is registered with v0, subsequent chats will generate UI that conforms to your own components, tokens, and conventions. Registered items are saved to the workspace as skills (instruction sets held by v0).

Import sources support a wide range including GitHub repositories, Figma frames, documentation links such as Storybook, attached files, and environment variables for private npm.

A verification step is included in the registration flow. After v0 reads the source, it builds a starter app once to verify that its understanding is correct, and when the user reviews and approves it, it is saved as a skill. Rather than jumping straight into production generation, the policy is to increase certainty by inserting a verification step.

What is explicitly stated as policy is grounding. If a component, prop, or token cannot be verified from the source, v0 is supposed to not use it. This is a mechanism to prevent fabricating things that do not exist.

Furthermore, v0's design system has been progressively improved with import from GitHub on February 25, 2026, direct linking with team skills on March 16, and renewal of the design system library and improvements to generation quality on June 26, with various features continuing to be added.

Mitsubishi Electric Serendie

Serendie is Mitsubishi Electric's design system.

https://serendie.design/ai/

Published in November 2024, it provides various packages on GitHub such as React components, and items such as the Serendie UI Kit published on Figma Community.

AI support consists of the following.

  1. Remote MCP server
  2. Agent Skills (serendie-overview)
  3. Figma Plugin

In addition, llms.txt (a simplified version and a detailed llms-full.txt) is also prepared.

The Serendie MCP server can be used simply by setting https://serendie.design/mcp as the endpoint. There are 4 MCP tools, which can retrieve a list of Serendie UI components and their properties, design tokens, Serendie Symbols, and various guidelines.
Configuration examples and introduction methods for major coding agents such as Claude Code, Codex, and Cursor are prepared for the MCP server and Agent Skills.

For Claude Code, MCP and Skills are bundled in the Plugin and can be installed with the following commands.

/plugin marketplace add serendie/serendie
/plugin install serendie@serendie

SmartHR Design System

SmartHR Design System is SmartHR's design system.

https://smarthr.design/
https://github.com/kufu/smarthr-design-system/tree/main

The implementation foundation is the React library smarthr-ui.

Since May 2026, plugins for Claude Code / Cursor / Codex have been established, and currently two types of Agent Skills are provided: component-guidelines and design-pattern-guidelines. Rather than an MCP server, the configuration distributes how to use components and design patterns as skills.

The two types of skills are as follows.

  • component-guidelines: A guide for correctly using each component of smarthr-ui. There are 104 component guides, composed of Props and type information, Do/Don't derived from eslint-plugin-smarthr, and checklists.
  • design-pattern-guidelines: A guide for correctly implementing page layouts and UI patterns of SmartHR products. 22 pattern guides are prepared.

Individual guides in component-guidelines are operated with automatic generation based on metadata, rule READMEs from eslint-plugin-smarthr, and checklists. When AI-focused guides are separately maintained by hand, updates tend to fall behind and the guides become outdated, but this structure makes that divergence less likely to occur.

/plugin marketplace add kufu/smarthr-design-system
/plugin install smarthr-design-system@smarthr-design-system

Organizing the 4 approaches

Even among efforts that share the same goal of "handing a design system to AI," there are differences in where the emphasis is placed: evaluating, ingesting, referencing, or converting to guides.

Target Approach Deliverables Characteristics
Meta Astryx Evaluation / verification Design system + evaluation harness (vibe-tests) Scores agent output with static evaluation and Vision LLM, and measures whether quality holds up midway
v0 Design Systems 2.0 Ingestion / grounding Skill creation on the generation tool side Reads in proprietary source and prevents use of components, props, and tokens that cannot be verified
Mitsubishi Electric Serendie Reference interface Remote MCP + Agent Skills + llms.txt + Figma Plugin Makes components, tokens, icons, and guidelines retrievable from agents and Figma
SmartHR Design System Skill-ification of implementation guides Agent Skills / plugins Provides how to use smarthr-ui components and UI patterns for Claude Code / Cursor

Serendie and SmartHR are oriented toward setting up entry points for agents to access design system information. While Serendie broadly prepares reference channels including MCP, llms.txt, and Figma Plugin, SmartHR organizes how to use smarthr-ui components and UI patterns as Agent Skills.

On the other hand, Astryx goes beyond simply making the design system available for use, venturing into "measuring whether it was used correctly." A distinctive characteristic is that it possesses an evaluation mechanism that exists beyond the stages of handing over information and having it ingested into generation tools.

Also, v0's grounding and Astryx's deduction for non-existent props are both approaches to the challenge of hallucination from different angles, which makes for an interesting contrast.
Whether to constrain on the generation side with "do not use what cannot be verified," or to measure on the evaluation side with "deduct points for using non-existent props" — the ways of tackling the same challenge diverge.

Summary

Through investigating these four initiatives, I organized the approaches to handing design systems to AI agents. Building blocks such as MCP, Agent Skills, and evaluation harnesses have appeared one after another, and combining agents with design systems is becoming a realistic option.

With directions such as evaluating, ingesting, referencing, and converting to guides, the situation is arriving where you can choose an approach by starting from what you want to solve first when thinking about how to hand your own design system to AI.


Claudeならクラスメソッドにお任せください

クラスメソッドは、Anthropic社とリセラー契約を締結しています。各種製品ガイドから、業種別の活用法、フェーズごとのお悩み解決などサービス支援ページにまとめております。まずはご覧いただき、お気軽にご相談ください。

サービス詳細を見る

Share this article