Getting started with spec-driven development using GitHub's Spec Kit

2025.09.30
This page has been translated by machine translation. View original
Spec Kit is a toolkit for specification-driven development (SDD) provided by GitHub (specification is called "Specification" in English).
Since the emergence of Claude Code, AI coding has become more accessible. While many are amazed by its impressive coding capabilities, others struggle with getting the AI to converge on expected outputs rather than diverging into over-engineered implementations.
Let's look at @t-wada's "Typical 'AI collaboration' patterns (as of early summer 2025)".
Spec Kit is a framework focused on creating specifications (Design Docs, Architecture Decision Records/ADR, Product Requirements Documents/PRD) and task breakdown that AI assistants can easily code from, based on the idea that "AI should be able to generate correct code if there are clear specifications."
Therefore, development follows this flow:
/specify : what to build (WHAT)
/plan : how to build it (HOW)
/tasks : task breakdown (TODO)
/implement : implementation (DO)
If we map Spec Kit's slash commands to @t-wada's pattern, it would look something like this:
Claude Code also separates phases into planning and execution:
Plan (Reasoning): reasoning and planning phase
Act (Acting): action and execution phase
The final slash command Spec Kit executes is /implement.

Since the goal is to complete the implementation by executing the /implement command, Spec Kit (specification-driven development) could be said to focus on supporting the planning phase of the development process.
In this article, I'll implement a hello world-level API server of less than 20 lines using Spec Kit's specification-driven development process.
 Tips for Using Spec KitAfter trying Spec Kit, here are some important points I've noticed:
 Project memory is instructed via /constitutionIn Spec Kit, after initializing the project, you execute the /constitution command.
"Constitution" is the project memory instruction, similar to CLAUDE.md or AGENTS.md.
 /specify is the unit of a pull request, /tasks is the checklist of tasksWhen you execute the /specify command in Spec Kit, it creates a Git branch.

It's helpful to think of /specify as a pull request (PR) unit and the output of /tasks as a checklist within the PR.
 Instruct not to over-engineerNot specific to Spec Kit, but as a general principle in AI coding, AI loves to over-engineer.
Be sure to craft prompts that encourage simpler implementations.
 What the Prompter Really WantedSpecification-driven development seems mainly targeted at enterprise development environments.
However, since this is an introductory article, I'll use specification-driven development to create a simple Python & Flask API server of less than 20 lines as our target output:
from flask import Flask, jsonify, request

app = Flask(__name__)

@app.route("/health", methods=["GET"])
def health():
    return jsonify({"status": "healthy"}), 200

@app.route("/greeting", methods=["GET"])
def greeting():
    name = request.args.get("name", "world")
    return f"hello, {name}!"

if __name__ == "__main__":
    app.run()
It provides two GET endpoints:
A health check
A message that accepts GET parameters
 Prerequisites for Using Spec KitSetting up Spec Kit is relatively easy:
OS: Mac, WSL2 on Windows, etc.
AI agent: GitHub Copilot, Claude Code, Gemini CLI, etc.
It's particularly suitable for trying specification-driven development in environments where only GitHub Copilot is available.
 Test EnvironmentFor this verification, I used the following environment:
Spec Kit: 0.0.54
OS: Mac
VS Code
AI agent: GitHub Copilot
Model: GPT-5
As of 2025/09/28, the multipliers for the main models in Copilot Premium's paid plan are:
GPT-4.1: 0
GPT-5: 1
Claude Sonnet 3.5: 1
Claude Sonnet 4: 1
Claude Opus 4: 10
 Spec Kit Development FlowHere's the flow for developing a new API server:
Install Spec Kit
Initialize Spec Kit
Use /constitution for project memory instructions
Use /specify to communicate requirements (WHAT)
Use /plan to communicate specifications (HOW)
Use /tasks for task breakdown
Use /implement for implementation
Focusing particularly on /specify (WHAT) and /plan (HOW) allows the AI agent to work more independently as it progresses through task division and implementation phases.
 1. Install Spec KitSpec Kit is built with Python.
Let's install it using the uv package manager:
$ curl -LsSf https://astral.sh/uv/install.sh | sh
$ uv tool install specify-cli --from git+https://github.com/github/spec-kit.git
 2. Initialize Spec Kit$ specify init <project name>
Follow the wizard to select:
AI assistant
Shell type (I chose sh for this example)
Let's open VS Code:
$ code .
 3. Project Memory Instructions with /constitutionThe first thing to do when creating a new project is to create the project memory, similar to CLAUDE.md or AGENTS.md.
In Spec Kit, you create memory using /constitution:
/constitution

Flask API project development guidelines. Using Python 3.13, Flask 3.1, uv package management, ruff for static analysis and formatting, and hatching for building.
Flask runs in CLI mode on localhost:8080. Using Blueprint pattern with one feature per file, server errors use only HTTP status 400/500.
No type annotations needed, extremely simple exception/input processing, focus on simplicity, no logging required.
Testing uses pytest 8.4, preparing only minimal test cases, with test files corresponding to BluePrint files.
When implementing, ensure passing pytest unit tests and ruff static analysis and formatting.

Main process should be called via uv:

$ uv run flask run
$ uv run ruff format/check .
$ uv run pytest
In AI coding, limiting the scope of changes often leads to better results.

The Blueprint pattern is a Flask development approach that allows modularization of functionality and responsibilities.
After a while, you'll see a "Constitution Update Summary" message, and .specify/memory/constitution.md will be updated, along with templates under .specify/templates/ for the /plan and /tasks phases:
.specify/templates/plan-template.md
.specify/templates/tasks-template.md
The constitution is now ready to guide feature development with clear gates for Flask Blueprint architecture, simplified error handling, and mandatory TDD with pytest validation.
From .specify/memory/constitution.md:
 Spec Kit Flask Constitution Core Principles I. Flask Blueprint ArchitectureEvery feature MUST be implemented as a separate Flask Blueprint in its own file. Blueprints MUST be modular, self-contained, and focused on a single responsibility. No monolithic route definitions - each Blueprint handles one functional domain (users, orders, auth, etc.).
 4. Communicate Requirements (WHAT) with /specifyNow, let's implement an API server with just a health check endpoint from scratch.
First, use the /specify command to communicate requirements (user stories/WHAT):
/specify Simple health check API endpoint. GET /health checks service operational status. No DB connection or other processing needed for connectivity check. Response in JSON format
In the background, the following command executes and switches to the 001-simple-health-check branch. Note that Spec Kit maintains a 1:1 relationship between /specify units and branches:
$ .specify/scripts/bash/create-new-feature.sh --json "simple health check endpoint returning JSON status at GET /health"
If the AI agent is uncertain, you might have an interaction like this:
AI Assistant
Let me apply those edits now. Would you like me to proceed with fixing the Markdown formatting, or should I leave it as-is? Let me know and I'll continue (next step would be updating the file, then reporting readiness for /plan).
Developer
proceed with fixing them.
After resolving any uncertainties together, the /specify phase completes with a message like:
Specification Summary
I've created a comprehensive feature specification for the Health Check Endpoint that includes:
Core Requirements Captured:

✅ GET endpoint at /health for monitoring systems

✅ JSON response format for operational status

✅ Lightweight design bypassing heavy operations

✅ HTTP 200 status code for successful health checks

✅ Fast response time for reliable monitoring

...
Next Steps
The specification is now ready for the /plan command, which will:
Generate technical implementation plan
Define Flask Blueprint architecture
Create test scenarios following TDD principles
Ensure compliance with the project constitution
Checking specs/001-this-feature-will/spec.md, eight functional requirements (FR) were created. Here's an excerpt:

 Requirements (mandatory)
 Functional Requirements
FR-001: System MUST provide a health check endpoint accessible via GET request to /health
FR-002: System MUST return operational status information in JSON format
FR-004: System MUST bypass heavy operations (database connections, file I/O) to ensure rapid response
FR-005: System MUST provide consistent response format for reliable automated monitoring integration
FR-006: System MUST indicate service availability status clearly in response payload
FR-007: System MUST respond within [NEEDS CLARIFICATION: specific response time requirement not specified - under 100ms? 500ms?]
FR-008: System MUST handle [NEEDS CLARIFICATION: error scenarios not specified - what HTTP status codes for degraded states?]

FR-007 is a performance requirement for response time.
Note that there's a comment indicating clarification is needed (NEEDS CLARIFICATION) regarding the threshold.
 5. Communicate Specifications (HOW) with /planNext, use /plan to communicate implementation methods (HOW):
/plan Implement health check endpoint as Flask Blueprint in src/blueprints/health.py. Generate fixed response in {"status": "ok"} format using jsonify. Create pytest tests.
After interactively resolving any uncertainties as before, the design phase completes with an "Implementation Plan Execution Complete! 🎉" message:
✅ Phase 0: Research complete

✅ Phase 1: Design complete

✅ Phase 2: Task planning complete

✅ Constitutional Gates: All checks passed

✅ Clarifications: All resolved through user implementation details
Next Steps
The implementation plan is ready for the /tasks command, which will:
Generate 8-10 specific, ordered tasks in tasks.md
Include setup, TDD test creation, and Blueprint implementation
Follow constitutional principles with parallel test execution
Enable implementation phase with clear task dependencies
 6. Task Breakdown with /tasksGitHub issues and pull requests allow for creating task lists with checkboxes:
https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/about-tasklists
Since Spec Kit creates branches based on /specify granularity, eventually creating pull requests, this /tasks phase can be considered task listing.
Having already communicated specifications thoroughly, simply execute /tasks.
After interactively resolving any uncertainties, the task breakdown phase completes with a "Tasks Generation Complete! 🎉" message.
You can also check the specs/branch-name/tasks.md file:
 Phase 3.1: Setup Phase 3.2: Tests First (TDD) ⚠️ MUST COMPLETE BEFORE 3.3Phase 3.1: Setup (4 tasks)

T001: Create project structure (directories)

T002: Initialize Python 3.13 project with uv and Flask 3.1 dependencies

T003 [P]: Configure ruff for linting/formatting

T004 [P]: Create Flask application factory

Phase 3.2: Tests First - TDD (2 tasks)

T005 [P]: Contract test for GET /health endpoint

T006 [P]: Unit test for health Blueprint

Phase 3.3: Core Implementation (3 tasks)

T007: Health Blueprint implementation with /health route

T008: Register health Blueprint in Flask app

T009: Create package init files

Phase 3.4: Integration (1 task)

T010: Flask application entry point configuration

Phase 3.5: Polish (3 tasks)

T011 [P]: Validate tests with pytest

T012 [P]: Validate code formatting with ruff

T013: Manual quickstart validation
...
The task breakdown is ready for the /implement command or manual execution! 🚀
Check for any unintended tasks.
Task dependencies (TDD Compliance: Tests (T005-T006) come before implementation (T007)) and tasks that can be executed in parallel (Parallel Execution: 6 tasks marked [P] for independent execution) are also scrutinized.
 7. Implementation with /implementWhen you execute /implement, implementation begins. If you see questionable implementations starting, guide corrections until the implementation completion message appears:
🎉 Implementation Complete! Health Check Endpoint Successfully Deployed
Executive Summary
Feature: Health Check Endpoint

Branch: 001-this-feature-will

Status: ✅ FULLY IMPLEMENTED AND VALIDATED

Total Tasks Completed: 13/13 (100%)
...
Implementation Success Factors
🎯 TDD Methodology: Tests written first, confirmed failures, then implemented
📐 Constitutional Adherence: All 5 core principles followed exactly
⚡ Performance Focus: Lightweight static response with no heavy operations
🧪 Comprehensive Testing: Both contract and unit test coverage
🔧 Modern Toolchain: Python 3.13, Flask 3.1, uv, ruff, pytest 8.4
📋 Task Discipline: All 13 tasks completed in dependency order
The health check endpoint is production-ready for monitoring integration, with sub-100ms response times, comprehensive test coverage, and full constitutional

compliance. The implementation successfully demonstrates the Flask Blueprint pattern with TDD methodology in a modern Python development environment.
Ready for monitoring system integration! 🚀
Great job!
I noticed the AI made multiple cURL requests to the API server during implementation.
If there was a response time requirement in specs/001-this-feature-will/contracts/health-api.md (like Response time: < 100ms (target)), it diligently made multiple cURL requests to measure and ensure this contract was met.
 What the AI Agent ImplementedThe implementation results are published in this repository:
https://github.com/quiver/spec-kit-flask
The main output from the branch specified with /specify is this code (src/blueprints/health.py):
from flask import Blueprint, jsonify

health_bp = Blueprint("health", __name__)

@health_bp.route("/health", methods=["GET"])
def get_health():
    """Health check endpoint for service monitoring."""
    return jsonify({"status": "ok"})
It's implemented using the Flask Blueprint approach.
However, I should mention that reaching this simple output required careful prompt engineering and multiple iterations of trial and error.
 ConclusionI introduced Spec Kit, GitHub's toolkit for specification-driven development (SDD).
Spec Kit divides the process into fine-grained phases, building documentation at each phase with reference to existing documents, ultimately guiding the AI agent to code autonomously.
The custom commands like /specify serve as the interface for this document and prompt management.
While Spec Kit's compatibility with various AI assistants is a benefit, it also potentially limits the optimized development experience that could be achieved with a single, specific AI assistant.
It might be interesting to set an AI assistant like Claude Code and observe what's happening behind the scenes.
 Referencesgithub/spec-kit:  Toolkit to help you get started with Spec-Driven Development
Spec-driven development with AI: Get started with a new open source toolkit - The GitHub Blog
"Specification-driven development" as an external prompt with Spec Kit