GitHub Spec Kit: A Practitioner's Guide to Spec-Driven Development

A spec is not documentation. It is the operating contract between your intent and the agent executing it. GitHub Spec Kit makes that contract the center of your workflow.

GitHub Spec Kit is GitHub’s open-source toolkit that brings spec-driven development into your AI coding agent through seven slash commands and a structured .specify directory. You install the specify CLI, run specify init in your project, and your agent gains the commands it needs to turn a description of what you want into a verified implementation. It works with GitHub Copilot, Claude Code, Gemini CLI, Cursor, Codex, and more than thirty other agents.

The short answer for the featured snippet: Spec Kit is an open-source CLI toolkit, published by GitHub under the MIT license, that scaffolds a spec-driven workflow inside any supported AI coding agent by generating slash commands, template files, and a structured project directory on initialization. The project has over 118,000 GitHub stars as of writing.

If you are new to the underlying method, start with what spec-driven development is. The tool is downstream of the method. Come back once the method clicks.

The problem Spec Kit solves

AI coding agents are fast. They are also stateless and literal. Describe a goal vaguely, and the agent pattern-matches toward the most common implementation of that description, not toward your specific intent. The result compiles. It is syntactically correct. It just does not do what you meant.

The fix is not a better prompt. It is a structured artifact, written and approved before implementation starts, that gives the agent the clarity it cannot infer from a chat message. That artifact is the spec.

Spec Kit does not change that idea. It operationalizes it. Instead of maintaining discipline manually, you get slash commands that walk the agent through each phase, templates that force you to write the right things, and a directory structure that keeps artifacts organized across features.

Install and initialize

Spec Kit requires Python 3.11 or later, uv, and git. Install uv first if you do not have it.

# Install the specify CLI, pinned to a release tag
uv tool install specify-cli --from git+https://github.com/github/spec-kit.git@vX.Y.Z

# Or run it in one shot without a persistent install, good for a quick tryout
uvx --from git+https://github.com/github/spec-kit.git specify init my-project

Replace vX.Y.Z with the latest tag from the Spec Kit releases page. For persistent use, prefer uv tool install. Once installed:

specify init my-project --integration claude    # Claude Code
specify init my-project --integration copilot   # GitHub Copilot
specify init my-project --integration gemini    # Gemini CLI
specify init my-project --integration codex     # OpenAI Codex

The --integration flag tells specify which agent you use. It writes command files in the right format for that agent. Omit it in an interactive session and specify prompts you to choose. In CI or piped runs it defaults to Copilot.

After initialization, your project gains a .specify directory and an agent-specific context file. For Claude Code that file is CLAUDE.md. For Copilot it is .github/copilot-instructions.md.

What specify init scaffolds

.specify/
- memory/
  - constitution.mdproject principles, tech standards, non-negotiables
- scripts/bash helpers invoked by constitution, specify, plan, and tasks commands
- specs/one subfolder per feature, created during the workflow
  - 001-feature-name/feature
    - spec.mdfunctional requirements and user stories
    - plan.mdtechnical architecture and choices
    - tasks.mdordered task list with parallel markers
    - data-model.mdentities and schemas
    - research.mdlibrary and API research gathered during planning
    - contracts/
      - api-spec.jsonAPI contract for the feature
- templates/spec, plan, and tasks templates that constrain what the agent writes
CLAUDE.mdagent context file, automatically updated by specify init

The templates inside .specify/templates/ are what actually do the work. They instruct the agent to mark ambiguities with [NEEDS CLARIFICATION] instead of guessing, to separate functional requirements from technical decisions, and to produce checklists that act as acceptance gates within each artifact. The discipline is baked into the templates, not into your willpower.

The seven commands, in order

After specify init, your agent has these commands. The core commands run in the sequence below. Optional commands can be inserted where they add value.

The Spec Kit workflow: constitution to implementation

Input

A description of what you want to build and why, without a tech stack decision

01/speckit.constitution (run once per project)
Establishes the governing principles for the project: code quality standards, architectural rules, testing requirements, and constraints. Writes to .specify/memory/constitution.md. Every subsequent command references this file.
02/speckit.specify (per feature)
Takes your high-level description and produces spec.md with functional requirements and user stories. Creates the feature branch and directory automatically. Focus on what and why. No stack decisions here.
03/speckit.clarify (optional, before plan)
Runs a structured clarification pass over the spec, asking sequential questions to surface gaps or contradictions. Records answers in a Clarifications section. Recommended before planning to reduce rework downstream.
04/speckit.plan (per feature)
Takes the spec and your tech stack input, then produces plan.md with architecture decisions, plus data-model.md, contracts/, and research.md. Every technical choice traces back to a requirement in the spec.
05/speckit.tasks (per feature)
Reads plan.md and supporting files, then produces tasks.md: an ordered list of implementable units with parallel markers ([P]) for tasks that can run concurrently. Each task is independently testable.
06/speckit.analyze (optional, before implement)
Runs a cross-artifact consistency and coverage check: does every spec requirement appear in the plan? Does every plan decision map to at least one task? Catches misalignments before they become bugs.
07/speckit.implement (per feature)
Validates that constitution, spec, plan, and tasks are all present, then executes tasks in dependency order, respecting parallel markers. Reports progress and surfaces errors for resolution.

Output

A verified implementation with a traceable audit trail from user intent to working code

Two additional commands exist for specific needs. /speckit.checklist generates quality checklists for any artifact, essentially unit tests for English: does the spec meet the criteria for being unambiguous and complete? /speckit.taskstoissues converts tasks.md into GitHub Issues for teams that track work there.

One thing to understand about the gate between tasks and implement: there is no machine-enforced lock. The agent validates prerequisites exist, but you decide when to call /speckit.implement. The workflow documentation is explicit about this: your role at each phase is to reflect and refine, not just to approve by omission. The agent generates artifacts; you verify them before advancing.

The constitution is not optional

Every SDD tool needs a stable memory layer: the project-level context that every feature draws from. In Spec Kit that is the constitution.

The constitution lives at .specify/memory/constitution.md and contains the principles that govern all development: what abstractions are non-negotiable, testing requirements, simplicity rules, security policies, organizational constraints. It is established once with /speckit.constitution and referenced by every subsequent /speckit.plan and /speckit.implement call.

The spec-driven.md methodology document from GitHub describes the constitution as “immutable principles” and documents a formal amendment process for changing them: explicit rationale, maintainer review, backwards-compatibility assessment. That is a signal about how seriously the project treats architectural consistency. Write the constitution on day one and keep it alive.

The three use cases from GitHub

GitHub’s own framing describes three situations where Spec Kit earns its setup cost.

Greenfield development, which the announcement calls zero-to-one, is the most obvious case. You have an idea, not a codebase. Spec Kit’s workflow forces you to define what success looks like before the agent writes anything, which eliminates the most common failure mode: the agent builds a technically correct thing that does not match what you imagined.

Feature work in existing systems, the brownfield case, is where the tool is most practically useful day to day. Adding to a real codebase requires capturing how the new feature interacts with what is already there. The spec encodes those interaction constraints. The plan encodes architectural decisions that respect the existing system. The result is new code that feels native instead of bolted on. GitHub’s announcement notes that this use case may require additional context engineering to give the agent enough awareness of the existing codebase.

Legacy modernization is the hardest case and the most interesting. When the original intent of a system has been lost to time and undocumented code, Spec Kit gives you a process to capture the essential business logic in a modern spec before rebuilding. You are not just migrating code. You are reconstructing intent, which is the harder problem.

Honest weaknesses: what Böckeler found

Birgitta Böckeler at Thoughtworks published an independent evaluation of Spec Kit in October 2025 on martinfowler.com. Her critique is worth reading. The main observations:

One workflow for all sizes. Spec Kit has one opinionated workflow. For a three-day feature it creates significant artifacts that feel proportionate. For a small bug fix it produces user stories with sixteen acceptance criteria, including “as a developer, I want the transformation function to handle edge cases gracefully.” Böckeler’s phrase: using a sledgehammer to crack a nut. The tool does not yet provide a lightweight path for smaller changes.

Volume of markdown to review. Each feature produces multiple files: spec, plan, tasks, data model, research, contracts. Böckeler found them repetitive and verbose. Her point is fair: if reviewing the spec artifacts takes as long as implementing the feature manually would have taken, the economics do not work. An effective SDD workflow needs a good spec review experience, and a wall of markdown is not it.

Agents do not always follow all instructions. More context in the window is not the same as better compliance. Böckeler observed the agent treating research notes about existing classes as new specifications and regenerating them as duplicates. Larger context windows help. They do not guarantee the agent correctly prioritizes everything in the input.

Spec-first versus spec-anchored ambiguity. GitHub’s framing describes specs as living artifacts that evolve with the project. But Spec Kit creates a branch per spec, which implies the spec lives for the lifetime of a change request, not a feature. Böckeler noted that this creates confusion, and a community discussion in the repository confirms it. As of her evaluation, Spec Kit is spec-first in practice, not spec-anchored over time.

These are real constraints, not reasons to dismiss the tool. Spec Kit works best for features of meaningful size on greenfield or well-understood brownfield codebases. For exploratory work and small tasks, the overhead is not proportionate.

Spec Kit vs Kiro vs Claude Code with pi-sdd-kit

Three SDD tools have significant traction now. They are not the same tool with different names.

Spec Kit vs Kiro vs Claude Code with pi-sdd-kit

Different scope, different tradeoffs. Spec Kit wins on agent breadth. Kiro wins on IDE integration. pi-sdd-kit adds an explicit machine-readable approval gate.
Feature	Spec Kit	Kiro	Claude Code + pi-sdd-kit
Core workflow	7-command CLI, any agent	3-step IDE workflow in Kiro IDE	5-phase slash commands in Claude Code
Stable memory layer	constitution.md	steering/ (product, tech, structure)	steering/ with 4 context files
Machine-readable approval gate	None: human decides when to advance	None: human decides when to advance	.status file with explicit token
Agent support	30+ agents via specify init	Kiro IDE only	Claude Code primary
Per-feature artifacts	6+ files: spec, plan, tasks, data-model, research, contracts	3 files: requirements, design, tasks	Per-feature folder with .status gate
Customization	Extensions and presets system	Steering documents	Plain markdown, any structure
Cost	Free, MIT license	Part of AWS Kiro offering	Free, open source npm package

Different scope, different tradeoffs. Spec Kit wins on agent breadth. Kiro wins on IDE integration. pi-sdd-kit adds an explicit machine-readable approval gate.

Here is the honest take from someone who uses Claude Code and pi-sdd-kit as a daily workflow.

Spec Kit is the better choice when you work across multiple agents, want a community-maintained open-source standard with an extensions and presets ecosystem, or work in teams where different developers use different tools. The constitution-based approach is genuinely good for encoding organizational constraints. The extensions system lets you add domain-specific workflows. The presets system lets you override templates without forking the core.

Spec Kit is less suited when you want a machine-readable gate that stops the agent from advancing without explicit approval. The .status token in pi-sdd-kit is not ceremony: it removes the ambiguity between “I wrote the spec” and “I approved the spec.” A completed tasks.md and an approved tasks.md look identical to an agent scanning the directory. The status token removes that ambiguity. Spec Kit leaves the distinction to human discipline.

For Kiro, the comparison is different. See what Kiro is for the full picture. The short version: Kiro is tightly integrated into a VS Code-based IDE and its three-step workflow is simpler in structure, but you are locked to that environment. Spec Kit trades IDE integration for agent breadth. Kiro’s three-file structure (requirements, design, tasks) is also notably lighter than Spec Kit’s six-plus files per feature, which makes Böckeler’s observation about artifact volume feel more pointed in comparison.

When this work deserves Spec Kit

Does this work benefit from Spec Kit?

Does this work benefit from Spec Kit?. Winner: Greenfield project with a weighted score of 50. Scale 1-5 (5 = best).
Criterion (weight)	One-off script	Small bug fix	Multi-session feature	Greenfield project
Outlives one sitting (3)	1	1	4	5
Spans multiple sessions or team members (3)	1	2	4	5
Requires architectural decisions (2)	1	1	4	5
Benefits from a living spec document (2)	1	1	3	5
Weighted score	10	13	38	50

Scale 1-5 (5 = best). Highlighted column: winner by weighted score.

Low score: just prompt and go. High score: Spec Kit earns its setup cost. A bug fix that takes twenty minutes does not need a constitution.

Spec Kit has real overhead, and that overhead is not proportional for small work. The tool’s own documentation shows greenfield and feature-scale examples because that is where it is designed to shine. The Böckeler critique is correct: the tool is a sledgehammer for a bug fix. Do not use it for a bug fix.

Use it for features where the cost of building the wrong thing is higher than the cost of the spec. Use it for greenfield projects where you want to establish architectural consistency from the start and carry it forward across every feature. Use it for teams where you want shared spec artifacts in version control that serve as the source of truth across developers.

Extensions and presets: making Spec Kit your own

One part of Spec Kit that distinguishes it from simpler tools is the customization system. It has two components.

Extensions add new capabilities: new commands, new workflow phases, integrations with external tools like Jira, post-implementation code review steps, V-Model test traceability. An extension is installed once and adds command files to your agent’s directory.

Presets customize how existing capabilities work: enforcing a compliance-oriented spec format, applying organizational terminology, adding mandatory security review gates to plans. A preset overrides the templates that Spec Kit and its extensions use to generate artifacts.

Both are community-maintained and searchable via specify extension search and specify preset search. The resolution order is clear: project-local overrides win over presets, presets win over extensions, extensions win over the core defaults. Multiple presets can be stacked.

The method is the point, not the tool

Spec Kit is an implementation of spec-driven development. The method exists independently of any tool. You can run SDD with plain markdown files, no CLI, no slash commands, and still get the benefit. The method is: write and approve a spec before implementation starts, keep the spec as the source of truth, and use clear gates to prevent the agent from racing ahead.

What Spec Kit gives you on top of that is consistency, a community, and an extensions ecosystem. The constitution template is well-designed. The separation of functional spec from technical plan is clear and enforced by the templates. The agent integrations handle the format differences across thirty-plus tools.

If you use Claude Code, the approach I built during a 13-app crypto fintech build in 70 days solo is in Spec-Driven Development with Claude Code. The pi-sdd-kit approach adds the .status gate and the three-layer context system (CLAUDE.md, steering, specs) that give the agent durable memory and an explicit, machine-readable approval signal. These approaches solve adjacent parts of the same problem. You can take ideas from both.

FAQ

What is GitHub Spec Kit?

GitHub Spec Kit is an open-source toolkit published by GitHub under the MIT license. It adds spec-driven development to your AI coding agent by providing a CLI tool (specify) that initializes a project with slash commands, templates, and a structured .specify directory.

After running specify init, your agent gains commands for each phase of the workflow: constitution, specify, clarify, plan, tasks, analyze, and implement. It works with over 30 agents including Claude Code, GitHub Copilot, Gemini CLI, and Cursor.

Is Spec Kit free?

Yes. Spec Kit is published under the MIT license. There is no paid tier, subscription, or usage gate. Install the specify CLI from the public GitHub repository using uv or pipx.

How do I install Spec Kit?

First install uv (https://docs.astral.sh/uv/). Then run: uv tool install specify-cli --from git+https://github.com/github/spec-kit.git@vX.Y.Z, where vX.Y.Z is the latest release tag from the releases page. Then initialize your project: specify init my-project --integration claude for Claude Code, or --integration copilot for GitHub Copilot.

For a quick one-shot tryout without a persistent install: uvx --from git+https://github.com/github/spec-kit.git specify init my-project

Does Spec Kit work with Claude Code, Cursor, and Copilot?

Yes. Spec Kit works with over 30 AI coding agents: Claude Code, GitHub Copilot, Gemini CLI, Cursor, OpenAI Codex, Kiro CLI, Goose, Windsurf, and many others. The specify init command with the --integration flag writes command files in the right format for your agent.

Run specify integration list to see all available integrations in your installed version.

Spec Kit vs Kiro: which should I choose?

Spec Kit is agent-agnostic: it works with 30+ tools and gives you a community ecosystem of extensions and presets. Kiro is a VS Code-based IDE with a tighter three-step workflow (requirements, design, tasks) integrated directly into the editor UI.

If you want agent flexibility and an open-source standard that any teammate can use with their preferred tool, use Spec Kit. If you want a streamlined IDE experience and are willing to commit to the Kiro environment, read the dedicated guide on Kiro for the full comparison. The workflow logic is similar in both. The execution environments are very different.

What does specify init actually create?

It creates a .specify directory with a memory folder for the constitution, a scripts folder with bash helpers, a specs folder where feature subfolders will live, and a templates folder. It also creates or updates your agent context file with a Spec Kit section that points the agent to the workflow.

Feature subfolders (like .specify/specs/001-feature-name/) are created during the workflow by /speckit.specify, not by init. Init gives you the scaffold; the workflow populates it.

What are the main weaknesses of Spec Kit?

Three worth knowing before you invest time. First, one workflow for all sizes: the tool is designed for feature-scale work and feels like overkill for small tasks and bug fixes. Second, no machine-readable approval gate: you decide when to advance to the next phase, which requires discipline to prevent the agent from racing ahead on a draft spec. Third, the artifact volume per feature is high (six or more markdown files), which can be tedious to review carefully for smaller changes.

Birgitta Böckeler's October 2025 analysis on martinfowler.com is the most thorough independent evaluation of these tradeoffs and worth reading before committing to the workflow.

Where to go next

The method behind this tool is spec-driven development. Read that first. The tool is valuable; the method is what makes the tool work.

If you use Claude Code, the workflow I run on production systems is in Spec-Driven Development with Claude Code: the three-layer context system, the .status gate, and the three sub-agents. It is compatible with Spec Kit’s ideas and complements the approach.

Install Spec Kit, run specify init, establish your constitution, and build one feature all the way through to /speckit.implement. The first run is slow. The second is faster. By the third, the spec-first habit is set, and the overhead that seemed daunting at the start is just the cost of building with intent instead of guessing.