Building a Claude Code Plugin for Spec-Driven Development

I have been spending a lot of time with Claude Code over the past few weeks. It is an incredible tool for writing code, but I kept running into the same problem. Claude would start making changes without a clear plan. Files would get edited, features would drift from the original idea, and I would lose track of what changed and why. I needed a way to bring structure to AI-assisted development without slowing things down.

So I built delta-spec, a Claude Code plugin for spec-driven development.

The Problem with AI-Assisted Development

When you pair with an AI, things move fast. That speed is the whole point. But speed without direction leads to a mess. I would ask Claude to add a feature and it would start editing files immediately. There was no proposal, no design review, no way to look back and understand why a decision was made.

Traditional spec-driven development solves this for human teams but it is way too heavy for the pace of AI coding. Writing full specification documents before every change would kill the workflow. I needed something lighter.

What is Delta-Spec

Delta-spec is a minimal system that uses delta specs instead of rewriting entire specification files. When you want to make a change, you describe what is changing rather than documenting the entire state of the system. Think of it like git diffs but for your specifications.

The plugin adds a set of skills to Claude Code that guide the workflow:

/ds:new <name> - Start a new change with a proposal. Define the problem, scope, and success criteria.
/ds:plan - Claude explores your actual codebase and creates a design spec that fits your existing patterns.
/ds:tasks - Generate implementation tasks from the design.
Implement - Do the work with clear guardrails.
/ds:archive - Merge the delta specs and archive the change for future reference.

The key insight is that /ds:plan does not just write a plan in isolation. It reads your code, understands your patterns, and designs an approach that fits. This makes AI-assisted changes feel like they belong in the codebase instead of being bolted on.

Dogfooding It to Build Itself

The best part of this project was using delta-spec to build delta-spec. Every feature I added went through the full workflow. Proposal, design, tasks, implementation, archive. I have 18 completed changes in the archive, each with a clear record of what problem it solved and how.

Some of the changes I tracked:

file-based-tasks - Tasks were getting lost when sessions restarted. Moved from ephemeral native tasks to persistent tasks.md files.
plan-dependencies-fix - Planning was blocked by unsatisfied dependencies which was unnecessarily restrictive. Made planning informational only and enforced dependencies at implementation time instead.
quick-workflow - Running three separate commands to start a change was tedious. Added /ds:quick to go from proposal to tasks with a single confirmation.
ds-batch - Creating multiple proposals one by one was slow. Added /ds:batch to describe all your features in free-form prose and let it parse them into proposals with dependency inference.
circular-dependency-resolution - Circular dependencies required manual untangling. Added cycle detection that automatically extracts the base concept and re-plans.

Each of these changes exists as a proposal, design, and task list in the archive. When I need to understand why something works the way it does, the answer is right there.

What I Learned

Claude Trusts Docs More Than Reality

This was the most frustrating lesson. During a code review of the plugin, Claude confidently declared that my ds-* skill naming convention was redundant because of how namespacing works in Claude Code plugins. It flagged it as a critical finding. The problem was that Claude was reasoning from documentation, not from actual installed behavior. The real behavior did not work that way at all.

I had to correct it manually. This taught me that when working with Claude on plugin development, you need to push it to verify assumptions by actually running the code rather than just reading the docs.

Dependencies Need Different Enforcement at Different Stages

Early on, I had dependencies enforced everywhere. If change B depended on change A, you could not even plan change B until A was done. This killed the batch planning workflow because you often want to plan multiple related changes in sequence before implementing any of them.

The fix was simple. Planning is safe so dependencies are informational. Tasks and archive are where order matters so dependencies are enforced there. This small change made batch workflows actually usable.

Persistent State Matters More Than You Think

Claude Code sessions are ephemeral. When a session ends, everything in memory is gone. I was originally using Claude Code’s native task system which meant my task lists disappeared on restart. Moving to file-based tasks.md files was an obvious fix in hindsight. Now tasks survive across sessions and you can pick up exactly where you left off.

Batch Operations Need Checkpointing

I ran a lot of batch operations through /ds:batch and they would regularly get cut short before finishing. Long running operations in Claude Code need progress tracking and the ability to resume. This is something I am still working on improving.

Invest in Your CLAUDE.md Early

Claude re-reads a lot of files across sessions because it does not retain context. I noticed it constantly re-exploring the same directories to orient itself. Adding a codebase map to CLAUDE.md cut down on redundant exploration significantly. If you are building a plugin, invest time in your project instructions early. It pays off on every single session.

Zero Dependencies, Just Markdown

One of the things I like most about Claude Code plugins is that you do not need to write TypeScript or build a server. Skills are just Markdown files with frontmatter that tell Claude what to do. The entire delta-spec plugin is a collection of .md files in a specific directory structure. No build step, no runtime dependencies, no package management.

How It Compares to Just Using Claude Code

You can absolutely use Claude Code without any structure and get a lot done. But for larger changes or anything you want to maintain long term, having a spec trail makes a difference.

Without delta-spec:

Changes happen immediately with no review step
There is no record of why decisions were made
Multiple related changes can step on each other
You lose context between sessions

With delta-spec:

Every change starts with a clear problem statement and scope
Designs are based on actual codebase exploration
Dependencies between changes are tracked
Archived changes create a searchable history

It adds a small amount of overhead but that overhead has saved me from going down the wrong path more times than I can count.

What is Next

I plan to keep using delta-spec for all my projects. There are still improvements I want to make around batch operation resilience and cross-plan conflict detection. I also want to explore having Claude validate its own technical assumptions during the planning phase rather than just inferring from documentation.

If you want to try it out, you can install it locally with Claude Code:

claude --plugin-dir /path/to/delta-spec

The plugin is open source and I would love feedback from anyone who gives it a shot.