Intro
A few months ago I had to work on a complex application on AWS: a React frontend on Amplify, several Lambda functions, Bedrock with AgentCore, Knowledge Bases, and Prompt Management. I was in a hurry, and the temptation was overwhelming: open Claude Code, throw in a generic prompt, and hope it would “figure it out.” Instead, I did something different — I wrote specifications, reviewed them, spent an entire day on it — and that day it felt like I hadn’t accomplished anything. Two days later I had a working application. If I had improvised, I’d probably still be debugging.
This experience changed my perspective on what it truly means to use AI for software development. It’s not about “vibe coding” — writing a vague prompt and hoping for the best — but something far more structured, and paradoxically more demanding. But before diving into the details, let’s look around: the signs of a radical transformation are already everywhere.
- Last December, Boris Cherny, an Anthropic engineer and creator of Claude Code (essentially the company’s flagship product), stated that in the previous 30 days, 100% of the work on the Claude Code repository had been done by Claude Code itself
- The sharp slowdown in stock prices for some SaaS companies suggests that the market has already priced in the tendency for companies to build software in-house rather than buying it from the usual big players
- Spotify declared that since December 2025 their best developers haven’t written a single line of code: they send instructions via Slack to their internal system “Honk” (based on Claude Code), which implements the changes, while the engineers focus on review and architecture To tackle this discussion, however, we need to leave behind the “Vibe Coding” hype and understand from the outset that we’re talking about a different way of conceiving the developer’s profession and the software assembly line. This approach requires method and discipline, and might even be hard to swallow for some, because it risks compressing certain “creative” phases of the work — which can sometimes be the most rewarding.
Is Software Development a Dead Profession?
According to many analysts, we’re looking at a “transformation,” so in a sense the answer is no. However, I believe the transformation will be so radical that within 5 to 10 years, none of us will see a job posting that simply says “Developer.” The software developer of tomorrow requires highly varied expertise (architectures, networks, processes, languages, data, …). I don’t know what this role will be called in the future, but for simplicity I’ll call it the “DESIGNER” (in the sense of system designer/architect).
The same “Designer” pattern is already being applied in other less common domains, such as writing and journalism. To give you an extreme example, a few months ago Luciano Floridi, one of the leading figures in the philosophy of information and the digital age, published a book titled Distant Writing: Literary Production in the Age of Artificial Intelligence. In this work, Floridi pursues an ambitious project of interweaving stories of minor characters (mentioned but secondary) from classic English novels, from Jane Austen to Virginia Woolf, into short stories (1,500–2,000 words each) where they meet in narrative chains that are plausible given the era, location, and social status. In interviews he has given, Floridi stated that he essentially “designed” the book and had it in a drawer for many years, but was able to realize it only through the use of LLMs to expand and write the individual stories, and to ensure that characters would encounter one another in a way consistent with their characteristics and the overall plot.
Prerequisites
To prepare for this transformation, I see mainly 2 prerequisites — one technical and one mental. Neither is optional: without the right skills you can’t produce quality specifications; without the right mindset you won’t have the patience to write them.
Skills
Software specifications have existed forever — they’re certainly not a 2026 novelty. Yet it’s surprising to note how the tech world is refocusing on the importance of this concept after having invented the transformer, cleared forests, and haunted stock markets with potential AI bubbles.
Spec-Driven Development techniques are obviously based on the concept of “Specification,” which can be understood at various levels of abstraction (e.g., user story, technical specification, code template, …). The Designer must therefore be able to read and write specifications across the entire stack and must have a methodical and rigorous approach to industrialize the work.
But how do I write specifications for a solution with frontend components, backend, a message broker, various containers, and the need to deploy it on one hyperscaler rather than another?
In the past, you needed to know the basics of computing, CPUs, memory, data modeling, and telecommunications networks. Now, you need to raise the level of abstraction and broaden the perspective. This means knowing data platforms, hyperscalers, authentication patterns, deployment models and containerization, automation pipelines, and managing software across dozens or potentially hundreds of branches.
To evolve from “Developer” of 2020 to “Designer” of 2026, you need:
- knowledge of basic DevOps practices
- foundational competencies in Solution Design and the ability to navigate the most common development and deployment patterns (microservices, message brokers, containers, transport and application protocols, Security, IaC, …)
- strong technical expertise in a specific area (e.g., frontend, data engineering, …)
- understanding the basics of LLMs, particularly the role of context and context engineering techniques
Mindset
For many people, development is a passion as well as a job, as demonstrated by the countless open-source development communities. We need to get used to the idea that the Designer’s work might be far less fun than today’s Developer’s work. This shift might be an unacceptable effort for everyone, but it’s very likely that this is exactly where the battle over skills valued by the job market will be fought: the ability to read and write specifications will be fundamental.
We also need to fight the impulse to have “everything right now”: we can’t expect to write a prompt and have the software ready. We truly need to apply a certain level of effort and genuinely use the skills we mentioned above. As I described in the introduction, the day “wasted” writing specifications saved me about two weeks of work. But the initial feeling was exactly that: of wasting time. It’s a counterintuitive investment, and the right mindset consists precisely in accepting it.
Core Concepts
Context Window
Most people think: “the more I put into the context, the better.” This is an idea that can lead you astray, and understanding why requires a minimum understanding of how models work.
LLMs are autoregressive models based on the Transformer architecture. The heart of this architecture is the self-attention mechanism: for each generated token, the model calculates an “attention” score against all previous tokens in the context. This has two important practical implications:
Quadratic complexity: the computational cost of attention grows as O(n^2) relative to context length. Doubling the context quadruples the cost. This is not just a latency and cost issue (which are also significant), but it degrades the quality of the output itself.
“Lost in the middle”: several studies (including the well-known paper by Liu et al., 2023) have shown that LLMs tend to pay more attention to information at the beginning and end of the context, “forgetting” what’s in the middle. In practice, if the crucial specification of your API is in the middle of an 80,000-token conversation, the model might simply not take it into account.
Although the LLM interface presents itself as a chat, we should always evaluate each interaction as if it were an isolated task, carrying along a conversational history that is often useless, pollutes the context window, and leads the model astray. This phenomenon is called Context Bloat.
Context Engineering
The term “Context Engineering” is often confused with prompt engineering, but they are distinct concepts. Prompt engineering concerns the formulation of a single request to the LLM. Context Engineering is something broader: it’s the systematic control of everything that enters the model’s context window — system prompt, persistent instructions (like CLAUDE.md), tool results, loaded code files, memory of previous interactions, and only lastly the user’s prompt.
Think of the context as a program: every element you insert is an instruction that the model will execute (or attempt to execute). The more contradictory or irrelevant instructions you insert, the more unpredictable the “program” becomes.
With this perspective, SDD techniques are essentially Context Engineering techniques: they maximize the effectiveness of the context window by making development modular (SPEC -> PLAN -> CLARIFY -> IMPLEMENTATION) and above all by minimizing noise. Each phase operates in a clean, dedicated context, with only the information relevant to that specific task.
Divide and Conquer
In the realm of software development, many people (including many of today’s developers) think that the LLM is only useful for writing code, while SDD techniques are based on the assumption that the Designer uses the LLM across the entire pipeline of software work.
| Phase | Objective | LLM Usage |
|---|---|---|
| Ideation | Exploration of the solution space | For a given problem, there are potentially infinite solutions, and LLMs are a formidable tool for exploring them |
| Specifications | Detailed definition of user requirements | Beyond defining detailed specifications, this phase also identifies any gaps and areas of ambiguity |
| Design | Having a solid baseline to write code without “improvising” | Research, expansion, deep-dive, and selection of software components to create/modify, plus the development plan and testing approach |
| Implementation | Translating the design into code | Writing code and tests |
| Testing | Software verification | Running tests and identifying bugs |
- For each phase, it’s worth evaluating the most suitable LLM on a case-by-case basis. For example, at the time of writing, Claude Opus 4.6 is among the top performers in pure coding, but models like the latest versions of ChatGPT, Gemini 3, or Kimi k2.5 can be more effective and creative in the solution exploration phases.
- Reusing the same context for an entire development cycle is strongly discouraged, even if the model supports millions of tokens. The reason is the Context Bloat discussed above: the architectural decisions from the Planning phase, the user story details from the Specification phase, and the code from the Implementation phase all compete for the model’s attention. The result is a progressive degradation in quality across all phases, not just the last one.
What Is SDD
Spec-Driven Development (SDD) is a paradigm that treats specifications as the primary source of truth for a software system. Code becomes a secondary artifact, generated or verified against the specification. Instead of the classic approach “write the code first, document later,” SDD inverts the flow: you write clear, structured specifications of the expected behavior and then generate, implement, or verify the code against them.
In other words: the specification is the product, the code is a byproduct.
This concept is not entirely new. API-first development with OpenAPI, BDD (Behavior-Driven Development), and contract-driven testing have existed for years. What changes today is that LLMs make it possible to automate the entire flow: from specification to technical plan, from plan to tasks, from tasks to code, from code to tests. The specification becomes a true control plane that orchestrates AI agents and human developers.
A recent paper on arXiv formalizes SDD as follows: “Specifications are the source of truth; code derives from them. The specification is the authoritative description that humans and machines use to understand, build, and evolve the system.”
Levels of SDD
There isn’t a single way to apply SDD. Three levels of rigor can be identified:
- Spec-first: the specification is written before implementation and guides initial development. Ideal for new services, APIs, or features with multiple consumers.
- Spec-anchored: specification and code evolve together, kept in sync through tests and validation. This is the most practical level for most teams in production.
- Spec-as-source: humans only edit specifications; code is generated from them. Suitable for highly regulated or structured domains where traceability from requirement to code must be rigorous.
Most teams will find spec-anchored to be the right compromise between rigor and agility.
Overview of Key Frameworks
The SDD tooling ecosystem is developing rapidly. Here are the three most relevant frameworks:
Spec Kit (GitHub)
Spec Kit is GitHub’s open-source toolkit for SDD. It proposes a multi-phase workflow (Specify -> Plan -> Tasks -> Implement) and generates versioned Markdown artifacts in the repository. It’s compatible with GitHub Copilot, Claude Code, Cursor, and Gemini CLI. We’ll dive deeper into Spec Kit in the next chapter.
OpenSpec (Fission AI)
OpenSpec is a lightweight, open-source framework (TypeScript) designed to bring determinism to AI development. Its distinctive features:
- Delta Specs: captures incremental changes in requirements, rather than rewriting the entire specification
- Brownfield-first: designed to evolve existing codebases, not just greenfield projects
- No API key or complex installation: specifications live in the repository alongside the code
- Supports over 20 tools, including Claude Code, Cursor, and GitHub Copilot
BMAD Method
The BMAD Method (Breakthrough Method for Agile AI-Driven Development) is a more ambitious open-source framework, with:
- 21 specialized AI agents (Analyst, Product Manager, Architect, Developer, QA, Scrum Master, …) each with defined roles and responsibilities
- 50+ guided workflows for different project types and phases
- Multi-agent architecture: agents collaborate from ideation to implementation
- Compatible with Claude Code, Cursor, Windsurf, and other AI IDEs
Here’s a quick guide to help you choose:
| Criterion | Spec Kit | OpenSpec | BMAD |
|---|---|---|---|
| Setup complexity | Low (CLI + Markdown) | Very low (files in repo) | Medium-high (21 agents to configure) |
| Ideal for | Greenfield projects with GitHub | Evolving existing codebases | Enterprise projects with structured teams |
| Learning curve | ~1 hour | ~30 minutes | ~1 day |
| Lock-in | Low (Markdown + Git) | None (files in repo) | Medium (framework dependency) |
| Brownfield support | Limited | Excellent (Delta Specs) | Good |
In general: start with Spec Kit if you use GitHub and want a structured but lightweight workflow. Choose OpenSpec if you need to evolve an existing codebase without disrupting your workflow. Consider BMAD only if your project requires multi-role coordination and you have the time budget to configure the entire orchestra of agents.
Deep Dive into Spec Kit
Spec Kit deserves a deeper look because it represents the state of the art in SDD applied to coding agents and is directly backed by GitHub and Microsoft.
The Spec Kit workflow is organized into well-defined steps, each with a dedicated command:
1. Project Constitution (/speckit.constitution)
You define the non-negotiable principles of the project: coding standards, testing requirements, security rules, UX principles, performance targets. The constitution is automatically consulted at every subsequent phase as a constraint.
2. Functional Specification (/speckit.specify)
An idea is transformed into a structured functional specification: user stories, functional requirements, acceptance criteria. No technical details here — only the what and the why. Spec Kit automatically creates a dedicated Git branch for the feature.
3. Clarification (/speckit.clarify)
The AI agent asks structured questions to eliminate ambiguities from the specification: edge cases, constraints, error handling, permissions. This phase is critical: an ambiguous specification produces ambiguous code.
4. Technical Plan (/speckit.plan)
The validated specification is translated into a detailed technical plan: architectural decisions, data models, APIs, integrations. This is where you choose the stack, patterns, and interfaces. Generated artifacts include plan.md, data-model.md, and a contracts/ folder with API specifications.
5. Validation (/speckit.checklist, /speckit.analyze)
Quality control and consistency checking across all artifacts before writing code. Inconsistencies, gaps, and quality issues are identified.
6. Task Decomposition (/speckit.tasks)
The plan is decomposed into small, reviewable work units: each task has explicit inputs, outputs, and success criteria tied to the specification. Tasks are ordered by dependencies, and parallelizable ones are marked.
7. Implementation (/speckit.implement)
The AI agent executes the tasks, generating and modifying code, tests, and configurations according to the plan. Code is produced in small diffs, easily reviewable.
A Practical Example
Let’s imagine we want to develop a simple API for managing a book library. Here’s how the flow would unfold with Spec Kit:
Phase 1 - Constitution:
/speckit.constitution
The project follows an API-first approach. We use Python with FastAPI.
Every endpoint must have unit tests. Security: JWT authentication.
PostgreSQL database with Alembic for migrations.
Phase 2 - Specify:
/speckit.specify
Build a REST API to manage a book library.
Users can search books by title, author, or ISBN.
Administrators can add, modify, and remove books.
Each book has: title, author, ISBN, publication year, genre.
Include user stories and acceptance criteria.
At this point, Spec Kit generates a structured spec.md file with user stories like:
- As a user, I want to search books by title, so I can quickly find the book I’m interested in
- As an administrator, I want to add a new book to the catalog, specifying all metadata
Phase 3 - Clarify: The agent asks, for example: “Are there limits on the number of results per page? What happens if someone tries to insert a duplicate ISBN? Which fields are required?”
Phase 4 - Plan: A technical plan is generated. Here’s a realistic excerpt from the generated plan.md:
# Technical Plan - Library API
## Architecture
- Framework: FastAPI with Pydantic v2 for validation
- Database: PostgreSQL 16 with SQLAlchemy 2.0 (async)
- Migrations: Alembic with autogenerate
- Auth: JWT (access token 15min + refresh token 7d)
## Data Model
### Book
| Field | Type | Constraints |
|-------------|-------------|---------------------------|
| id | UUID | PK, auto-generated |
| title | VARCHAR(255)| NOT NULL, INDEX |
| author | VARCHAR(255)| NOT NULL, INDEX |
| isbn | VARCHAR(13) | UNIQUE, NOT NULL |
| year | INTEGER | CHECK (year >= 1450) |
| genre | VARCHAR(100)| NULL |
| created_at | TIMESTAMP | DEFAULT now() |
## REST Endpoints
- `GET /books?title=&author=&isbn=&page=1&size=20` -> 200 + pagination
- `GET /books/{id}` -> 200 | 404
- `POST /books` -> 201 | 400 (validation) | 409 (duplicate ISBN)
- `PUT /books/{id}` -> 200 | 404
- `DELETE /books/{id}` -> 204 | 404
- All POST/PUT/DELETE methods require header `Authorization: Bearer <token>`
Note the level of detail: types, constraints, response codes, authentication rules. There’s no ambiguity, and the AI agent receiving this plan won’t have to “guess” anything.
Phase 5-6 - Checklist & Tasks: Tasks are generated such as:
- Create SQLAlchemy models (Book, User)
- Configure Alembic and create the initial migration
- Implement the
GET /booksendpoint with filters and pagination - Implement the
POST /booksendpoint with validation - Add JWT authentication
- Write tests for each endpoint
Each task has explicit inputs, outputs, and success criteria. Parallelizable tasks are marked with [P], sequential ones are ordered by dependencies.
Phase 7 - Implement: The agent executes each task, producing verifiable code and tests.
The key point is that every phase produces versioned Markdown artifacts in the repository, creating complete traceability from idea to code. If six months from now someone asks “why does this API work this way?”, the answer is in the specification.
Coding Agents
SDD is the methodology, but putting it into practice requires the right tools. Coding agents are the operational component of this new paradigm: AI agents that go beyond auto-completion — they plan tasks, modify codebases, run tests, and collaborate through existing DevOps workflows.
How a Coding Agent Works (Under the Hood)
Before surveying the tools, it’s worth understanding what distinguishes an “agent” from a simple chatbot. A coding agent operates according to a continuous agentic loop, which in pseudocode can be represented as:
while task is not complete:
context = gather(specs, code, test_results, errors)
plan = reason(context) # the LLM decides what to do
action = select_tool(plan) # tool selection: edit, bash, search...
result = execute(action) # real execution on filesystem/terminal
feedback = verify(result) # test, lint, command output
if feedback.has_errors:
context.append(feedback) # the error becomes input for the next cycle
The key mechanism is tool use (or function calling): the LLM doesn’t just generate text — it emits structured calls to external tools: file editors, bash terminal, browser, APIs. This allows it to interact with the real world: read a file, modify it, run tests, and react to results.
The fundamental difference between the various agents on the market lies in which tools they have available and in which environment they operate:
- Agents with direct filesystem access (Claude Code, Cursor): operate on your machine, with full access to terminal and files. Maximum flexibility, but require supervision.
- Agents in isolated sandbox (Devin, GitHub Copilot coding agent): operate in a dedicated cloud environment. Safer for full autonomy, but less flexible for custom workflows.
The coding agent landscape has evolved rapidly, and today we can distinguish several categories:
Ecosystem-Integrated Agents
- GitHub Copilot coding agent: works directly within the Pull Request workflow. You can assign an issue to
@copilotand the agent plans, modifies code, runs tests, and opens a PR autonomously. It’s the native target for Spec Kit. - Amazon Q Developer: AWS’s AI assistant, particularly strong for cloud-native development, IaC, and application transformations (e.g., Java 8->17 migration).
- Google Gemini Code Assist: strong integration with Google Cloud services (BigQuery, Firebase, Apigee). Explicitly supported by Spec Kit as an SDD target.
Editor-First Agents
- Cursor: a fork of VS Code that’s natively AI-first. The Agentic + Composer mode allows planning multi-step tasks, modifying multiple files, executing terminal commands, and iterating until tests pass.
- JetBrains AI Assistant & Junie: integrated across all JetBrains IDEs, Junie offers agentic programming for implementing fixes, refactoring, and tests.
Agent Platforms
- Claude Code / Claude Agent SDK: Anthropic’s platform based on the principle “give the agent a computer.” Claude Code has access to terminal and file system and operates with a continuous cycle: gather context -> act -> verify -> repeat. The Agent SDK allows building custom agents.
- Devin (Cognition): a fully autonomous agent with its own integrated development environment (shell, editor, browser). Still experimental and not very “enterprise-ready.”
Which Agent Should You Choose?
For a company looking to adopt SDD today, a pragmatic approach is:
- GitHub Copilot or Amazon Q for issue/PR-driven work on core services
- Gemini Code Assist for SDD workflows on analytics and GCP integrations
- Cursor or JetBrains in the IDE for high-fidelity implementation from specifications
- Claude Code / Agent SDK for custom SDD pipelines where standard tools are too rigid
The Added Value of Instruction Files: CLAUDE.md
One of the most powerful concepts to emerge with coding agents is that of persistent instruction files: Markdown files that the agent reads automatically at the start of every session to understand the project context. Each agent has its own format (.github/copilot-instructions.md for Copilot, .cursorrules for Cursor, etc.), but the most well-known and mature is Claude Code’s CLAUDE.md.
What Is CLAUDE.md
CLAUDE.md is a project-specific instruction file that Claude Code reads automatically when starting in a directory. Its purpose is to:
- Give Claude the minimum context it cannot infer from the code
- Codify critical rules and caveats that must be respected in every task
- Improve reliability and speed by avoiding repeated explanations
Think of it as a carefully curated system prompt, not a wiki. It’s a living contract between the codebase and the AI agents.
How to Set Up a Good CLAUDE.md
Best practices, confirmed by both Anthropic documentation and empirical community experience, converge on several key principles. These principles are not specific to Claude Code but derive from the general characteristics of LLMs, and therefore apply to any assistant or coding agent, even if the specific format may vary depending on the tool.
1. Less is more
Every additional line can reduce the overall quality of instruction adherence. LLMs can follow only a limited number of distinct instructions with high fidelity. When there are too many, adherence to all rules degrades — it’s not that the last ones are ignored; all of them get worse.
2. High signal, low noise
Only include information that is:
- Hard for Claude to infer by reading the code
- Relevant to the vast majority of daily tasks
3. The minimum effective structure
A good CLAUDE.md typically contains three blocks:
# CLAUDE.md
## Project
This is a Next.js + TypeScript e-commerce portal that communicates
with our internal payment and catalog APIs.
## Key Commands
- Install dependencies: `pnpm install`
- Dev server: `pnpm dev`
- Build: `pnpm build`
- Test: `pnpm test`
- Lint: `pnpm lint`
## IMPORTANT Caveats
- IMPORTANT: Do not modify `prisma/schema.prisma` directly.
Use `pnpm db:migrate` and `pnpm db:generate`.
- IMPORTANT: The `/api/webhooks/stripe` endpoint expects the raw
request body. DO NOT use a body parser.
- Images in `public/` must be optimized before committing;
files > 200KB will fail CI.
4. Don’t include style rules
Rules like “use two spaces for indentation” or “use single quotes” are a waste of instructions: Claude infers them from existing code, and linters and formatters handle them better anyway.
5. Progressive disclosure
For detailed but rarely needed information, don’t weigh down the main file. Instead:
## Additional Documentation
- Database schema and migrations: read `docs/schema.md` when
modifying models.
Claude will open docs/schema.md only when necessary, instead of loading it on every task.
6. Path-specific rules with .claude/rules/
Claude Code supports path-specific instruction files:
# .claude/rules/tests.md
paths: ["**/*.spec.ts", "**/*.test.ts"]
## Testing Rules
- Use Vitest, not Jest.
- Use the helpers in `test-utils/` for component rendering.
This file is loaded only when Claude works on test files, keeping the global CLAUDE.md leaner.
7. Continuous maintenance
Treat CLAUDE.md as a living document: update it when you notice Claude repeating avoidable mistakes, remove obsolete instructions, reorder by importance. The most important rules should always be at the top of the file.
Limitations and Risks of SDD
It would be dishonest to present SDD as a solution without issues. There are concrete limitations worth knowing before adopting it:
- Non-determinism. LLMs are not deterministic: the same specification, given to the same model at two different times, can produce structurally different code. This means SDD doesn’t guarantee reproducibility. The specification drastically reduces variance compared to a generic prompt, but doesn’t eliminate it. That’s why contract tests and automated validation are indispensable — they’re the “deterministic guardrail” that compensates for the probabilistic nature of the model. At the same time, though, it’s worth reflecting on the fact that, in general, the same concept applies to humans as well: the same developer, reading the same specification at two different times, can write different code.
- Garbage in, garbage out (shifted up a level). If I use an LLM to generate the specifications themselves, who validates them? The risk is automating the production of plausible but incorrect specifications — for example, a data model that seems reasonable but violates an unstated business rule. Human review of specifications is not optional: it’s the critical control point of the entire workflow.
- Scalability. The book library example works well, but what happens with a distributed system of 200 microservices? SDD scales well as long as specifications remain modularizable — one service at a time, one feature at a time. When cross-service dependencies become too intricate, the specifications themselves risk becoming a maintenance problem. This isn’t a reason not to adopt SDD, but it is a reason not to think of it as a magic wand.
- Costs. A complete SDD workflow (specify -> clarify -> plan -> tasks -> implement) consumes significantly more tokens than a single prompt. Each phase involves one or more calls to the LLM, each with its own context. On top-tier models like Claude Opus or GPT-4, a complete cycle for a medium-complexity feature can cost between 5 and 20 dollars in tokens. It’s an investment that pays for itself amply in terms of time saved, but it needs to be planned — especially for teams working on dozens of features in parallel.
- Overhead for simple projects. A complete SDD workflow for a 50-line script is over-engineering. SDD performs best on features with medium-to-high complexity, where ambiguities are the real cost. For trivial tasks, a good direct prompt remains the best choice.
All these limitations are, however, manageable through a disciplined and aware approach. In other words, a naive “vibe-coding” approach may work fine for a small prototype, but as project complexity increases, it becomes necessary to apply SDD techniques with ever-greater rigor to avoid running into these problems.
Conclusions
Spec-Driven Development is neither a passing fad nor an academic exercise.
It is a methodological and disciplined approach to software development through AI Agents, leveraging the capabilities of agents across the entire development stack — from exploring the solution space, to writing detailed specifications, to technical planning, through to implementation and testing.
But as we’ve seen, it’s not without limitations: the non-determinism of LLMs, the risk of incorrect specifications, token costs, and overhead for simple projects are all factors to consider. SDD works best when applied with judgment, not as dogma.
For those working in software development today, the message is clear:
- Invest in cross-cutting skills: architecture, DevOps, data models, security. The Designer of the future isn’t the one who writes the fastest code, but the one who writes the most precise specifications.
- Adopt gradually: start with API-first and contract tests. Then add an SDD framework like Spec Kit on a new feature. Measure the results.
- Abandon Vibe Coding: writing vague prompts and hoping for the best doesn’t scale. Investing a day in specifications to save two weeks of work isn’t “wasting time” — it’s the Designer’s craft.
- Prepare for the mindset shift: it will be less “fun” in the traditional sense, but the satisfaction of orchestrating a complex system through specifications that produce working software is, in its own way, equally rewarding.
The future of software development isn’t writing code. It’s designing systems and letting the code write itself — but under the rigorous control of someone who knows what they want to achieve.