Specification-Centric AI Development (SCD)
The “Fix the Factory” Paradigm
Process Improvement of Prompt Generators for Deterministic AI Code Production
Version: 1.1 (Research Draft)
Date: [TBD]
Author: [TBD]
1. Executive Summary
Most AI-assisted development today follows an implicit pattern:
Generate code.
Discover defect.
Patch the code.
Move on.
This paper proposes a different model:
Do not fix the product. Fix the factory.
Specification-Centric Development (SCD) treats AI not as an autonomous agent, but as a deterministic compiler whose behavior is governed by explicit prompt generators.
When a defect is discovered:
The implementation is deleted.
A diagnostic AI classifies the root cause.
The specification or prompt generator is amended.
The entire system is regenerated.
No manual coding is permitted.
The hypothesis:
Software quality in AI-driven systems is a function of specification and generator clarity, not iterative patching.
This research evaluates whether structured prompt generators can be improved through a formal process improvement loop, resulting in measurable reduction in defect recurrence.
2. The Problem: The Auditability Gap in AI Development
AI coding workflows suffer from three structural weaknesses:
Opaque reasoning — agent decisions are internal.
Code patching bias — developers fix outputs instead of causes.
Lack of versioned instruction discipline — prompts evolve informally.
When a defect occurs, the natural tendency is to adjust the code.
This creates:
Non-reproducible changes
Hidden instruction drift
No systematic improvement of the generation mechanism
SCD reverses this logic.
The generator, not the code, becomes the unit of improvement.
3. Research Objective
This research aims to determine:
Can prompt generators be treated as production systems subject to process improvement?
Can defects be consistently classified as instruction gaps?
Does amending generators reduce recurrence across regenerations?
Can AI diagnose and amend its own generation instructions?
The focus is not on AI creativity, but on AI repeatability.
4. Experimental Constraint
To isolate variables, the experiment is constrained to:
Stack: Node.js + Express
Persistence: SQLite
API Style: REST
Validation: Supertest integration tests
These constraints:
Reduce architectural variance
Allow deterministic test validation
Focus the research on specification clarity
This is not a limitation of the methodology, but an experimental boundary.
5. The Multi-Layer Specification Model
SCD divides instruction into three explicit layers:
Layer
Type
Purpose
Essential Functional Requirements
Domain
What the system must do
Domain Quirks / NFRs
Domain
Safety, ordering, privacy nuances
Architectural Standards
Technical
Stack-specific implementation constraints
Each layer is versioned and externalized.
No implementation code is considered authoritative.
6. Prompt Generators as Factories
Rather than writing a single prompt, SCD uses structured generators:
Requirements Generator
Project Setup Generator
OpenAPI Generator
Database Generator
Test Suite Generator
Diagnostic Generator
Each generator:
Is versioned
Is deterministic
Is externalized
Produces artifacts from inputs
The generators collectively constitute the factory.
7. The Diagnostic Generator (The “Coroner”)
The Diagnostic Generator is the core research instrument.
Inputs:
Functional Requirements
NFR / Quirks
Architectural Standards
Generator versions
Generated artifacts
Failing test
Failure log
Process:
Map failing test to requirement rule.
Identify violated constraint.
Classify defect:
R1 — Missing Functional Requirement
R2 — Ambiguous Requirement
Q1 — Missing Domain Quirk
S1 — Missing Architectural Standard
G1 — Generator Omission
G2 — Generator Misinterpretation
Propose amendment to:
Specification layer, or
Generator instruction
Prohibit direct code modification.
This enforces factory-level correction.
8. The Regeneration Loop
The SCD loop operates as follows:
Generate full codebase from generators.
Run Supertest suite.
If failure:
Invoke Diagnostic Generator.
Classify defect.
Amend appropriate instruction layer.
Delete generated code.
Regenerate entire system.
Repeat until 100% test pass.
No manual code edits are allowed.
9. Data Collection
Each defect is logged:
Defect ID
Generator versions
Requirement version
Classification
Root cause explanation
Amendment applied
Regeneration result
Recurrence status
TBD:
% defects by layer
Mean regeneration cycles to green
Recurrence rate after amendment
Time comparison: spec fix vs manual patch
This data forms the empirical backbone of the study.
10. Hypothesis
Primary Hypothesis:
In AI-driven backend generation, most defects are instruction gaps, not stochastic generation errors.
Secondary Hypothesis:
Process improvement of prompt generators reduces recurrence of defect classes over successive regenerations.
11. Deliberate Non-Goals
This research does not attempt to:
Eliminate regeneration cost.
Optimize generation speed.
Compare models (GPT vs Claude vs others).
Replace traditional software engineering.
The focus is narrow:
Can structured prompt generators be hardened like production systems?
12. Contribution to the AI Landscape
Current AI discourse focuses on:
Agent autonomy
Multi-step reasoning
Tool use
This research shifts attention to:
Deterministic instruction design
Versioned specification discipline
Factory-level improvement
If successful, SCD offers:
Auditability
Reproducibility
Clear defect provenance
Structured AI governance
13. Current Status
Implemented:
Requirements Generator
Diagnostic Generator (v1)
Structured defect classification
Controlled backend domain
In Progress:
OpenAPI Generator
Database Generator
Test Suite Generator
Empirical defect logging
TBD:
Dataset size
Statistical outcomes
Recurrence analysis
Comparative studies
14. Conclusion
SCD reframes AI coding from:
“Generate and patch”
to:
“Specify, generate, diagnose, regenerate.”
The research does not ask whether AI can write code.
It asks:
Can AI systems be improved by refining the instructions that generate them?
If so, prompt generators become production assets subject to continuous improvement.
And AI becomes not an agent — but a compiler.