SENAR
Supervised Engineering & Normative AI Regulation
The methodology for producing verified software when AI writes the code and humans supervise.
The Problem
AI coding tools are everywhere. Governance for them is nowhere.
What happens without a standard
- Plausible wrong code. You give AI a vague prompt. It produces code that compiles, runs, and does the wrong thing. You don't notice because it looks right.
- Self-validating bugs. AI writes code AND the tests. The tests pass because they test the AI's misunderstanding, not your requirement.
- Knowledge evaporates. You spend 40 minutes discovering that approach X doesn't work. Next session, the AI tries approach X again. Nobody wrote it down.
- No way to measure. Is AI making you faster? Or just making bugs faster? You have no data either way.
What SENAR changes
- Requirements before code. AI receives structured context: goal, acceptance criteria, scope boundaries. Not vibes.
- Human verifies against criteria. Not "looks right" but each acceptance criterion checked independently against the requirement, not the AI's output.
- Dead ends documented. One sentence: "Approach X fails because Y." Prevents hours of repeated failure across sessions.
- Ten metrics tell the truth. First-Pass Success Rate, Defect Escape Rate, Throughput, Lead Time, Adversarial Detection Rate, and five more. You know exactly what AI is doing for you.
"AI does not ask clarifying questions — it picks an interpretation silently."
The resulting code compiles, passes the tests the AI wrote (which test the wrong behavior), and looks correct on review. This is the self-consistent artifact problem: AI generates code, tests, and documentation that are perfectly consistent with each other — and all wrong.
Every methodology before SENAR — Scrum, SAFe, Kanban — was designed for teams of humans who can ask each other "did you mean X or Y?" AI doesn't ask.
SENAR is the missing layer between "I use AI" and "I produce verified software with AI."
What Makes SENAR Different
This is not "write good prompts in ISO language." These are structural problems that require structural solutions.
Self-Consistent Artifact Problem
AI generates code, tests, and docs that agree with each other — but not with your requirement. Standard QA catches nothing because the tests validate the code against the AI's interpretation, not yours. SENAR breaks this loop: the Supervisor verifies against acceptance criteria written before the AI saw the task.
Quality at Input
A defect in a requirement cascades to code, tests, and documentation. AI amplifies this because it doesn't push back, doesn't ask "did you mean X or Y?" and doesn't flag ambiguity. SENAR's cascade principle: fix inputs, or every output is contaminated. The 3-level requirement hierarchy (BR → SR → TR) is not bureaucracy — it's defect prevention.
Documentation as Active Context
Code docs are not post-hoc bureaucracy. They are working context that the AI reads on every task. Good docs reduce per-task overhead because the AI starts with correct assumptions instead of guessing. Dead ends, architecture decisions, scope boundaries — they exist for the AI's benefit, not a filing cabinet.
AI Model as Supplier
When your model version changes, your production unit changes. This is equivalent to a compiler version change: baselines must be recalibrated, quality gates re-validated, and efficiency metrics re-baselined. SENAR is the first methodology to treat AI model governance as a supply chain management problem, not just "pick the best model."
Quick Start: 6 Habits, 5 Minutes
No meetings. No certifications. No tools to buy. Start today.
Write the WHAT
One sentence: what is done when this task is done? If you can't write it, you don't know what you're building.
Action: Write a goal sentence before you open the AI tool.
Write the DONE
Numbered acceptance criteria. Testable. "User can X" not "system should be good." Set scope boundaries — tell the AI what NOT to touch.
Action: Write 2-5 acceptance criteria before code starts.
Verify each criterion
Check each AC independently. Not "looks right" but "criterion #3 is satisfied because I tested X." The AI's tests don't count — they test the AI's interpretation.
Action: Go through AC one by one. Check, don't glance.
Record dead ends
When an approach fails, write one sentence: "X fails because Y." Costs 10 seconds. Saves 40 minutes next session.
Action: Keep a dead-ends file. One line per failure.
Tests pass + AC met = done
Not "I think it works." CI green, types clean, every acceptance criterion checked. The task is done or it isn't.
Action: Don't close a task until CI passes and AC are checked off.
Capture knowledge
Write down anything non-obvious you learned. Architecture decisions, gotchas, workarounds. The AI reads these next time.
Action: Spend 60 seconds writing what you learned.
SENAR Core
8 rules. 2 gates. 2 metrics. Adopt in 1 hour.
The minimal, self-contained subset of SENAR. No tooling required. No meetings. No roles to assign. Just the essential discipline for AI-assisted development.
Choose Your Level
Start with Core. Scale when you need to.
| Feature | Core | Foundation | Team | Enterprise |
|---|---|---|---|---|
| Rules | 8 | 11 | 15 | 15 |
| Quality Gates | 2 (Start/Done) | 3 (QG-0..QG-2) | 5 (QG-0..QG-4) | 5 + custom |
| Metrics | 2 (FPSR, Dead End Rate) | 4 | 10 | 10+ |
| Roles | 1 | 3 (combined) | 5 (dedicated) | 5 + portfolio |
| Ceremonies | 0 | 3 | 7 | 7 + portfolio |
| Tooling required | None | Recommended | Required | Required |
Before and After
Real data from 552 tasks across 38 sessions. Single-team project (5 services, 2 frontends). Honest caveats included.
Without SENAR
- Tasks without acceptance criteria → AI produces plausible wrong code that passes review
- No dead ends → same failed approaches retried session after session
- Marathon sessions (6+ hours) → 2.7x efficiency drop vs. focused sessions
- No metrics → no way to tell if AI is helping or hurting
- Cost: ~$105 per escaped defect in rework time
With SENAR
- Every task has goal + acceptance criteria → 85%+ First-Pass Success Rate
- Dead ends documented → 100% knowledge reuse across sessions
- Sessions capped at natural breakpoints → consistent, predictable output
- Ten metrics tracked → you know exactly where AI helps and where it doesn't
- Cost: ~5 min per session + 1-3 min per task overhead
Honest caveats
- Data is from one team, one stack, one AI tool (Claude Code). Your results will vary.
- The 85% FPSR was achieved at maturity. Early sessions were lower (~60-70%) as practices were being established.
- SENAR doesn't make bad requirements good. It makes bad requirements visible before they become bad code.
- Overhead numbers assume you're already writing some form of task description. If you currently use no process at all, initial overhead will feel higher.
Four Documents
Everything you need. Nothing you don't.
Core
The essential rules. Adopt in under 1 hour.
- 8 rules for disciplined AI development
- 2 quality gates (Start + Done)
- 2 metrics (FPSR + Dead End Rate)
- 27-item verification checklist
- No tooling required
- Self-contained, tool-agnostic
Standard
The full normative document. Looking for a simpler starting point? See Core.
- 5 roles (Supervisor, Context Architect, Knowledge Engineer, Flow Manager, Verification Engineer)
- 5 units of work (Epic → Story → Task → Subtask → Session)
- 7 ceremonies (Planning, Review, Retro...)
- 5 quality gates with entry/exit criteria
- 10 metrics with formulas and thresholds
- 15 normative rules
- 4 configurations (Core / Foundation / Team / Enterprise)
- Requirement hierarchy: BR → SR → TR
- Agent Instrumentation: profiles, scripts, interface
Guide
The "why" and "how." Philosophy, patterns, practical training.
- Quick Start (5 minutes, 6 habits)
- 6 pillars of AI-native development
- Requirements engineering for AI
- 23-item AI review checklist
- Failure modes and anti-patterns
- Scaling patterns (Core → Foundation → Team → Enterprise)
- Onboarding new team members
Reference
Lookup tables, models, compliance mapping.
- 44-term glossary with formal definitions
- Efficiency model and cost formulas
- AI model governance framework
- Scaling ratios and team sizing
- Tooling requirements checklist
- ISO 9001 / SAFe compatibility notes
Who Is It For
Same methodology. Four levels of ceremony.
1 developer
Solo developer using Claude Code, Cursor, Copilot, or any AI coding tool. Start with the Quick Start guide. No meetings required.
- 2 quality gates
- 2 metrics (FPSR, Dead End Rate)
- 8 rules
- ~5 min session overhead
- ~1-3 min per task overhead
1-3 developers
Combined roles, session management, monthly quality sweep. The bridge between solo practice and team coordination.
- 11 rules
- 4 metrics
- 3 ceremonies
- QG-0 + QG-2
- 3 combined roles
3-10 developers
Full requirement hierarchy (BR → SR → TR). Dedicated Supervisor and Reviewer roles. Requirement library. Federation across projects.
- 5 quality gates
- 10 metrics
- All 7 ceremonies
- Cross-project coordination
- Knowledge base with reuse tracking
10+ developers
Requirements-as-code with CI validation. Audit trails for compliance. Portfolio-level AI cost tracking and budget governance.
- Full governance model
- AI model supplier management
- ISO 9001 compatible
- SAFe integration patterns
- Efficiency model with cost formulas
TAUSIK
Task Agent Unified Supervision, Inspection & Knowledge
Open-source framework that enforces SENAR rules automatically. Quality gates that physically block agents from skipping steps — not recommendations, but enforcement. Works with Claude Code, Cursor, and Windsurf.
- 15 automated checks — pytest, ruff, tsc, eslint, cargo, go vet
- 33 structured skills — /plan, /ship, /review, /audit, /debug
- 80 MCP tools — programmatic access to project memory
- 6 automatic metrics — FPSR, DER, throughput, lead time
- Zero core dependencies — Python 3.11+ standard library only
AI writes the code.
You set the standard.
Start with Core. Scale to full governance. Open standard, free forever.
SENAR v1.3 · 25.03.2026 · CC BY-SA 4.0 · Normative language per RFC 2119