SENAR Guide: AI Output Review Checklist
NOTE — Relationship to SENAR Core
SENAR Core includes an updated Verification Checklist (26 items, 3 tiers: Standard / High / Critical) that supersedes this checklist for Core adopters.
Tier mapping: Tier 1 (this guide) ≈ Standard (Core) | Tier 2 ≈ High | Tier 3 ≈ Critical
This Guide checklist is retained for teams adopting SENAR through the Standard (team-level entry point) who have not yet moved to Core. If your team uses SENAR Core, use the Core Verification Checklist instead.
When reviewing AI-generated output, check for these AI-specific issues.
The Checklist
| # | Check | What to Look For |
|---|---|---|
| 1 | Scope | Did AI modify files outside the task scope? (most common AI error) |
| 2 | Deletions | Did AI silently remove or replace existing working code? |
| 3 | Phantom imports | Are all imported packages in the dependency file? |
| 4 | Dependency versions | Are specified versions real and published? |
| 5 | Hardcoded values | Magic numbers, URLs, credentials, API keys in code? |
| 6 | Over-engineering | Unnecessary abstractions, patterns, or generalization? |
| 7 | Duplication | New code that duplicates existing utilities? |
| 8 | Test quality | Do tests verify behavior, or just mirror implementation? |
| 9 | Test tampering | Did AI modify tests to pass instead of fixing the code? |
| 10 | Security | Open CORS, hardcoded tokens, SQL without parameterization? |
| 11 | Edge cases | Happy path works — what about null, empty, boundary, concurrent? |
| 12 | Naming | Does AI follow project naming conventions? |
| 13 | Commit scope | Is the commit atomic and focused, or a kitchen sink? |
| 14 | Null guard before comparison | None == None is True in Python (JS: null === null is true; most languages have equivalent null-equality traps) — access checks bypassed when both sides null? |
| 15 | Empty config bypass | Security check skipped when config value is empty string? (if secret and ... fails open) |
| 16 | Header trust | X-Forwarded-For, X-Partner-ID, Content-Length used for security without proxy validation? |
| 17 | IDOR | Resource accessed by ID without verifying user’s access to that resource? Auth ≠ authorization. |
| 18 | Return True shortcut | Access control function returns True/grants access without explicit ownership validation? |
| 19 | Format string injection | str.format(**untrusted_dict) — Python format supports attribute access, enables injection (JS: template literals with eval(); C/C++: printf format strings; any language: string interpolation with untrusted input) |
| 20 | God functions/files | Functions >50 lines, files >400 lines — strongest signal of unrefactored AI output |
| 21 | Unreachable safety code | return False after exhaustive exception handling — AI added “just in case” but can never execute |
| 22 | Swallowed exceptions | catch/except blocks that discard errors (except Exception: pass, catch(e) {}, or logging at debug level) — hides real failures. Check for null/None/nil returns that silently mask error conditions |
| 23 | Unsafe deserialization | pickle.loads, yaml.load without SafeLoader, eval/exec on untrusted input, JSON prototype pollution? (Java: ObjectInputStream; JS: eval(JSON); any language: deserializing untrusted data without validation) |
Priority Tiers
Tier 1 — Always check (every task): Items 1–3 (scope, deletions, phantom imports) and 8–9 (test quality, test tampering). These catch the most frequent and most dangerous AI defects. A 5-item check takes under 2 minutes.
Tier 2 — Security-sensitive tasks (auth, payment, data, API): Items 10 (security), 14–18 (null guard, empty config, header trust, IDOR, return True). These catch latent defect patterns — AI output that looks correct but fails under adversarial conditions. Identified through adversarial audits of production AI-generated code.
Tier 3 — Deep review (complex tasks, agent dispatch output): All remaining items (4–7, 11–13, 19–23). Apply when reviewing complex features, refactors, or any output from dispatched agents (Rule 15 L3).
Usage
Print this list. Use it at QG-2 (Implementation Gate) for every Task. Over time, some checks become automatic habits — but keep the list visible for new Supervisors.