SENAR Guide: AI Output Review Checklist

NOTE — Relationship to SENAR Core

SENAR Core includes an updated Verification Checklist (26 items, 3 tiers: Standard / High / Critical) that supersedes this checklist for Core adopters.

Tier mapping: Tier 1 (this guide) ≈ Standard (Core) | Tier 2 ≈ High | Tier 3 ≈ Critical

This Guide checklist is retained for teams adopting SENAR through the Standard (team-level entry point) who have not yet moved to Core. If your team uses SENAR Core, use the Core Verification Checklist instead.

When reviewing AI-generated output, check for these AI-specific issues.

The Checklist

#CheckWhat to Look For
1ScopeDid AI modify files outside the task scope? (most common AI error)
2DeletionsDid AI silently remove or replace existing working code?
3Phantom importsAre all imported packages in the dependency file?
4Dependency versionsAre specified versions real and published?
5Hardcoded valuesMagic numbers, URLs, credentials, API keys in code?
6Over-engineeringUnnecessary abstractions, patterns, or generalization?
7DuplicationNew code that duplicates existing utilities?
8Test qualityDo tests verify behavior, or just mirror implementation?
9Test tamperingDid AI modify tests to pass instead of fixing the code?
10SecurityOpen CORS, hardcoded tokens, SQL without parameterization?
11Edge casesHappy path works — what about null, empty, boundary, concurrent?
12NamingDoes AI follow project naming conventions?
13Commit scopeIs the commit atomic and focused, or a kitchen sink?
14Null guard before comparisonNone == None is True in Python (JS: null === null is true; most languages have equivalent null-equality traps) — access checks bypassed when both sides null?
15Empty config bypassSecurity check skipped when config value is empty string? (if secret and ... fails open)
16Header trustX-Forwarded-For, X-Partner-ID, Content-Length used for security without proxy validation?
17IDORResource accessed by ID without verifying user’s access to that resource? Auth ≠ authorization.
18Return True shortcutAccess control function returns True/grants access without explicit ownership validation?
19Format string injectionstr.format(**untrusted_dict) — Python format supports attribute access, enables injection (JS: template literals with eval(); C/C++: printf format strings; any language: string interpolation with untrusted input)
20God functions/filesFunctions >50 lines, files >400 lines — strongest signal of unrefactored AI output
21Unreachable safety codereturn False after exhaustive exception handling — AI added “just in case” but can never execute
22Swallowed exceptionscatch/except blocks that discard errors (except Exception: pass, catch(e) {}, or logging at debug level) — hides real failures. Check for null/None/nil returns that silently mask error conditions
23Unsafe deserializationpickle.loads, yaml.load without SafeLoader, eval/exec on untrusted input, JSON prototype pollution? (Java: ObjectInputStream; JS: eval(JSON); any language: deserializing untrusted data without validation)

Priority Tiers

Tier 1 — Always check (every task): Items 1–3 (scope, deletions, phantom imports) and 8–9 (test quality, test tampering). These catch the most frequent and most dangerous AI defects. A 5-item check takes under 2 minutes.

Tier 2 — Security-sensitive tasks (auth, payment, data, API): Items 10 (security), 14–18 (null guard, empty config, header trust, IDOR, return True). These catch latent defect patterns — AI output that looks correct but fails under adversarial conditions. Identified through adversarial audits of production AI-generated code.

Tier 3 — Deep review (complex tasks, agent dispatch output): All remaining items (4–7, 11–13, 19–23). Apply when reviewing complex features, refactors, or any output from dispatched agents (Rule 15 L3).

Usage

Print this list. Use it at QG-2 (Implementation Gate) for every Task. Over time, some checks become automatic habits — but keep the list visible for new Supervisors.