SENAR Guide: Agent Configuration in Practice
This chapter bridges the normative requirements of Standard Section 5 (Agent Instrumentation) with practical implementation. It covers how to write operational scripts, define agent profiles, manage script changes, and apply these concepts across different AI development tools.
Anatomy of an Operational Script
An Operational Script is a structured natural-language instruction. Unlike code, scripts are interpreted by an AI agent — which means they must be unambiguous, self-contained, and testable.
Minimal Script Structure
Script: commit
Trigger: /commit command or explicit user request
Preconditions:
- Changes exist in the working tree
- No failing tests
Algorithm:
1. Run git status to see all changes
2. Run git diff to review staged and unstaged changes
3. Analyze changes and draft a commit message:
- Summarize the nature of changes (feature, fix, refactor...)
- Keep message under 72 characters for the subject line
4. Stage relevant files (prefer specific files over git add -A)
5. Create the commit
6. Run git status to verify success
Postconditions:
- Commit exists in local repository
- Working tree is clean for committed files
Outputs:
- Commit hash
- Summary of what was committed
What Makes a Good Script
-
Explicit preconditions. If a script assumes something is true, state it. “Tests pass” is a precondition, not an assumption.
-
Numbered steps with decision points. When step 3 depends on the outcome of step 2, say so. “If tests fail, stop and report. If tests pass, continue to step 4.”
-
Defined outputs. What does the script produce? A commit? A report? A status change? Be specific.
-
Edge cases documented. “If no changes exist, report this and exit without creating an empty commit.”
-
No platform-specific calls. Write “Run the test suite” not “Run
npm test.” The script should work regardless of the project’s tech stack.
Anti-Patterns
- Vague instructions: “Make sure everything is good” — not verifiable
- Implicit knowledge: “Use the standard approach” — which standard?
- Missing error handling: No guidance for when steps fail
- Overcoupled scripts: One script that does planning, implementation, and review
Defining Agent Profiles
An Agent Profile restricts what an AI agent can do. The restriction is the point — it prevents the agent from performing actions outside its designated function.
Generator Profile (Primary Development)
Profile: Generator
Purpose: Implement code changes for a specific Task
Scripts available:
- implement: Write code to satisfy Task acceptance criteria
- debug: Investigate and fix failures
- commit: Create version-controlled commits
- test: Run and verify test results
Access:
- Read/write: source code, tests, configuration
- Read: task definitions, knowledge base
- Write: task status updates, knowledge entries
- NO access to: deployment, infrastructure, other projects' code
Constraints:
- Changes must stay within Task scope boundaries
- Commits must be atomic (one logical change)
- Must not modify files outside defined scope
Reviewer Profile (Independent Verification)
Profile: Reviewer
Purpose: Independently verify code changes against acceptance criteria
Scripts available:
- review: Check code against acceptance criteria and standards
- security-audit: Scan for security issues
- report: Generate review findings
Access:
- Read-only: source code, tests, configuration, task definitions
- Write: review comments, findings
- NO access to: code modification, commits, deployment
Constraints:
- Cannot modify any code being reviewed
- Findings must reference specific acceptance criteria
- Must check for AI-specific issues (hallucinated APIs, self-validating tests)
Minimum Viable Profiles
At SENAR Core level, you need at minimum:
- Generator — for development work
- Reviewer — for independent verification
The key insight: even if the same AI agent switches between profiles, the profile switch enforces a cognitive boundary. The Reviewer profile cannot write code, so it evaluates rather than fixes.
Managing Script Changes
Changing a script changes how your AI agent behaves in production. Treat script changes with the same rigor as production code changes.
The Script Change Workflow
1. Identify need for change
↓
2. Draft the change with rationale
↓
3. Review (peer or self, depending on configuration)
↓
4. Test on isolated project (Team+)
↓
5. Deploy to target project
↓
6. Monitor effectiveness (FPSR, error rate)
↓
7. Record decision in knowledge base
Testing Script Changes
Before deploying a script change across all projects:
- Select a representative task — pick a task similar to what the script will handle
- Run the task with the old script — record FPSR and any issues
- Run a similar task with the new script — compare results
- Check for regressions — did the change break any existing behavior?
Rollback
Every script change should be reversible:
- Scripts are in version control —
git revertworks - Maintain a version identifier in each script (date or semver)
- Document what to watch for after rollback (tasks in progress may be affected)
Tool-Specific Examples
Claude Code
Claude Code uses CLAUDE.md for the Behavioral Contract and .claude/skills/ for Operational Scripts.
Behavioral Contract (CLAUDE.md):
# Project Rules
- NEVER commit without explicit permission
- NEVER modify files outside src/ and tests/
- Run tests before every commit
- Ask before installing new dependencies
Operational Script (.claude/skills/review.md):
# Review Skill
Trigger: /review command
Steps:
1. Read the current git diff
2. For each changed file, check:
a. Does the change match the stated Task goal?
b. Are there any hardcoded values that should be configurable?
c. Are error cases handled?
d. Do tests cover the acceptance criteria, not just code paths?
3. Check for AI-specific issues:
a. Hallucinated APIs or methods
b. Non-existent package references
c. Self-validating test patterns
4. Report findings with specific line references
Profile switching: Claude Code does not natively enforce profile permissions. Implement via CLAUDE.md rules: “When performing a review, you are in Reviewer mode. Do NOT modify any code. Only report findings.”
OpenAI ChatGPT / GPT-4
Behavioral Contract (system prompt or custom instructions):
You are working in Generator mode. You may:
- Read and write source code and tests
- Create commits with descriptive messages
- Search the codebase
You may NOT:
- Modify deployment configuration
- Access external APIs
- Create files outside the project directory
Operational Scripts (structured prompts or saved workflows): ChatGPT / GPT-4 does not have a native skill/script mechanism. Scripts are implemented as structured prompt templates that the Supervisor provides at the start of each task.
Cursor
Behavioral Contract (.cursorrules):
Rules:
- Always run type checking before suggesting changes as complete
- Never modify files not mentioned in the current task
- Prefer editing existing files over creating new ones
Operational Scripts (.cursor/prompts/ or manual invocation):
Cursor supports custom prompts that function similarly to operational scripts. Store these in version control alongside the project.
Practical Workflow: Script Change Lifecycle
Scenario: Your “implement” script produces code that frequently has type errors. You want to add a “run type checking” step before marking the implementation as complete.
Step 1: Document the Problem
In your knowledge base:
Type: observation
Title: Implementation script produces type errors
Body: In the last 5 tasks, 3 had type checking failures caught at CI
(QG-2). Adding a type check step to the implement script should
catch these earlier and improve FPSR.
Step 2: Draft the Change
Add to the implement script, after the “write code” step:
5. Run the project's type checker
6. If type errors exist:
a. Fix them
b. Re-run type checker
c. Repeat until clean
7. Continue to tests
Step 3: Review
For Core: self-review — does the addition make sense? Does it conflict with other steps?
For Team+: peer review — another Supervisor reviews the script change. Check: is the new step clear? Are edge cases handled (what if the project has no type checker)?
Step 4: Test (Team+)
Run one task with the new script on a non-critical project. Did the type check step execute correctly? Did it catch type errors before CI? Did it add unreasonable time?
Step 5: Deploy
Commit the script change. Update the version identifier.
Step 6: Monitor
Over the next 5–10 tasks, track:
- Did type checking failures at CI (QG-2) decrease?
- Did task completion time change significantly?
- Any unexpected side effects?
Step 7: Record
Type: decision
Title: Added type checking step to implement script
Body: After 3/5 tasks had type errors at CI, added explicit type check
step to implement script. First 5 tasks after change: 0/5 type
errors at CI. ~2 min added per task, saves ~10 min rework.
Summary
| Concept | Core | Team+ |
|---|---|---|
| Agent Profiles | Generator + Reviewer (SHOULD) | All 5 with permission enforcement (SHALL) |
| Operational Scripts | In version control, self-reviewed | Reviewed, tested, registry maintained |
| Script Changes | Treated as process changes | Tested on isolated project first |
| Rollback | Version control provides | Registry + explicit rollback procedure |
| Audit Trail | Knowledge entries | Formal change log |