1.4.2 AI Coding Agent Workflow

AI coding agents are no longer only autocomplete tools. Modern tools such as Codex and Google Antigravity can inspect a repository, edit files, run tests, and leave a task trail. That power is useful only when you give the agent a narrow job, a permission boundary, and a way to prove what changed.
This lesson teaches the workflow before the magic. You should finish with a small run card that another developer can review.
Why This Appeared
Section titled “Why This Appeared”Older AI coding help mostly answered “how do I write this function?” Agentic coding answers a different question:
Can a model move through a real repository, change code, verify it, and explain the result without hiding the risk?
That shift happened because model reasoning, tool calling, long context, terminal access, and code-review interfaces became good enough to form a loop:
- Read the repo and constraints.
- Plan a small edit.
- Patch files.
- Run tests or checks.
- Summarize evidence.
- Ask for human review when risk is high.
The problem it solves is not “write code faster.” The deeper problem is reducing handoff cost: the agent can collect context, make a scoped change, and package proof for the human.
What Problem It Solves
Section titled “What Problem It Solves”| Problem in coding work | Agent role | Human role | Evidence to keep |
|---|---|---|---|
| Large repo is hard to navigate | Search files, map entry points, identify likely owners | Confirm the requested scope | Search notes and touched files |
| Small bug requires many mechanical edits | Apply consistent edits and formatters | Check intent and product behavior | Diff, tests, screenshots |
| Tests fail for unclear reasons | Read logs, isolate failing layer, propose repair | Decide whether repair is in scope | Failing command and fixed command |
| Refactor risk is hidden | Produce a risk card before editing | Approve, narrow, or reject | Risk level and rollback note |
| Review takes too long | Summarize why each change exists | Review behavior and edge cases | Commit message and QA notes |
Decision Table
Section titled “Decision Table”Use this table before giving the agent a task.
| Situation | Good agent task | Not a good first task | Required gate |
|---|---|---|---|
| One failing unit test | ”Fix this test failure and explain the root cause" | "Rewrite the whole module” | Run the failing test and one nearby test |
| UI copy or layout issue | ”Adjust this section and verify screenshot" | "Redesign the whole app” | Browser screenshot |
| New lesson page | ”Add a page matching existing template" | "Invent a new curriculum structure” | Link check and course QA |
| Security or data deletion | ”Inspect and propose a patch plan" | "Run destructive cleanup” | Human approval |
| Dependency upgrade | ”Assess breaking changes and update one package" | "Upgrade everything” | Lockfile diff and build |
Runnable Lab: Build an Agent Run Card
Section titled “Runnable Lab: Build an Agent Run Card”Create agent_run_card.py and run it with Python 3.10 or later.
import jsonfrom pathlib import Path
task = { "request": "fix a broken course sidebar link", "files_likely_touched": ["src/content/docs", "astro.config.mjs"], "can_run_tests": True, "touches_user_data": False, "changes_public_behavior": True,}
def classify_risk(info): if info["touches_user_data"]: return "high" if info["changes_public_behavior"]: return "medium" return "low"
def choose_gates(info): gates = ["read surrounding files", "make minimal patch", "record diff"] if info["can_run_tests"]: gates.append("run relevant QA command") if info["changes_public_behavior"]: gates.append("capture before/after behavior") return gates
run_card = { "task": task["request"], "agent_scope": "one narrow bug or content fix", "risk": classify_risk(task), "permissions": { "read": True, "edit": True, "network": False, "destructive_commands": False, }, "gates": choose_gates(task), "evidence_file": "agent_evidence.md",}
Path("agent_run_card.json").write_text(json.dumps(run_card, indent=2), encoding="utf-8")print(json.dumps(run_card, indent=2))Expected output:
{ "task": "fix a broken course sidebar link", "agent_scope": "one narrow bug or content fix", "risk": "medium", "permissions": { "read": true, "edit": true, "network": false, "destructive_commands": false }, "gates": [ "read surrounding files", "make minimal patch", "record diff", "run relevant QA command", "capture before/after behavior" ], "evidence_file": "agent_evidence.md"}Read the Code Line by Line
Section titled “Read the Code Line by Line”task is the input contract. It says what the human wants, what files are likely involved, and which risks exist.
classify_risk() is the permission gate. The agent should not treat a content typo and a user-data migration as the same type of work.
choose_gates() turns the task into proof steps. It is the difference between “I changed it” and “I changed it, checked it, and can show what happened.”
run_card is the handoff artifact. In a real repository, attach it to the PR, commit message, or task note.
Mini Exercise
Section titled “Mini Exercise”Change the request to one of these tasks and rerun the script:
- “update the homepage meta description”
- “delete old user accounts”
- “add a new API endpoint”
For each run, answer:
| Question | What to decide |
|---|---|
| Is the risk low, medium, or high? | Explain the reason, not only the label |
| What extra gate is needed? | Test, screenshot, security review, migration backup, or human approval |
| What evidence would convince a reviewer? | Diff, log, screenshot, request/response, or rollback note |
Evidence to Keep
Section titled “Evidence to Keep”Before you finish any AI coding task, leave this minimum packet:
- Request
- what the human asked for
- Scope
- files or behavior intentionally touched
- Risk
- low, medium, or high
- Commands
- checks that were run
- Result
- pass, fail, or not run with reason
- Diff Summary
- what changed and why
- Rollback
- how to undo or where the commit is
Small Summary
Section titled “Small Summary”AI coding agents are most useful when they are treated as junior engineers with fast hands and a strict notebook. Give them a narrow task, make permissions explicit, run checks, and require evidence.
Check reasoning and explanation
You pass this lesson when you can turn a vague request into a scoped run card, name the risk, choose the verification gates, and explain what evidence a reviewer should inspect.