The Problem: Your Agent Has Amnesia

Imagine hiring a brilliant contractor who forgets everything about your company every morning. Monday, you explain that tests go in tests/integration/, not tests/unit/. Tuesday, you explain it again. Wednesday, you explain it again, slower this time, wondering if you are the problem.

That is what working with an AI agent feels like without context engineering.

Every time you open a new Claude Code session, your agent starts from zero. It does not know your test runner is Vitest, not Jest. It does not know your team deploys via GitHub Actions, not Vercel. It does not know that src/legacy/ is a minefield nobody should touch, or that every PR needs a changelog entry.

So you explain. Again. Every session. The agent still gets it wrong, because natural language instructions in a chat window are ambiguous, incomplete, and forgotten the moment the session ends. It is like writing configuration in sticky notes and hoping someone reads them.

Claude's official use case page lists "Workflow improvement planner" -- you describe your process, the model suggests optimizations. That works for a one-off brainstorm. But it falls apart the moment you need the agent to execute within your workflow repeatedly, because the agent starts from zero every time.

Context engineering solves this. Instead of re-explaining your workflow in chat, you encode it into structured files -- CLAUDE.md, AGENTS.md, Skills, hooks, and memory -- that load automatically on every session. Think of it as infrastructure, not instructions. The agent understands your workflow on the first message. Every time.

This article shows you how to build that system from scratch, with a concrete before/after comparison on a real workflow.

Before vs After: The Same Task, Two Realities

Here is the task: create a new API endpoint that follows your team's conventions, write the integration test, and open a draft PR.

Before Context Engineering

Session 1 (Monday):

You: Create a POST /api/v2/invoices endpoint.
Agent: *Creates the file in src/routes/invoices.ts*
You: No, we use src/api/v2/ for v2 routes.
Agent: *Moves it*
You: We use Zod for request validation, not inline checks.
Agent: *Rewrites with Zod*
You: The test needs to go in tests/integration/, not tests/unit/.
Agent: *Moves the test*
You: We use pnpm test:integration, not npm test.
Agent: *Runs pnpm test:integration*

Five corrections. Ten minutes of back-and-forth. The agent did the work, but you micromanaged every structural decision. It is like pair programming with someone who has never seen your codebase -- except this someone forgets everything overnight.

Session 2 (Tuesday, new session):

You: Create a POST /api/v2/payments endpoint.
Agent: *Creates the file in src/routes/payments.ts*
You: ...same corrections, from the top.

The agent forgot everything. You are the context now.

After Context Engineering

You spend 20 minutes setting up three files. Now, every session:

You: Create a POST /api/v2/invoices endpoint.
Agent: I'll create the endpoint following your v2 API conventions.
  - Created src/api/v2/invoices/route.ts with Zod validation
  - Created tests/integration/invoices.test.ts
  - Running pnpm test:integration... all tests pass
  - Opening draft PR with changelog entry

Zero corrections. Under two minutes. The agent made every structural decision correctly because the knowledge was already loaded before you typed a single character.

The difference is not a smarter model. It is a smarter context. The same engine, but with the right fuel.

The Three-Layer System

Context engineering for workflows uses three layers, each loaded at a different time and scope. Think of it like CSS specificity: there is a base layer that applies everywhere, a middle layer for tool-specific behavior, and an on-demand layer for specific tasks.

If you want the full comparison of these file types, see SKILL.md vs CLAUDE.md vs AGENTS.md. Here, we focus on how to use them specifically for workflow optimization.

Layer	File	When It Loads	What It Encodes
Foundation	AGENTS.md	Every session	Project architecture, conventions, constraints
Tool-specific	CLAUDE.md	Every session	Claude Code behavior, memory hints, subagent preferences
On-demand	SKILL.md	When task matches	Step-by-step workflow procedures

Layer 1: AGENTS.md -- Your Workflow's Ground Truth

AGENTS.md is the cross-tool context file. Claude Code, Codex CLI, Gemini CLI, Copilot CLI, and Cursor all read it. It is always loaded. Every token in it counts against every task. That makes it expensive real estate -- like the kernel's hot path, every byte matters.

For workflow optimization, AGENTS.md encodes the structural decisions that apply to all tasks: where files go, what tools you use, what is off-limits.

# AGENTS.md

## Architecture
- Node.js 22, TypeScript strict, Fastify 5
- Database: PostgreSQL 16 via Drizzle ORM
- API routes: src/api/v1/ (legacy, read-only) and src/api/v2/ (active)
- Auth: Custom JWT middleware in src/middleware/auth.ts

## Conventions
- Request validation: Zod schemas in src/schemas/
- Error handling: Result<T, E> pattern (src/lib/result.ts)
- DB queries: src/repositories/ only, never in route handlers
- Tests: Vitest for unit (tests/unit/), Supertest for integration (tests/integration/)
- Every PR must include a CHANGELOG.md entry

## Constraints
- Never modify src/api/v1/ routes (legacy clients depend on exact signatures)
- Never write raw SQL -- use Drizzle query builder
- Never skip integration tests for new endpoints

## Commands
- Dev: pnpm dev | Test: pnpm test:integration | Lint: pnpm lint
- Build: pnpm build | Migrate: pnpm db:migrate

That is 25 lines. It captures every structural decision the agent got wrong in the "before" scenario. The CLAUDE.md writing guide covers how to write this well; the key principle is: only include what the agent cannot infer from reading your code. If it can figure it out from package.json, do not duplicate it here.

Layer 2: CLAUDE.md -- Agent-Specific Behavior

CLAUDE.md adds Claude Code-specific instructions on top of AGENTS.md. Keep it minimal -- under 20 lines. This is where you encode how the agent should operate, not what the project contains. Think of AGENTS.md as README.md and CLAUDE.md as .editorconfig -- one describes the project, the other configures the tool.

# CLAUDE.md

Read AGENTS.md for project architecture and conventions.

## Behavior
- When compacting, preserve the full list of modified files and test results
- Prefer subagents for research tasks (exploring unfamiliar code, reading docs)
- After completing a task, run the relevant test suite before reporting done

## Memory
- Check ~/.claude/memory/ for cross-session notes on recent decisions
- When I make an architectural decision during a session, save it to memory

The memory section is particularly powerful for workflows. Claude Code's memory system persists knowledge across sessions -- architectural decisions, user preferences, project-specific patterns you discussed. When the agent checks memory at session start, it picks up where the last session left off. It is the difference between a contractor who keeps a notebook and one who does not.

Layer 3: Skills -- Your Workflow Recipes

This is where the real workflow optimization happens. AGENTS.md and CLAUDE.md tell the agent what your project looks like. Skills tell the agent how to do specific jobs. If layers 1 and 2 are the map, layer 3 is the turn-by-turn directions.

A skill loads only when the task matches its description. It costs zero tokens when you are doing unrelated work. That means skills can be detailed -- up to 500 lines -- without bloating every session.

Here is the API endpoint skill that eliminates the five-correction problem from the "before" scenario:

---
name: create-api-endpoint
description: >
  Use this skill when creating new API endpoints, route handlers,
  or REST resources. Triggers on: new endpoint, new route, API creation,
  POST/GET/PUT/DELETE handler.
---

## Create API Endpoint

### Step 1: Scaffold
1. Create route file in src/api/v2/[resource]/route.ts
2. Create Zod schema in src/schemas/[resource].ts
3. Create repository in src/repositories/[resource].ts (if new resource)

### Step 2: Implementation
1. Define request/response Zod schemas first
2. Implement repository method with Drizzle query builder
3. Wire route handler: validate -> repository -> Result pattern -> response
4. Add auth middleware if endpoint requires authentication

### Step 3: Testing
1. Create integration test in tests/integration/[resource].test.ts
2. Test happy path, validation errors, auth failures, and edge cases
3. Run: pnpm test:integration -- [resource]

### Step 4: Finalize
1. Add CHANGELOG.md entry under "Added"
2. Run full lint: pnpm lint
3. Open draft PR with description following PR template

### Anti-patterns
- Never put DB queries in the route handler -- always go through a repository
- Never skip Zod validation -- even for internal endpoints
- Never create v1 routes -- all new endpoints are v2

This skill is 35 lines of workflow-specific knowledge. Without it, you explain these steps manually every time. With it, the agent executes the full workflow autonomously. It is the difference between giving someone a recipe and explaining cooking from first principles every meal.

For deeper coverage of building skills like this, see context engineering with skill layering and the principles behind good skill design.

Hooks: Automated Context Loading

Claude Code hooks let you run scripts automatically at specific points in the agent lifecycle. For workflow optimization, hooks serve one critical function: ensuring context is loaded without manual prompts. They are the init.d of your agent setup.

{
  "hooks": {
    "session_start": [
      {
        "command": "cat .workflow-status.md 2>/dev/null || echo 'No active workflow'",
        "description": "Load current workflow status"
      }
    ],
    "pre_commit": [
      {
        "command": "pnpm lint --quiet",
        "description": "Lint before every commit"
      }
    ]
  }
}

The session_start hook is the most useful for workflows. It can load a status file that tracks where you left off -- which tickets are in progress, what was deployed last, which tests are failing. The agent starts every session with situational awareness, not just structural knowledge. It is the difference between walking into an office and seeing the whiteboard versus walking in and seeing blank walls.

Memory: Cross-Session Knowledge

Claude Code's memory system (~/.claude/memory/) stores notes that persist across sessions. If AGENTS.md is the project's constitution and skills are its standard operating procedures, memory is the institutional knowledge that accumulates over time -- the stuff nobody writes down but everyone knows.

Workflow-relevant memory includes:

Architectural decisions. "We decided on 2026-03-15 to migrate from REST to tRPC for internal services. External API stays REST."
Blockers. "Integration tests for payments module are flaky due to Stripe sandbox rate limits. Run them individually, not in parallel."
In-progress work. "Refactoring auth middleware. New version is in src/middleware/auth-v2.ts. Old version still active. Do not delete until migration complete."

The memory system turns the agent from a stateless tool into something that accumulates knowledge over time. Each session leaves the agent slightly smarter for the next one. It is compound interest applied to AI assistance.

Building Your System: A 30-Minute Setup

Here is the exact sequence to go from zero to a working context engineering setup.

Minutes 1-10: AGENTS.md

Open your project. Create AGENTS.md at the root. Write four sections: Architecture, Conventions, Constraints, Commands. Keep it under 30 lines. Only include what a new developer would get wrong on their first PR -- that is the signal for what the agent needs to know. See the CLAUDE.md writing guide for the full template -- the structure is identical for AGENTS.md.

Minutes 10-15: CLAUDE.md

Create CLAUDE.md at the root. Three lines minimum: reference to AGENTS.md, compaction behavior, subagent preference. Under 15 lines total. Resist the urge to over-specify. If you find yourself writing a paragraph, it probably belongs in a skill.

Minutes 15-25: Your First Two Skills

Identify your two most common tasks. The ones you do at least twice a week. Create a skill for each in .claude/skills/. Each skill should be a step-by-step recipe with anti-patterns listed. Reference the 10 common CLAUDE.md mistakes to avoid the same pitfalls in skill writing.

Minutes 25-30: Test

Open a new Claude Code session. Give the agent one of the tasks your skill covers. Watch whether it follows every step without correction. If it deviates, sharpen the skill wording. If it follows perfectly, your setup is working. This is your integration test.

The Compounding Effect

Context engineering is not a one-time setup. It compounds like a well-maintained codebase.

Week 1: You encode your basic project structure and two workflows. The agent stops asking where files go.

Week 4: You have added skills for deployment, code review, database migrations, and PR creation. Memory has accumulated architectural decisions and known blockers. The agent handles 80% of your routine work without correction.

Week 12: New team members onboard by reading your AGENTS.md and skills. The agent teaches them your conventions by following them. The context files become living documentation that is always accurate because the agent enforces it daily. Your documentation is never stale because it is also your configuration.

The workflow improvement planner use case from Claude's docs gives you a one-time conversation about process pain points. Context engineering gives you a system that gets permanently smarter about your specific processes. The difference is the difference between advice and infrastructure. Advice fades. Infrastructure compounds.

Try Termdock — Workspace Sync works out of the box. Free download →

Workspace-Level Persistence with Termdock

The context engineering system described above -- CLAUDE.md, AGENTS.md, skills, hooks, memory -- lives in your project directory and your home directory. It works in any terminal. But terminal sessions are fragile. Close a tab, lose your scroll history. Switch projects, lose your layout. Restart your machine, start from scratch. The context is persistent. The environment is not.

Termdock solves this at the workspace level. Your terminal sessions, split-pane layouts, environment variables, and working directories persist across restarts. When you reopen a workspace, every pane is exactly where you left it -- including the Claude Code sessions that were mid-task.

For context engineering workflows specifically, this means:

Multi-pane editing. Edit CLAUDE.md in one pane, test the agent in another, view logs in a third. The layout persists.
Project switching. Each project has its own workspace with its own terminal layout. Switch between them without losing state.
Session continuity. Close your laptop on Friday, open it Monday. Your agent sessions, file edits, and terminal history are all intact.

Context engineering makes the agent remember your workflow. Termdock makes your terminal remember your workspace. Together, nothing is forgotten.

Danny Huang·Follow on Threads →

Free Download

Ready to streamline your terminal workflow?

Multi-terminal drag-and-drop layout, workspace Git sync, built-in AI integration, AST code analysis — all in one app.

Download Termdock →

#context-engineering#claude-md#agents-md#skill-md#workflow#claude-code#ai-cli

Optimize Any Workflow with Context Engineering: CLAUDE.md + AGENTS.md in Practice