You're Using Coding Agents Wrong

Most developers try a coding agent, get mediocre results, and conclude the tool isn't that useful. But the gap between a beginner and a power user is enormous. The same model that gives you broken code can also ship production-ready 15-file PRs. The difference is entirely in how you use it.

This guide is focused on Claude Code, Anthropic's CLI agent, because it's what we use daily. But most principles apply to any coding agent.

First-try success rate by user skill level

Same model. Same capabilities. Wildly different results. Here's how to get to the right side of that chart.

Your CLAUDE.md is your highest-leverage file

CLAUDE.md sits at the root of your project and gets loaded into every Claude Code session automatically. Think of it as your project's instruction manual for the agent. Most people either skip it or auto-generate one with /init and never touch it again. Both are mistakes.

Research from teams like HumanLayer shows that frontier LLMs can reliably follow about 150-200 instructions. Claude Code's own system prompt already eats ~50 of those. That means your CLAUDE.md needs to be ruthlessly focused, not a dump of your entire codebase's documentation.

A bad CLAUDE.md:

This is a Next.js project. Use TypeScript.

A good CLAUDE.md:

# Project: Acme Dashboard
Next.js 14 App Router, Prisma ORM, Stripe billing.

## Commands
- `bun run dev` (port 3000)
- `bun run test`: Vitest
- `bun run lint`: ESLint + Prettier

## Code Patterns
- Named exports, not default exports
- Zod schemas use .optional(), defaults via destructuring
- File ops return null on failure: `await fs.readFile(p).catch(() => null)`

## Important
- NEVER commit .env files
- Use `safeJoin()` for all user-input file paths
- For complex auth flows, see docs/auth-architecture.md

Notice that last line. Instead of pasting your entire auth documentation into CLAUDE.md, you reference the file. Claude pulls it only when relevant, saving context for the stuff that matters. This technique (progressive disclosure) is what separates a 300-line bloated CLAUDE.md from a lean 60-line one that actually works.

The hierarchy you should know: Claude Code reads CLAUDE.md files at multiple levels, in order of priority:

~/.claude/CLAUDE.md: your personal global defaults (all projects)
./CLAUDE.md: project root, committed and shared with your team
./CLAUDE.local.md: personal project overrides, gitignored
./src/components/CLAUDE.md: subdirectory-specific context, loaded when working in that folder

Replace negatives with alternatives. Instead of "Never use npm," write "Use bun instead of npm." The model responds measurably better to positive instructions.

Review it quarterly. Instructions accumulate and drift. Periodically delete anything that isn't actively preventing Claude from making mistakes.

Context management is the #1 skill separator

Most developers hit rate limits by noon. Power users code all day on the same quota. The difference is context window management.

Here's what most people don't realize: as your conversation grows, quality degrades. The agent starts forgetting your instructions from 20 messages ago. It repeats mistakes you already corrected. This isn't a bug. It's how attention works over long sequences.

Impact of context management on output quality

The green line represents what happens when you actively manage your context. The spikes are where you /clear and start fresh or /compact to compress history.

Three strategies that actually work:

New task, new session. Don't reuse a conversation about your auth system to work on billing. Type /clear or just start a fresh session.
The handoff pattern. For complex multi-session work, dump your progress to a markdown file before clearing: "Write a summary of what we've done and what remains to progress.md." Then start a new session: "Read progress.md and continue from where we left off." This preserves intent without the bloat.
Monitor with /cost. Run it mid-session to see how much context you're using. If you're past 70% capacity, it's time to compact or restart.

Watch your MCP servers. Each MCP server's tool definitions eat context tokens. Power users have found that too many active servers can shrink your usable context from 200K to 70K. Run /cost to check, and disable unused servers in your project settings.

The plan-first workflow

This is Anthropic's own recommendation from their internal engineering teams. It's also the single most underused feature.

Hit Shift+Tab to cycle to plan mode, or start Claude with claude --permission-mode plan. In this mode, the agent explores your codebase, identifies relevant files, and proposes an implementation strategy before writing a single line of code. You review the plan, push back on bad ideas, then approve.

Here's the workflow Anthropic's own teams follow:

Start with "plan this, don't write code yet"
Review the plan: question assumptions, check edge cases
Give the green light
Claude implements, runs tests, iterates
You review the ~80% complete result and handle the final 20%

Press Ctrl+G to open the plan in your default editor. You can directly edit the plan file (add constraints, remove unnecessary steps, reorder priorities), then save and let Claude continue. This is way faster than going back and forth in chat.

When to use plan mode: any change touching more than 2-3 files, refactors, new features, architectural decisions. When to skip it: single-file fixes, boilerplate, things where you've given very specific instructions.

Custom slash commands are a cheat code

Create .claude/commands/your-command.md and you get a custom /your-command slash command. Most people don't know these exist. They're incredibly powerful.

/catchup: resume context after clearing:

Read all files changed on the current git branch compared to main.
Summarize what has been done and what remains to be implemented.

/gh-issue: work directly from GitHub issues:

Read GitHub issue $ARGUMENTS using `gh issue view`.
Analyze the requirements, identify relevant files, and create an implementation plan.

/review: self-review before you look:

Review all uncommitted changes. Check for: bugs, edge cases,
security issues, unnecessary complexity. Be harsh.

The $ARGUMENTS variable gets replaced with whatever you type after the command. So /gh-issue 142 reads issue #142 and starts planning. You can also use $1, $2 for positional arguments.

Restrict tool access with frontmatter:

---
allowed-tools: Read, Grep, Glob
---

This makes the command read-only, useful for review commands where you don't want Claude accidentally editing files.

Hooks: automated quality gates

Hooks are shell commands that fire automatically at specific lifecycle points. Configure them in .claude/settings.json. This is where Claude Code goes from "code writer" to "code writer with guardrails."

Auto-format every edit:

{
  "hooks": {
    "PostToolUse": [{
      "matcher": "Edit|Write",
      "hooks": [{
        "type": "command",
        "command": "prettier --write \"$CLAUDE_FILE_PATH\""
      }]
    }]
  }
}

Now you never review unformatted AI code. The agent writes, Prettier fixes, you review clean output.

The killer pattern: block commits until tests pass. Set up a PreToolUse hook on Bash(git commit) that runs your test suite first. If tests fail (exit code 2), the hook blocks the commit and feeds the error back to Claude. The agent then fixes the issue and tries again. It creates a fully automated write → test → fix → commit loop.

Available hook events: PreToolUse, PostToolUse, UserPromptSubmit, Notification, Stop, SessionStart. The exit code protocol: 0 = proceed, 2 = block and feed error message back to Claude.

Pro tip: Don't put code style rules in CLAUDE.md. Use a formatter hook instead. Style rules in CLAUDE.md waste instruction slots and the model follows them inconsistently. A PostToolUse formatter hook enforces style perfectly, every time.

Keyboard shortcuts that change how you work

Shortcut	What it does
`Escape`	Stop Claude mid-response. Use this aggressively.
`Escape` twice	Browse and resume previous sessions
`Shift+Tab`	Cycle permission modes: Normal → Auto-Accept → Plan
`Tab`	Accept the agent's suggested action
`Ctrl+G`	Open plan in your editor for direct editing
`Ctrl+V`	Paste images from clipboard (screenshots, diagrams)

The interrupt habit is what separates beginners from power users. If you see Claude going in the wrong direction, hit Escape immediately. Don't wait for it to finish writing 200 lines you're going to reject. Steer constantly: "No, not like that. Use the existing UserService class." Treat it as a conversation, not a vending machine.

CLI flags worth memorizing:

claude -c: continue your most recent conversation
claude -r: resume from a conversation picker
claude --from-pr 123: resume the session linked to a PR
claude -p "prompt": headless mode, no interaction needed

Use git worktrees for parallel agents

This is one of the most powerful techniques that almost nobody uses. Git worktrees let you run multiple Claude Code instances simultaneously on different branches, each with their own working directory and context:

git worktree add ../project-feature-a -b feature-a
cd ../project-feature-a && claude

In another terminal:

git worktree add ../project-bugfix -b bugfix-123
cd ../project-bugfix && claude

Two agents, two branches, zero context clashes. While one agent works on your feature, the other is fixing a bug. You review both when they're done.

MCP servers: be strategic, not greedy

MCP (Model Context Protocol) servers extend what Claude can do. But power users have learned an important lesson: less is more.

Each MCP server's tool definitions eat context window tokens. Enable too many and your 200K context window shrinks to 70K before you've even started working. The move is to enable only what you need per project.

The servers worth having:

Playwright MCP (claude mcp add playwright npx @playwright/mcp@latest): browser automation via accessibility tree. Way more reliable than coordinate-based clicking. Essential for testing web apps.
Database MCP: lets Claude query your DB directly to understand schemas and test queries instead of you pasting them in.

The contrarian take from power users: For stateless tools like Jira, GitHub, and AWS, skip MCP entirely and just use the CLI. gh issue view 142 works fine from Claude's Bash tool and doesn't eat context tokens with tool definitions. Reserve MCP for tools that genuinely need persistent connections or complex state.

Enable lazy loading by setting "ENABLE_TOOL_SEARCH": "true" in your settings. This loads MCP tool definitions on-demand instead of stuffing them all into context upfront.

The headless mode nobody talks about

Claude Code works as a Unix utility. Pipe data in, get structured output out:

cat build-error.txt | claude -p "explain the root cause" > explanation.txt

Add it to your package.json scripts:

{
  "scripts": {
    "lint:ai": "claude -p 'review changes vs main, report issues as filename:line followed by description'"
  }
}

Parallel batch refactoring:

claude -p "in src/api/ rename all refs from userId to accountId" &
claude -p "in src/client/ rename all refs from userId to accountId" &
wait

Two agents, working different directories, same refactor. This also works in CI/CD: trigger Claude Code in GitHub Actions to generate fully tested PRs from Jira tickets, Slack alerts, or monitoring events.

The anti-patterns: what kills your results

How prompt specificity affects output quality

Vague prompts. "Fix the bug" fails. "Fix the calculation error in cartTotal() within src/checkout.ts when discount codes apply to already-reduced items" succeeds. The more specific, the better.

Stuffing CLAUDE.md with everything. Over-instruction causes uniform degradation across all instructions. If everything is important, nothing is. Keep it under 100 lines.

Never restarting. Long conversations rot. Quality drops. Don't cling to a 40-message thread. Clear it, hand off context to a file, start fresh.

Using /init without editing. The auto-generated CLAUDE.md is bloated and generic. Always rewrite it by hand.

Enabling all MCP servers. Your context window is not infinite. Each server eats tokens. Only enable what you need for the current project.

Skipping review. Anthropic's own engineering teams treat all AI output as untrusted until verified. They gate every merge with tests and human review. If Anthropic doesn't trust raw AI output, neither should you.

Review like it's a junior dev's PR

What to look for when reviewing AI-generated code

Over-engineering is the #1 issue. The agent loves adding abstractions, config options, error handling, and "helpful" comments you didn't ask for. Get comfortable deleting code from AI output.

The review checklist:

Read the actual diff, not just Claude's summary of what it did
Check edge cases: AI gravitates toward the happy path
Verify it didn't touch files outside your request scope
Strip unnecessary abstractions and over-defensive error handling
Run git diff before approving anything

The developers getting the most out of coding agents aren't the ones asking for miracles. They're the ones who've built a rhythm: focused prompts, aggressive steering, context management, and rigorous review.

The agent handles the throughput. You handle the taste. That combination is hard to beat.