Per Anthropic's June 2026 cost management documentation, the average Claude Code developer spends about $13 per active day, and 90% of users stay under $30. That cost isn't random. It comes from accumulated context, verbose output, and exploratory prompts that could have been avoided. These 7 habits address exactly that.
Hack 1: /compact Is Your Biggest Single Lever
In 2026, Anthropic's Claude Code docs list /compact as the primary tool for managing context window growth in long sessions. It compresses your full conversation history into a tight summary, dropping accumulated token count without losing your working state.
Use it when sessions run long — after an hour of active work, or when you notice responses getting slower. The savings compound over a multi-hour session faster than any other change you can make. Type /compact in the session. That's the whole hack.
For even deeper cuts on repeated API calls outside Claude Code, prompt caching reduces input token costs by 90% on cached content.
Hack 2: CLAUDE.md Loads Context Once, Not on Every Prompt
Per Anthropic's Claude Code documentation (retrieved June 2026), placing a CLAUDE.md file at your project root lets Claude read project-specific instructions automatically at session start. You write them once. You stop repeating them at the top of every conversation.
The official docs recommend keeping CLAUDE.md under 200 lines. Every line loads into context on every message you send. A bloated CLAUDE.md increases your per-message cost across every session on that project.
Keep it tight. Only what Claude needs every session: project structure, stack, constraints, naming conventions. Move specialized workflow instructions to separate skill files that load on demand.
# CLAUDE.md example (keep it ruthlessly short) ## Stack Next.js 16 App Router, TypeScript, Tailwind CSS v4, Supabase ## Project structure src/app/ — routes, src/components/ — UI, src/lib/ — utils ## Rules - No new npm packages without flagging first - Server components only in page.tsx files - All animations use IntersectionObserver pattern
Hack 3: Vague Prompts Are the Hidden Cost
"Fix my app" is not a prompt. It's an invitation for Claude to explore, guess context, and burn tokens before doing real work. "Fix the auth bug in src/auth/login.ts line 42 — JWT not validated" goes straight to the fix.
The difference isn't style. It's token count. Exploratory prompts front-load context gathering. Surgical prompts front-load the answer.
Precision is the cheapest optimization available. No tooling. No setup. No API changes.
Before sending any prompt, ask: does Claude need to figure out what you mean, or is that already in the prompt? If Claude has to guess, rewrite it.
Hack 4: Kill the Explanation
Claude Code defaults to explaining its reasoning. Useful when learning. Expensive when you just need the output.
End your prompts with "no explanation needed" or "just output the code." This cuts output tokens on every single request. On a day with fifty prompts, that adds up to real money on your bill. It costs you nothing in output quality on routine tasks.
Where to skip it: novel architecture decisions or complex debugging where seeing Claude's reasoning chain helps you catch wrong assumptions. For anything routine — refactoring, simple fixes, code generation with clear specs — default to skipping the explanation.
Hack 5: /clear Between Unrelated Tasks
Carrying context from one task into an unrelated one doesn't help Claude. It adds tokens that have no bearing on the new problem.
Use /clear whenever you shift to a genuinely different task. Reviewing a PR, then switching to build a new feature, then debugging an unrelated module — each is a clear candidate for a fresh session.
One command. Zero baggage. The session that was helping you debug authentication has no business being present while you write a data migration.
Hack 6: Reference Files, Not Directories
"Update the button component" forces Claude to scan the directory, identify candidates, and load context for multiple files before settling on the one you meant. "Update /components/Button.tsx" goes directly there.
The extra context from a directory scan isn't neutral. It costs tokens, it costs latency, and it occasionally introduces noise from adjacent files that Claude then has to reason around.
Exact paths remove all of that. Make it a habit: always give Claude the exact file path. Every time.
Hack 7: Forced-Choice Questions Over Open Questions
"What should I do about the database schema?" opens an exploration. Claude will weigh options, cover trade-offs, and produce a multi-paragraph response covering angles you didn't ask about.
"Should I normalize this to 3NF or keep it denormalized for read performance?" gets a direct answer.
Forced-choice questions define the decision space before Claude starts generating. You're not limiting answer quality. You're removing the exploratory overhead that doesn't produce value for you.
When you know the options, put them in the prompt. This applies to architecture decisions, debugging strategies, code style, anything where you already have the candidates in mind. The batch and model routing guide uses the same principle at the API level: constrain the problem, reduce the cost.
What These Hacks Save in Practice
In 2026, Anthropic's cost docs show that prompt cache reads cost 10% of standard input price — $0.30/MTok vs $3.00/MTok for Sonnet 4.6 (Anthropic pricing, June 2026). Anything that reduces context reprocessing (like /compact and tight CLAUDE.md files) compounds at that same 90% discount rate.
Agent team mode is the other extreme. Per Anthropic's cost docs, each agent runs its own full context window as a separate Claude instance, using approximately 7x more tokens than single-agent sessions. Every bad habit above gets multiplied by 7 in agent mode. These hacks matter most there.
None of the seven changes above require you to change what you build. They change what you send. That's a meaningfully different ask.
For a full breakdown of API-level cost reduction including the Batch API and model routing, see the Claude API cost reduction guide.
This is part of an ongoing token saving series. The next post covers better alternatives to some of the commands above, including smarter replacements for /compact and "kill the explanation" that work in more situations. Subscribe below to get it when it drops.
Frequently Asked Questions
When should I use /compact vs /clear?
Use /compact when you want to continue the current task but reduce context bloat from earlier in the session. Use /clear when switching to a completely unrelated task where the existing context has no value. /compact preserves working state. /clear discards it.
How long should a CLAUDE.md file be?
Anthropic's official docs recommend under 200 lines. In practice, 50-100 lines covers most projects well. Only include information Claude needs on every session: stack, structure, constraints, conventions. Move specialized instructions to separate skill files that load on demand rather than on every message.
Does "no explanation needed" ever hurt output quality?
On routine tasks like refactoring, simple fixes, or code generation with clear specs, no. Where it can backfire: novel architecture decisions or complex debugging where Claude's reasoning chain helps you spot wrong assumptions. Default to it on anything routine. Skip it when you need to see the thinking.
Do these hacks work with Claude Code agent mode?
Yes, and they matter more there. Per Anthropic's cost documentation (June 2026), agent teams use approximately 7x more tokens than single-agent sessions because each sub-agent runs its own full context window. Tight prompts, exact file paths, and /compact on the orchestrator session all reduce the base cost that gets multiplied across every agent.
What's the fastest win from this list?
Adding "no explanation needed" to prompts you send more than five times a day. It takes two seconds, cuts output tokens on each one, and requires no change to how you work. Start there, then tackle CLAUDE.md once you see the bill drop.
Sources
- Anthropic, "Manage costs effectively — Claude Code," retrieved June 2026 - code.claude.com/docs/en/costs
- Anthropic, "Models Overview and Pricing," retrieved June 2026 - platform.claude.com/docs/en/about-claude/pricing