If you read my earlier post on the Skill Vault pattern, you know I cut Claude Code's per-session overhead by 96%. After living with it for a few weeks, I went looking for what was still eating tokens β and found three more sinks worth killing.
This is the follow-up: smaller wins individually, but together they shaved another ~5,000 tokens off every single session with zero capability lost.
Where the Tokens Were Still Hiding
After the vault, my baseline was around 51K tokens per session. I dug into the system prompt to see what remained:
- Skill list (still): ~3K tokens
- Project
CLAUDE.md: ~2K tokens - claude-mem auto-injected timeline: ~2K tokens
- Plugin hook reminders: ~1K tokens (some recurring per turn)
Three of these were either unnecessary or way overweight. Here's how I killed each.
Sink #1: A Bloated Root CLAUDE.md (~1.5K tokens)
CLAUDE.md files auto-load into every session that touches the repo. Mine had grown to 326 lines / 8 KB with quickstart instructions for every component:
- Flutter dev commands
- Backend
npmscripts - React build steps
- ML service Python setup
- A duplicate gstack skill listing
The problem: I was loading every component's instructions on every turn, even when I was only working in one of them.
Fix: Hierarchical CLAUDE.md Files
Claude Code loads CLAUDE.md hierarchically β only the ones in your current working tree get pulled in. So instead of one fat root file, I split it:
khetisahayak/
βββ CLAUDE.md # 55 lines β overview, ports, creds only
βββ kheti_sahayak_app/CLAUDE.md # Flutter details
βββ frontend/CLAUDE.md # React details
βββ ml/CLAUDE.md # ML service details
βββ kheti_sahayak_backend/CLAUDE.md # already existed
The root now contains only:
- Project overview (5 lines)
- Service ports table
- Test credentials
- Cross-cutting auth + DB notes
- Troubleshooting
Component-specific details only load when I'm actually working in that subdir.
Result: root CLAUDE.md trimmed from 326 β 55 lines (8 KB β 1.6 KB). ~1.5K tokens saved per session.
Sink #2: Plugin SessionStart Hooks Injecting "Helpful" Context
I use claude-mem for persistent memory across sessions. Genuinely useful. But it has a SessionStart hook that auto-injects a timeline of recent observations at the top of every conversation β about 50 entries, ~2K tokens:
S499 Indeed Auto-Apply β User asked how to automate job applications
S498 Indeed MCP Integration Query β clarifying JobSpy vs Apify
2337 12:33p β
systemd/install.sh β Backend Services Enabled
... 47 more lines
I almost never needed this auto-recall. When I want past context, I call mem-search explicitly.
Fix: Disable the Auto-Inject, Keep the Memory
The hook config lives at:
~/.claude/plugins/cache/thedotmack/claude-mem/<version>/hooks/hooks.json
The SessionStart array has three hooks. The third is the timeline injection:
{
"type": "command",
"command": "... node bun-runner.js worker-service.cjs hook claude-code context ..."
}
I removed just that one entry. The other two SessionStart hooks (install + worker-start) and the recording hooks (PostToolUse, Stop, SessionEnd) stay intact, so memory is still being captured. The MCP search server still works β I just have to ask for it.
# Always back up before editing plugin internals
cp hooks.json hooks.json.bak
# Remove the 3rd SessionStart hook (jq or manual edit)
Result: ~2K tokens saved per session. Memory still works on demand.
β οΈ Caveat: editing a file in ~/.claude/plugins/cache/ will be overwritten on plugin upgrade. For durability, mirror the change in your user-level ~/.claude/settings.json hooks block, or add a small post-upgrade re-patch script.
Sink #3: Round 2 of the Skill Vault (~1.4K tokens)
The original vault was a one-time bulk move. After several weeks of actual usage, I saw which skills I'd installed and never touched. The vault was overdue for a second pass.
Audit script (same as the first post, run again):
for f in ~/.claude/skills/*/SKILL.md; do
dir=$(dirname "$f")
name=$(basename "$dir")
size=$(wc -c < "$f")
echo "$size $name"
done | sort -rn | head -30
I ended up vaulting 27 more skills out of 73 actively loaded:
-
8 marketing skills (
mkt-content,mkt-seo,mkt-social, etc.) β I do real marketing in a separate context -
6 research skills (
research,research-deep,research-report, etc.) β episodic, not daily -
5 niche tools (
obsidian-vault,make-pdf,pair-agent,setup-browser-cookies,open-gstack-browser) -
4 design-heavy (
design-consultation,design-html,design-shotgun,devex-review) β restored only when designing -
3 plan reviews (
plan-design-review,plan-devex-review,plan-tune) -
1 interview prep (
staff-engineer-interview) β used a few times last quarter, not weekly
Bulk move:
for s in mkt-content mkt-email mkt-growth mkt-pr mkt-review mkt-seo mkt-social cmo \
research research-add-fields research-add-items research-deep research-report edit-article \
staff-engineer-interview \
obsidian-vault make-pdf pair-agent setup-browser-cookies open-gstack-browser \
plan-design-review plan-devex-review plan-tune \
design-consultation design-html design-shotgun devex-review; do
mv ~/.claude/skills/$s ~/.claude/skills-vault/ 2>/dev/null
done
The skill-vault index skill from the original post still bridges everything β Claude knows where each skill lives and restores it on demand.
Result: 73 β 46 active skills. ~1.4K tokens saved.
The Combined Result
| Fix | Tokens Saved (per session) |
|---|---|
| Sink #1 β CLAUDE.md trim + per-component split | ~1.5K |
| Sink #2 β claude-mem timeline disabled | ~2K |
| Sink #3 β Round 2 skill vault | ~1.4K |
| Total | ~4.9K |
On top of the original 96% reduction from the Skill Vault, this is another solid bite. But honestly, the dollar value isn't the point.
Why This Matters Beyond Cost
Every token in your context is a token Claude has to attend over before generating its response. The bigger your prompt, the more diluted attention becomes on the actual task.
Big context β better answers. Frequently it's the opposite. Targeted context wins.
The pattern across all three fixes is the same:
- Audit what's auto-loaded vs. what's actually useful. You'll be surprised.
- Move episodic content out of the always-on path and into on-demand access.
- Trust the model to pull what it needs β when it does need the vaulted skill, the subdir's CLAUDE.md, or the memory search, it will reach for it.
Less ambient noise. Sharper signal.
Tips for Your Own Audit
-
Inspect, don't guess.
wc -cyourCLAUDE.mdfiles.ls ~/.claude/skills/. Read your pluginhooks.jsonfiles line by line. -
Hierarchy is free. Per-directory
CLAUDE.mdfiles cost nothing when you're not in that directory. -
Plugin hooks are debt. Every
SessionStartorUserPromptSubmithook is a tax. Audit them by hand. Some are essential (auth, telemetry); some inject "context" that's just clutter. - Re-vault every few weeks. Usage patterns shift. Skills that were daily three months ago may be quarterly now. Yesterday's must-have is tomorrow's vault candidate.
-
Watch for per-turn taxes. A
UserPromptSubmithook costs N tokens every single turn. Even a small reminder block adds up fast in a long session.
Conclusion
The Skill Vault pattern is still the heaviest hitter. These three follow-ups are the long tail:
- CLAUDE.md β split per component, slim the root.
- Plugin hooks β audit auto-injected context. Disable what you don't need.
- Skill vault β revisit it. Vault more.
Together: another ~5K tokens per session, zero capability lost.
When your context is tight, your model is sharp.
I'm Prakash Ponali, a Staff Engineer with 16+ years in enterprise eCommerce. Currently building Khetisahayak β a farming helper app for Telugu-speaking farmers in Andhra Pradesh. Find me on LinkedIn.
United States
NORTH AMERICA
Related News
How Brazeβs CTO is rethinking engineering for the agentic area
10h ago
Amazon Employees Are 'Tokenmaxxing' Due To Pressure To Use AI Tools
21h ago

Implementing Multicloud Data Sharding with Hexagonal Storage Adapters
15h ago

DeepMindβs CEO Says AGI May Be ~4 Years Away. The Last Three Missing Pieces Are Not What Most People Think.
15h ago

CCSnapshot - A Claude Code Configs Transfer Tool
21h ago