Claude Codes is not a simple LLM, Here is a breakdown of every piece:
Configuration (active)
- settings.json — your permissions, allowed tools, MCP server config, hooks
- settings.local.json — machine-local overrides (not committed)
- settings.json.backup / .OPTIMIZED — snapshots from past config changes
- CLAUDE.md — your global instructions injected into every conversation
- .credentials.json — your Anthropic API auth token
Conversation history
- history.jsonl — full log of every prompt/response across all sessions (used for /cost, search, replay)
- projects//*.jsonl — per-session conversation transcripts for this working directory; the UUID-named folders
alongside them hold task state for that session
Active session state
- sessions/.json — one file per running Claude Code process (identified by OS process ID); tracks what session is
live - session-env/ — captures shell environment at session start so Claude inherits your PATH etc.
- shell-snapshots/ — snapshots of the working shell state (cwd, env) for context restoration
- paste-cache/ — temporary storage for large text pastes you make mid-conversation
Planning & tasks
- plans/.md — when you enter plan mode, the plan is written here and shown to you for approval before
implementation - tasks// — per-session task lists (the TaskCreate/TaskUpdate tool state); disappear when the session ends
Memory (persistent across sessions)
- projects//memory/MEMORY.md + *.md files — the memory system described earlier
- projects//memory/ is per-project; global memories would live at root level
IDE integration
- ide/ — socket/config files for VS Code and JetBrains extensions to connect to the running Claude Code process
Plugins / MCP
- plugins/ — installed MCP plugin metadata (config.json, blocklist.json, download cache)
- mcp-needs-auth-cache.json — tracks which MCP servers need OAuth so Claude doesn’t re-prompt every turn
Telemetry & caching
- telemetry/ — usage events sent to Anthropic (opt-out via settings)
- statsig/ — feature flag evaluation cache (controls which Claude Code features are enabled for your account)
- stats-cache.json — cached token/cost stats for /cost display
- cache/ — general HTTP response cache
- debug/ — diagnostic dumps written when Claude Code crashes or you run /bug
Misc
- file-history/ — tracks which files Claude has read/edited for the permission system
- todos/ — legacy todo list storage (superseded by tasks/)
- archived-projects/ — old project conversation archives
- backups/ — periodic backups of settings.json and other config before destructive changes
The harness — everything outside the LLM itself that makes it useful as a coding agent has
- Permission system (settings.json allowlists) — lets the LLM touch the filesystem and run commands safely
- Hook system — shell commands that fire on events (pre/post tool call), letting you automate behaviors the LLM alone can’t enforce
- Session/state management (sessions/, shell-snapshots/, session-env/) — keeps the agent grounded in your actual working environment
- MCP plugin system (plugins/) — extends the LLM’s tool repertoire without retraining
- Context injection (CLAUDE.md, memory files) — shapes behavior persistently without being in every prompt
The core IP is really the interplay between three things:
- The tool definitions (Read, Edit, Bash, Grep, etc.) — structured interfaces so the LLM can act on the world, not
just describe actions - The permission/hook infrastructure — the safety layer that makes autonomous tool execution viable
- The memory + context system — what makes it a persistent agent rather than a stateless chatbot
So referencing the spirit of claude code harnessing, we would do the same for the index construction agent:

I have not used hooks yet, hooks run in the harness, not in the LLM. here are potential examples:
Example 1 — Log every Python script Claude runs
"hooks": {
"PostToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "echo \"[$(date)] $CLAUDE_TOOL_INPUT\" >> C:/Users/ncarucci/.claude/bash_audit.log"
}
]
}
]
}
Every Bash call Claude makes gets appended to a log. Useful for auditing what ran during an index build.
---
Example 2 — Block accidental writes to production Excel outputs
"hooks": {
"PreToolUse": [
{
"matcher": "Edit",
"hooks": [
{
"type": "command",
"command": "echo \"$CLAUDE_TOOL_INPUT\" | python -c \"import sys,json; d=json.load(sys.stdin); exit(1 if
'_FINAL' in d.get('file_path','') else 0)\""
}
]
}
]
}
If Claude tries to edit a file with _FINAL in the name, the hook exits with code 1, blocking the tool call before it
executes.
---
Example 3 — Auto-validate universe size after each pipeline step
"hooks": {
"PostToolUse": [
{
"matcher": "Bash(python:*)",
"hooks": [
{
"type": "command",
"command": "python C:/Users/ncarucci/Documents/Gitfolder/Working/shared/validate_universe.py"
}
]
}
]
}
After every Python run, a validation script checks the output CSV row count is within expected bounds and prints a
warning if not — Claude sees that output and can react.