Chapter 9: Context Management

Coding agents run long conversations with many tool calls. The context window fills up fast. Context management — tracking history, estimating tokens, and compacting when necessary — is critical infrastructure.

The Problem

A typical coding session might involve:

Codex: ContextManager

History Tracking

Codex maintains an ordered transcript of ResponseItem variants:

Type Description
Message User messages
Reasoning Model reasoning/thinking
LocalShellCall Local shell invocations
FunctionCall Tool invocations
FunctionCallOutput Tool results
ToolSearchCall / ToolSearchOutput Tool discovery
CustomToolCall / CustomToolCallOutput Custom tool invocations
WebSearchCall Web search invocations
ImageGenerationCall Image generation requests
GhostSnapshot Compacted history marker
Compaction Encrypted compaction content
Other Catch-all for unrecognized types

Token Accounting

Compaction

When context exceeds the threshold:

  1. Creates a GhostSnapshot marker at the compaction boundary
  2. Summarizes old history (can use remote API call for summarization)
  3. Replaces old messages with the summary
  4. Continues with fresh context after the boundary

History Normalization

Before sending to the API:

Session Persistence

Code Location

Claude Code: Multi-Strategy Compaction

Message Types

Claude Code tracks 7 message types:

Type Description
UserMessage User input and tool_results
AssistantMessage Model output with tool_use blocks
SystemMessage Internal control (UI-only, not sent to API)
AttachmentMessage Memory files, code context
ProgressMessage Tool execution progress
TombstoneMessage Placeholder for deleted content
CompactBoundaryMessage Context compaction marker
ToolUseSummaryMessage Compressed tool results

Token Tracking

Compaction Strategies

Claude Code has multiple compaction strategies:

  1. Auto-compact: Triggered when tokens exceed threshold. Forks a summarization agent to compress old messages.

  2. Micro-compact: Cached message compression — individual large tool results are compressed into summaries.

  3. Snip-based: (Feature-gated HISTORY_SNIP) Selective removal of less-important history.

  4. Reactive: (Feature-gated CACHED_MICROCOMPACT) Reactive updates to cached compact data.

Compaction Flow

Token count exceeds threshold
    ↓
Insert CompactBoundaryMessage
    ↓
Fork summarization agent (separate API call)
    ↓
Agent summarizes messages before boundary
    ↓
Replace old messages with ToolUseSummaryMessage
    ↓
Continue conversation with reduced context

Message Normalization

Before sending to API (normalizeMessagesForAPI()):

Session Persistence

Code Location

Comparison

Aspect Codex Claude Code
Message types 14 (ResponseItem variants) 7+ (typed message classes)
Token estimation Byte-based heuristics API response counts + estimation
Compaction trigger Token threshold Token threshold
Compaction method GhostSnapshot + remote summarization Multi-strategy (auto, micro, snip, reactive)
Budget enforcement Token tracking max_budget_usd cost limit
Session resume codex resume / codex fork --resume / --fork-session / --resume-session-at
History normalization Strip unsupported types, validate pairs Strip UI fields, validate pairs, truncate content

Key Difference

Codex uses a single compaction strategy (ghost snapshots). Claude Code has multiple strategies that can be combined — auto-compact for the overall conversation, micro-compact for individual large results, and feature-gated experimental strategies. This reflects Claude Code's more mature context management, likely driven by handling more diverse real-world conversations.