Claude Agent SDK: Subagents and Session Store

> The Claude Agent SDK is the library form of the Claude Code harness. Built-in tools, subagents for context isolation, hooks, W3C trace propagation, session store parity. Claude Managed Agents is the hosted alternative for long-running async work.

Type: Learn + Build

Languages: Python (stdlib)

Prerequisites: Phase 14 · 01 (Agent Loop), Phase 14 · 10 (Skill Libraries)

Time: ~75 minutes

Learning Objectives

The Problem

A raw LLM API gets you one round-trip. A production agent needs tool execution, MCP servers, lifecycle hooks, subagent spawning, session persistence, trace propagation. Claude Agent SDK ships this shape as a library — the same harness Claude Code uses, exposed for custom agents.

The Concept

Client SDK vs Agent SDK

Built-in tools

The SDK ships 10+ tools out of the box: file read/write, shell, grep, glob, web fetch, more. Custom tools register via the standard tool-schema interface.

Subagents

Two purposes documented by Anthropic:

  1. Parallelization. Run independent work concurrently. "Find the test file for each of these 20 modules" is 20 parallel subagent tasks.
  2. Context isolation. Subagents use their own context window; only results return to the orchestrator. The orchestrator's budget is preserved.

Python SDK recent additions: list_subagents(), get_subagent_messages() for reading subagent transcripts.

Session store

Protocol parity with TypeScript:

--session-mirror (CLI flag) mirrors the transcript to an external file as it streams, for debugging.

Hooks

Lifecycle hooks you can register:

Hooks are how pro-workflow (Phase 14 curriculum reference) and similar systems add cross-cutting behavior.

W3C trace context

OTel spans active on the caller propagate into the CLI subprocess via W3C trace context headers. The whole multi-process trace shows up as one trace in your backend.

Claude Managed Agents

The hosted alternative (beta header managed-agents-2026-04-01). Long-running async work, built-in prompt caching, built-in compaction. Trade control for managed infrastructure.

Where this pattern goes wrong

Build It

code/main.py implements the SDK shape in stdlib:

Run it:

python3 code/main.py

The trace shows subagent context isolation (orchestrator context size stays bounded), hook execution, and session persistence.

Use It

Ship It

outputs/skill-claude-agent-scaffold.md scaffolds a Claude Agent SDK app with subagents, hooks, session store, MCP server attachment, and W3C trace propagation.

Exercises

  1. Add a subagent spawner that batches 20 tasks into groups of 5 parallel subagents. Measure orchestrator context size vs one-per-task.
  2. Implement a PreToolUse hook that rate-limits write_file calls (5 per minute per session). Trace the behavior.
  3. Wire list_subkeys to render a subagent tree. What does deep nesting look like?
  4. Port the toy to the real claude-agent-sdk Python package. What changes about tool registration?
  5. Read the Claude Managed Agents docs. When would you switch from self-hosted to managed?

Key Terms

Term What people say What it actually means
Agent SDK "Claude Code as a library" Harness shape: tools, MCP, hooks, subagents, session store
Subagent "Child agent" Separate context, own budget; results bubble up
Session store "Conversation DB" Persist, load, list, delete turns with subagent cascade
Hook "Lifecycle callback" Pre/post tool, session, prompt submit, compact, stop
W3C trace context "Cross-process trace" Parent span propagates into CLI subprocess
Managed Agents "Hosted harness" Anthropic-hosted long-running async work
--session-mirror "Transcript mirror" Writes session turns to an external file as they stream
MCP server "Tool surface" External tool/resource source attached to the agent

Further Reading