Production Runtimes: Queue, Event, Cron

> Production agents run on six runtime shapes: request-response, streaming, durable execution, queue-based background, event-driven, and scheduled. Pick the shape before you pick the framework. Observability is load-bearing at every shape.

Type: Learn

Languages: Python (stdlib)

Prerequisites: Phase 14 · 13 (LangGraph), Phase 14 · 22 (Voice)

Time: ~60 minutes

Learning Objectives

The Problem

Production agents fail in ways a Jupyter notebook doesn't surface: network timeouts at step 37, user hangs up mid-voice call, cron job dies on machine reboot, background worker runs out of memory. The runtime shape determines which failures are survivable.

The Concept

Request-response

Streaming

Durable execution

Queue-based / background

Event-driven

Scheduled

2026 deployment patterns

Observability is load-bearing

Without OpenTelemetry GenAI spans (Lesson 23) plus a Langfuse/Phoenix/Opik backend (Lesson 24), you cannot debug a multi-step agent that failed at step 40. This is not optional for production. It's the difference between "we debug fast" and "we replay from scratch with more logging."

Where production runtimes fail

Build It

code/main.py is a stdlib multi-shape demo:

Run it:

python3 code/main.py

Output: five traces showing each shape's behavior on the same task. Same agent logic, different outer shells. Durable execution (the sixth shape) is intentionally covered in Lesson 13 with LangGraph checkpointing.

Use It

Ship It

outputs/skill-runtime-shape.md picks a runtime shape for a task and wires the observability requirements.

Exercises

  1. Port your Lesson 01 ReAct loop to all six shapes in your stack. Which shape fits which product surface?
  2. Add a DLQ to the queue-based demo. Simulate 10% job failure; surface DLQ size.
  3. Write a cron-triggered eval agent that runs nightly against your top 20 traces from the day.
  4. Implement streaming with backpressure: if the client is slow, pause the agent. How does this interact with a turn budget?
  5. Read Claude Managed Agents docs. When would you move a self-hosted long-horizon agent to managed?

Key Terms

Term What people say What it actually means
Request-response "Synchronous" User waits; short tasks only
Streaming "SSE / WS" Progressive output; better UX; latency observable per chunk
Durable execution "Resume from failure" Checkpointed state; restart at last step
Queue-based "Background jobs" Producer / worker pool / DLQ
Event-driven "Trigger-based" Agent reacts to external events
DLQ "Dead-letter queue" Parking lot for failed jobs
Claude Managed Agents "Hosted harness" Anthropic-hosted long-running async with caching + compaction

Further Reading