CrewAI: Role-Based Crews and Flows

> CrewAI is the 2026 role-based multi-agent framework — Agents, Tasks, Crews, Processes as the four primitives. Production guidance from the docs: "for any production-ready application, start with a Flow."

Type: Learn + Build

Languages: Python (stdlib)

Prerequisites: Phase 14 · 12 (Workflow Patterns), Phase 14 · 14 (Actor Model)

Time: ~60 minutes

Learning Objectives

The Problem

Teams adopting multi-agent frameworks hit the same wall: "autonomous collaboration" sounds great, but when customers file a bug you need deterministic replay. CrewAI splits this explicitly — Crews for creative collaboration, Flows for event-driven, auditable, production-shaped workflows.

The Concept

Four primitives

Crews vs Flows

CrewAI 2026 docs say: start production apps with Flows; fold Crews in as sub-steps when autonomy earns its cost.

Memory system

CrewAI ships four memory types out of the box: short-term (within run), long-term (across runs), entity (per-entity facts), contextual (retrieval-time assembly). Integrations with vector stores are first-party.

AWS Bedrock integration

CrewAI has documented AWS Bedrock integration with CloudWatch, AgentOps, and Langfuse observability hooks. AWS docs cite a 5.76x speedup vs LangGraph on QA tasks in their benchmarks — take framework-specific numbers as directional, not absolute.

Dependency shape

Independent of LangChain. Python 3.10–3.13. Uses uv for dependency management. 30k+ GitHub stars early 2026.

Where this pattern goes wrong

Build It

code/main.py implements stdlib versions of both:

Run it:

python3 code/main.py

The Crew trace is fluid and variable; the Flow trace is fixed and observable. That is the choice.

Use It

Ship It

outputs/skill-crew-or-flow.md picks Crew vs Flow for a task and scaffolds the minimal implementation.

Exercises

  1. Convert a Crew-based demo to a Flow. Count the touchpoints where variability drops.
  2. Add entity memory to the Crew: facts about a customer persist across tasks.
  3. Implement a Hierarchical process: a manager Agent picks which specialist runs next based on the prior output.
  4. Read CrewAI's docs intro. Port your toy to the real crewai API. What changes about testability?
  5. Wire AgentOps or Langfuse to one of your runs. Which traces did you miss in the stdlib version?

Key Terms

Term What people say What it actually means
Agent "Persona" Role + goal + backstory + tools
Task "Unit of work" Description + expected output + assignee
Crew "Agent team" Container for Agents + Tasks + Process
Process "Execution strategy" Sequential / Hierarchical / Consensual
Flow "Deterministic workflow" Event-driven, code-owned, testable
Backstory "Persona prompt" Tone and judgment shaper for the Agent
Entity memory "Per-entity facts" Memory scoped to a customer/account/issue

Further Reading