Agno and Mastra: Production Runtimes
> Agno (Python) and Mastra (TypeScript) are the 2026 production-runtime pairing. Agno aims at microsecond agent instantiation and stateless FastAPI backends. Mastra ships agents, tools, workflows, unified model routing, and composite storage on the Vercel AI SDK substrate.
Type: Learn
Languages: Python, TypeScript
Prerequisites: Phase 14 · 01 (Agent Loop), Phase 14 · 13 (LangGraph)
Time: ~45 minutes
Learning Objectives
- Identify Agno's performance targets and when they matter.
- Name Mastra's three primitives — Agents, Tools, Workflows — and the supported server adapters.
- Explain why a stateless session-scoped FastAPI backend is the recommended Agno production path.
- Pick Agno vs Mastra for a given stack (Python-first vs TypeScript-first).
The Problem
LangGraph, AutoGen, CrewAI are framework-heavy. Teams that want "just the agent loop, fast, in my runtime" reach for Agno (Python) or Mastra (TypeScript). Both trade some of the framework-owned primitives for raw speed and a tighter fit to the surrounding stack.
The Concept
Agno
- Python runtime, formerly Phi-data.
- "No graphs, chains, or convoluted patterns — just pure python."
- Performance targets from their docs: ~2μs agent instantiation, ~3.75 KiB memory per agent, ~23 model providers.
- Production path: stateless session-scoped FastAPI backend. Each request starts a fresh agent; session state lives in a DB.
- Native multimodal (text, image, audio, video, file) and agentic RAG.
The speed targets matter when you have thousands of short-lived agents per second (chat fan-in, evaluation pipelines). They matter less when one agent runs for 10 minutes.
Mastra
- TypeScript, built on Vercel AI SDK.
- Three primitives: Agents, Tools (Zod-typed), Workflows.
- Unified Model Router — 3,300+ models across 94 providers (March 2026).
- Composite storage: memory, workflows, observability to different backends; ClickHouse recommended for observability at scale.
- Apache 2.0 with
ee/directories under source-available enterprise license. - Server adapters for Express, Hono, Fastify, Koa; first-class Next.js and Astro integration.
- Ships Mastra Studio (localhost:4111) for debugging.
- 22k+ GitHub stars, 300k+ weekly npm downloads at 1.0 (Jan 2026).
Positioning
Neither is trying to be LangGraph. They compete on:
- Language fit. Agno for Python-first teams; Mastra for TypeScript-first.
- Runtime ergonomics. Agno = near-zero overhead; Mastra = integrated with the Vercel ecosystem.
- Observability. Both integrate with Langfuse/Phoenix/Opik (Lesson 24) but Mastra Studio is first-party.
When to pick each
- Agno — Python backend, many short-lived agents, strong perf requirements, FastAPI shop.
- Mastra — TypeScript backend, Next.js / Vercel deploy, unified multi-provider model routing, Zod-typed tools.
- LangGraph (Lesson 13) — when durable state and explicit graph reasoning matter more than raw speed.
- OpenAI / Claude Agent SDK — when you want the provider's productized shape (Lessons 16–17).
Where this pattern goes wrong
- Perf-for-perf's-sake. Picking Agno because "2μs" sounds good when the workload is one slow agent call per request. Overhead is not the bottleneck.
- Ecosystem lock-in. Mastra's Vercel-flavored integration is a plus on Vercel, a minus elsewhere.
- Enterprise license confusion. Mastra's
ee/directories are source-available, not Apache 2.0. Read the licenses if you're planning to fork.
Build It
This lesson is primarily comparative — no single code artifact would do both frameworks justice. See code/main.py for a side-by-side toy: a minimal "run an agent, stream the output, persist session" flow implemented twice (once Agno-shaped, once Mastra-shaped).
Run it:
python3 code/main.py
Two structurally different but functionally equivalent traces.
Use It
- Agno — Python backend that needs speed and FastAPI shape.
- Mastra — TypeScript backend with many providers and workflow primitives.
- Both ship first-party observability hooks. Both integrate with Langfuse.
Ship It
outputs/skill-runtime-picker.md picks Agno, Mastra, LangGraph, or a provider SDK based on stack, latency budget, and operational shape.
Exercises
- Read Agno's docs. Port the stdlib ReAct loop (Lesson 01) to Agno. What disappeared? What stayed?
- Read Mastra's docs. Port the same loop to Mastra. What changed in tool typing (Zod vs nothing)?
- Benchmark: measure agent instantiation latency on your stack. Does Agno's 2μs matter to your workload?
- Design a migration: if you've been running CrewAI in Python, what breaks if you move to Agno?
- Read Mastra's
ee/license terms. What restrictions would affect an open-source fork?
Key Terms
| Term | What people say | What it actually means |
|---|---|---|
| Agno | "Fast Python agents" | Stateless session-scoped agent runtime |
| Mastra | "TypeScript agents on Vercel AI SDK" | Agents + Tools + Workflows + Model Router |
| Unified Model Router | "Multi-provider access" | Single client for 3,300+ models across 94 providers |
| Composite storage | "Multiple backends" | Memory/workflows/observability each to a different store |
| Mastra Studio | "Local debugger" | localhost:4111 UI for introspecting agents |
| Source-available | "Not OSS" | License permits source reading but restricts commercial use |
Further Reading
- Agno Agent Framework docs — performance targets, FastAPI integration
- Mastra docs — primitives, server adapters, Model Router
- LangGraph overview — the stateful-graph alternative
- Comet Opik — observability comparisons cited by Mastra integrations