Why agents do not call models directly
The operational reason for routing spec-driven agents through AgentRuntime.
Why this mattered
Direct model calls are tempting until you need telemetry, cost caps, guardrails, and reproducibility.
This belongs in the development timeline because Agentic Quant Platform is not a single feature. It is a local-first quant research and trading platform with FastAPI, Celery, Postgres, Iceberg, DuckDB, MLflow, Redis-backed RAG, strategy factories, agents, bots, streaming, and paper trading. The project only became useful once its infrastructure decisions were written down well enough to be repeated.
Design decision
AgentRuntime turns a YAML spec into an auditable run with model choice, context, tools, and outputs in one place.
The practical stack around this decision includes Python, FastAPI, Celery, Redis, Postgres, SQLAlchemy, Alembic, Iceberg, DuckDB, MLflow, LiteLLM, CrewAI, LangGraph, vectorbt-pro, Kafka, Flink, Next.js. I try to keep the interfaces small: configuration describes intent, runtime code owns behavior, and operational notes explain what a future maintainer should check first.
What I would repeat
That is the difference between a clever agent and an operated agent.
The repeatable pattern is to make the boring path explicit. For this project that means clear repository boundaries, documented setup, predictable deployment commands, and enough observability to know whether the system is healthy or merely quiet.
Reader takeaway
If you are building something similar, start with the workflow you need to repeat every week. Then add only the platform pieces that make that workflow easier to recover, explain, and extend.