Agentic Stack

The layers that show up when building an agent application.


An "agent" is a loop around a model that can call tools and decide what to do next. The actual stack underneath has a handful of layers that show up in every serious agent project.

The layers

LayerWhat it doesTypical choice
ModelReasoning, generationClaude, GPT, Gemini, or an open model
ToolsSide effects, retrieval, computationCustom functions, MCP servers, code execution
ContextWhat the model sees on each turnSystem prompt, retrieved docs, tool results, history
OrchestrationLoop control, branching, retriesA while loop, LangGraph, your own state machine
MemoryPersistence across turns or sessionsDB rows, vector store, summaries
EvalsHow you know it worksGolden set + LLM judge + production traces
ObservabilityTracing, replay, cost monitoringLangfuse, LangSmith, Braintrust, Helicone
DeploymentWhere the loop runsEdge function, long-running worker, queue

The loop, concretely

import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();

async function runAgent(messages, tools) {
  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 4096,
      tools,
      messages,
    });
    messages.push({ role: "assistant", content: response.content });

    if (response.stop_reason !== "tool_use") return response;

    const results = await Promise.all(
      response.content
        .filter((b) => b.type === "tool_use")
        .map(async (b) => ({
          type: "tool_result",
          tool_use_id: b.id,
          content: await runTool(b.name, b.input),
        }))
    );
    messages.push({ role: "user", content: results });
  }
}

Findings

  • The loop is small. Most agent code is a 20-line while loop. The framework is rarely the hard part.
  • Tools are the product. The value of an agent is its tools. Spend time on tool design, not orchestration.
  • Context is the bottleneck. What you put in the prompt matters more than which model you pick. See Context.
  • Long horizons are hard. Agents that take 10+ steps fail in ways short ones never do. Test long traces explicitly.
  • State leaks everywhere. Tool results, scratchpads, errors. All of it ends up in context. Plan for it.

Reading