Senior Architect Interview Series

LangGraph & Agentic AI
Complete Interview Prep Guide

10 chapters · From ReAct patterns to production agents · 2026 Edition

📅 March 28, 2026⏱ ~90 min read🎯 Senior Engineer / Architect level
Chapter 7 of 10Multi-Agent: Supervisor Pattern & Handoffs

LangGraph Chapter 7 — Multi-Agent: Supervisor Pattern & Handoffs

Senior Architect Interview Series — LangGraph & Agentic AI


Navigation

Chapter 6 — Memory | Chapter 8 — Human-in-the-Loop →


7.0 What This Chapter Covers

Single-agent systems hit limits: one agent can't be an expert at everything. Multi-agent systems solve this by decomposing work across specialized agents, each with their own tools and prompts. This chapter covers:

  1. Why multi-agent architectures matter
  2. The supervisor pattern — your project's approach
  3. supervisor.py deep dive and LangGraph integration
  4. Sub-agent worker patterns
  5. data_agent.py as a specialized worker
  6. Handoff strategies and communication patterns
  7. Production considerations for multi-agent systems

7.1 Why Multi-Agent?

A single agent with many tools becomes difficult to optimize:

ProblemSymptom
Too many toolsLLM gets confused about which tool to use
Competing concernsSame agent for code, data, conversation
PerformanceAll requests hit the same (expensive) model
ReliabilityOne faulty tool can destabilize the whole agent
SpecializationNo room for domain-specific prompting

Solution: Decompose into specialized agents, each with focused tools, tailored prompts, and appropriate models.

                    ┌─────────────┐
    User Question──►│  Supervisor │
                    │  (Classifier)│
                    └──────┬──────┘
                           │
           ┌───────────────┼───────────────┐
           │               │               │
           ▼               ▼               ▼
    ┌────────────┐  ┌────────────┐  ┌────────────┐
    │  RAG Agent │  │ Data Agent │  │General Agent│
    │            │  │            │  │            │
    │ ChromaDB   │  │  SQLite/   │  │  Direct    │
    │ retrieval  │  │  Postgres  │  │  LLM call  │
    └────────────┘  └────────────┘  └────────────┘
           │               │               │
           └───────────────┴───────────────┘
                           │
                    Final Answer

7.2 Your Project's Supervisor — supervisor.py

# agent/supervisor.py
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

def route_question(question: str) -> str:
    """
    Classify the user's question into one of three categories:
    - RAG:     Questions about Agent Factory (knowledge base)
    - DATA:    Questions requiring database queries/metrics
    - GENERAL: General conversational questions
    
    Returns: "RAG", "DATA", or "GENERAL"
    """
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    
    prompt = f"""You are a routing assistant. Classify the following question.
    
    Categories:
    - RAG: Questions about Agent Factory features, architecture, capabilities, documentation
    - DATA: Questions requiring data analysis, metrics, statistics, database queries
    - GENERAL: All other conversational questions
    
    Respond with exactly one word: RAG, DATA, or GENERAL
    
    Question: {question}"""
    
    response = llm.invoke([HumanMessage(content=prompt)])
    return response.content.strip().upper()

Key design decisions:

  • temperature=0 — deterministic routing, no randomness
  • Exactly one word response — forces clean output, easy to parse
  • Separate LLM call — supervisor is decoupled from the worker agents
  • Fast, cheap model (gpt-4o-mini) — routing doesn't need GPT-4

7.3 Integrating the Supervisor into LangGraph

Here is the full multi-agent LangGraph graph that integrates your supervisor.py with specialized workers:

from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage

# Shared state for the multi-agent graph
class MultiAgentState(TypedDict):
    messages: Annotated[list[BaseMessage], add_messages]
    route:    str        # set by supervisor, read by router
    session_id: str

# --- Supervisor Node ---
def supervisor_node(state: MultiAgentState) -> dict:
    question = state["messages"][-1].content
    route = route_question(question)   # from supervisor.py
    return {"route": route}           # stores in state: "RAG", "DATA", "GENERAL"

# --- RAG Agent Node ---
def rag_agent_node(state: MultiAgentState) -> dict:
    """Delegate to the full ReAct agent with RAG tools."""
    # Run the full agent graph from agent.py
    result = agent.invoke({"messages": state["messages"]})
    final_answer = result["messages"][-1].content
    return {"messages": [AIMessage(content=final_answer)]}

# --- Data Agent Node ---
def data_agent_node(state: MultiAgentState) -> dict:
    """Delegate to the data analysis agent."""
    result = data_agent.invoke({"messages": state["messages"]})
    final_answer = result["messages"][-1].content
    return {"messages": [AIMessage(content=final_answer)]}

# --- General Agent Node ---
def general_agent_node(state: MultiAgentState) -> dict:
    """Handle general questions with direct LLM call."""
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

# --- Routing Function ---
def route_after_supervisor(state: MultiAgentState) -> str:
    route = state["route"]
    mapping = {"RAG": "rag_agent", "DATA": "data_agent", "GENERAL": "general_agent"}
    return mapping.get(route, "general_agent")  # default to general

# --- Assemble the Multi-Agent Graph ---
multi_graph = StateGraph(MultiAgentState)

multi_graph.add_node("supervisor",    supervisor_node)
multi_graph.add_node("rag_agent",     rag_agent_node)
multi_graph.add_node("data_agent",    data_agent_node)
multi_graph.add_node("general_agent", general_agent_node)

multi_graph.set_entry_point("supervisor")
multi_graph.add_conditional_edges("supervisor", route_after_supervisor)
multi_graph.add_edge("rag_agent",     END)
multi_graph.add_edge("data_agent",    END)
multi_graph.add_edge("general_agent", END)

multi_agent = multi_graph.compile()

7.4 The Data Agent — data_agent.py

Your data_agent.py is a specialized agent with SQL-focused tools:

# agent/data_agent.py (conceptual implementation)
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from sqlalchemy import text
from database import engine

@tool
def execute_sql(query: str) -> str:
    """Execute a read-only SQL query against the application database.
    Use for: counting records, aggregating metrics, fetching user statistics.
    Only SELECT queries are allowed.
    
    Args:
        query: A valid SQL SELECT statement
    
    Returns:
        Query results as a formatted string
    """
    # Security: only allow SELECT
    if not query.strip().upper().startswith("SELECT"):
        return "Error: Only SELECT queries are permitted."
    
    # Parameterized execution (safe)
    with engine.connect() as conn:
        result = conn.execute(text(query))
        rows = result.fetchmany(100)  # limit rows
        columns = result.keys()
        
    if not rows:
        return "No results found."
    
    # Format as readable table
    header = " | ".join(columns)
    separator = "-" * len(header)
    data_rows = [" | ".join(str(v) for v in row) for row in rows]
    return f"{header}\n{separator}\n" + "\n".join(data_rows)

# Build a dedicated agent for data tasks
data_tools = [execute_sql]
data_llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
data_llm_with_tools = data_llm.bind_tools(data_tools)
data_tool_map = {tool.name: tool for tool in data_tools}

# The data agent graph (same ReAct pattern as agent.py)
from langgraph.graph import StateGraph, END

class DataAgentState(TypedDict):
    messages: Annotated[list[BaseMessage], add_messages]

def data_call_llm(state): ...
def data_call_tools(state): ...
def data_should_call_tools(state): ...

data_graph = StateGraph(DataAgentState)
# ... (same assembly pattern as agent.py)
data_agent = data_graph.compile()

Key differences from the RAG agent:

  • Different tools (execute_sql vs rag_search)
  • Different system prompt (focused on data analysis)
  • Potentially different LLM (more analytical)
  • Additional security constraints (SELECT-only)

7.5 Sub-Agent Patterns

Pattern 1 — Nested Graphs (Current Approach)

Worker agents are independent CompiledGraph objects, invoked from parent nodes:

def rag_agent_node(state: MultiAgentState) -> dict:
    result = rag_agent.invoke({"messages": state["messages"]})  # sub-graph call
    return {"messages": [AIMessage(content=result["messages"][-1].content)]}

Pros: Clean separation, independent state schemas, composable
Cons: State isn't shared — parent passes messages, gets answer back

Pattern 2 — Subgraph Integration

LangGraph supports embedding subgraphs directly:

# Compile the sub-agent
rag_compiled = graph.compile()

# Add as a node in the parent graph
multi_graph.add_node("rag_agent", rag_compiled)
# LangGraph handles subgraph invocation and state mapping automatically

Pattern 3 — Tool-Based Sub-Agents

Wrap sub-agents as tools that the supervisor agent can call:

@tool
def ask_rag_agent(question: str) -> str:
    """Ask the RAG agent a question about Agent Factory."""
    result = rag_agent.invoke({"messages": [HumanMessage(content=question)]})
    return result["messages"][-1].content

@tool
def ask_data_agent(question: str) -> str:
    """Ask the data agent to query the database."""
    result = data_agent.invoke({"messages": [HumanMessage(content=question)]})
    return result["messages"][-1].content

# Supervisor becomes an LLM agent that routes via tool calling
supervisor_tools = [ask_rag_agent, ask_data_agent]
supervisor_llm = ChatOpenAI(model="gpt-4o").bind_tools(supervisor_tools)

This is the "meta-agent" pattern — the supervisor itself uses the ReAct loop to decide when to invoke workers.


7.6 Communication Patterns Between Agents

Pattern A — Message Passing (Your Project)

Parent → passes messages list → Worker
Worker → returns single answer → Parent

Clear interface, easy to reason about, but worker can't access parent's full state.

Pattern B — Shared State

All agents work on the same shared state object (LangGraph handles this with subgraphs and input/output schema mapping):

# Parent state
class MainState(TypedDict):
    messages: Annotated[list[BaseMessage], add_messages]
    rag_results: list[str]
    sql_results: str

# Sub-agent reads subset, writes to specific fields
def rag_node(state: MainState) -> dict:
    # read messages, write to rag_results
    results = retrieve(state["messages"][-1].content)
    return {"rag_results": results}

Pattern C — Handoff Objects (LangGraph's AgentHandoff)

LangGraph has a concept of Handoff objects for explicit agent-to-agent communication:

from langgraph.types import Command

def supervisor_node(state):
    route = route_question(state["messages"][-1].content)
    # Use Command to handoff AND update state simultaneously
    return Command(
        goto=route.lower() + "_agent",  # next node
        update={"route": route}          # state update
    )

Command is LangGraph's newest pattern for clean handoffs — it combines routing and state update in one return value.


7.7 Orchestration vs. Collaboration

Two distinct multi-agent topologies:

Orchestrator → Workers (Hierarchical)

One orchestrator coordinates multiple specialized workers:

Orchestrator
    ├── calls RAG Agent → gets answer
    ├── calls Data Agent → gets metrics
    └── synthesizes both into final answer

Your project uses this pattern: supervisor → routes to one worker → done.

Collaborative (Peer-to-Peer)

Agents collaborate on subtasks, passing work back and forth:

Agent A → "I need historical context" → Agent B
Agent B → retrieves context → passes to Agent A
Agent A → "I need current metrics" → Agent C
Agent C → queries DB → passes to Agent A
Agent A → synthesizes → final answer

LangGraph supports both topologies through conditional routing.


7.8 Production Multi-Agent Considerations

Agent Registry Pattern

# agent_registry.py
from typing import Protocol

class AgentProtocol(Protocol):
    def invoke(self, state: dict) -> dict: ...

AGENT_REGISTRY: dict[str, AgentProtocol] = {
    "rag":     rag_agent,
    "data":    data_agent,
    "general": general_agent,
}

def dispatch_to_agent(route: str, state: dict) -> dict:
    """Central dispatch with fallback."""
    agent = AGENT_REGISTRY.get(route, AGENT_REGISTRY["general"])
    return agent.invoke(state)

Observability — Trace Which Agent Was Used

def supervisor_node(state: MultiAgentState) -> dict:
    route = route_question(state["messages"][-1].content)
    logger.info(
        "supervisor_routing",
        extra={
            "route": route,
            "session_id": state["session_id"],
            "question_preview": state["messages"][-1].content[:100]
        }
    )
    return {"route": route}

Circuit Breaker for Sub-Agents

from datetime import datetime, timedelta

class AgentCircuitBreaker:
    def __init__(self, max_failures=5, reset_after=timedelta(minutes=5)):
        self.failures = {}
        self.open_until = {}
        self.max_failures = max_failures
        self.reset_after = reset_after
    
    def call(self, agent_name: str, agent_fn, *args, **kwargs):
        if self.is_open(agent_name):
            raise RuntimeError(f"Agent {agent_name} circuit breaker is open")
        try:
            result = agent_fn(*args, **kwargs)
            self.record_success(agent_name)
            return result
        except Exception as e:
            self.record_failure(agent_name)
            raise

7.9 Interview Q&A

Q: Explain the supervisor pattern in your multi-agent architecture.

The supervisor is a routing agent that classifies each incoming question and dispatches it to the appropriate specialized worker. In our project, supervisor.py's route_question() makes a fast LLM call with temperature=0 to classify the question into RAG (knowledge base), DATA (analytics), or GENERAL (conversational). In the LangGraph graph, the supervisor is a node that stores the route in state, and the routing function reads that route to dispatch to the correct sub-agent node. Each sub-agent is an independent compiled graph (or directly invoked function) with its own tools and system prompt.


Q: How does a worker agent return results to the supervisor/parent?

In our architecture, worker agent nodes call the agent graph directly (nested invocation) and return only the final answer as an AIMessage: return {"messages": [AIMessage(content=result)]}. The parent's add_messages reducer appends this to the shared message history. The parent then reaches END and returns the final state. This "message passing" approach keeps the interface clean — the parent doesn't need to know how the worker produced its answer.


Q: How would you prevent the supervisor from routing everything to the same agent?

Three approaches: (1) Improve the routing prompt — make categories mutually exclusive with clear examples. (2) Add a confidence check — have the LLM return route + confidence; if low confidence, fall back to a more general agent. (3) Add monitoring — log routing decisions and track the distribution over time. If 95% goes to one agent, the routing prompt needs tuning. In extreme cases, use a fine-tuned classifier instead of a general-purpose LLM for routing.


Q: What's the difference between supervisor-based and collaborative multi-agent architectures?

A supervisor architecture is hierarchical: one orchestrator makes routing decisions, workers execute and return results. It's simpler to reason about and debug because control flow is centralized. A collaborative architecture is peer-to-peer: agents can call each other and pass work back and forth. It's more flexible for complex tasks requiring multiple perspectives but harder to debug and reason about because control flow is distributed. For our use case (question routing to specialized answerers), the supervisor pattern is the right choice — clear assignment of responsibility, easy to add new workers.


Q: How do you handle partial failures in a multi-agent system?

A worker agent failure should not crash the entire system. Implement error handling at the worker node level: wrap the sub-agent call in a try/except and return a graceful error message as an AIMessage if the worker fails. The supervisor can optionally re-route to a fallback agent (e.g., route data failures to the general agent with a note). Log all failures with enough context to diagnose the issue. In a circuit breaker pattern, a repeatedly failing agent can be temporarily skipped with automatic fallback.


7.10 Key One-Liners to Memorize

"Supervisor pattern: one classifier routes to specialized workers — separation of concerns."

"temperature=0 for routing LLM calls — you need determinism, not creativity."

"Workers are independent graphs: they receive messages, process, return one answer."

"Multi-agent = decompose by capability: RAG for knowledge, DATA for analytics, GENERAL for chat."

"Observability is critical: log which agent handled each request, and why."

"Circuit breaker: if RAG agent is down, fall back to general agent gracefully."

Next: Chapter 8 — Human-in-the-Loop & Interrupts

Header Logo