Senior Architect Interview Series

LangGraph & Agentic AI
Complete Interview Prep Guide

10 chapters · From ReAct patterns to production agents · 2026 Edition

📅 March 28, 2026⏱ ~90 min read🎯 Senior Engineer / Architect level
Chapter 1 of 10Why Agents? ReAct Pattern vs Chains

LangGraph Chapter 1 — Why Agents? ReAct Pattern vs Chains

Senior Architect Interview Series — LangGraph & Agentic AI


Chapter Map (All 10 Chapters)

ChapterTopicInterview Weight
Ch 1 ← YOU ARE HEREWhy Agents? ReAct Pattern vs Chains⭐⭐⭐⭐⭐ Always asked first
Ch 2LangGraph Fundamentals — StateGraph, Nodes, Edges⭐⭐⭐⭐⭐
Ch 3AgentState & Reducers — add_messages Deep Dive⭐⭐⭐⭐⭐
Ch 4Tool Calling — How It Works End-to-End⭐⭐⭐⭐⭐
Ch 5Conditional Routing & Graph Control Flow⭐⭐⭐⭐
Ch 6Memory — In-Context, Session, Long-Term⭐⭐⭐⭐⭐
Ch 7Multi-Agent — Supervisor Pattern & Handoffs⭐⭐⭐⭐⭐
Ch 8Human-in-the-Loop & Interrupts⭐⭐⭐⭐
Ch 9Error Handling, Retries & Fallback Agents⭐⭐⭐⭐
Ch 10Production Agents — Streaming, Tracing, Scaling⭐⭐⭐⭐⭐

1.0 Why This Is Always the First Interview Question

Before any interviewer asks you about LangGraph specifics, they will ask:

"Why use an agent instead of a simple chain or a direct API call?"

Your answer to this question tells them everything about whether you understand the fundamental architectural shift that agents represent. A weak answer describes what an agent does. A strong answer explains why the problem requires it.


1.1 The Problem With Chains

A chain is a fixed, linear sequence of operations:

Input → Step 1 → Step 2 → Step 3 → Output

Example: A simple RAG chain

User Question
     │
     ▼
Embed question
     │
     ▼
Search ChromaDB
     │
     ▼
Build prompt
     │
     ▼
Call GPT-4
     │
     ▼
Return answer

This works perfectly when:

  • The steps are always the same
  • You always need exactly one retrieval
  • The question always maps to one document source
  • No reasoning is required about WHAT to do next

Where Chains Break Down

Scenario 1 — Multi-step reasoning:

User: "Compare the refund policies of our US and EU divisions,
       then tell me which one gives customers more time."

Chain approach:
  → Retrieves US policy ✓
  → Builds prompt ✓
  → GPT-4 generates answer
  → BUT: never retrieved the EU policy — it wasn't hardcoded!
  → Answer is incomplete or fabricated

Scenario 2 — Dynamic tool choice:

User: "What are the top-selling items AND what does our product manual say about them?"

Chain: hardcoded to ONE retrieval path
  → Cannot dynamically decide: "I need both SQL data AND document retrieval"
  → Calls the wrong tool or only one source

Scenario 3 — Error recovery:

Chain step 2 fails (ChromaDB timeout)
  → Chain crashes or returns error
  → No ability to retry with different strategy
  → No fallback to a different tool

Scenario 4 — Iterative refinement:

User: "Summarize all policies related to international shipping"

Chain: calls retrieval once, gets 3 chunks
  → GPT-4: "I need more information about customs duties specifically"
  → Chain: cannot retrieve again — no second trip back to the tool

The fundamental limitation of chains: The control flow is determined at design time by the developer. It cannot adapt at runtime based on what the LLM learns.


1.2 What Is an Agent?

An agent is a system where the LLM itself controls the execution flow.

                    ┌─────────────┐
                    │    LLM      │  ← The LLM is the CONTROLLER
                    │  (Reasoner) │
                    └──────┬──────┘
                           │
               ┌───────────┼───────────┐
               │           │           │
               ▼           ▼           ▼
          [Tool A]     [Tool B]    [End /
          rag_search   sql_query    Answer]
               │           │
               └─────┬─────┘
                     │
              Observations fed
              back to LLM
                     │
                     ▼
              LLM reasons again:
              "Do I have enough to answer?"
              → YES → produce final answer
              → NO  → call another tool

Key difference: In a chain, the developer codes if retrieval_needed → retrieve. In an agent, the LLM decides if retrieval_needed → retrieve.


1.3 The ReAct Pattern — The Foundation of All Agents

ReAct = Reasoning + Acting

Introduced in the paper: "ReAct: Synergizing Reasoning and Acting in Language Models" (Yao et al., 2022, Google Brain).

The ReAct Loop

┌──────────────────────────────────────────┐
│                                          │
│   THOUGHT (reasoning)                    │
│   "I need to find the refund policy      │
│    to answer this question"              │
│                                          │
│         ↓                                │
│   ACTION (tool call)                     │
│   rag_search("refund policy")            │
│                                          │
│         ↓                                │
│   OBSERVATION (tool result)              │
│   "Refunds accepted within 30 days..."   │
│                                          │
│         ↓                                │
│   THOUGHT (reasoning again)              │
│   "I now have the refund policy.         │
│    I can answer the question."           │
│                                          │
│         ↓                                │
│   FINAL ANSWER                           │
│   "The refund window is 30 days."        │
│                                          │
└──────────────────────────────────────────┘
         ↑___________________________|
              (loop until done)

Each cycle consists of:

  1. Thought — LLM reasons about current state and what to do
  2. Action — LLM calls a tool (or decides to stop)
  3. Observation — Tool result is returned to LLM
  4. Repeat — Until LLM determines it has enough to answer

Why This Is Powerful

The LLM gets to see its own tool results and reason about them before deciding the next step. This enables:

  • Multi-hop retrieval (retrieve → read → retrieve again with refined query)
  • Dynamic tool selection (choose the right tool based on the question)
  • Self-correction (if one tool gives insufficient info, try another)
  • Early termination (if the first retrieval is enough, don't call more tools)

1.4 ReAct in Your Project — Line by Line

Your agent/agent.py implements ReAct exactly. Let's trace through it:

# agent/agent.py

# ── STEP 1: Define the THOUGHT capability ──────────────────────────────────
# The LLM (reasoner) is bound with tools it CAN call
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
llm_with_tools = llm.bind_tools(tools)
# bind_tools() tells the LLM: "You may call these functions.
# When you need information, emit a tool_call instead of a text answer."

# ── STEP 2: The THOUGHT node ───────────────────────────────────────────────
def call_llm(state: AgentState) -> AgentState:
    response = llm_with_tools.invoke(state["messages"])
    return {"messages": [response]}
# LLM reads ALL messages (conversation history + tool results so far)
# and decides: produce an answer, OR request a tool call

# ── STEP 3: The ACTION + OBSERVATION node ─────────────────────────────────
def call_tools(state: AgentState) -> AgentState:
    last_message = state["messages"][-1]  # the AIMessage with tool_calls
    results = []
    for tool_call in last_message.tool_calls:
        tool_fn = tool_map[tool_call["name"]]
        output = tool_fn.invoke(tool_call["args"])           # ACTION
        results.append(ToolMessage(                           # OBSERVATION
            content=str(output),
            tool_call_id=tool_call["id"]
        ))
    return {"messages": results}
# Executes the tool, wraps result in ToolMessage → fed back to LLM

# ── STEP 4: The DECISION (loop or stop?) ──────────────────────────────────
def should_call_tools(state: AgentState) -> str:
    last_message = state["messages"][-1]
    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "call_tools"   # → loop back, execute tools
    return END                # → done, return final answer
# This IS the ReAct loop controller.
# If LLM asked for tools → execute them and loop back to call_llm
# If LLM gave a text answer → END (no more tool calls)

The complete ReAct execution trace for: "What is Agent Factory?"

Turn 1 — call_llm:
  Input:  [HumanMessage("What is Agent Factory?")]
  LLM:    "I should search the knowledge base for this."
  Output: AIMessage(tool_calls=[rag_search("Agent Factory")])

Turn 1 — should_call_tools:
  → sees tool_calls → returns "call_tools"

Turn 1 — call_tools:
  → executes rag_search("Agent Factory")
  → Output: ToolMessage("Agent Factory is a platform that...")

Turn 2 — call_llm:
  Input:  [HumanMessage, AIMessage(tool_call), ToolMessage(result)]
  LLM:    "I now have the information. I can answer."
  Output: AIMessage("Agent Factory is PepsiCo's platform for...")

Turn 2 — should_call_tools:
  → no tool_calls in last message → returns END

Final answer: "Agent Factory is PepsiCo's platform for..."

1.5 Why LangGraph Instead of Plain LangChain Agents?

This is a must-know interview question.

LangChain AgentExecutor (Old Way)

# Old LangChain agent — black box
from langchain.agents import AgentExecutor, create_openai_tools_agent

agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)
result = executor.invoke({"input": "What is Agent Factory?"})

Problems:

  • The loop runs entirely inside AgentExecutor — you cannot see or control intermediate steps
  • No ability to pause mid-execution (no human-in-the-loop)
  • Cannot inject custom logic between steps (e.g., guardrails after each tool call)
  • Hard to add custom state beyond just the messages
  • Debugging is painful — it's a black box

LangGraph (New Way — What Your Project Uses)

# LangGraph — explicit, visible, controllable
graph = StateGraph(AgentState)
graph.add_node("call_llm", call_llm)
graph.add_node("call_tools", call_tools)
graph.set_entry_point("call_llm")
graph.add_conditional_edges("call_llm", should_call_tools)
graph.add_edge("call_tools", "call_llm")
agent = graph.compile()

Advantages:

LangChain AgentExecutorLangGraph
Implicit loop (hidden)Explicit graph (visible)
Fixed flowConfigurable nodes + edges
No mid-execution controlPause/resume with interrupt_before
Hard to add custom stateFull TypedDict state, any fields
Debugging: guessDebugging: trace each node
Single agent onlyMulti-agent natively
No persistenceState can be checkpointed

Interview one-liner: "LangGraph makes the agent loop explicit — every node, every edge, every state transition is visible and controllable. AgentExecutor hides all of that."


1.6 Chains vs Agents vs Agentic RAG — Decision Framework

Use this to answer "when would you USE an agent vs a chain?"

DECISION TREE:

Is the task always the same steps in the same order?
  YES → Use a chain (LangChain LCEL, simple pipeline)
  NO  ↓

Does the solution require choosing between multiple tools?
  NO  → Use a chain with one tool
  YES ↓

Does the solution require seeing a tool result before deciding the next step?
  NO  → Use a chain (all steps predetermined)
  YES ↓

Does the solution require iteration (loop until done)?
  NO  → Use a chain with multiple fixed tools
  YES ↓

                      USE AN AGENT

Concrete examples:

Use CaseChain or Agent?Why
Simple Q&A from one documentChainFixed path: retrieve → answer
Customer chatbot with memoryAgentMust reason about multi-turn, tool choice
"Compare policies from 3 docs"AgentMust retrieve 3× iteratively
ETL pipelineChainFixed transformation steps
Code debugging assistantAgentIterates: read error → fix → run → check result
Data analysis with SQL + docsAgentDynamic tool choice: SQL or RAG or both
Your project's multi-agent systemAgent + SupervisorMultiple specialized capabilities

1.7 The Three Levels of LLM Application Architecture

Senior architects categorize LLM applications into three levels:

Level 1 — LLM-Powered Functions

Direct API calls. No framework needed.

response = openai.chat.completions.create(model="gpt-4o-mini", messages=[...])

Use when: Single prompt, simple transformation, no tools needed.

Level 2 — Chains

Fixed multi-step pipelines. LangChain LCEL or similar.

chain = retriever | prompt | llm | output_parser
result = chain.invoke({"question": "..."})

Use when: Steps are known upfront, no branching, no loops.

Level 3 — Agents (LangGraph)

LLM controls the flow. Dynamic, iterative, multi-tool.

graph = StateGraph(AgentState)
# nodes + edges + conditional routing
agent = graph.compile()
result = agent.invoke({"messages": [HumanMessage("...")]})

Use when: Complex reasoning, tool choice, iteration, or multi-agent needed.

Your project implements Level 3 — and that's exactly what makes it interview-worthy.


1.8 What "Agentic" Really Means

The word "agentic" is overused. Here is the precise definition for senior interviews:

An agentic system has:

PropertyMeaningYour Project
AutonomyLLM decides what to do without explicit instructions per step✓ LLM chooses when to call rag_search
Tool useCan take actions beyond text generationrag_search, sql_query tools
PerceptionReads its own action results✓ ToolMessages fed back to LLM
MemoryRetains information across turns✓ Session history in PostgreSQL
Goal-directedPursues a goal across multiple steps✓ Loops until question is answered
AdaptabilityChanges approach based on what it learns✓ Decides next tool based on current state

A system that just calls GPT-4 with a fixed prompt is NOT agentic — it's a Level 1 function. A system where GPT-4 decides what to do next based on tool results IS agentic.


1.9 The Message Accumulation Pattern

Every ReAct agent shares one core pattern: messages accumulate.

Initial state:
  messages: [HumanMessage("What is Agent Factory?")]

After call_llm (Turn 1):
  messages: [
    HumanMessage("What is Agent Factory?"),
    AIMessage(tool_calls=[rag_search("Agent Factory")])    ← appended
  ]

After call_tools:
  messages: [
    HumanMessage("What is Agent Factory?"),
    AIMessage(tool_calls=[rag_search("Agent Factory")]),
    ToolMessage("Agent Factory is a platform...")          ← appended
  ]

After call_llm (Turn 2):
  messages: [
    HumanMessage("What is Agent Factory?"),
    AIMessage(tool_calls=[rag_search("Agent Factory")]),
    ToolMessage("Agent Factory is a platform..."),
    AIMessage("Agent Factory is PepsiCo's platform...")    ← appended
  ]

Why this matters:

  • The LLM in Turn 2 sees the FULL history — including its own tool call and the result
  • This is what enables reasoning ABOUT tool results
  • The add_messages reducer in AgentState handles this append behavior (Chapter 3)

1.10 Code Reference — Complete agent.py Annotated

import os
from typing import Annotated, TypedDict
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage, BaseMessage
from langchain_core.tools import tool
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from rag.retrieve import retrieve, build_prompt
from .memory import load_history, save_history
from sqlalchemy.orm import Session
from .guardrails import check_input
from .logger import log_guardrail, log_agent_start, log_agent_end, log_tool_call

# ── TOOL DEFINITION ────────────────────────────────────────────────────────
@tool
def rag_search(query: str) -> str:
    """Search the knowledge base for relevant information about the query."""
    # The docstring IS the tool description the LLM reads to decide when to use it
    chunks = retrieve(query, top_k=3)
    if not chunks:
        return "No relevant information found in the knowledge base."
    return build_prompt(query, chunks)

# ── STATE DEFINITION ───────────────────────────────────────────────────────
class AgentState(TypedDict):
    messages: Annotated[list[BaseMessage], add_messages]
    # Annotated[..., add_messages] means:
    # "When this field is updated, APPEND new messages, don't replace the list"
    # This is the reducer that makes message accumulation automatic

# ── LLM SETUP ──────────────────────────────────────────────────────────────
tools = [rag_search]
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
llm_with_tools = llm.bind_tools(tools)
# bind_tools registers tool schemas with the model
# The model receives JSON schema of each tool and can emit structured tool_calls
tool_map = {t.name: t for t in tools}  # name → callable lookup

# ── GRAPH NODES ────────────────────────────────────────────────────────────
def call_llm(state: AgentState) -> AgentState:
    response = llm_with_tools.invoke(state["messages"])
    # response is either:
    #   AIMessage(content="Final answer...")            → done
    #   AIMessage(content="", tool_calls=[...])         → wants to call tools
    return {"messages": [response]}

def call_tools(state: AgentState) -> AgentState:
    last_message = state["messages"][-1]  # AIMessage with tool_calls
    results = []
    for tool_call in last_message.tool_calls:
        log_tool_call("agent", tool_call["name"], tool_call["args"])
        tool_fn = tool_map[tool_call["name"]]
        output = tool_fn.invoke(tool_call["args"])
        results.append(ToolMessage(
            content=str(output),
            tool_call_id=tool_call["id"]  # must match the AIMessage tool_call id
        ))
    return {"messages": results}

# ── CONDITIONAL EDGE ───────────────────────────────────────────────────────
def should_call_tools(state: AgentState) -> str:
    last_message = state["messages"][-1]
    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "call_tools"   # → route to call_tools node
    return END                # → exit graph, return final state

# ── GRAPH ASSEMBLY ─────────────────────────────────────────────────────────
graph = StateGraph(AgentState)
graph.add_node("call_llm", call_llm)
graph.add_node("call_tools", call_tools)
graph.set_entry_point("call_llm")           # first node to execute
graph.add_conditional_edges(               # from call_llm, branch:
    "call_llm",
    should_call_tools                       # routing function
    # Return "call_tools" → go to call_tools node
    # Return END           → exit graph
)
graph.add_edge("call_tools", "call_llm")    # always go back to LLM after tools
agent = graph.compile()                     # compile into executable

# ── ENTRY POINT ────────────────────────────────────────────────────────────
def run_agent(question: str, session_id: str, db: Session) -> str:
    # 1. Input guardrail — check BEFORE anything
    allowed = check_input(question)
    log_guardrail(session_id, question, allowed)
    if not allowed:
        return "I can only answer questions relevant to the knowledge base."

    # 2. Load conversation history from PostgreSQL
    start_time = log_agent_start(session_id, question)
    history = load_history(session_id, db)

    # 3. Add the new question to history
    history.append(HumanMessage(content=question))

    # 4. Run the LangGraph agent (ReAct loop)
    final_state = agent.invoke({"messages": history})

    # 5. Extract the final answer (last message)
    answer = final_state["messages"][-1].content

    # 6. Persist session history
    save_history(session_id, question, answer, db)
    log_agent_end(session_id, start_time)
    return answer

1.11 Interview Q&A

Q: What is the ReAct pattern and why does it matter?

ReAct (Reasoning + Acting) is the foundational pattern for LLM agents. The LLM alternates between reasoning about the current state (Thought), taking an action via a tool call (Act), and reading the result (Observe). This loop continues until the LLM determines it has enough information to produce a final answer. It matters because it enables dynamic, multi-step problem solving that static chains cannot do — the LLM can retrieve information, read the result, decide it needs more, retrieve again with a refined query, and so on.


Q: Why did you use LangGraph instead of LangChain's AgentExecutor?

AgentExecutor runs the loop as a black box — you can't see or control intermediate steps, can't pause execution, can't inject custom logic between steps, and can't add custom state fields. LangGraph makes the agent an explicit directed graph: every node, every edge, every conditional branch is visible and controllable. This means I can add guardrails between steps, implement human-in-the-loop interrupts, add checkpointing for fault tolerance, or route to different subgraphs based on state. For a production system at PepsiCo's scale, that control and visibility is non-negotiable.


Q: When would you use a chain instead of an agent?

When the steps are known upfront and fixed — a chain is simpler, faster, cheaper, and more predictable. The overhead of an agent (iterative LLM calls, tool execution loop) is only justified when you need: dynamic tool selection based on the question, iterative retrieval (retrieve → reason → retrieve again), or the ability to recover from tool failures with a different strategy. A simple Q&A over a single document collection is a chain. A system that might query documents, SQL, or external APIs depending on the question is an agent.


Q: How does your agent know when to stop?

The should_call_tools function checks the last message in the state. If the LLM's response contains tool_calls, the graph routes to the call_tools node and loops back. If the LLM's response is a plain text AIMessage with no tool calls, the graph returns END and execution stops. The LLM inherently knows to stop generating tool calls when it has sufficient information — because its training teaches it to emit tool results only when needed, and to emit a final text answer when the question is fully answerable.


Q: What is bind_tools() and what does it actually do?

llm.bind_tools(tools) registers the JSON schema of each tool with the LLM. At call time, OpenAI receives the tool schemas alongside the messages. The model is then capable of producing structured tool_calls in its response — a JSON object with the function name and arguments — instead of plain text. The LLM never directly executes Python functions; it emits a structured request, and the call_tools node in the graph dispatches to the actual Python function and feeds the result back as a ToolMessage.


Q: What happens if a tool call fails mid-agent execution?

In the current implementation, an exception in tool_fn.invoke() would propagate up and crash the agent invocation. In production, I'd wrap each tool call in a try/except and return an error ToolMessage (e.g., "Tool failed: timeout"). The LLM then reads this failure as an observation and can reason about it — either retrying with a different query, calling a different tool, or returning a graceful answer explaining it couldn't retrieve the information. LangGraph also supports retry policies at the graph level via compiled graph settings.


1.12 Key One-Liners to Memorize

"A chain is a developer-defined path. An agent is an LLM-defined path."

"ReAct = Thought → Action → Observation → repeat until done."

"LangGraph makes the agent loop explicit. AgentExecutor hides it."

"bind_tools() doesn't let the LLM call Python. It lets the LLM REQUEST a call —
 the framework executes it."

"Messages accumulate across the loop — every tool call and result becomes context
 for the next reasoning step."

"Use a chain when steps are known. Use an agent when the LLM needs to decide the steps."

1.13 Mental Model Summary

CHAIN (fixed flow):
  User Input → [Step 1] → [Step 2] → [Step 3] → Answer
  Developer decides every step at design time.

AGENT (dynamic flow):
  User Input
       │
       ▼
  [LLM decides: do I need a tool?]
       │
       ├── YES → [Execute Tool] → [LLM reads result] → loop
       │
       └── NO  → [Final Answer]

LANGGRAPH AGENT (your project):
  StateGraph with explicit nodes (call_llm, call_tools)
  and conditional routing (should_call_tools)
  ↓
  Full visibility, full control, production-ready
  ↓
  run_agent():
    guardrails → load history → ReAct loop → save history → return

Next: Chapter 2 — LangGraph Fundamentals: StateGraph, Nodes & Edges

Header Logo