HomeAgents & OrchestrationWhat Are AI Agents?
beginner12 min read· Module 9, Lesson 1

🤖What Are AI Agents?

Understand the difference between chatbots and autonomous agents

What Are AI Agents?

You have probably used a chatbot before — you type a question, it gives you an answer, and the conversation is done. An AI agent is something fundamentally different. It is an AI system that can observe its environment, make decisions, take actions, and repeat until a goal is achieved — all with minimal human intervention.

Think of it this way: a chatbot is like a librarian who answers your question. An agent is like a research assistant who goes out, finds papers, reads them, takes notes, cross-references sources, and delivers a finished report back to you.


Chatbot vs Agent: A Clear Comparison

Understanding the distinction between chatbots and agents is critical because it changes how you design, build, and deploy AI systems.

FeatureChatbotAI Agent
Interaction modelSingle turn: question in, answer outMulti-step: works autonomously toward a goal
MemoryLimited to conversation contextMaintains state across steps and sessions
Tool useNone or minimalUses tools (search, code execution, APIs, file I/O)
Decision-makingResponds to explicit instructionsDecides what to do next on its own
Error handlingReturns an error messageDetects errors, retries, and adapts its approach
ScopeAnswers one question at a timeSolves complex, multi-step problems
AutonomyNone — waits for user inputHigh — can plan and execute without human input
Feedback loopNo self-correctionObserves results and adjusts behavior

A chatbot is reactive — it waits for you. An agent is proactive — it pursues a goal.


The Agent Loop: Observe, Think, Act, Repeat

Every AI agent follows a fundamental loop, regardless of how complex it is. This loop is the heartbeat of agentic behavior:

Step 1: Observe

The agent gathers information from its environment. This could mean:

  • Reading a user's request
  • Checking the output of a previous action
  • Inspecting a file, database, or API response
  • Reviewing error messages from a failed step

Step 2: Think

The agent reasons about what it has observed. It asks itself:

  • What is my goal?
  • What do I know so far?
  • What should I do next?
  • Have I encountered an error I need to handle?

This "thinking" step is powered by the LLM's reasoning capabilities. In advanced agents, this step may involve explicit chain-of-thought or extended thinking.

Step 3: Act

The agent takes an action. This is where tool use comes in:

  • Call a search API to find information
  • Execute a code snippet to process data
  • Write to a file or database
  • Send an API request to an external service
  • Ask the user for clarification (when truly needed)

Step 4: Repeat

After acting, the agent returns to Step 1. It observes the result of its action and decides whether it has achieved its goal or needs to take another step.

+----------+ | Observe |<---------+ +----+-----+ | | | v | +----------+ | | Think | | +----+-----+ | | | v | +----------+ +-----+------+ | Act +--->| Evaluate | +----------+ +-----+------+ | Goal met? ---> Done

This loop is what gives agents their power. A chatbot stops after one response. An agent keeps going until the job is done.


Tool Use: The Key Differentiator

The single most important capability that separates an agent from a chatbot is tool use. Without tools, an LLM can only generate text. With tools, it can interact with the real world.

What are tools?

Tools are functions or APIs that the agent can call during its reasoning loop. They extend the agent's capabilities beyond text generation.

Common tool categories

Information retrieval:

  • Web search (Google, Bing)
  • Database queries (SQL, vector search)
  • File reading (read a document, parse a CSV)
  • API calls (fetch weather, stock prices, user data)

Code execution:

  • Run Python, JavaScript, or bash scripts
  • Install packages and dependencies
  • Execute tests and check results

File manipulation:

  • Create, edit, and delete files
  • Navigate directory structures
  • Manage version control (git)

Communication:

  • Send emails or messages
  • Create pull requests
  • Post to Slack or other platforms

How tool use works in practice

When an agent decides to use a tool, the flow looks like this:

  1. The LLM generates a tool call — a structured request specifying which tool to use and with what parameters
  2. The runtime environment executes the tool call
  3. The tool returns a result to the agent
  4. The agent observes the result and continues reasoning
JSON
{ "tool": "web_search", "parameters": { "query": "current population of Tokyo 2026" } }

The agent receives the search results, extracts the relevant information, and decides its next step. This tool-use loop is what makes agents genuinely useful for real-world tasks.


Types of AI Agents

Not all agents are the same. They vary in complexity, autonomy, and architecture.

1. Single-Step Agents

The simplest form of agent. It receives a task, uses one tool, and returns a result.

Example: A summarization agent that:

  1. Receives a URL
  2. Fetches the web page content (tool use)
  3. Summarizes it and returns the summary

When to use: Simple, well-defined tasks where one action is sufficient.

2. Multi-Step Agents

These agents chain multiple actions together to solve a more complex problem. They follow the observe-think-act loop for several iterations.

Example: A research agent that:

  1. Receives a topic
  2. Searches the web for relevant articles (tool use)
  3. Reads the top 5 articles (tool use)
  4. Cross-references facts across sources (reasoning)
  5. Writes a structured report (generation)
  6. Saves the report to a file (tool use)

When to use: Tasks that require gathering information from multiple sources, processing data, or completing multi-step workflows.

3. Multi-Agent Systems

Multiple specialized agents work together, each handling a different aspect of a larger task. A coordinator agent delegates subtasks to worker agents.

Example: A software development system with:

  • Planner agent — breaks down a feature request into tasks
  • Coder agent — writes the code for each task
  • Reviewer agent — reviews the code for bugs and style
  • Tester agent — writes and runs tests
  • Deployer agent — handles deployment

When to use: Large-scale tasks that benefit from specialization, tasks requiring different expertise areas, or systems that need parallel processing.


Agent Architectures

As the field has matured, several standard architectures have emerged for building agents.

ReAct (Reasoning + Acting)

ReAct is the most widely used agent architecture. The agent alternates between reasoning (thinking out loud about what to do) and acting (using tools).

How it works:

  1. Thought: "I need to find the current stock price of Apple."
  2. Action: Call the stock_price tool with symbol "AAPL"
  3. Observation: The tool returns $198.50
  4. Thought: "Now I need to compare this to the price 30 days ago."
  5. Action: Call the historical_price tool with symbol "AAPL" and period "30d"
  6. Observation: The tool returns $185.20
  7. Thought: "Apple stock is up 7.2% over 30 days. I have enough information to answer."
  8. Final Answer: "Apple (AAPL) is currently trading at $198.50, up 7.2% from $185.20 thirty days ago."

Strengths:

  • Simple to implement
  • Easy to debug (you can read the reasoning trace)
  • Works well for most use cases

Weaknesses:

  • Can get stuck in loops
  • May not plan far enough ahead for complex tasks
  • Each step depends on the previous one (sequential)

Plan-and-Execute

This architecture separates planning from execution. The agent first creates a complete plan, then executes each step.

How it works:

  1. Planning phase: Given the task "Build a weather dashboard," create a plan:

    • Step 1: Research weather APIs
    • Step 2: Choose an API and get an API key
    • Step 3: Design the UI layout
    • Step 4: Write the HTML/CSS
    • Step 5: Write the JavaScript to fetch weather data
    • Step 6: Test the application
    • Step 7: Deploy to a hosting service
  2. Execution phase: Execute each step sequentially, potentially re-planning if something goes wrong.

Strengths:

  • Better for complex, long-horizon tasks
  • Creates a clear roadmap before acting
  • Can re-plan when encountering obstacles

Weaknesses:

  • Initial plan may be wrong or incomplete
  • Re-planning adds latency
  • More complex to implement

Other Notable Architectures

Reflexion: The agent reflects on its own performance after completing a task and uses that reflection to improve future attempts.

LATS (Language Agent Tree Search): The agent explores multiple possible action paths in a tree structure, evaluating each one before committing to the best path.

Toolformer-style: The agent learns when and how to use tools from training data, rather than being explicitly programmed with tool definitions.


Real-World Examples of AI Agents

AI agents are already being used across industries. Here are concrete examples:

Software Development

  • Claude Code (Anthropic): An agentic coding assistant that can read your codebase, write code, run tests, fix bugs, and manage git workflows autonomously.
  • GitHub Copilot Workspace: Plans, implements, and tests code changes based on issue descriptions.
  • Cursor / Windsurf: IDE-integrated agents that can edit multiple files and run commands.

Data Analysis

  • Automated data pipelines: Agents that fetch data from APIs, clean it, run statistical analyses, generate visualizations, and produce reports.
  • Anomaly detection agents: Continuously monitor data streams and alert humans when unusual patterns emerge.

Customer Service

  • Tier-2 support agents: Handle complex customer issues by looking up account information, checking order status, processing refunds, and escalating when necessary.
  • Onboarding agents: Guide new users through product setup, configuring settings and troubleshooting issues in real time.

Research and Knowledge Work

  • Literature review agents: Search academic databases, read papers, extract key findings, and compile annotated bibliographies.
  • Legal document analysis: Review contracts, flag risky clauses, and suggest amendments.

DevOps and Infrastructure

  • Incident response agents: Detect system failures, gather logs, diagnose root causes, and apply fixes or roll back deployments.
  • Infrastructure management: Monitor cloud resources, scale services up or down, and optimize costs.

When to Use Agents vs Simple API Calls

Not every task needs an agent. Here is a practical decision framework:

Use a simple API call when:

  • The task has a single, well-defined step (e.g., "translate this sentence")
  • You do not need tool use — the LLM's knowledge is sufficient
  • Latency matters — you need a response in under a second
  • The task is deterministic — there is one right answer
  • You can fully specify the input and expected output format

Use an agent when:

  • The task requires multiple steps that depend on each other
  • You need tool use — searching the web, running code, reading files
  • The task is open-ended — you cannot predict all the steps in advance
  • Error recovery is important — the system should retry and adapt
  • The task involves exploration — the agent needs to discover information
  • You need autonomous operation — the system should work without constant human input

The decision matrix

ScenarioApproach
Summarize a paragraphSimple API call
Translate a documentSimple API call
Research a topic and write a reportAgent
Fix a bug in a codebaseAgent
Generate a JSON from a templateSimple API call
Build a feature based on a specAgent
Answer a factual questionSimple API call
Monitor a system and respond to incidentsAgent
Classify an email as spam or notSimple API call
Process a multi-step refund workflowAgent

Limitations and Risks of Autonomous Agents

AI agents are powerful, but they come with real risks that you must understand before deploying them.

1. Hallucination and Confabulation

Agents can "make up" information, tool calls, or results. An agent might:

  • Claim it searched for something when it did not
  • Fabricate data that looks plausible but is incorrect
  • Misinterpret tool outputs and draw wrong conclusions

Mitigation: Always validate agent outputs. Use structured tool calls with strict schemas. Implement output verification checks.

2. Runaway Execution

An agent in a loop can consume unlimited resources:

  • Making hundreds of API calls
  • Running expensive computations
  • Creating or deleting files without restraint

Mitigation: Set maximum iteration limits. Implement cost budgets. Require human approval for destructive actions.

3. Security Risks

Agents that can execute code or access external systems are potential attack vectors:

  • Prompt injection: A malicious document could trick the agent into executing harmful commands
  • Data exfiltration: An agent with internet access could be tricked into sending sensitive data to an attacker
  • Privilege escalation: An agent might access resources beyond its intended scope

Mitigation: Run agents in sandboxed environments. Apply the principle of least privilege. Never give agents access to production databases without safeguards.

4. Unpredictable Behavior

Because agents make autonomous decisions, their behavior is harder to predict than a simple API call:

  • They may choose unexpected approaches to solve a problem
  • They may get stuck in loops, retrying the same failed approach
  • They may misunderstand the goal and pursue the wrong objective

Mitigation: Implement comprehensive logging. Use guardrails and safety checks. Start with limited autonomy and expand gradually.

5. Cost

Agents are expensive. Each iteration of the observe-think-act loop costs tokens. A complex agent task might require:

  • 10-50 LLM calls
  • Multiple tool executions
  • Extended thinking time

Mitigation: Set token budgets. Cache intermediate results. Use cheaper models for simple sub-tasks and more capable models for complex reasoning steps.

6. Latency

Agents are slow compared to single API calls. Each step involves:

  • An LLM inference call (seconds)
  • Tool execution (variable, could be seconds to minutes)
  • Result processing (milliseconds)

A 10-step agent task might take 30 seconds to 5 minutes.

Mitigation: Use streaming to show progress. Parallelize independent steps when possible. Set user expectations about response times.


Key Takeaways

  1. An agent is not just a chatbot. It is an autonomous system that can observe, think, act, and repeat until a goal is achieved.

  2. Tool use is the key differentiator. Without tools, an LLM can only generate text. With tools, it can interact with the real world.

  3. The agent loop (observe-think-act-repeat) is universal. Every agent architecture is built on this fundamental cycle.

  4. Choose the right level of complexity. Single-step agents for simple tasks, multi-step for complex workflows, multi-agent for large-scale systems.

  5. ReAct and Plan-and-Execute are the two primary architectures. ReAct is simpler and works for most tasks. Plan-and-Execute is better for long-horizon, complex tasks.

  6. Not everything needs an agent. Use simple API calls when the task is well-defined and single-step. Use agents when the task is open-ended, multi-step, or requires tool use.

  7. Agents come with real risks. Hallucination, runaway execution, security vulnerabilities, unpredictability, cost, and latency are all concerns you must address.

  8. Start simple, add complexity gradually. Begin with a single-step agent, add tools one at a time, and expand autonomy as you build confidence in the system.


What is Next?

In the next lesson, we will move from theory to practice. You will learn how to build your first AI agent using Claude's tool use capabilities — starting with a simple single-step agent and progressively adding more tools and autonomy.