HomeAgents & OrchestrationWhat Are AI Agents?

beginner12 min read· Module 9, Lesson 1

🤖What Are AI Agents?

Understand the difference between chatbots and autonomous agents

What Are AI Agents?

You have probably used a chatbot before — you type a question, it gives you an answer, and the conversation is done. An AI agent is something fundamentally different. It is an AI system that can observe its environment, make decisions, take actions, and repeat until a goal is achieved — all with minimal human intervention.

Think of it this way: a chatbot is like a librarian who answers your question. An agent is like a research assistant who goes out, finds papers, reads them, takes notes, cross-references sources, and delivers a finished report back to you.

Chatbot vs Agent: A Clear Comparison

Understanding the distinction between chatbots and agents is critical because it changes how you design, build, and deploy AI systems.

Feature	Chatbot	AI Agent
Interaction model	Single turn: question in, answer out	Multi-step: works autonomously toward a goal
Memory	Limited to conversation context	Maintains state across steps and sessions
Tool use	None or minimal	Uses tools (search, code execution, APIs, file I/O)
Decision-making	Responds to explicit instructions	Decides what to do next on its own
Error handling	Returns an error message	Detects errors, retries, and adapts its approach
Scope	Answers one question at a time	Solves complex, multi-step problems
Autonomy	None — waits for user input	High — can plan and execute without human input
Feedback loop	No self-correction	Observes results and adjusts behavior

A chatbot is reactive — it waits for you. An agent is proactive — it pursues a goal.

The Agent Loop: Observe, Think, Act, Repeat

Every AI agent follows a fundamental loop, regardless of how complex it is. This loop is the heartbeat of agentic behavior:

Step 1: Observe

The agent gathers information from its environment. This could mean:

Reading a user's request
Checking the output of a previous action
Inspecting a file, database, or API response
Reviewing error messages from a failed step

Step 2: Think

The agent reasons about what it has observed. It asks itself:

What is my goal?
What do I know so far?
What should I do next?
Have I encountered an error I need to handle?

This "thinking" step is powered by the LLM's reasoning capabilities. In advanced agents, this step may involve explicit chain-of-thought or extended thinking.

Step 3: Act

The agent takes an action. This is where tool use comes in:

Call a search API to find information
Execute a code snippet to process data
Write to a file or database
Send an API request to an external service
Ask the user for clarification (when truly needed)

Step 4: Repeat

After acting, the agent returns to Step 1. It observes the result of its action and decides whether it has achieved its goal or needs to take another step.

    +----------+
    | Observe  |<---------+
    +----+-----+          |
         |                |
         v                |
    +----------+          |
    |  Think   |          |
    +----+-----+          |
         |                |
         v                |
    +----------+    +-----+------+
    |   Act    +--->|  Evaluate  |
    +----------+    +-----+------+
                          |
                    Goal met? ---> Done

This loop is what gives agents their power. A chatbot stops after one response. An agent keeps going until the job is done.

Tool Use: The Key Differentiator

The single most important capability that separates an agent from a chatbot is tool use. Without tools, an LLM can only generate text. With tools, it can interact with the real world.

What are tools?

Tools are functions or APIs that the agent can call during its reasoning loop. They extend the agent's capabilities beyond text generation.

Common tool categories

Information retrieval:

Web search (Google, Bing)
Database queries (SQL, vector search)
File reading (read a document, parse a CSV)
API calls (fetch weather, stock prices, user data)

Code execution:

Run Python, JavaScript, or bash scripts
Install packages and dependencies
Execute tests and check results

File manipulation:

Create, edit, and delete files
Navigate directory structures
Manage version control (git)

Communication:

Send emails or messages
Create pull requests
Post to Slack or other platforms

How tool use works in practice

When an agent decides to use a tool, the flow looks like this:

The LLM generates a tool call — a structured request specifying which tool to use and with what parameters
The runtime environment executes the tool call
The tool returns a result to the agent
The agent observes the result and continues reasoning

JSON
{
  "tool": "web_search",
  "parameters": {
    "query": "current population of Tokyo 2026"
  }
}

The agent receives the search results, extracts the relevant information, and decides its next step. This tool-use loop is what makes agents genuinely useful for real-world tasks.

Types of AI Agents

Not all agents are the same. They vary in complexity, autonomy, and architecture.

1. Single-Step Agents

The simplest form of agent. It receives a task, uses one tool, and returns a result.

Example: A summarization agent that:

Receives a URL
Fetches the web page content (tool use)
Summarizes it and returns the summary

When to use: Simple, well-defined tasks where one action is sufficient.

2. Multi-Step Agents

These agents chain multiple actions together to solve a more complex problem. They follow the observe-think-act loop for several iterations.

Example: A research agent that:

Receives a topic
Searches the web for relevant articles (tool use)
Reads the top 5 articles (tool use)
Cross-references facts across sources (reasoning)
Writes a structured report (generation)
Saves the report to a file (tool use)

When to use: Tasks that require gathering information from multiple sources, processing data, or completing multi-step workflows.

3. Multi-Agent Systems

Multiple specialized agents work together, each handling a different aspect of a larger task. A coordinator agent delegates subtasks to worker agents.

Example: A software development system with:

Planner agent — breaks down a feature request into tasks
Coder agent — writes the code for each task
Reviewer agent — reviews the code for bugs and style
Tester agent — writes and runs tests
Deployer agent — handles deployment

When to use: Large-scale tasks that benefit from specialization, tasks requiring different expertise areas, or systems that need parallel processing.

Agent Architectures

As the field has matured, several standard architectures have emerged for building agents.

ReAct (Reasoning + Acting)

ReAct is the most widely used agent architecture. The agent alternates between reasoning (thinking out loud about what to do) and acting (using tools).

How it works:

Thought: "I need to find the current stock price of Apple."
Action: Call the stock_price tool with symbol "AAPL"
Observation: The tool returns $198.50
Thought: "Now I need to compare this to the price 30 days ago."
Action: Call the historical_price tool with symbol "AAPL" and period "30d"
Observation: The tool returns $185.20
Thought: "Apple stock is up 7.2% over 30 days. I have enough information to answer."
Final Answer: "Apple (AAPL) is currently trading at $198.50, up 7.2% from $185.20 thirty days ago."

Strengths:

Simple to implement
Easy to debug (you can read the reasoning trace)
Works well for most use cases

Weaknesses:

Can get stuck in loops
May not plan far enough ahead for complex tasks
Each step depends on the previous one (sequential)

Plan-and-Execute

This architecture separates planning from execution. The agent first creates a complete plan, then executes each step.

How it works:

Planning phase: Given the task "Build a weather dashboard," create a plan:
- Step 1: Research weather APIs
- Step 2: Choose an API and get an API key
- Step 3: Design the UI layout
- Step 4: Write the HTML/CSS
- Step 5: Write the JavaScript to fetch weather data
- Step 6: Test the application
- Step 7: Deploy to a hosting service
Execution phase: Execute each step sequentially, potentially re-planning if something goes wrong.

Strengths:

Better for complex, long-horizon tasks
Creates a clear roadmap before acting
Can re-plan when encountering obstacles

Weaknesses:

Initial plan may be wrong or incomplete
Re-planning adds latency
More complex to implement

Other Notable Architectures

Reflexion: The agent reflects on its own performance after completing a task and uses that reflection to improve future attempts.

LATS (Language Agent Tree Search): The agent explores multiple possible action paths in a tree structure, evaluating each one before committing to the best path.

Toolformer-style: The agent learns when and how to use tools from training data, rather than being explicitly programmed with tool definitions.

Real-World Examples of AI Agents

AI agents are already being used across industries. Here are concrete examples:

Software Development

Claude Code (Anthropic): An agentic coding assistant that can read your codebase, write code, run tests, fix bugs, and manage git workflows autonomously.
GitHub Copilot Workspace: Plans, implements, and tests code changes based on issue descriptions.
Cursor / Windsurf: IDE-integrated agents that can edit multiple files and run commands.

Data Analysis

Automated data pipelines: Agents that fetch data from APIs, clean it, run statistical analyses, generate visualizations, and produce reports.
Anomaly detection agents: Continuously monitor data streams and alert humans when unusual patterns emerge.

Customer Service

Tier-2 support agents: Handle complex customer issues by looking up account information, checking order status, processing refunds, and escalating when necessary.
Onboarding agents: Guide new users through product setup, configuring settings and troubleshooting issues in real time.

Research and Knowledge Work

Literature review agents: Search academic databases, read papers, extract key findings, and compile annotated bibliographies.
Legal document analysis: Review contracts, flag risky clauses, and suggest amendments.

DevOps and Infrastructure

Incident response agents: Detect system failures, gather logs, diagnose root causes, and apply fixes or roll back deployments.
Infrastructure management: Monitor cloud resources, scale services up or down, and optimize costs.

When to Use Agents vs Simple API Calls

Not every task needs an agent. Here is a practical decision framework:

Use a simple API call when:

The task has a single, well-defined step (e.g., "translate this sentence")
You do not need tool use — the LLM's knowledge is sufficient
Latency matters — you need a response in under a second
The task is deterministic — there is one right answer
You can fully specify the input and expected output format

Use an agent when:

The task requires multiple steps that depend on each other
You need tool use — searching the web, running code, reading files
The task is open-ended — you cannot predict all the steps in advance
Error recovery is important — the system should retry and adapt
The task involves exploration — the agent needs to discover information
You need autonomous operation — the system should work without constant human input

The decision matrix

Scenario	Approach
Summarize a paragraph	Simple API call
Translate a document	Simple API call
Research a topic and write a report	Agent
Fix a bug in a codebase	Agent
Generate a JSON from a template	Simple API call
Build a feature based on a spec	Agent
Answer a factual question	Simple API call
Monitor a system and respond to incidents	Agent
Classify an email as spam or not	Simple API call
Process a multi-step refund workflow	Agent

Limitations and Risks of Autonomous Agents

AI agents are powerful, but they come with real risks that you must understand before deploying them.

1. Hallucination and Confabulation

Agents can "make up" information, tool calls, or results. An agent might:

Claim it searched for something when it did not
Fabricate data that looks plausible but is incorrect
Misinterpret tool outputs and draw wrong conclusions

Mitigation: Always validate agent outputs. Use structured tool calls with strict schemas. Implement output verification checks.

2. Runaway Execution

An agent in a loop can consume unlimited resources:

Making hundreds of API calls
Running expensive computations
Creating or deleting files without restraint

Mitigation: Set maximum iteration limits. Implement cost budgets. Require human approval for destructive actions.

3. Security Risks

Agents that can execute code or access external systems are potential attack vectors:

Prompt injection: A malicious document could trick the agent into executing harmful commands
Data exfiltration: An agent with internet access could be tricked into sending sensitive data to an attacker
Privilege escalation: An agent might access resources beyond its intended scope

Mitigation: Run agents in sandboxed environments. Apply the principle of least privilege. Never give agents access to production databases without safeguards.

4. Unpredictable Behavior

Because agents make autonomous decisions, their behavior is harder to predict than a simple API call:

They may choose unexpected approaches to solve a problem
They may get stuck in loops, retrying the same failed approach
They may misunderstand the goal and pursue the wrong objective

Mitigation: Implement comprehensive logging. Use guardrails and safety checks. Start with limited autonomy and expand gradually.

5. Cost

Agents are expensive. Each iteration of the observe-think-act loop costs tokens. A complex agent task might require:

10-50 LLM calls
Multiple tool executions
Extended thinking time

Mitigation: Set token budgets. Cache intermediate results. Use cheaper models for simple sub-tasks and more capable models for complex reasoning steps.

6. Latency

Agents are slow compared to single API calls. Each step involves:

An LLM inference call (seconds)
Tool execution (variable, could be seconds to minutes)
Result processing (milliseconds)

A 10-step agent task might take 30 seconds to 5 minutes.

Mitigation: Use streaming to show progress. Parallelize independent steps when possible. Set user expectations about response times.

Key Takeaways

An agent is not just a chatbot. It is an autonomous system that can observe, think, act, and repeat until a goal is achieved.
Tool use is the key differentiator. Without tools, an LLM can only generate text. With tools, it can interact with the real world.
The agent loop (observe-think-act-repeat) is universal. Every agent architecture is built on this fundamental cycle.
Choose the right level of complexity. Single-step agents for simple tasks, multi-step for complex workflows, multi-agent for large-scale systems.
ReAct and Plan-and-Execute are the two primary architectures. ReAct is simpler and works for most tasks. Plan-and-Execute is better for long-horizon, complex tasks.
Not everything needs an agent. Use simple API calls when the task is well-defined and single-step. Use agents when the task is open-ended, multi-step, or requires tool use.
Agents come with real risks. Hallucination, runaway execution, security vulnerabilities, unpredictability, cost, and latency are all concerns you must address.
Start simple, add complexity gradually. Begin with a single-step agent, add tools one at a time, and expand autonomy as you build confidence in the system.

What is Next?

In the next lesson, we will move from theory to practice. You will learn how to build your first AI agent using Claude's tool use capabilities — starting with a simple single-step agent and progressively adding more tools and autonomy.

Module 9

1/6

💬 Project: AI-Powered Slack/Discord Bot

Claude Agent SDK Overview 🧰

1/6