LLM Agents & Tools

What's in this lesson: Explore how large language models evolve from passive text generators into active agents capable of planning, using tools, and executing complex tasks.
Why this matters: Mastering agents is the key to building AI systems that can actually *do* things in the real world.

Attention Activity: The Illusion of Action

What happens when you ask an AI to interact with the real world, like booking a flight? Let's observe the difference between passive and active AI.

Passive LLM

Waiting for prompt...

Agent LLM

Waiting for prompt...

The passive model only generates words. The Agent plans a sequence of steps, executes an external tool, and observes the result.

From Generators to Agents

Traditional Large Language Models (LLMs) take an input prompt and predict the next word. They are closed boxes, relying entirely on static training data.

LLM Agents break out of the box. By giving the LLM access to external tools, it evolves into an active participant. The model becomes the "cognitive engine" routing logic and data.

The Execution Loop (ReAct)

How do agents actually "think"? They rely on cognitive architectures, the most famous being the ReAct (Reason + Act) loop. The agent iterates through a cycle until its overarching goal is met.

Reason

Act

Observe

1. Reason: The LLM analyzes the goal and decides on a plan.
2. Act: The LLM executes an external tool (like searching the web).
3. Observe: The LLM reads the result from the tool and feeds it back into the reasoning step.

Knowledge Check

In the ReAct execution loop, what does the agent do during the "Observe" phase?

It outputs the final summary. It reads the result from an executed tool. It formulates a new plan. It reads the user prompt.

Function & Tool Calling Mechanics

How does a text generation model physically "use" a tool? Through a mechanism called Function Calling.

Developers provide the LLM with a list of available tools and their required data formats (schemas). When the agent decides to act, it doesn't run code itself; it generates a highly structured text object (usually JSON).

"What's the weather in Paris?"

{ "function_name": "get_weather", "arguments": { "location": "Paris, FR", "unit": "celsius" } }

The surrounding software architecture catches this JSON, executes the actual API request or database query, and hands the raw data back to the LLM as an observation.

Task Decomposition

For complex goals, agents perform Task Decomposition. They break a massive problem down into smaller, sequential steps, handling one micro-task at a time.

State Management & Memory

Agents need memory to function over multiple steps. This is divided into Short-term Memory (the scratchpad of the current task) and Long-term Memory (a database of past interactions and facts).

State management allows an agent to pause, wait for a slow tool or user approval, and resume exactly where it left off without starting over.

Multi-Agent Orchestration

One agent cannot do everything perfectly. Multi-Agent Orchestration involves a network of specialized agents.

A "Supervisor" agent breaks down the task and routes pieces to specialized "Worker" agents (e.g., a Coder agent, a Researcher agent, a Reviewer agent). They collaborate, critique each other, and merge their outputs.

Handling Failure (Self-Correction)

The real world is messy. Tools fail, APIs timeout, and queries return empty. Self-Correction is a hallmark of advanced agents.

When an agent observes an error from a tool, it uses its reasoning engine to analyze the failure, adjust its parameters, and try a different approach instead of crashing.

Knowledge Check

Which concept allows an agent to recover from API failures or empty tool results?

Task Decomposition Semantic Routing Self-Correction Passive Generation

Assessment Starts Now

You are about to begin the final assessment. You will be tested on the concepts of ReAct loops, function calling, state management, and multi-agent systems.

Ensure you are ready. You must score at least 80% to earn your certificate.

Assessment

1. What is the primary difference between a traditional LLM and an Agent LLM?

A traditional LLM uses tools natively, while an Agent does not. An Agent LLM can actively plan and execute external tools, while a traditional LLM only generates text. Traditional LLMs are much faster than Agent LLMs. Agent LLMs do not use text generation at all.

Assessment

2. In the ReAct architecture, which phase involves interpreting the output of a tool?

Reason Act Observe Dispatch

Assessment

3. What is the purpose of a "Supervisor" in a multi-agent orchestration setup?

To manually write all the code for the worker agents. To store long-term memory in a vector database. To break down complex tasks and route them to specialized worker agents. To format the final output as a PDF.

Assessment

4. How does an LLM typically interface with an external API?

By generating structured text, such as JSON, through Function Calling. By downloading the API source code into its context window. By establishing a direct web socket connection natively. By training a new sub-model for each API.

Attention Activity: The Illusion of Action

Passive LLM

Agent LLM

From Generators to Agents

The Execution Loop (ReAct)

Knowledge Check

Function & Tool Calling Mechanics

Task Decomposition

State Management & Memory

Multi-Agent Orchestration

Handling Failure (Self-Correction)

Knowledge Check

Assessment Starts Now

Assessment

Assessment

Assessment

Assessment

Assessment Complete