%%capture
%pip install -U anthropic python-dotenv
What’s the best way to get started building AI agent systems? There are countless frameworks for building AI agents available, such as CrewAI, LangGraph, and the OpenAI Agents SDK, and it can be overwhelming to choose one. On the other hand, Anthropic recommended starting with using direct LLM APIs calls to understand the fundamentals before relying on framework abstractions.
This tutorial takes this approach by exploring how to implement an AI agent from scratch in Python using an LLM API directly to gain a better understanding of what’s happening under the hood. This tutorial focuses on implementing a singleagent before advancing to more complex topics, such as agentic workflows or multi-agent systems.
Implementing an AI Agent from scratch
This section implements an Agent()
class by incorporating each of the following core components of an AI agent step-by-step:
- LLM and instructions: The LLM powering the agent’s reasoning and decision-making capabilities with explicit guidelines defining how the agent should behave.
- Memory: Conversation history (short-term memory) the agent uses to understand the current interaction.
- Tools: External functions or APIs the agent can call.
And finally, we will put everything together in a loop.
Component 1: LLM and Instructions
At the core of every AI agent, you have a Large Language Model (LLM) with tool use capabilities, such as Anthropic’s Claude 4 Sonnet, OpenAI’s GPT-4o, or Google’s Gemini 2.5 Pro.
This tutorial uses Claude 4 Sonnet through the Anthropic API but you can easily adjust the code to any other LLM API of your choice.
To use the Anthropic API, you will need an ANTHROPIC_API_KEY
, which you can obtain by creating an Anthropic account and navigating to the “API Keys” tab in your dashboard. Once you have your API key, you need to store it in the environment variables, an .env file, or the Google Colab secrets, depending on the environment you’re using.
Let’s install and import the required libraries.
import anthropic
import os
from dotenv import load_dotenv
from google.colab import userdata
load_dotenv()
print(anthropic.__version__)
0.69.0
Now, we will implement a simple Agent
class with the following components:
- Initialization: Sets up the LLM client and configures the model with a system prompt that contains instruction for the agent on how to act. (You could also turn this into a parameter you can pass to the agent but we will use a fixed one for simplicity.)
chat
method: Processes user messages by sending them to the LLM API and returning the response
class Agent:
"""A simple AI agent that can answer questions"""
def __init__(self):
self.client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
self.model = "claude-sonnet-4-20250514"
self.system_message = "You are a helpful assistant that breaks down problems into steps and solves them systematically."
def chat(self, message):
"""Process a user message and return a response"""
= self.client.messages.create(
response =self.model,
model=1024,
max_tokens=self.system_message,
system=[
messages"role": "user", "content": message}
{
],=0.1,
temperature
)
return response
The agent now has simple query-response capabilities. Let’s test it.
= Agent()
agent
= agent.chat("I have 4 apples. How many do you have?")
response print(response.content[0].text)
I don't have any apples - as an AI, I don't have a physical form, so I can't possess physical objects like apples. Only you have apples in this scenario (4 of them).
Is there something you'd like to do with this information, like a math problem involving your apples?
Great. Let’s follow up with a second message.
= agent.chat("I ate 1 apple. How many are left?")
response print(response.content[0].text)
I don't have enough information to answer how many apples are left. To solve this, I would need to know:
**What I need:**
- How many apples you started with
**The calculation would be:**
Starting number of apples - 1 apple eaten = Apples remaining
Could you tell me how many apples you had before eating one?
As you can see, the agent lacks the information from the first message. That’s why we need to give the agent access to the conversation history.
Component 2: (Conversation) Memory
Memory in agents can take many different forms, such as short-term and long-term memory, and memory management can become a complex topic. For the sake of this tutorial, let’s keep it simple and start with a basic short-term memory implementation.
Short-term memory gives the agent access to the conversation history to understand the current interaction. In its simplest form, the short-term memory is just a list of past messages
between the user
and the assistant
. (Note, that the longer the conversation history becomes, you will run into context window limitations and will need to implement a more sophisticated solution.)
We implement short-term memory by adding a messages
property where we store both:
- the user inputs with
{"role": "user", "content": message}
- the response with
{"role": "assistant", "content": response.content}
class Agent:
"""A simple AI agent that can answer questions in a multi-turn conversation"""
def __init__(self):
self.client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
self.model = "claude-sonnet-4-20250514"
self.system_message = "You are a helpful assistant that breaks down problems into steps and solves them systematically."
self.messages = []
def chat(self, message):
"""Process a user message and return a response"""
# Store user input in short-term memory
self.messages.append({"role": "user", "content": message})
= self.client.messages.create(
response =self.model,
model=1024,
max_tokens=self.system_message,
system=self.messages,
messages=0.1,
temperature
)
# Store assistant's response in short-term memory
self.messages.append({"role": "assistant", "content": response.content})
return response
Now, let’s test the agent again with the previous example conversation.
= Agent()
agent
= agent.chat("I have 4 apples. How many do you have?")
response print(response.content[0].text)
= agent.chat("I ate 1 apple. How many are left?")
response print(response.content[0].text)
I don't have any apples - as an AI, I don't have a physical form and can't possess physical objects like apples. You have 4 apples, and I have 0 apples.
Is there something you'd like to do with your 4 apples, like a math problem or recipe suggestion?
Let me solve this step by step:
**Step 1:** Identify the starting amount
- You started with 4 apples
**Step 2:** Identify what was consumed
- You ate 1 apple
**Step 3:** Calculate the remaining amount
- Apples left = Starting amount - Apples eaten
- Apples left = 4 - 1 = 3
**Answer:** You have 3 apples left.
As you can see, the agent is now able to hold a conversation and to reference previous information.
But what happens, if you task the agent with a little more complex math problem?
= Agent()
agent
= agent.chat("What is 157.09 * 493.89?")
response
print(response.content[0].text)
I'll solve this step by step using the standard multiplication algorithm.
157.09 × 493.89
First, let me multiply 157.09 by each digit of 493.89:
**Step 1:** 157.09 × 9 (ones place)
157.09 × 9 = 1,413.81
**Step 2:** 157.09 × 80 (tens place)
157.09 × 8 = 1,256.72
1,256.72 × 10 = 12,567.2
**Step 3:** 157.09 × 300 (hundreds place)
157.09 × 3 = 471.27
471.27 × 100 = 47,127
**Step 4:** 157.09 × 90,000 (ten-thousands place)
157.09 × 9 = 1,413.81
1,413.81 × 10,000 = 14,138,100
**Step 5:** 157.09 × 400,000 (hundred-thousands place)
157.09 × 4 = 628.36
628.36 × 100,000 = 62,836,000
**Step 6:** Add all partial products:
```
1,413.81
12,567.2
47,127
14,138,100
62,836,000
-----------
77,035,208.01
```
Therefore, **157.09 × 493.89 = 77,035.2081**
The agent’s answer sounds perfectly believable but if you validate it, you can actually see that even powerful LLMs like Claude 4 Sonnet can still make arithmetic errors without tools.
157.09 * 493.89
77585.1801
Component 3: Tool Use
To extend the agent’s capabilities, you can provide it with tools that can range from simple functions to using external APIs. For this tutorial, we will implement a simple CalculatorTool
class, that can handle math problems.
The exact implemention of tool use is different across providers, but at the core always requires two key components:
- Function implementation: This is the actual function that executes the tool’s logic, such as performing a calculation, or making an API call.
- Tool schema: A structured description of the tool. The tool description is important because it tells the LLM what the tool does, when to use it, and what parameters it takes.
This tutorial follows the Anthropic documentation on tool use. If you’re using a different LLM API than this tutorial, I recommend to check out your LLM providers documentation on tool use.
class CalculatorTool():
"""A tool for performing mathematical calculations"""
def get_schema(self):
return {
"name": "calculator",
"description": "Performs basic mathematical calculations, use also for simple additions",
"input_schema": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Mathematical expression to evaluate (e.g., '2+2', '10*5')"
}
},"required": ["expression"]
}
}
def execute(self, expression):
"""
Evaluate mathematical expressions.
WARNING: This tutorial uses eval() for simplicity but it is not recommended for production use.
Args:
expression (str): The mathematical expression to evaluate
Returns:
float: The result of the evaluation
"""
try:
= eval(expression)
result return {"result": result}
except:
return {"error": "Invalid mathematical expression"}
Note, that in this tutorial, we are just implementing a single tool. In production code, you’d typically use an abstract base class to ensure a consistent interface across tools.
Let’s test if the calculator function works.
= CalculatorTool()
calculator_tool
"157.09 * 493.89") calculator_tool.execute(
{'result': 77585.1801}
Now that we have a CalculatorTool
, let’s add tool use capabilities to our agent, in three steps:
- Add
tools
andtool_map
attributes to store available tools - Add the private
_get_tool_schemas()
method to extract tool schemas - Add tool handling logic to the
create
method to detect tool use
class Agent:
"""A simple AI agent that can use tools to answer questions in a multi-turn conversation"""
def __init__(self, tools):
self.client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
self.model = "claude-sonnet-4-20250514"
self.system_message = "You are a helpful assistant that breaks down problems into steps and solves them systematically."
self.messages = []
self.tools = tools
self.tool_map = {tool.get_schema()["name"]: tool for tool in tools}
def _get_tool_schemas(self):
"""Get tool schemas for all registered tools"""
return [tool.get_schema() for tool in self.tools]
def chat(self, message):
"""Process a user message and return a response"""
# Store user input in short-term memory
self.messages.append({"role": "user", "content": message})
= self.client.messages.create(
response =self.model,
model=1024,
max_tokens=self.system_message,
system=self._get_tool_schemas() if self.tools else None,
tools=self.messages,
messages=0.1,
temperature
)
# Store assistant's response in short-term memory
self.messages.append({"role": "assistant", "content": response.content})
return response
Let’s give it a try.
= CalculatorTool()
calculator_tool = Agent(tools=[calculator_tool])
agent
= agent.chat("What is 157.09 * 493.89?")
response
for block in response:
print(block)
('id', 'msg_01BzC2FerKEr8rC1wGfaMiNK')
('content', [TextBlock(citations=None, text="I'll calculate 157.09 * 493.89 for you.", type='text'), ToolUseBlock(id='toolu_017NhVhd5wYWdEw7fFRPHyXL', input={'expression': '157.09 * 493.89'}, name='calculator', type='tool_use')])
('model', 'claude-sonnet-4-20250514')
('role', 'assistant')
('stop_reason', 'tool_use')
('stop_sequence', None)
('type', 'message')
('usage', Usage(cache_creation=CacheCreation(ephemeral_1h_input_tokens=0, ephemeral_5m_input_tokens=0), cache_creation_input_tokens=0, cache_read_input_tokens=0, input_tokens=433, output_tokens=77, server_tool_use=None, service_tier='standard'))
As you can see in the response, the agent answers with “I’ll calculate 157.09 * 493.89 for you.” but instead of calculating the expression itself, it stops with stop_reason
being tool_use
. This means, that the agent is waiting for the user to execute the tool and return the result from the tool to the agent.
But now, the agent has responded that it needs help with executing the tool and is waiting. This is where the final component of the loop comes into play.
Component 4: Agent Loop
You might have already heard people say that “Agents are models using tools in a loop”. Without the loop, the agent can only handle single-turn without multi-turn interactions.
I really like this pseudo code by Barry Zhan, Anthropic, showing that agents are just LLMs making decisions in a loop, observing results, and deciding what to do next.
= Environment()
env = Tools(env)
tools = "Goals, constraints, and how to act"
system_prompt
while True:
= llm.run(system_prompt + env.state)
action = tools.run(action) env.state
For this simple agent implementation, that means, we have the following flow:
- User sends message to agent
- Agent decides it needs a tool and responds with a
stop_reason
oftool_use
and atool_use
block with the tool name and parameters. It’s saying “I’m pausing for you to execute this tool with these parameters”. - The user executes the tool and sends the tool result back to the agent in a follow-up message
- The agent continues and gives the final response.
import json
def run_agent(user_input, max_turns=10):
= CalculatorTool()
calculator_tool = Agent(tools=[calculator_tool])
agent
= 0
i
while i < max_turns: # It's safer to use max_turns rather than while True
+= 1
i print(f"\nIteration {i}:")
print(f"User input: {user_input}")
= agent.chat(user_input)
response print(f"Agent output: {response.content[0].text}")
# Handle tool use if present
if response.stop_reason == "tool_use":
# Process all tool uses in the response
= []
tool_results for content_block in response.content:
if content_block.type == "tool_use":
= content_block.name
tool_name = content_block.input
tool_input
print(f"Using tool {tool_name} with input {tool_input}")
# Execute the tool
= agent.tool_map[tool_name]
tool = tool.execute(**tool_input)
tool_result
tool_results.append({"type": "tool_result",
"tool_use_id": content_block.id,
"content": json.dumps(tool_result)
})print(f"Tool result: {tool_result}")
# Add tool results to conversation
= tool_results
user_input else:
return response.content[0].text
return
Testing the Implemented AI Agent
Let’s test the implemented AI agent with a few example test cases.
Test 1: General question (no tool use)
This test demonstrates the agent’s ability to answer a simple, general question that does not require the use of any external tools.
= run_agent("I have 4 apples. How many do you have?") response
Iteration 1:
User input: I have 4 apples. How many do you have?
Agent output: I don't have any apples since I'm an AI assistant - I don't have a physical form or possessions. But I can help you with calculations involving your 4 apples if you need!
Is there something specific you'd like to calculate or figure out with your 4 apples?
Test 2: Tool Use
This test demonstrates how the agent understands that it needs to use a tool to to solve a specific task and uses the CalculatorTool
to get the correct result.
= run_agent("What is 157.09 * 493.89?") response
Iteration 1:
User input: What is 157.09 * 493.89?
Agent output: I'll calculate 157.09 * 493.89 for you.
Using tool calculator with input {'expression': '157.09 * 493.89'}
Tool result: {'result': 77585.1801}
Iteration 2:
User input: [{'type': 'tool_result', 'tool_use_id': 'toolu_01FC9yLWt2Cf6a8zLGhj7ZJz', 'content': '{"result": 77585.1801}'}]
Agent output: The result of 157.09 * 493.89 is **77,585.1801**.
Test 3: Step-by-step tool use
This test demonstrates the agent’s ability to break down a more complex problem into smaller steps and use the CalculatorTool
multiple times within a single conversation to arrive at the final answer.
= run_agent("If my brother is 32 years younger than my mother and my mother is 30 years older than me and I am 20, how old is my brother?") response
Iteration 1:
User input: If my brother is 32 years younger than my mother and my mother is 30 years older than me and I am 20, how old is my brother?
Agent output: I'll solve this step by step using the given information.
Given:
- You are 20 years old
- Your mother is 30 years older than you
- Your brother is 32 years younger than your mother
Let me calculate your mother's age first:
Using tool calculator with input {'expression': '20 + 30'}
Tool result: {'result': 50}
Iteration 2:
User input: [{'type': 'tool_result', 'tool_use_id': 'toolu_01WPMQRzCi4roua9vQ7qXeCR', 'content': '{"result": 50}'}]
Agent output: So your mother is 50 years old.
Now I'll calculate your brother's age:
Using tool calculator with input {'expression': '50 - 32'}
Tool result: {'result': 18}
Iteration 3:
User input: [{'type': 'tool_result', 'tool_use_id': 'toolu_01UL7n7a85XJUn7Tgk8kiHhX', 'content': '{"result": 18}'}]
Agent output: Your brother is 18 years old.
To summarize:
- You: 20 years old
- Your mother: 50 years old (30 years older than you)
- Your brother: 18 years old (32 years younger than your mother)
Summary
This tutorial showed you how you can implement a minimal AI agent from scratch using just an LLM API without any frameworks. Hopefully, you now understand the fundamentals of what happens under the hood of an AI agent and what people mean, when they say “Agents are models using tools in a loop”.
You can find this notebook in this GitHub repository
As a next step, you can refer to the following resources to learn more about how to implement different agent workflows.