<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Leonie Monigatti</title>
<link>https://www.leoniemonigatti.com/blog.html</link>
<atom:link href="https://www.leoniemonigatti.com/blog.xml" rel="self" type="application/rss+xml"/>
<description>Leonie Monigatti&#39;s portfolio and blog about Machine Learning and AI Engineering.</description>
<generator>quarto-1.8.27</generator>
<lastBuildDate>Mon, 19 Jan 2026 00:00:00 GMT</lastBuildDate>
<item>
  <title>Agent Memory: Filesystem vs Database</title>
  <link>https://www.leoniemonigatti.com/blog/filesystem-vs-database-for-agent-memory.html</link>
  <description><![CDATA[ 





<p>I’m digesting the current “filesystem vs database” debate for agent memory. Currently I’m seeing 2 camps in how we build agent memory:</p>
<ul>
<li>On the one side, we have the “file interfaces are all you need” camp.</li>
<li>n the other side, we have the “filesystems are just bad databases” camp.</li>
</ul>
<section id="file-interfaces-are-all-you-need-camp" class="level2">
<h2 class="anchored" data-anchor-id="file-interfaces-are-all-you-need-camp">“File interfaces are all you need” camp</h2>
<p>Leaders like Anthropic, Letta, Langchain &amp; LlamaIndex are leaning towards file interfaces because “files are surprisingly effective as agent memory”.</p>
<ul>
<li><a href="../blog/claude-memory-tool.html">Anthropic’s memory tool</a> treats memory as a set of files (the storage implementation is left up to the developer)</li>
<li><a href="https://x.com/hwchase17/status/2011814697889316930">Langsmith’s agent builder</a> also represents memory in as a set of files (the data is stored in a database and files are exposed to the agent as a filesystem)</li>
<li><a href="https://www.letta.com/blog/benchmarking-ai-agent-memory">Letta</a> found that simple filesystem tools like <code>grep</code> and <code>ls</code> outperformed specialized memory or retrieval tools in their benchmarks -<a href="https://www.llamaindex.ai/blog/files-are-all-you-need">LlamaIndex</a> argues that for many use cases a well-organized filesystem with semantic search might be all you need</li>
</ul>
<p>Agents are good at using filesystems because models are optimized for coding tasks (including CLI operations) duringpost-training.</p>
<p>That’s why we’re seeing a “virtual filesystem” pattern where the agent interface and the storage implementation are decoupled.</p>
</section>
<section id="filesystems-are-just-bad-databases-camp" class="level2">
<h2 class="anchored" data-anchor-id="filesystems-are-just-bad-databases-camp">“Filesystems are just bad databases” camp</h2>
<p>But then you have voices like Dax from OpenCode who rightly points out that <a href="https://x.com/thdxr/status/2011638639831499041">“a filesystem is just the worst kind of database”</a>.</p>
<p><a href="https://x.com/swyx/status/2011984243430236608?s=20">swyx</a> and <a href="https://x.com/jeffreyhuber/status/2011953780053737961">colleagues in the database space</a> warn about accidentally reinventing databases by solving the agent memory problem. Avoid writing worse versions of:</p>
<ul>
<li>search indexes,</li>
<li>transaction logs,</li>
<li>locking mechanisms,</li>
</ul>
</section>
<section id="trade-offs" class="level2">
<h2 class="anchored" data-anchor-id="trade-offs">Trade-offs</h2>
<p>It’s important to match the complexity of your system to the complexity of your problem.</p>
<section id="simplicity-vs-scale" class="level3">
<h3 class="anchored" data-anchor-id="simplicity-vs-scale">Simplicity vs scale</h3>
<p>Files are simple and CLI tools can even outperform specialized retrieval tools.</p>
<p>But these CLI tools don’t scale well &amp; can become a bottleneck.</p>
</section>
<section id="querying-and-aggregations" class="level3">
<h3 class="anchored" data-anchor-id="querying-and-aggregations">Querying and aggregations</h3>
<p><code>grep</code> can be effective and a hard baseline to beat. And if you want to improve retrieval performance with hybrid or semantic search?</p>
<p>Luckily, there are CLI tools available for semantic search (e.g., <a href="https://github.com/run-llama/semtools"><code>semtools</code></a> or <a href="https://github.com/mixedbread-ai/mgrep"><code>mgrep</code></a>).</p>
<p>The question remains: How well they scale and how effective agents are at using them when they are not as common in the training data.</p>
<p>Also at some point you might want some aggregations as well.</p>
</section>
<section id="plain-text-vs-complex-data" class="level3">
<h3 class="anchored" data-anchor-id="plain-text-vs-complex-data">Plain text vs complex data</h3>
<p>File interfaces and native CLI tools are great for plain-text files. What happens when memory becomes multimodal?</p>
</section>
<section id="concurrency" class="level3">
<h3 class="anchored" data-anchor-id="concurrency">Concurrency</h3>
<p>If you have a single agent accessing one memory file sequentially, no need to think about this.</p>
<p>If you have a multi-agent system, you want a database before implementing buggy lock mechanisms.</p>
<hr>
<p>We’re just scratching the surface: security concerns, permission management, schema validation, etc. are more arguments for databases over filesystems for agent memory use cases.</p>
<p>I think this is an interesting conversation and I’mm curious to see where it goes.</p>
<hr>
<p>Originally posted on <a href="https://x.com/helloiamleonie/status/2013256958535401503">X</a>/<a href="https://www.linkedin.com/posts/804250ab_digesting-the-current-filesystem-vs-database-activity-7419022671964766208-DG1V?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAABdZ4YQB5f0bhOeOvQJ3YEUtKThe0GEP4tc">LinkedIn</a>.</p>


</section>
</section>

<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <guid>https://www.leoniemonigatti.com/blog/filesystem-vs-database-for-agent-memory.html</guid>
  <pubDate>Mon, 19 Jan 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Building AI Agents with Google’s ADK</title>
  <link>https://www.leoniemonigatti.com/blog/building-ai-agents-with-google-adk.html</link>
  <description><![CDATA[ 





<p>At the beginning of November 2025, Kaggle ran a <a href="https://www.kaggle.com/learn-guide/5-day-agents">5-Day AI Agents Intensive Course</a> covering core concepts of AI agents and how you can implement them with <a href="https://google.github.io/adk-docs/">Google’s <strong>Agent Development Kit (ADK)</strong></a>. To accompany the course, they released five whitepapers, which we will reference in this blog. This article reflects my study notes from the course.</p>
<p>For this, we will install the <code>google-adk</code> (<code>v1.18.0</code>) library for Python.</p>
<div id="cell-1" class="cell" data-execution_count="12">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%</span>pip install <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>q <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>U google<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>adk</span></code></pre></div></div>
</div>
<p>The agents we are building in this tutorial will be powered by <code>gemini-2.5-flash-lite</code>, which is free for experimental use. To use it, you will need to create a Gemini API key in <a href="https://aistudio.google.com/app/api-keys">Google AI Studio</a>. Then, make sure to add your API key to your environment variables (or Google Colab Secrets).</p>
<div id="cell-3" class="cell" data-execution_count="13">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> os</span>
<span id="cb2-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.colab <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> userdata</span>
<span id="cb2-3"></span>
<span id="cb2-4">GEMINI_API_KEY <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> userdata.get(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'GEMINI_API_KEY'</span>)</span>
<span id="cb2-5">os.environ[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"GOOGLE_API_KEY"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> GEMINI_API_KEY</span>
<span id="cb2-6"></span>
<span id="cb2-7">MODEL_NAME <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"gemini-2.5-flash-lite"</span></span></code></pre></div></div>
</div>
<section id="agent-fundamentals" class="level2">
<h2 class="anchored" data-anchor-id="agent-fundamentals">Agent Fundamentals</h2>
<p>Let’s start with creating and running a first simple <code>Agent</code> (or aliased as <code>LlmAgent</code>) in ADK. First, you’ll define an <code>Agent</code> with the following core components:</p>
<ul>
<li><code>name</code>: A name to identify the agent.</li>
<li><code>model</code>: We will be using <code>gemini-2.5-flash-lite</code> with <code>retry_options</code> for automatically handling failures by retrying the request.</li>
<li><code>description</code>: A description to identify the agent’s purpose.</li>
<li><code>instructions</code>: Instructions to describe the agent’s goal and how it should behave.</li>
<li><code>tools</code>: A list of tools that the agent can use (e.g., built-in Google search tool).</li>
</ul>
<div id="cell-5" class="cell" data-execution_count="14">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.genai <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> types</span>
<span id="cb3-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.agents <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Agent</span>
<span id="cb3-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.models.google_llm <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Gemini</span>
<span id="cb3-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.tools <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> google_search</span>
<span id="cb3-5"></span>
<span id="cb3-6">retry_config<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>types.HttpRetryOptions(</span>
<span id="cb3-7">    attempts<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>,         <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Maximum retry attempts</span></span>
<span id="cb3-8">    exp_base<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>,         <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Delay multiplier</span></span>
<span id="cb3-9">    initial_delay<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Initial delay before first retry (in seconds)</span></span>
<span id="cb3-10">    http_status_codes<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb3-11">        <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">429</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Too Many Requests</span></span>
<span id="cb3-12">        <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Internal Server Error</span></span>
<span id="cb3-13">        <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">503</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Service Unavailable</span></span>
<span id="cb3-14">        <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">504</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Gateway Timeout</span></span>
<span id="cb3-15">        ] <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Retry on these HTTP errors</span></span>
<span id="cb3-16">)</span>
<span id="cb3-17"></span>
<span id="cb3-18">root_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(</span>
<span id="cb3-19">    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant"</span>,</span>
<span id="cb3-20">    model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>Gemini(</span>
<span id="cb3-21">        model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>MODEL_NAME,</span>
<span id="cb3-22">        retry_options<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>retry_config</span>
<span id="cb3-23">    ),</span>
<span id="cb3-24">    description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"A simple agent that can answer general questions."</span>,</span>
<span id="cb3-25">    instruction<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""You are a helpful assistant.</span></span>
<span id="cb3-26"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    Use Google Search for current info or if unsure."""</span>,</span>
<span id="cb3-27">    tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[google_search],</span>
<span id="cb3-28">)</span></code></pre></div></div>
</div>
<p>Next, you will define an orchestrator that will run the agent. For experimentation purposes, you can use the <code>InMemoryRunner</code>. For production, you’d use the base <code>Runner</code> class when you need persistent state between runs (see Memory Management).</p>
<div id="cell-7" class="cell" data-execution_count="15">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.runners <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> InMemoryRunner</span>
<span id="cb4-2"></span>
<span id="cb4-3">runner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> InMemoryRunner(agent<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>root_agent)</span></code></pre></div></div>
</div>
<p>And finally, you can call the <code>run_debug</code> function to prompt the agent with a query.</p>
<div id="cell-9" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="6afc6e93-1451-4334-91ba-c2db3a096879" data-execution_count="16">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> runner.run_debug(</span>
<span id="cb5-2">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"When was the Kaggle 5-Day AI Agents Intensive Course happening?"</span>,</span>
<span id="cb5-3">    verbose<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>,</span>
<span id="cb5-4">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
 ### Created new session: debug_session_id

User &gt; When was the Kaggle 5-Day AI Agents Intensive Course happening?
assistant &gt; The Kaggle 5-Day AI Agents Intensive Course was happening from November 10th to November 14th, 2025. This course is also available as a self-paced learning guide. It was previously held live from March 31 to April 4, 2025.</code></pre>
</div>
</div>
<p>And that’s all to get your first single-agent system up and running!</p>
</section>
<section id="tools" class="level2">
<h2 class="anchored" data-anchor-id="tools">Tools</h2>
<p>An AI agent’s capability to call tools to connect them to the outside world is what sets them apart from regular LLM calls and what makes them so powerful. This section discusses four main types of tools an ADK agent can use:</p>
<ul>
<li>Built-in tools</li>
<li>Function tools</li>
<li>Agent tools</li>
<li>MCP tool</li>
</ul>
<section id="built-in-tools" class="level3">
<h3 class="anchored" data-anchor-id="built-in-tools">Built-in tools</h3>
<p>In the above example, we provided the agent access to a tool called <code>google_search</code>, which is a built-in tool. Some foundation models have built-in tools, where the tool definition is given to the model implicitly. For example, Google’s Gemini API has <a href="https://google.github.io/adk-docs/tools/built-in-tools/">several built-in tools</a>, such as Google search, code execution, or computer use.</p>
<div id="cell-13" class="cell" data-execution_count="17">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.tools <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> google_search</span>
<span id="cb7-2"></span>
<span id="cb7-3">root_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(</span>
<span id="cb7-4">    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant_with_builtin_tool"</span>,</span>
<span id="cb7-5">    model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>Gemini(</span>
<span id="cb7-6">        model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>MODEL_NAME,</span>
<span id="cb7-7">        retry_options<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>retry_config</span>
<span id="cb7-8">    ),</span>
<span id="cb7-9">    description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"A simple agent that can answer general questions."</span>,</span>
<span id="cb7-10">    instruction<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""You are a helpful assistant.</span></span>
<span id="cb7-11"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    Use Google Search for current info or if unsure."""</span>,</span>
<span id="cb7-12">    tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[google_search],</span>
<span id="cb7-13">)</span></code></pre></div></div>
</div>
<div id="cell-14" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="9c98c649-8c20-4446-ef30-27a3eb9b7eb3" data-execution_count="18">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1">runner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> InMemoryRunner(agent<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>root_agent)</span>
<span id="cb8-2"></span>
<span id="cb8-3">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> runner.run_debug(</span>
<span id="cb8-4">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"When was the Kaggle 5-Day AI Agents Intensive Course happening?"</span>,</span>
<span id="cb8-5">    verbose<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>,</span>
<span id="cb8-6">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
 ### Created new session: debug_session_id

User &gt; When was the Kaggle 5-Day AI Agents Intensive Course happening?
assistant_with_builtin_tool &gt; The Kaggle 5-Day AI Agents Intensive Course was happening from November 10th to November 14th, 2025. This course was designed to teach participants how to build and deploy intelligent AI agents, covering topics such as agent architectures, tools, memory, and evaluation, and moving from prototype to production. Registration for the course has since closed. However, the course content is expected to be available as a self-paced learning guide by the end of November 2025.</code></pre>
</div>
</div>
</section>
<section id="function-tools" class="level3">
<h3 class="anchored" data-anchor-id="function-tools">Function Tools</h3>
<p>The most common type of tool is the function tool. Developers can define custom functions for all foundation models that support “function calling”.</p>
<p>How you define your custom function will impact how well the agent is able to select and use the right tool for a given task. Therefore, it is important that the tool follows a few best practices:</p>
<ol type="1">
<li><strong>Docstrings:</strong> Enable the agent to understand when and how to use tools</li>
<li><strong>Type Hints:</strong> Enable the agent to generate the correct schema</li>
<li><strong>Dictionary Returns:</strong> Tools return successful tool calls with tool results or error message for failed tool calls</li>
</ol>
<div id="cell-16" class="cell" data-execution_count="19">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> get_kaggle_progressions(tier: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">dict</span>:</span>
<span id="cb10-2">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""Looks up the needed medals to progress in the Kaggle competitions tier</span></span>
<span id="cb10-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    based on the tier provided by the user.</span></span>
<span id="cb10-4"></span>
<span id="cb10-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Args:</span></span>
<span id="cb10-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        tier: The name of the Kaggle competitions tier. It should be descriptive,</span></span>
<span id="cb10-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">                e.g., "expert", "master", or "grandmaster".</span></span>
<span id="cb10-8"></span>
<span id="cb10-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Returns:</span></span>
<span id="cb10-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Dictionary with status and medal information.</span></span>
<span id="cb10-11"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Success: {"status": "success", "medals": "2 bronze"}</span></span>
<span id="cb10-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Error: {"status": "error", "error_message": "Kaggle tier not found"}</span></span>
<span id="cb10-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb10-14"></span>
<span id="cb10-15">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># This simulates looking up Kaggle's competition progression</span></span>
<span id="cb10-16">    medals_database <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {</span>
<span id="cb10-17">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"expert"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2 bronze"</span>,</span>
<span id="cb10-18">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"gold debit card"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"1 gold and 2 silver"</span>,</span>
<span id="cb10-19">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bank transfer"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"5 gold"</span>,</span>
<span id="cb10-20">    }</span>
<span id="cb10-21"></span>
<span id="cb10-22">    medals <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> medals_database.get(tier.lower())</span>
<span id="cb10-23"></span>
<span id="cb10-24">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> medals <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">is</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb10-25">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> {</span>
<span id="cb10-26">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"status"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"success"</span>,</span>
<span id="cb10-27">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"medals"</span>: medals,</span>
<span id="cb10-28">            }</span>
<span id="cb10-29">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb10-30">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> {</span>
<span id="cb10-31">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"status"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"error"</span>,</span>
<span id="cb10-32">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"error_message"</span>: <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Payment method '</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>tier<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">' not found"</span>,</span>
<span id="cb10-33">        }</span>
<span id="cb10-34"></span>
<span id="cb10-35">root_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(</span>
<span id="cb10-36">    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant_with_function_tool"</span>,</span>
<span id="cb10-37">    model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>Gemini(</span>
<span id="cb10-38">        model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>MODEL_NAME,</span>
<span id="cb10-39">        retry_options<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>retry_config,</span>
<span id="cb10-40">        ),</span>
<span id="cb10-41">    instruction<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""You are a Google Developer Expert for Kaggle.</span></span>
<span id="cb10-42"></span>
<span id="cb10-43"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    For Kaggle tier progression requests use `get_kaggle_progressions()` to find competition medal requirements for each tier.</span></span>
<span id="cb10-44"></span>
<span id="cb10-45"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    If the tool returns status "error", explain the issue to the user clearly.</span></span>
<span id="cb10-46"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    """</span>,</span>
<span id="cb10-47">    tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[get_kaggle_progressions],</span>
<span id="cb10-48">)</span></code></pre></div></div>
</div>
<div id="cell-17" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="29833c93-e87b-4add-e80f-ff412143722e" data-execution_count="20">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1">runner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> InMemoryRunner(agent<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>root_agent)</span>
<span id="cb11-2"></span>
<span id="cb11-3">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> runner.run_debug(</span>
<span id="cb11-4">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"How many medals do I need to become a Kaggle Competitions Expert?"</span>,</span>
<span id="cb11-5">    verbose<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>,</span>
<span id="cb11-6">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
 ### Created new session: debug_session_id

User &gt; How many medals do I need to become a Kaggle Competitions Expert?
assistant_with_function_tool &gt; [Calling tool: get_kaggle_progressions({'tier': 'expert'})]
assistant_with_function_tool &gt; [Tool result: {'status': 'success', 'medals': '2 bronze'}]</code></pre>
</div>
</div>
</section>
<section id="agent-tools" class="level3">
<h3 class="anchored" data-anchor-id="agent-tools">Agent Tools</h3>
<p>Another type of tool is the <code>AgentTool</code>, when an agent is invoked as a tool. This allows the primary agent to delegate specific tasks (e.g., calculations) to sub-agents while keeping control over the user interaction.</p>
<div id="cell-19" class="cell" data-execution_count="21">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb13-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.tools <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> AgentTool</span>
<span id="cb13-2"></span>
<span id="cb13-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Define agent tool</span></span>
<span id="cb13-4">tool_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(</span>
<span id="cb13-5">  model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>MODEL_NAME,</span>
<span id="cb13-6">  name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_agent"</span>,</span>
<span id="cb13-7">  description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Returns the capital city for any country or state"</span>,</span>
<span id="cb13-8">  instruction<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""When the user gives you the name of a country (e.g. Germany),</span></span>
<span id="cb13-9"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">  answer with the name of the capital city of that country.</span></span>
<span id="cb13-10"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">  Otherwise, tell the user you are not able to help them."""</span></span>
<span id="cb13-11">)</span>
<span id="cb13-12"></span>
<span id="cb13-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Define primary agent</span></span>
<span id="cb13-14">root_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(</span>
<span id="cb13-15">    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant_with_agent_tool"</span>,</span>
<span id="cb13-16">    model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>Gemini(</span>
<span id="cb13-17">        model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>MODEL_NAME,</span>
<span id="cb13-18">        retry_options<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>retry_config</span>
<span id="cb13-19">        ),</span>
<span id="cb13-20">  description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Answers user questions and gives advice"</span>,</span>
<span id="cb13-21">  instruction<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""Use the tools you have available to answer the user's questions"""</span>,</span>
<span id="cb13-22">  tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[AgentTool(agent<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>tool_agent)]</span>
<span id="cb13-23">)</span></code></pre></div></div>
</div>
<div id="cell-20" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="c2a96ba0-9c87-4e5b-e829-51fa41412437" data-execution_count="22">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb14-1">runner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> InMemoryRunner(agent<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>root_agent)</span>
<span id="cb14-2"></span>
<span id="cb14-3">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> runner.run_debug(</span>
<span id="cb14-4">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I want to visit Germany. Which city do you recommend I visit first?"</span>,</span>
<span id="cb14-5">    verbose<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>,</span>
<span id="cb14-6">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
 ### Created new session: debug_session_id

User &gt; I want to visit Germany. Which city do you recommend I visit first?
assistant_with_agent_tool &gt; [Calling tool: tool_agent({'request': 'What is the capital of Germany?'})]
assistant_with_agent_tool &gt; [Tool result: {'result': 'Berlin'}]
assistant_with_agent_tool &gt; I recommend you visit Berlin first. It's the capital city and a great place to start exploring Germany!</code></pre>
</div>
</div>
</section>
<section id="model-context-protocol-mcp-tools" class="level3">
<h3 class="anchored" data-anchor-id="model-context-protocol-mcp-tools">Model Context Protocol (MCP) Tools</h3>
<p>Writing custom function tools requires writing and maintaining API clients when you want to connect to external systems, like databases, GitHub, or Google services. Instead of writing your own integrations and API clients, you can leverage the <a href="https://www.anthropic.com/news/model-context-protocol"><strong>Model Context Protocol (MCP)</strong>, which is an open standard introduced by Anthropic</a>.</p>
<p>The MCP lets you connect your agent (<em>MCP client</em>) to an external <em>MCP server</em> that provides tools, such as image generation or database access.</p>
<p>To use MCP tools with your agent, you first need to choose an MCP server and a tool. You can use the <a href="https://github.com/mcp">MCP registry</a> to find one. In this tutorial, we will use the <a href="https://github.com/modelcontextprotocol/servers/tree/main/src/everything">Everything MCP Server</a>, which is a demo server providing a tool called <code>getTinyImage</code> to return a test image.</p>
<p>Next, you will need to create an <code>MCPToolset</code> to integrate an ADK agent with an MCP server. This launches the MCP server, establishes a communication channel, and integrates the tool in the agent’s tool list automatically without the need for any additional integration code.</p>
<div id="cell-22" class="cell" data-execution_count="23">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb16-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.tools.mcp_tool.mcp_toolset <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> McpToolset</span>
<span id="cb16-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.tools.mcp_tool.mcp_session_manager <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> StdioConnectionParams</span>
<span id="cb16-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> mcp <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> StdioServerParameters</span>
<span id="cb16-4"></span>
<span id="cb16-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># MCP integration with Everything Server</span></span>
<span id="cb16-6">mcp_server <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> McpToolset(</span>
<span id="cb16-7">    connection_params<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>StdioConnectionParams(</span>
<span id="cb16-8">        server_params<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>StdioServerParameters(</span>
<span id="cb16-9">            command<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"npx"</span>,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Run MCP server via npx</span></span>
<span id="cb16-10">            args<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb16-11">                <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"-y"</span>,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Argument for npx to auto-confirm install</span></span>
<span id="cb16-12">                <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"@modelcontextprotocol/server-everything"</span>,</span>
<span id="cb16-13">            ],</span>
<span id="cb16-14">            tool_filter<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"getTinyImage"</span>],</span>
<span id="cb16-15">        ),</span>
<span id="cb16-16">        timeout<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>,</span>
<span id="cb16-17">    )</span>
<span id="cb16-18">)</span></code></pre></div></div>
</div>
<p>Now, you only have to add the <code>mcp_server</code> to the agent’s tool list and update the agent’s instructions to use it.</p>
<div id="cell-24" class="cell" data-execution_count="24">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb17-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create image agent with MCP integration</span></span>
<span id="cb17-2">root_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(</span>
<span id="cb17-3">    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant_with_mcp_tool"</span>,</span>
<span id="cb17-4">    model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>Gemini(</span>
<span id="cb17-5">        model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>MODEL_NAME,</span>
<span id="cb17-6">        retry_options<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>retry_config</span>
<span id="cb17-7">        ),</span>
<span id="cb17-8">    instruction<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Use the MCP Tool to generate images for user queries"</span>,</span>
<span id="cb17-9">    tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[mcp_server],</span>
<span id="cb17-10">)</span></code></pre></div></div>
</div>
<hr>
<p>This section discussed the main types of tools. In ADK, you also have <a href="https://google.github.io/adk-docs/tools-custom/function-tools/#long-run-tool">long-running function tools</a> and <a href="https://google.github.io/adk-docs/tools-custom/openapi-tools/">OpenAPI tools</a>.</p>
</section>
</section>
<section id="memory-management" class="level2">
<h2 class="anchored" data-anchor-id="memory-management">Memory Management</h2>
<p>LLMs are stateless. Without access to memory management, every interaction with them is a completely new interaction. In ADK, you use:</p>
<ul>
<li><code>Sessions</code> for short-term memory management</li>
<li><code>Memory</code> for long-term memory management</li>
</ul>
<p>Note that since we now want to have conversation history and persistent state between runs, we will no longer use the <code>InMemoryRunner</code>, but instead we will use the base <code>Runner</code> class, which takes <code>session_service</code> for short-term memory and <code>memory_service</code> for long-term memory as input parameters.</p>
<p>Additionally, we cannot use the <code>run_debug()</code> method anymore because it creates a debug session with a <code>debug_session_id</code>. Since we want to distinguish between different sessions, we will need to use the <code>run_async()</code> method, which takes a session ID as input.</p>
<section id="short-term-memory" class="level3">
<h3 class="anchored" data-anchor-id="short-term-memory">Short-Term Memory</h3>
<p>Short-term memory most often refers to the conversation history of a session. The conversation history not only records the user queries and the agent’s responses, but also all tool interactions. Therefore, short-term memory can also become a summarized version of the current session for long-running conversations. Short-term memory in ADK is managed by a <code>session_service</code>, which you can pass to the <code>Runner</code> class.</p>
<div id="cell-28" class="cell" data-execution_count="25">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb18" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb18-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.runners <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Runner</span>
<span id="cb18-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.sessions <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> InMemorySessionService</span>
<span id="cb18-3"></span>
<span id="cb18-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Set up Session Management</span></span>
<span id="cb18-5">session_service <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> InMemorySessionService()</span>
<span id="cb18-6"></span>
<span id="cb18-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Set up the agent</span></span>
<span id="cb18-8">root_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(</span>
<span id="cb18-9">    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant"</span>,</span>
<span id="cb18-10">    model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>Gemini(</span>
<span id="cb18-11">        model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>MODEL_NAME,</span>
<span id="cb18-12">        retry_options<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>retry_config</span>
<span id="cb18-13">    ),</span>
<span id="cb18-14">    description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"A simple agent that can answer general questions."</span>,</span>
<span id="cb18-15">    instruction<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""You are a helpful assistant."""</span></span>
<span id="cb18-16">)</span>
<span id="cb18-17"></span>
<span id="cb18-18"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create the Runner</span></span>
<span id="cb18-19">runner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Runner(</span>
<span id="cb18-20">    agent<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>root_agent,</span>
<span id="cb18-21">    app_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"default"</span>,</span>
<span id="cb18-22">    session_service<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>session_service</span>
<span id="cb18-23">    )</span></code></pre></div></div>
</div>
<p>To run the session, we create a new session manually and pass it into the <code>run_async</code> method.</p>
<div id="cell-30" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="5e158fa3-7916-4ad7-e744-ec8c5dccec26" data-execution_count="26">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb19-1">session_name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"session1"</span></span>
<span id="cb19-2">app_name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> runner.app_name</span>
<span id="cb19-3">USER_ID <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'default'</span></span>
<span id="cb19-4"></span>
<span id="cb19-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a new session</span></span>
<span id="cb19-6">session <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> session_service.create_session(</span>
<span id="cb19-7">    app_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>app_name,</span>
<span id="cb19-8">    user_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>USER_ID,</span>
<span id="cb19-9">    session_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>session_name</span>
<span id="cb19-10">)</span>
<span id="cb19-11"></span>
<span id="cb19-12">user_queries <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [</span>
<span id="cb19-13">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Hi, I am Sam! What is the capital of United States?"</span>,</span>
<span id="cb19-14">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Hello! What is my name?"</span>,</span>
<span id="cb19-15">]</span>
<span id="cb19-16"></span>
<span id="cb19-17"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> query <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> user_queries:</span>
<span id="cb19-18">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">User &gt; </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>query<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb19-19"></span>
<span id="cb19-20">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">async</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> event <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> runner.run_async(</span>
<span id="cb19-21">        user_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>USER_ID,</span>
<span id="cb19-22">        session_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>session.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb19-23">        new_message<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>types.Content(role<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, parts<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[types.Part(text<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>query)])</span>
<span id="cb19-24">    ):</span>
<span id="cb19-25">      <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Assistant &gt; "</span>, event.content.parts[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].text)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
User &gt; Hi, I am Sam! What is the capital of United States?
Assistant &gt;  Hi Sam! The capital of the United States is Washington, D.C.

User &gt; Hello! What is my name?
Assistant &gt;  Your name is Sam.</code></pre>
</div>
</div>
<p>You can see that the agent was able to remember the user’s name.</p>
<p>Below you can see that we recorded four events in the current session <code>session1</code>.</p>
<div id="cell-32" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="445c5449-67b8-4dd8-9ad4-401e4c6a8e64" data-execution_count="27">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb21" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb21-1">session <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> session_service.get_session(</span>
<span id="cb21-2">    app_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>app_name,</span>
<span id="cb21-3">    user_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>USER_ID,</span>
<span id="cb21-4">    session_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>session_name,</span>
<span id="cb21-5">)</span>
<span id="cb21-6"></span>
<span id="cb21-7"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> event <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> session.events:</span>
<span id="cb21-8">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>event<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>content<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>role<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>event<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>content<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>parts[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>text<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>user: Hi, I am Sam! What is the capital of United States?
model: Hi Sam! The capital of the United States is Washington, D.C.
user: Hello! What is my name?
model: Your name is Sam.</code></pre>
</div>
</div>
<p>Note that here we’re using <code>InMemorySessionService</code>, which stores conversations temporarily in RAM for experimentation. In production, you’d use <code>DatabaseSessionService</code>, which stores conversations permanently in a database, as shown below:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb23-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.sessions <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> DatabaseSessionService</span>
<span id="cb23-2"></span>
<span id="cb23-3">db_url <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sqlite:///my_agent_data.db"</span>  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Local SQLite file</span></span>
<span id="cb23-4">session_service <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DatabaseSessionService(db_url<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>db_url)</span></code></pre></div></div>
<p>As you can imagine, recording all events can become a long conversation history, which will lead to higher cost and slower performance, and eventually hit the context window limit. To mitigate this, you can use <strong>context compaction</strong>, which automatically reduces the context stored in the session. The compaction process summarizes previous events and stores them in a single new event. In ADK, you can use the <code>EventsCompactionConfig</code> class for this.</p>
<p>Another approach to reduce the token number of the static instructions is to cache the request data via <strong>context caching</strong>. In ADK, you can use the <code>ContextCacheConfig</code> class for this.</p>
</section>
<section id="long-term-memory" class="level3">
<h3 class="anchored" data-anchor-id="long-term-memory">Long-term Memory</h3>
<p>In contrast to short-term memory, long-term memory persists across multiple conversations in a searchable storage.</p>
<p>In ADK, long-term memory is managed by a <code>memory_service</code>, which has to be first created and then provided to the agent via the <code>Runner</code> class.</p>
<div id="cell-36" class="cell" data-execution_count="28">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb24" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb24-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.memory <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> InMemoryMemoryService</span>
<span id="cb24-2"></span>
<span id="cb24-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create Session Service</span></span>
<span id="cb24-4">session_service <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> InMemorySessionService()</span>
<span id="cb24-5"></span>
<span id="cb24-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create Memory Service</span></span>
<span id="cb24-7">memory_service <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (InMemoryMemoryService())</span>
<span id="cb24-8"></span>
<span id="cb24-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create runner with BOTH services</span></span>
<span id="cb24-10">runner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Runner(</span>
<span id="cb24-11">    agent<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>root_agent,</span>
<span id="cb24-12">    app_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"default"</span>,</span>
<span id="cb24-13">    session_service<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>session_service,</span>
<span id="cb24-14">    memory_service<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>memory_service,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Memory service is now available!</span></span>
<span id="cb24-15">)</span></code></pre></div></div>
</div>
<p>Let’s run the session again.</p>
<div id="cell-38" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="bf5871d1-167c-4687-9c9c-bb14ea89f758" data-execution_count="29">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb25" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb25-1">session_name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"session2"</span></span>
<span id="cb25-2">app_name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> runner.app_name</span>
<span id="cb25-3">USER_ID <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'default'</span></span>
<span id="cb25-4"></span>
<span id="cb25-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a new session</span></span>
<span id="cb25-6">session <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> session_service.create_session(</span>
<span id="cb25-7">    app_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>app_name,</span>
<span id="cb25-8">    user_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>USER_ID,</span>
<span id="cb25-9">    session_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>session_name,</span>
<span id="cb25-10">)</span>
<span id="cb25-11"></span>
<span id="cb25-12">user_queries <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [</span>
<span id="cb25-13">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Hi, I am Sam! What is the capital of United States?"</span>,</span>
<span id="cb25-14">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Hello! What is my name?"</span>,</span>
<span id="cb25-15">]</span>
<span id="cb25-16"></span>
<span id="cb25-17"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Process each query in the list sequentially</span></span>
<span id="cb25-18"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> query <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> user_queries:</span>
<span id="cb25-19">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">User &gt; </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>query<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb25-20"></span>
<span id="cb25-21">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Stream the agent's response asynchronously</span></span>
<span id="cb25-22">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">async</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> event <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> runner.run_async(</span>
<span id="cb25-23">        user_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>USER_ID,</span>
<span id="cb25-24">        session_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>session.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb25-25">        new_message<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>types.Content(role<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, parts<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[types.Part(text<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>query)])</span>
<span id="cb25-26">    ):</span>
<span id="cb25-27">      <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Assistant &gt; "</span>, event.content.parts[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].text)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
User &gt; Hi, I am Sam! What is the capital of United States?
Assistant &gt;  Hi Sam! The capital of the United States is Washington, D.C.

User &gt; Hello! What is my name?
Assistant &gt;  You told me your name is Sam!</code></pre>
</div>
</div>
<p>As you can see, the general behavior of the agent looks the same to the user and even records similar events in the new session <code>session2</code>.</p>
<div id="cell-40" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="583c8cd0-8d09-40c7-9e48-151d210f33c2" data-execution_count="30">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb27" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb27-1">session <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> session_service.get_session(</span>
<span id="cb27-2">    app_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>app_name,</span>
<span id="cb27-3">    user_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>USER_ID,</span>
<span id="cb27-4">    session_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>session_name</span>
<span id="cb27-5">)</span>
<span id="cb27-6"></span>
<span id="cb27-7"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> event <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> session.events:</span>
<span id="cb27-8">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>event<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>content<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>role<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>event<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>content<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>parts[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>text<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>user: Hi, I am Sam! What is the capital of United States?
model: Hi Sam! The capital of the United States is Washington, D.C.
user: Hello! What is my name?
model: You told me your name is Sam!</code></pre>
</div>
</div>
<p>Notice how this agent has both <code>session_service</code> for short-term memory and <code>memory_service</code> for long-term memory? This is because long-term memory is created by transferring session data to memory using the <code>add_session_to_memory()</code> function. While the <code>InMemoryMemoryService</code> stores the entire conversation history, a managed memory service like the Vertex AI Memory Bank extracts key facts from the conversation history and only stores those in the long-term memory.</p>
<p>You can save session data to long-term memory at the end of a session, in periodic intervals, or after every turn, depending on your use case.</p>
<div id="cell-42" class="cell" data-execution_count="31">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb29" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb29-1"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> memory_service.add_session_to_memory(session)</span></code></pre></div></div>
</div>
<p>You can also manually search memory with <code>search_memory</code>. Note that the <code>InMemoryMemoryService</code> search with keyword matching, while the <code>VertexAiMemoryBankService</code> uses semantic search.</p>
<div id="cell-44" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="6a18a71d-3040-4757-869f-da5e2d463909" data-execution_count="32">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb30" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb30-1">APP_NAME <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> runner.app_name</span>
<span id="cb30-2"></span>
<span id="cb30-3">search_response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> memory_service.search_memory(</span>
<span id="cb30-4">    app_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>app_name,</span>
<span id="cb30-5">    user_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>USER_ID,</span>
<span id="cb30-6">    query<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"What is the user's name?"</span></span>
<span id="cb30-7">)</span>
<span id="cb30-8"></span>
<span id="cb30-9"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(search_response)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>memories=[MemoryEntry(content=Content(
  parts=[
    Part(
      text='Hi, I am Sam! What is the capital of United States?'
    ),
  ],
  role='user'
), custom_metadata={}, id=None, author='user', timestamp='2025-11-25T20:40:25.254618'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='Hi Sam! The capital of the United States is Washington, D.C.'
    ),
  ],
  role='model'
), custom_metadata={}, id=None, author='assistant', timestamp='2025-11-25T20:40:25.255173'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='Hello! What is my name?'
    ),
  ],
  role='user'
), custom_metadata={}, id=None, author='user', timestamp='2025-11-25T20:40:25.742681'), MemoryEntry(content=Content(
  parts=[
    Part(
      text='You told me your name is Sam!'
    ),
  ],
  role='model'
), custom_metadata={}, id=None, author='assistant', timestamp='2025-11-25T20:40:25.743219')]</code></pre>
</div>
</div>
<p>But what good use is storing and searching in memory when your agent can’t access it? To give your agents access to the memory, you can provide them with the built-in memory tools:</p>
<ul>
<li><code>load_memory</code> (Reactive): Agent decides when to search memory. This saves tokens and latency, but you run the risk of the agent forgetting to look up the memory.</li>
<li><code>preload_memory</code> (Proactive): Automatically searches before every turn. This is less efficient, but you’re guaranteed that memory is always available to the agent.</li>
</ul>
</section>
</section>
<section id="agent-quality" class="level2">
<h2 class="anchored" data-anchor-id="agent-quality">Agent Quality</h2>
<p>AI agents are inherently non-deterministic, which makes them unpredictable and difficult to evaluate in traditional ways. Traditional quality assurance practices, such as unit tests, were built for deterministic systems. But an agent can pass all of your unit tests and still fail in production due to wrong decision-making. Therefore, quality assurance in agent systems cannot be treated like a final testing stage but has to be treated as an architectural pillar.</p>
<section id="agent-observability" class="level3">
<h3 class="anchored" data-anchor-id="agent-observability">Agent Observability</h3>
<p>Without agent observability, you are not able to judge the agent’s decision-making process. Agent observability is <em>reactive</em>. That means, having observability is helpful to have after an error has occurred because it provides you with the required information to debug what went wrong.</p>
<p>The foundational pillars of agent observability are:</p>
<ul>
<li><strong>Logs tell us what happened:</strong> These are atomic events, such as “I was asked a question”, “I decided to user the vector search tool”, and “Vector search failed”</li>
<li><strong>Traces tell us why something happened:</strong> They reveal a causal relationship between isolated logs, such as “User Query -&gt; Vector search (failed) -&gt; LLM Error (confused by bad tool output) -&gt; Wrong final answer”</li>
<li><strong>Metrics tell us how well the overall system performed:</strong> These can be <em>system metrics</em>, such as performance (latency, error rate), cost (tokens per task, API cost per run), and effectiveness (task completion rate, tool usage frequency), or <em>quality metrics</em>, such as correctness, accuracy, trajectory adherence, safety and responsibility, helpfulness and relevance.</li>
</ul>
<p>For development debugging, you can use the ADK Web UI. However, for production observability, you can use the built-in <code>LoggingPlugin()</code>, which automatically captures all agent activity:</p>
<div id="cell-47" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="0e59c699-d36f-4986-e1ad-de0d724077b2" data-execution_count="33">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb32" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb32-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.plugins.logging_plugin <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> LoggingPlugin</span>
<span id="cb32-2"></span>
<span id="cb32-3">root_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(</span>
<span id="cb32-4">    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant"</span>,</span>
<span id="cb32-5">    model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>Gemini(</span>
<span id="cb32-6">        model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>MODEL_NAME,</span>
<span id="cb32-7">        retry_options<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>retry_config</span>
<span id="cb32-8">    ),</span>
<span id="cb32-9">    description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"A simple agent that can answer general questions."</span>,</span>
<span id="cb32-10">    instruction<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""You are a helpful assistant.</span></span>
<span id="cb32-11"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    Use Google Search for current info or if unsure."""</span>,</span>
<span id="cb32-12">    tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[google_search],</span>
<span id="cb32-13">)</span>
<span id="cb32-14"></span>
<span id="cb32-15">runner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> InMemoryRunner(</span>
<span id="cb32-16">    agent<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>root_agent,</span>
<span id="cb32-17">    plugins<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb32-18">        LoggingPlugin()  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Handles standard Observability logging across ALL agents</span></span>
<span id="cb32-19">    ],</span>
<span id="cb32-20">)</span>
<span id="cb32-21"></span>
<span id="cb32-22">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> runner.run_debug(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"When was the Kaggle 5-Day AI Agents Intensive Course happening?"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<div class="ansi-escaped-output">
<pre> ### Created new session: debug_session_id



User &gt; When was the Kaggle 5-Day AI Agents Intensive Course happening?

<span class="ansi-bright-black-fg">[logging_plugin] 🚀 USER MESSAGE RECEIVED</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Invocation ID: e-7bf69fd2-6e96-4ada-a35d-3f323a36f606</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Session ID: debug_session_id</span>

<span class="ansi-bright-black-fg">[logging_plugin]    User ID: debug_user_id</span>

<span class="ansi-bright-black-fg">[logging_plugin]    App Name: InMemoryRunner</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Root Agent: assistant</span>

<span class="ansi-bright-black-fg">[logging_plugin]    User Content: text: 'When was the Kaggle 5-Day AI Agents Intensive Course happening?'</span>

<span class="ansi-bright-black-fg">[logging_plugin] 🏃 INVOCATION STARTING</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Invocation ID: e-7bf69fd2-6e96-4ada-a35d-3f323a36f606</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Starting Agent: assistant</span>

<span class="ansi-bright-black-fg">[logging_plugin] 🤖 AGENT STARTING</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Agent Name: assistant</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Invocation ID: e-7bf69fd2-6e96-4ada-a35d-3f323a36f606</span>

<span class="ansi-bright-black-fg">[logging_plugin] 🧠 LLM REQUEST</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Model: gemini-2.5-flash-lite</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Agent: assistant</span>

<span class="ansi-bright-black-fg">[logging_plugin]    System Instruction: 'You are a helpful assistant.

    Use Google Search for current info or if unsure.



You are an agent. Your internal name is "assistant". The description about you is "A simple agent that can answer gen...'</span>

<span class="ansi-bright-black-fg">[logging_plugin] 🧠 LLM RESPONSE</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Agent: assistant</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Content: text: 'The Kaggle 5-Day AI Agents Intensive Course was happening from November 10th to November 14th, 2025.'</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Token Usage - Input: 63, Output: 55</span>

<span class="ansi-bright-black-fg">[logging_plugin] 📢 EVENT YIELDED</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Event ID: d46541c5-8037-442f-b8c4-9132e78ca3e8</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Author: assistant</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Content: text: 'The Kaggle 5-Day AI Agents Intensive Course was happening from November 10th to November 14th, 2025.'</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Final Response: True</span>

assistant &gt; The Kaggle 5-Day AI Agents Intensive Course was happening from November 10th to November 14th, 2025.

<span class="ansi-bright-black-fg">[logging_plugin] 🤖 AGENT COMPLETED</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Agent Name: assistant</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Invocation ID: e-7bf69fd2-6e96-4ada-a35d-3f323a36f606</span>

<span class="ansi-bright-black-fg">[logging_plugin] ✅ INVOCATION COMPLETED</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Invocation ID: e-7bf69fd2-6e96-4ada-a35d-3f323a36f606</span>

<span class="ansi-bright-black-fg">[logging_plugin]    Final Agent: assistant</span>
</pre>
</div>
</div>
</div>
</section>
<section id="agent-evaluation" class="level3">
<h3 class="anchored" data-anchor-id="agent-evaluation">Agent Evaluation</h3>
<p>Agent evaluation is the process of evaluating how well an AI agent performs on a task, including its decision-making process. That means, you want to evaluate the agent on two aspects:</p>
<ul>
<li><strong>Output (end-to-end) evaluation</strong> is done via <strong>“Outside-in” view</strong>: For example, how similar is the agent’s response to the expected response (example metrics are (ask success rate, user satisfaction, overall quality)</li>
<li><strong>Process evaluation</strong> is done via <strong>“Inside-out view”</strong>: For example, did the agent approach a task correctly (planning, tool usage with correct parameters, tool response interpretation, etc.)</li>
</ul>
<p>Agent evaluation is <em>proactive</em> because by evaluating your agent’s performance regularly, you are able to detect any performance degradation early on.</p>
<p>This is quite a complex topic, and I recommend you deep dive into the <a href="https://www.kaggle.com/whitepaper-agent-quality">original whitepaper</a>.</p>
</section>
</section>
<section id="multi-agent-systems" class="level2">
<h2 class="anchored" data-anchor-id="multi-agent-systems">Multi-agent systems</h2>
<p>So far, we’ve only looked at a single agent. However, instead of a single “monolithic” agent, you can also build a multi-agent system of specialized agents.</p>
<section id="multi-agent-patterns" class="level3">
<h3 class="anchored" data-anchor-id="multi-agent-patterns">Multi-agent patterns</h3>
<p>You can combine multiple agents in different patterns depending on your use case.</p>
<ul>
<li><strong>LLM-based (Sub-agents)</strong>: Use when the agent can dynamically orchestrate sub-agents on its own.</li>
<li><strong>Sequential</strong>: Use when deterministic order is important in a linear workflow.</li>
<li><strong>Parallel</strong>: Use when you have independent tasks and speed is important.</li>
<li><strong>Loop</strong>: Use when you need iterative improvement through repeated cycles</li>
</ul>
</section>
<section id="agent2agent-protocol" class="level3">
<h3 class="anchored" data-anchor-id="agent2agent-protocol">Agent2Agent Protocol</h3>
<p>When building multi-agent systems, you might want to integrate an agent that’s not part of your project. For example, different agents can be created using different frameworks, such as CrewAI or LangGraph (<em>cross-framework</em>). Or agents can be implemented in different programming languages, such as Python or Java (<em>cross-language</em>). And finally, you might want to integrate an agent from an external vendor (<em>cross-organization</em>). For these purposes, it is helpful to have a standardized communication protocol, such as the <a href="https://a2a-protocol.org/"><strong>Agent2Agent (A2A) protocol</strong></a>.</p>
<p>If you want to expose your agent and make it accessible to other agents, you can use ADK’s <code>to_a2a()</code> function, as follows:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb33" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb33-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.a2a.utils.agent_to_a2a <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> to_a2a</span>
<span id="cb33-2"></span>
<span id="cb33-3">my_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(</span>
<span id="cb33-4">    ...</span>
<span id="cb33-5">)</span>
<span id="cb33-6"></span>
<span id="cb33-7">public_agent_app <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> to_a2a(</span>
<span id="cb33-8">    my_agent,</span>
<span id="cb33-9">    port<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8001</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Port where the agent will be served</span></span>
<span id="cb33-10">)</span></code></pre></div></div>
<!-- TODO: add serving:  https://www.kaggle.com/code/kaggle5daysofai/day-5a-agent2agent-communication -->
<p>If you want to consume an agent, you can do so</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb34" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb34-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.adk.agents.remote_a2a_agent <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> (</span>
<span id="cb34-2">    RemoteA2aAgent,</span>
<span id="cb34-3">    AGENT_CARD_WELL_KNOWN_PATH,</span>
<span id="cb34-4">)</span>
<span id="cb34-5"></span>
<span id="cb34-6">remote_product_catalog_agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> RemoteA2aAgent(</span>
<span id="cb34-7">    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"product_catalog_agent"</span>,</span>
<span id="cb34-8">    description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Remote product catalog agent from external vendor that provides product information."</span>,</span>
<span id="cb34-9">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Point to the agent card URL - this is where the A2A protocol metadata lives</span></span>
<span id="cb34-10">    agent_card<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"http://localhost:8001</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>AGENT_CARD_WELL_KNOWN_PATH<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>,</span>
<span id="cb34-11">)</span></code></pre></div></div>
</section>
</section>
<section id="summary" class="level2">
<h2 class="anchored" data-anchor-id="summary">Summary</h2>
<p>This article is a summary of my learnings from Kaggle’s <a href="https://www.kaggle.com/learn-guide/5-day-agents">5-Day AI Agents Intensive Course</a>. During the course, I learned the fundamentals of how to build a first simple agent using Google’s ADK Python library, to its core concepts of different types of tools (including MCP tools) and memory management via sessions (for short-term) and memory (for long-term). Then, I learned that agent quality is a core pillar of agent architectures and not just a testing stage like in traditional quality assurance because of the non-deterministic characteristic of AI agents. Finally, the course touched on different patterns for multi-agent systems and how you can use the A2A protocol to allow collaboration between agents across different languages, frameworks, and organizations.</p>
<p>If you are interested in the details of any section of this blog, I recommend having a look at the free <a href="https://www.kaggle.com/learn-guide/5-day-agents">course materials</a>.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<ul>
<li><a href="https://www.kaggle.com/whitepaper-introduction-to-agents">Introduction to Agents Whitepaper</a></li>
<li><a href="https://www.kaggle.com/code/kaggle5daysofai/day-1a-from-prompt-to-action">Kaggle Notebook: From Prompt to Action</a></li>
<li><a href="https://www.kaggle.com/code/kaggle5daysofai/day-1b-agent-architectures">Kaggle Notebook: Agent Architectures</a></li>
<li><a href="https://www.kaggle.com/whitepaper-agent-tools-and-interoperability-with-mcp">Agent Tools &amp; Interoperability with MCP Whitepaper</a></li>
<li><a href="https://www.kaggle.com/code/kaggle5daysofai/day-2a-agent-tools">Kaggle Notebook: Agent Tools</a></li>
<li><a href="https://www.kaggle.com/code/kaggle5daysofai/day-2b-agent-tools-best-practices">Kaggle Notebook: Agent Tools Best Practices</a></li>
<li><a href="https://www.kaggle.com/whitepaper-context-engineering-sessions-and-memory">Context Engineering: Sessions &amp; Memory Whitepaper</a></li>
<li><a href="https://www.kaggle.com/code/kaggle5daysofai/day-3a-agent-sessions">Kaggle Notebook: Agent Session</a></li>
<li><a href="https://www.kaggle.com/code/kaggle5daysofai/day-3b-agent-memory">Kaggle Notebook: Agent Memory</a></li>
<li><a href="https://www.kaggle.com/whitepaper-agent-quality">Agent Quality Whitepaper</a></li>
<li><a href="https://www.kaggle.com/code/kaggle5daysofai/day-4a-agent-observability">Kaggle Notebook: Agent Observability</a></li>
<li><a href="https://www.kaggle.com/code/kaggle5daysofai/day-4b-agent-evaluation">Kaggle Notebook: Agent Evaluation</a></li>
<li><a href="https://www.kaggle.com/whitepaper-prototype-to-production">Prototype to Production Whitepaper</a></li>
<li><a href="https://www.kaggle.com/code/kaggle5daysofai/day-5a-agent2agent-communication">Kaggle Notebook: Agent2Agent Communication</a></li>
</ul>


</section>

<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <guid>https://www.leoniemonigatti.com/blog/building-ai-agents-with-google-adk.html</guid>
  <pubDate>Wed, 26 Nov 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Exploring Anthropic’s Memory Tool</title>
  <link>https://www.leoniemonigatti.com/blog/claude-memory-tool.html</link>
  <description><![CDATA[ 





<p><a href="https://platform.claude.com/docs/en/release-notes/overview#september-29-2025">On September 29, 2025, Anthropic launched the <strong>memory tool</strong> in beta</a>. This enables Claude-based agents to store and recall information across conversations.</p>
<p>This blog post explores the Claude Developer Platform’s memory tool by implementing a simple agent with memory using the Anthropic Python library. Inspired by <a href="https://x.com/sammcallister/status/1991551531142070757">Anthropic’s recent pop-up cafe in London</a>, we’re implementing a barista agent that can remember customers and their usual orders across different cafe visits.</p>
<!--
<blockquote class="twitter-tweet">
  <a href="https://twitter.com/x/status/1991551531142070757"></a> 
</blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> 
-->
<section id="prerequisites" class="level2">
<h2 class="anchored" data-anchor-id="prerequisites">Prerequisites</h2>
<p>To follow along in this blog post, you will need to install the <code>anthropic</code> Python package (<code>v0.74.1</code>).</p>
<div id="cell-2" class="cell" data-execution_count="21">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%</span>pip install <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>q <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>U anthropic</span></code></pre></div></div>
</div>
<p>Additionally, you will need an <code>ANTHROPIC_API_KEY</code>, which you can obtain by creating an Anthropic account and navigating to the <a href="console.anthropic.com/settings/keys">“API Keys” tab in your dashboard</a>. Once you have your API key, you need to store it in the environment variables.</p>
<div id="cell-4" class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> os</span>
<span id="cb2-2">os.environ[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'ANTHROPIC_API_KEY'</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"your-anthropic-api-key"</span></span></code></pre></div></div>
</div>
</section>
<section id="why-do-agents-need-memory" class="level2">
<h2 class="anchored" data-anchor-id="why-do-agents-need-memory">Why do agents need memory?</h2>
<p>Without memory, agents aren’t able to remember conversations across different conversations, tasks, or projects. Imagine, frequently visiting the same coffee shop in your neighbourhood over and over again. You’d expect an engaged barista to eventually remember who you are and what your usual order is.</p>
<p><a href="../blog/memory-in-ai-agents.html">Memory in AI agents</a> allows agents to do exactly that. By storing and recalling information from past conversations, they can learn from past interactions and build knowledge bases over time to improve workflows and user experiences.</p>
<p>Let’s demonstrate this by first implementing a simple <strong>agent without memory</strong>. (If you’re unfamiliar with implementing an agent without an orchestration framework, you can review my blog post on <a href="https://www.leoniemonigatti.com/blog/ai-agent-from-scratch-in-python.html">how to build an AI agent from scratch using Claude</a>.</p>
<p>The code below implements a barista agent as an LLM in a loop that takes the user’s input and responds to it based on the user’s input, the system prompt, and the conversation history (from the current conversation). The user can exit the conversation by typing the <code>/quit</code> command.</p>
<div id="cell-6" class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> anthropic <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Anthropic</span>
<span id="cb3-2"></span>
<span id="cb3-3">SYSTEM_PROMPT <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""You're a friendly barista at Anthropic's pop-up cafe.</span></span>
<span id="cb3-4"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">Respond like an efficient barista during rush hour - friendly but brief.</span></span>
<span id="cb3-5"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">Always get the name of the customer to write on the cup so that orders don't get mixed up."""</span></span>
<span id="cb3-6"></span>
<span id="cb3-7"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> conversation_without_memory():</span>
<span id="cb3-8"></span>
<span id="cb3-9">    client <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Anthropic()</span>
<span id="cb3-10">    messages: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>[BetaMessageParam] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb3-11"></span>
<span id="cb3-12">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">while</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>:</span>
<span id="cb3-13">        user_input <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">input</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">You: "</span>).strip()</span>
<span id="cb3-14"></span>
<span id="cb3-15">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> user_input.lower() <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/quit"</span>:</span>
<span id="cb3-16">          <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Goodbye!"</span>)</span>
<span id="cb3-17">          <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">break</span></span>
<span id="cb3-18"></span>
<span id="cb3-19">        messages.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: user_input})</span>
<span id="cb3-20"></span>
<span id="cb3-21">        response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.beta.messages.create(</span>
<span id="cb3-22">                model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"claude-sonnet-4-20250514"</span>,</span>
<span id="cb3-23">                max_tokens<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2048</span>,</span>
<span id="cb3-24">                system <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> SYSTEM_PROMPT,</span>
<span id="cb3-25">                messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>messages,</span>
<span id="cb3-26">        )</span>
<span id="cb3-27"></span>
<span id="cb3-28">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> content <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.content:</span>
<span id="cb3-29">          <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">Claude: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>content<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>text<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>, end<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>, flush<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb3-30"></span>
<span id="cb3-31">          <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Store assistant message</span></span>
<span id="cb3-32">          messages.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: content.text})</span></code></pre></div></div>
</div>
<p>Let’s start a conversation and order a coffee.</p>
<div id="cell-8" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="1b89bde8-074e-47fb-cd9f-642e22bb4bd3" data-execution_count="24">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1">conversation_without_memory()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
You: Hi, can I please get a regular flat white with oatmilk please?

Claude: Hey there! Absolutely, one regular oat milk flat white coming right up. 

Can I get a name for the cup?
You: Claudia

Claude: Perfect, Claudia! One regular oat milk flat white for you. That'll be $4.50. 

*starts writing "Claudia" on cup*

Will that be for here or to go?
You: /quit
Goodbye!</code></pre>
</div>
</div>
<p>Good, the barista agent works.</p>
<p>Now, let’s start a new conversation, visit the coffee shop again and <strong>order our usual</strong>. Note, that starting a new conversation clears the conversation history from the previous conversation. This is like leaving the coffee shop after receiving your order and coming back on the next day.</p>
<div id="cell-10" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="efcaa7cf-27bc-4aed-94a7-e4d1a855413c" data-execution_count="25">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1">conversation_without_memory()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
You: Hi, it's me Claudia again. Can I please get my usual?

Claude: Hi there! I'm sorry, but I don't actually have a record of previous orders - each conversation is fresh for me. Could you remind me what your usual is? And just to confirm, that's Claudia for the cup, right?

*grabs cup and marker, ready to write*

What can I get started for you today?
You: /quit
Goodbye!</code></pre>
</div>
</div>
<p>Unfortunately, the barista agent doesn’t remember us nor our usual order. As you can see, without any memory to remember interactions across conversations, interactions feel impersonal and leaves the user with a bad user experience.</p>
</section>
<section id="how-to-use-the-memory-tool-with-claude" class="level2">
<h2 class="anchored" data-anchor-id="how-to-use-the-memory-tool-with-claude">How to use the memory tool with Claude</h2>
<!-- How memory works in Claude -->
<p>In contrast to implementations by other providers of the memory layer, Anthropic’s memory tool enables it to store and retrieve memory information through a <strong>memory file directory</strong> (<code>/memory</code>) that persist between sessions <a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool">according to the Claude Docs</a>. The agent can create, read, update, and delete files in this memory file directory.</p>
<p>You can enable the memory tool using the Anthropic SDK with just two steps. First, you need to implement the client-side handlers to control where and how the information is stored. Then, you only need to include the beta header <code>context-management-2025-06-27</code> and the memory tool in your API requests.</p>
<section id="step-1-implement-client-side-handlers-for-memory-operations" class="level3">
<h3 class="anchored" data-anchor-id="step-1-implement-client-side-handlers-for-memory-operations">Step 1: Implement client-side handlers for memory operations</h3>
<p>According to <a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool">Claude’s developer documentation</a>, the memory tool operates client-side. This means, that the agent makes tool calls to perform memory operations and your application executes them locally. This gives developers the control over where and how the memory is stored (e.g, file-based, database, etc.).</p>
<p>To implement the client-side handlers for the memory operations you can subclass <code>BetaAbstractMemoryTool</code> and implement the handlers for each of the following six memory commands:</p>
<ul>
<li><code>view</code>: Shows directory contents or file contents</li>
<li><code>create</code>: Create a file</li>
<li><code>str_replace</code>: Replace text in a file</li>
<li><code>insert</code>: Insert text at a specific line in a file</li>
<li><code>delete</code>: Delete a file or directory</li>
<li><code>rename</code>: Rename a file or directory</li>
</ul>
<p>Note, the following implementation is copied from the <a href="https://github.com/anthropics/anthropic-sdk-python/blob/main/examples/memory/basic.py">example notebook for the memory tool</a>.</p>
<div id="cell-14" class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> shutil</span>
<span id="cb8-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> typing <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> List</span>
<span id="cb8-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> pathlib <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Path</span>
<span id="cb8-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> typing_extensions <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> override</span>
<span id="cb8-5"></span>
<span id="cb8-6"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> anthropic.lib.tools <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> BetaAbstractMemoryTool</span>
<span id="cb8-7"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> anthropic.types.beta <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> (</span>
<span id="cb8-8">    BetaMessageParam,</span>
<span id="cb8-9">    BetaContentBlockParam,</span>
<span id="cb8-10">    BetaMemoryTool20250818Command,</span>
<span id="cb8-11">    BetaContextManagementConfigParam,</span>
<span id="cb8-12">    BetaMemoryTool20250818ViewCommand,</span>
<span id="cb8-13">    BetaMemoryTool20250818CreateCommand,</span>
<span id="cb8-14">    BetaMemoryTool20250818DeleteCommand,</span>
<span id="cb8-15">    BetaMemoryTool20250818InsertCommand,</span>
<span id="cb8-16">    BetaMemoryTool20250818RenameCommand,</span>
<span id="cb8-17">    BetaMemoryTool20250818StrReplaceCommand,</span>
<span id="cb8-18">)</span>
<span id="cb8-19"></span>
<span id="cb8-20"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> LocalFilesystemMemoryTool(BetaAbstractMemoryTool):</span>
<span id="cb8-21">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""File-based memory storage implementation for Claude conversations"""</span></span>
<span id="cb8-22"></span>
<span id="cb8-23">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, base_path: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"./memory"</span>):</span>
<span id="cb8-24">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>().<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>()</span>
<span id="cb8-25">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.base_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Path(base_path)</span>
<span id="cb8-26">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.memory_root <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.base_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"memories"</span></span>
<span id="cb8-27">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.memory_root.mkdir(parents<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, exist_ok<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb8-28"></span>
<span id="cb8-29">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _validate_path(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, path: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> Path:</span>
<span id="cb8-30">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""Validate and resolve memory paths"""</span></span>
<span id="cb8-31">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> path.startswith(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/memories"</span>):</span>
<span id="cb8-32">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">ValueError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Path must start with /memories, got: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-33"></span>
<span id="cb8-34">        relative_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> path[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/memories"</span>) :].lstrip(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/"</span>)</span>
<span id="cb8-35">        full_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.memory_root <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> relative_path <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> relative_path <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.memory_root</span>
<span id="cb8-36"></span>
<span id="cb8-37">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">try</span>:</span>
<span id="cb8-38">            full_path.resolve().relative_to(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.memory_root.resolve())</span>
<span id="cb8-39">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">except</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">ValueError</span> <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> e:</span>
<span id="cb8-40">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">ValueError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Path </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> would escape /memories directory"</span>) <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> e</span>
<span id="cb8-41"></span>
<span id="cb8-42">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> full_path</span>
<span id="cb8-43"></span>
<span id="cb8-44">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@override</span></span>
<span id="cb8-45">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> view(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, command: BetaMemoryTool20250818ViewCommand) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb8-46">        full_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._validate_path(command.path)</span>
<span id="cb8-47"></span>
<span id="cb8-48">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> full_path.is_dir():</span>
<span id="cb8-49">            items: List[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb8-50">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">try</span>:</span>
<span id="cb8-51">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> item <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sorted</span>(full_path.iterdir()):</span>
<span id="cb8-52">                    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> item.name.startswith(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"."</span>):</span>
<span id="cb8-53">                        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">continue</span></span>
<span id="cb8-54">                    items.append(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>item<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>name<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">/"</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> item.is_dir() <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> item.name)</span>
<span id="cb8-55">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Directory: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.join([<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"- </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>item<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> item <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> items])</span>
<span id="cb8-56">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">except</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">Exception</span> <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> e:</span>
<span id="cb8-57">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">RuntimeError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Cannot read directory </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>e<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>) <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> e</span>
<span id="cb8-58"></span>
<span id="cb8-59">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> full_path.is_file():</span>
<span id="cb8-60">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">try</span>:</span>
<span id="cb8-61">                content <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> full_path.read_text(encoding<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"utf-8"</span>)</span>
<span id="cb8-62">                lines <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> content.splitlines()</span>
<span id="cb8-63">                view_range <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> command.view_range</span>
<span id="cb8-64">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> view_range:</span>
<span id="cb8-65">                    start_line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, view_range[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb8-66">                    end_line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(lines) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> view_range[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> view_range[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb8-67">                    lines <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> lines[start_line:end_line]</span>
<span id="cb8-68">                    start_num <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> start_line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb8-69">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb8-70">                    start_num <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb8-71"></span>
<span id="cb8-72">                numbered_lines <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>i <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> start_num<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:4d}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>line<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i, line <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">enumerate</span>(lines)]</span>
<span id="cb8-73">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.join(numbered_lines)</span>
<span id="cb8-74">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">except</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">Exception</span> <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> e:</span>
<span id="cb8-75">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">RuntimeError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Cannot read file </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>e<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>) <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> e</span>
<span id="cb8-76">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb8-77">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">RuntimeError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Path not found: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-78"></span>
<span id="cb8-79">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@override</span></span>
<span id="cb8-80">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> create(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, command: BetaMemoryTool20250818CreateCommand) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb8-81">        full_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._validate_path(command.path)</span>
<span id="cb8-82">        full_path.parent.mkdir(parents<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, exist_ok<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb8-83">        full_path.write_text(command.file_text, encoding<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"utf-8"</span>)</span>
<span id="cb8-84">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"File created successfully at </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb8-85"></span>
<span id="cb8-86">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@override</span></span>
<span id="cb8-87">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> str_replace(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, command: BetaMemoryTool20250818StrReplaceCommand) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb8-88">        full_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._validate_path(command.path)</span>
<span id="cb8-89"></span>
<span id="cb8-90">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> full_path.is_file():</span>
<span id="cb8-91">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">FileNotFoundError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"File not found: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-92"></span>
<span id="cb8-93">        content <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> full_path.read_text(encoding<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"utf-8"</span>)</span>
<span id="cb8-94">        count <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> content.count(command.old_str)</span>
<span id="cb8-95">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> count <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>:</span>
<span id="cb8-96">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">ValueError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Text not found in </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-97">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> count <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:</span>
<span id="cb8-98">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">ValueError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Text appears </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>count<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> times in </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">. Must be unique."</span>)</span>
<span id="cb8-99"></span>
<span id="cb8-100">        new_content <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> content.replace(command.old_str, command.new_str)</span>
<span id="cb8-101">        full_path.write_text(new_content, encoding<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"utf-8"</span>)</span>
<span id="cb8-102">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"File </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> has been edited"</span></span>
<span id="cb8-103"></span>
<span id="cb8-104">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@override</span></span>
<span id="cb8-105">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> insert(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, command: BetaMemoryTool20250818InsertCommand) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb8-106">        full_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._validate_path(command.path)</span>
<span id="cb8-107">        insert_line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> command.insert_line</span>
<span id="cb8-108">        insert_text <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> command.insert_text</span>
<span id="cb8-109"></span>
<span id="cb8-110">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> full_path.is_file():</span>
<span id="cb8-111">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">FileNotFoundError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"File not found: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-112"></span>
<span id="cb8-113">        lines <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> full_path.read_text(encoding<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"utf-8"</span>).splitlines()</span>
<span id="cb8-114">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> insert_line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">or</span> insert_line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(lines):</span>
<span id="cb8-115">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">ValueError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Invalid insert_line </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>insert_line<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">. Must be 0-</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(lines)<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-116"></span>
<span id="cb8-117">        lines.insert(insert_line, insert_text.rstrip(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>))</span>
<span id="cb8-118">        full_path.write_text(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.join(lines) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>, encoding<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"utf-8"</span>)</span>
<span id="cb8-119">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Text inserted at line </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>insert_line<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> in </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb8-120"></span>
<span id="cb8-121">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@override</span></span>
<span id="cb8-122">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> delete(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, command: BetaMemoryTool20250818DeleteCommand) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb8-123">        full_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._validate_path(command.path)</span>
<span id="cb8-124"></span>
<span id="cb8-125">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> command.path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/memories"</span>:</span>
<span id="cb8-126">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">ValueError</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Cannot delete the /memories directory itself"</span>)</span>
<span id="cb8-127"></span>
<span id="cb8-128">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> full_path.is_file():</span>
<span id="cb8-129">            full_path.unlink()</span>
<span id="cb8-130">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"File deleted: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb8-131">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> full_path.is_dir():</span>
<span id="cb8-132">            shutil.rmtree(full_path)</span>
<span id="cb8-133">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Directory deleted: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb8-134">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb8-135">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">FileNotFoundError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Path not found: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-136"></span>
<span id="cb8-137">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@override</span></span>
<span id="cb8-138">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> rename(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, command: BetaMemoryTool20250818RenameCommand) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb8-139">        old_full_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._validate_path(command.old_path)</span>
<span id="cb8-140">        new_full_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._validate_path(command.new_path)</span>
<span id="cb8-141"></span>
<span id="cb8-142">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> old_full_path.exists():</span>
<span id="cb8-143">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">FileNotFoundError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Source path not found: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>old_path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-144">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> new_full_path.exists():</span>
<span id="cb8-145">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">raise</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">ValueError</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Destination already exists: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>new_path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-146"></span>
<span id="cb8-147">        new_full_path.parent.mkdir(parents<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, exist_ok<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb8-148">        old_full_path.rename(new_full_path)</span>
<span id="cb8-149">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Renamed </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>old_path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> to </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>command<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>new_path<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb8-150"></span>
<span id="cb8-151">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@override</span></span>
<span id="cb8-152">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> clear_all_memory(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb8-153">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""Override the base implementation to provide file system clearing."""</span></span>
<span id="cb8-154">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.memory_root.exists():</span>
<span id="cb8-155">            shutil.rmtree(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.memory_root)</span>
<span id="cb8-156">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.memory_root.mkdir(parents<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, exist_ok<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb8-157">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"All memory cleared"</span></span></code></pre></div></div>
</div>
</section>
<section id="step-2-adjust-api-requests-for-memory-tool" class="level3">
<h3 class="anchored" data-anchor-id="step-2-adjust-api-requests-for-memory-tool">Step 2: Adjust API requests for memory tool</h3>
<p>Next, we need to adjust the API request.</p>
<ol type="1">
<li>Add the beta header <code>context-management-2025-06-27</code> in the API request</li>
<li>Add a system prompt for memory handling (<code>MEMORY_SYSTEM_PROMPT</code>)</li>
<li>Add the memory tool to the API request</li>
<li>Replace the <code>create</code> method with the <code>tool_runner</code>, which is <a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/implement-tool-use#tool-runner-beta">an out-of-the-box solution for executing tools instead of manually handling tool calls, tool results, and conversation management (in beta)</a> since we’re now using the memory tool</li>
</ol>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1">MEMORY_SYSTEM_PROMPT <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""- ***DO NOT just store the conversation history**</span></span>
<span id="cb9-2"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        - ...</span></span>
<span id="cb9-3"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        - Use a simple list format."""</span></span>
<span id="cb9-4"></span>
<span id="cb9-5">memory <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> LocalFilesystemMemoryTool()</span>
<span id="cb9-6"></span>
<span id="cb9-7">runner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.beta.messages.tool_runner(</span>
<span id="cb9-8">        betas<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"context-management-2025-06-27"</span>],</span>
<span id="cb9-9">        model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>...,</span>
<span id="cb9-10">        max_tokens<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>...,</span>
<span id="cb9-11">        system<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>SYSTEM_PROMPT <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> MEMORY_SYSTEM_PROMPT,</span>
<span id="cb9-12">        messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>...,</span>
<span id="cb9-13">        tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[memory],</span>
<span id="cb9-14">)</span></code></pre></div></div>
<p>When we put everything together in the agent’s conversation loop, the code looks like follows:</p>
<div id="cell-16" class="cell" data-execution_count="28">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1">MEMORY_SYSTEM_PROMPT <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""- ***DO NOT just store the conversation history**</span></span>
<span id="cb10-2"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        - No need to mention your memory tool or what you are writting in it to the user, unless they ask</span></span>
<span id="cb10-3"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        - Store facts about the customer, and their order. Do not store information about the order process or status.</span></span>
<span id="cb10-4"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        - Before responding, check memory to adjust technical depth and response style appropriately</span></span>
<span id="cb10-5"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        - Keep memories up-to-date - remove outdated info, add new details as you learn them</span></span>
<span id="cb10-6"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        - Use a simple list format."""</span></span>
<span id="cb10-7"></span>
<span id="cb10-8"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> conversation_with_memory():</span>
<span id="cb10-9"></span>
<span id="cb10-10">    client <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Anthropic()</span>
<span id="cb10-11">    memory <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> LocalFilesystemMemoryTool()</span>
<span id="cb10-12">    messages: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>[BetaMessageParam] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb10-13"></span>
<span id="cb10-14">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">while</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>:</span>
<span id="cb10-15">        user_input <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">input</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">You: "</span>).strip()</span>
<span id="cb10-16"></span>
<span id="cb10-17">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> user_input.lower() <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/quit"</span>:</span>
<span id="cb10-18">          <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Goodbye!"</span>)</span>
<span id="cb10-19">          <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">break</span></span>
<span id="cb10-20"></span>
<span id="cb10-21">        messages.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: user_input})</span>
<span id="cb10-22"></span>
<span id="cb10-23">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Use tool_runner with memory tool</span></span>
<span id="cb10-24">        runner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.beta.messages.tool_runner(</span>
<span id="cb10-25">                betas<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"context-management-2025-06-27"</span>], <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># The memory tool is currently in beta. To enable it, use the beta header context-management-2025-06-27 in your API requests.</span></span>
<span id="cb10-26">                model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"claude-sonnet-4-20250514"</span>,</span>
<span id="cb10-27">                max_tokens<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2048</span>,</span>
<span id="cb10-28">                system<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>SYSTEM_PROMPT <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> MEMORY_SYSTEM_PROMPT,</span>
<span id="cb10-29">                messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>messages,</span>
<span id="cb10-30">                tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[memory],</span>
<span id="cb10-31">        )</span>
<span id="cb10-32"></span>
<span id="cb10-33">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Process all messages from the runner</span></span>
<span id="cb10-34">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> message <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> runner:</span>
<span id="cb10-35">          <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Process content blocks</span></span>
<span id="cb10-36">          assistant_content: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>[BetaContentBlockParam] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb10-37"></span>
<span id="cb10-38">          <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> content <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> message.content:</span>
<span id="cb10-39">              <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> content.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">type</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"text"</span>:</span>
<span id="cb10-40">                  <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">Claude: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>content<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>text<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>, end<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>, flush<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb10-41">                  assistant_content.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"text"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"text"</span>: content.text})</span>
<span id="cb10-42"></span>
<span id="cb10-43">              <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> content.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">type</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_use"</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">and</span> content.name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"memory"</span>:</span>
<span id="cb10-44">                  <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">[Memory tool </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>content<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>name<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> called with </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>content<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">input</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">]"</span>)</span>
<span id="cb10-45">                  assistant_content.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_use"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"id"</span>: content.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"name"</span>: content.name, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"input"</span>: content.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">input</span>,})</span>
<span id="cb10-46"></span>
<span id="cb10-47">          <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Store assistant message</span></span>
<span id="cb10-48">          <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> assistant_content:</span>
<span id="cb10-49">              messages.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: assistant_content})</span>
<span id="cb10-50"></span>
<span id="cb10-51">          <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Generate tool response automatically</span></span>
<span id="cb10-52">          tool_response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> runner.generate_tool_call_response()</span>
<span id="cb10-53">          <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> tool_response <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">and</span> tool_response[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>]:</span>
<span id="cb10-54">              <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Add tool results to messages</span></span>
<span id="cb10-55">              messages.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: tool_response[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>]})</span>
<span id="cb10-56"></span>
<span id="cb10-57">              <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> result <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> tool_response[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>]:</span>
<span id="cb10-58">                  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">isinstance</span>(result, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">dict</span>) <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">and</span> result.get(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_result"</span>:</span>
<span id="cb10-59">                      <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"[Tool result processed: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>result<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>get(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>)<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">]"</span>)</span></code></pre></div></div>
</div>
</section>
</section>
<section id="example-demo-of-agent-with-memory" class="level2">
<h2 class="anchored" data-anchor-id="example-demo-of-agent-with-memory">Example demo of agent with memory</h2>
<p>Let’s see how the barista agent’s behavior changes with access to the memory tool by repeating our conversation from earlier.</p>
<div id="cell-18" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="5cf870b7-9949-4c50-f3ff-129831bc6d0d" data-execution_count="29">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1">conversation_with_memory()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
You: Hi, can I please get a regular flat white with oatmilk please? 

[Memory tool memory called with {'command': 'view', 'path': '/memories'}]
[Tool result processed: Directory: /memories]

[Memory tool memory called with {'command': 'create', 'path': '/memories/customers.txt', 'file_text': 'CUSTOMERS AND ORDERS\n\nNew customer:\n- Order: Regular flat white with oat milk\n- Name: (waiting for name)\n'}]
[Tool result processed: File created successfully at /memories/customers.txt]

Claude: Absolutely! One regular flat white with oat milk coming up. Can I get a name for the order?
You: Claudia

[Memory tool memory called with {'command': 'str_replace', 'path': '/memories/customers.txt', 'old_str': 'New customer:\n- Order: Regular flat white with oat milk\n- Name: (waiting for name)', 'new_str': 'Claudia:\n- Order: Regular flat white with oat milk'}]
[Tool result processed: File /memories/customers.txt has been edited]

Claude: Perfect, Claudia! One regular flat white with oat milk. That'll be $4.50. I'll have that ready for you in just a few minutes - keep an ear out for "Claudia!"
You: /quit
Goodbye!</code></pre>
</div>
</div>
<section id="storing-information-in-memory" class="level3">
<h3 class="anchored" data-anchor-id="storing-information-in-memory">Storing information in memory</h3>
<p>You can see that the conversation is similar to the conversation with the barista agent without memory, but this time, the agent <strong>stores information based on the interaction</strong>.</p>
<ol type="1">
<li><p><strong>User request</strong></p>
<p>We initiate the conversation with the same input:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb13-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"Hi, can I please get a regular flat white with oatmilk please?"</span></span></code></pre></div></div></li>
<li><p><strong>Agent checks memory directory</strong></p>
<p>Before responding to the user, the agent first checks the memory directory.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb14-1">{</span>
<span id="cb14-2">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_use"</span>,</span>
<span id="cb14-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"id"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"..."</span>,</span>
<span id="cb14-4">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"name"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"memory"</span>,</span>
<span id="cb14-5">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"input"</span>: {</span>
<span id="cb14-6">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"command"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"view"</span>,</span>
<span id="cb14-7">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"path"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/memories"</span></span>
<span id="cb14-8">   }</span>
<span id="cb14-9">}</span></code></pre></div></div></li>
<li><p><strong>The application returns the tool call results</strong></p>
<p>The tool call returns the directory contents of <code>\memory</code>.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb15-1">{</span>
<span id="cb15-2">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_result"</span>,</span>
<span id="cb15-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_use_id"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"..."</span>,</span>
<span id="cb15-4">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Directory: /memories"</span></span>
<span id="cb15-5">}</span></code></pre></div></div></li>
<li><p><strong>Agent creates a new memory file</strong></p>
<p>The agent now creates a memory file and stores the user’s order. (Note that the file name and contents can differ between API calls due to the non-deterministic nature of LLMs.)</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb16-1">{</span>
<span id="cb16-2">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_use"</span>,</span>
<span id="cb16-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"id"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"..."</span>,</span>
<span id="cb16-4">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"name"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"memory"</span>,</span>
<span id="cb16-5">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"input"</span>: {</span>
<span id="cb16-6">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"command"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"create"</span>, </span>
<span id="cb16-7">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"path"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/memories/customers.txt"</span>, </span>
<span id="cb16-8">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"file_text"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CUSTOMERS AND ORDERS</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">New customer:</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">- Order: Regular flat white with oat milk</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">- Name: (waiting for name)</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb16-9">  }</span>
<span id="cb16-10">}</span></code></pre></div></div></li>
<li><p><strong>Agent responds</strong></p>
<p>Finally, the agent responds.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb17-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"Absolutely! One regular flat white with oat milk coming up. Can I get a name for the order?"</span></span></code></pre></div></div></li>
</ol>
</section>
<section id="editing-existing-information-in-memory" class="level3">
<h3 class="anchored" data-anchor-id="editing-existing-information-in-memory">Editing existing information in memory</h3>
<p>You can also see in the above conversation that, after the user tells the barista agent their name, another memory tool call with the command <code>str_replace</code> is invoked to edit the customer profile with their name.</p>
</section>
<section id="recalling-from-memory" class="level3">
<h3 class="anchored" data-anchor-id="recalling-from-memory">Recalling from memory</h3>
<p>Now, let’s start a new conversation. This clears the conversation history like in our example above, but this time, important information from the earlier conversation is stored in the memory directory.</p>
<p>Let’s start the second conversation like in our example earlier again:</p>
<div id="cell-22" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="3f190f82-b3fb-45e8-e0b5-6f25214188ff" data-execution_count="30">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb18" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb18-1">conversation_with_memory()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
You: Hi, it's me Claudia again. Can I please get my usual?

[Memory tool memory called with {'command': 'view', 'path': '/memories'}]
[Tool result processed: Directory: /memories- customers.txt]

[Memory tool memory called with {'command': 'view', 'path': '/memories/customers.txt'}]
[Tool result processed:    1: CUSTOMERS AND ORDERS
   2: 
   3: Claudia:
   4: - Order: Regular flat white with oat milk]

Claude: Hey Claudia! Welcome back. One regular flat white with oat milk coming right up! 

*starts steaming oat milk and pulls espresso shots*

Should be ready in just a couple minutes. How's your day going?
You: /quit
Goodbye!</code></pre>
</div>
</div>
<p>As you can see, this time the barista agent remembers the customer and is able to recall their usual order from the information stored in the memory directory.</p>
</section>
<section id="recalling-from-conversation-history-not-memory" class="level3">
<h3 class="anchored" data-anchor-id="recalling-from-conversation-history-not-memory">Recalling from conversation history (not memory)</h3>
<p>But does the barista agent now always check their memory? No.&nbsp;Only when it needs to. In the example below, you can see that if asked about something that was discussed earlier in the conversation (<code>"Did I say almond milk or cashew milk?"</code>) instead of calling the memory tool, the agent uses the information in their context window to answer the user input.</p>
<div id="cell-25" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="82c83fca-b4b8-47c1-c3af-19eb0f6d900e" data-execution_count="32">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb20" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb20-1">conversation_with_memory()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
You: Hi, it's me Claudia again. Can I please get my usual?

[Memory tool memory called with {'command': 'view', 'path': '/memories'}]
[Tool result processed: Directory: /memories- customers.txt]

[Memory tool memory called with {'command': 'view', 'path': '/memories/customers.txt'}]
[Tool result processed:    1: CUSTOMERS AND ORDERS
   2: 
   3: Claudia:
   4: - Order: Regular flat white with oat milk]

Claude: Hey Claudia! Welcome back! Absolutely - one regular flat white with oat milk coming right up. I'll get that started for you now!

*starts steaming oat milk and pulling espresso shot*

That'll be ready in just a moment!
You: Actually, I would like to try almond milk today.

[Memory tool memory called with {'command': 'str_replace', 'path': '/memories/customers.txt', 'old_str': 'Claudia:\n- Order: Regular flat white with oat milk', 'new_str': "Claudia:\n- Usual order: Regular flat white with oat milk\n- Today's order: Regular flat white with almond milk"}]
[Tool result processed: File /memories/customers.txt has been edited]

Claude: Perfect! Switching that to almond milk for you. One regular flat white with almond milk instead - got it!

*adjusts the milk steamer to almond milk*

Great choice! The almond milk has a nice nutty flavor that pairs really well with our espresso. This'll be ready in just a sec!
You: Oh, wait. Did I say almond milk or cashew milk? I want to make sure I ordered almond milk

Claude: You said almond milk! I've got you down for a regular flat white with almond milk. That's what I'm making for you right now.

*continues steaming the almond milk*

So we're all good - one flat white with almond milk for Claudia!
You: /quit
Goodbye!</code></pre>
</div>
</div>
</section>
</section>
<section id="summary" class="level2">
<h2 class="anchored" data-anchor-id="summary">Summary</h2>
<p>This blog post explored how agents can use the memory tool (in beta) with the Claude Developer Platform. We’ve implemented two agents using Anthropic’s Python SDK: one without memory and one with memory. The agents were instructed to act as baristas. When asked to get the user their usual order, only the agent with access to the memory tool was able to fulfill this request.</p>
<p>This exploratory tutorial showed that Anthropic’s approach to memory is different from other providers of developer tools for the memory layer in AI agents: Claude stores memory information <strong>as files in a memory directory</strong>. I recommend this blog post by <a href="https://www.shloked.com/writing/claude-memory-tool">Shlok Khemani on “Anthropic’s Opinionated Memory Bet”</a> for a detailed explanation of what this means.</p>
<p>You can also <a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool#using-with-context-editing">combine the memory tool with context editing</a> to clear old tool results to keep the context concise and allow long-running agentic workflows.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<ul>
<li><a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool">Claude Docs: Memory tool</a></li>
<li><a href="https://github.com/anthropics/anthropic-sdk-python/blob/main/examples/memory/basic.py">Anthropic Python SDK Memory Tool Example</a></li>
</ul>


</section>

<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <guid>https://www.leoniemonigatti.com/blog/claude-memory-tool.html</guid>
  <pubDate>Tue, 25 Nov 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Making Sense of Memory in AI Agents</title>
  <link>https://www.leoniemonigatti.com/blog/memory-in-ai-agents.html</link>
  <description><![CDATA[ 





<p>I’ve been catching up on the topic of <strong>memory management for AI agents</strong> recently and was overwhelmed by the amount of new terminology and concepts. This blog post serves as my working study notes to collect all the information on memory for agents. Therefore, please note that this information is subject to change as I’m learning more about the topic (refer to the last modified date on this post).</p>
<section id="what-is-agent-memory" class="level2">
<h2 class="anchored" data-anchor-id="what-is-agent-memory">What is agent memory?</h2>
<p>Memory in AI agents is the ability to remember and recall important information across multiple user interactions. This ability enables agents to learn from feedback and adapt to user preferences, thereby enhancing the system’s performance and improving the user experience.</p>
<p>Large Language Models (LLMs) that power AI agents are <strong>stateless</strong> and don’t have memory as a built-in feature. LLMs learn and remember information during their training phase and store it in their model weights (parametric knowledge), but they don’t immediately learn and remember what you just said. Therefore, every time you interact with an LLM, each time is essentially a fresh start. The LLM has no memory of previous inputs.</p>
<p><img src="https://www.leoniemonigatti.com/blog/images/chat-with-and-without-conversational-memory.webp" class="img-fluid"></p>
<p>Therefore, to enable an LLM agent to recall what was said earlier in this conversation or in a previous session, developers must provide it with access to past interactions from the current and past conversations.</p>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Tip</span>Sidenotes
</div>
</div>
<div class="callout-body-container callout-body">
<p><strong>The term “memory”</strong></p>
<p>I still struggle with the new terminology surrounding this topic. I like the <a href="https://en.wikipedia.org/wiki/Memory">definition for (human) memory on Wikipedia, which says, “Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed”</a>.</p>
<p>That means “memory” refers to the storage location (e.g., a Markdown file or a database), and the actual information stored is not called “memories,” but rather “information.”</p>
<p><strong>Difference between agent memory and agentic memory</strong></p>
<p>When I first started researching the topic of agent memory, I thought “agent memory” or “memory for agents” was the same as “agentic memory”. Turns out (or rather: I think), they are not, but they are related:</p>
<ul>
<li><strong>Agent memory</strong> (or memory for agents) refers to the concept of granting agents access to memory, allowing them to recall information from past interactions.<br>
</li>
<li><strong>Agentic memory</strong> describes a memory system that agentically writes and manages memory for agents (usually via tool calls).</li>
</ul>
<p>That means, “agentic memory” always refers to “agentic memory for agents,” but “agent memory” doesn’t necessarily have to be agentic. (Curious if my understanding is correct on this.)</p>
</div>
</div>
</section>
<section id="types-of-agent-memory" class="level2">
<h2 class="anchored" data-anchor-id="types-of-agent-memory">Types of agent memory</h2>
<p>Agent memory can be categorized into different types depending on the type of information stored and its location. There are various ways to categorize agent memory into distinct types. But at the highest level, they all differentiate between short-term (<em>in-context</em>) and long-term memory (<em>out-of-context</em>):</p>
<ul>
<li><strong>In-context memory (Short-term memory)</strong> refers to the information available <em>in the context window</em> of the LLM. This can be both information from the current conversation as well as information pulled in from past conversations.</li>
<li><strong>Out-of-context memory (Long-term memory)</strong> refers to the information stored <em>in external storage</em>, such as a (vector or graph) database.</li>
</ul>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.leoniemonigatti.com/blog/images/agent-memory.webp" class="img-fluid figure-img"></p>
<figcaption>“An agent can have two types of memory: In-context (short-term) memory and out-of-context (long-term) memory”</figcaption>
</figure>
</div>
<p>The most commonly seen topology is based on the <a href="https://arxiv.org/abs/2309.02427">Cognitive Architectures for Language Agents (CoALA) paper</a>, which distinguishes between four memory types that mimic human memory, drawing on the SOAR architecture from the 1980s. Below, you can find a table of the different memory types inspired by a similar one in <a href="https://docs.langchain.com/oss/python/langgraph/memory">the LangGraph documentation</a>:</p>
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 25%">
<col style="width: 25%">
<col style="width: 25%">
</colgroup>
<thead>
<tr class="header">
<th style="text-align: left;">Memory Type</th>
<th style="text-align: left;">What is stored</th>
<th style="text-align: left;">Human Example</th>
<th style="text-align: left;">Agent Example</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: left;">Working memory</td>
<td style="text-align: left;">Contents of the context window</td>
<td style="text-align: left;">Current conversation (e.g., “Hi, my name is Sam.”)</td>
<td style="text-align: left;">Current conversation (e.g., “Hi, my name is Sam.”)</td>
</tr>
<tr class="even">
<td style="text-align: left;">Semantic memory</td>
<td style="text-align: left;">Facts</td>
<td style="text-align: left;">Things I learned in school (e.g., “Water freezes at 0°C”)</td>
<td style="text-align: left;">Facts about a user (e.g., “Dog’s name is Henry”)</td>
</tr>
<tr class="odd">
<td style="text-align: left;">Episodic memory</td>
<td style="text-align: left;">Experiences</td>
<td style="text-align: left;">Things I did (e.g., “Went to Six Flags on 10th birthday”)</td>
<td style="text-align: left;">Past actions (e.g., “Failed to calculate 1+1 without using a calculator”)</td>
</tr>
<tr class="even">
<td style="text-align: left;">Procedural memory</td>
<td style="text-align: left;">Instructions</td>
<td style="text-align: left;">Instincts or motor skills (e.g., “How to ride a bike”)</td>
<td style="text-align: left;">Instructions in the system prompt (e.g., “Always ask follow-up questions before answering a question.”)</td>
</tr>
</tbody>
</table>
<p>However, there’s also another approach to categorizing memory types for AI agents from a design pattern perspective. <a href="https://x.com/sarahwooders/status/1983421978922106999?s=20">Sarah Wooders from Letta argues that an LLM is a tokens-in-tokens-out function, not a brain, and that, therefore, the overly anthropomorphized analogies are not fit</a>. If you look at <a href="https://www.letta.com/blog/agent-memory">how Letta defines the types of agent memory, you will see that they define it differently</a>:</p>
<ul>
<li><strong>Message Buffer (Recent messages)</strong> stores the most recent messages from the current conversation.<br>
</li>
<li><strong>Core Memory (In-Context Memory Blocks)</strong> is specific information that the agent itself manages (e.g., the user’s birthday or the boyfriend’s name if this is relevant to the current conversation)<br>
</li>
<li><strong>Recall Memory (Conversational History)</strong> is the raw conversation history.<br>
</li>
<li><strong>Archival Memory (Explicitly Stored Knowledge)</strong> is explicitly formulated information stored in an external database.</li>
</ul>
<p>The difference lies in how they design in-context and out-of-context memory. For example, CoALA’s working memory is one category, while Letta splits this into message buffer and core memory. The long-term memory from the CoALA paper can be thought of as the out-of-context memory in Letta. However, the long-term memory types of procedural, episodic, and semantic aren’t directly mappable to Letta’s recall and archival memory. You can think of CoALA’s semantic memory as Letta’s archival memory, but the other are different from each other. Notably, the CoALA taxonomy doesn’t include the raw conversation history in long-term memory.</p>
</section>
<section id="ai-agent-memory-management" class="level2">
<h2 class="anchored" data-anchor-id="ai-agent-memory-management">AI Agent Memory Management</h2>
<p>Memory management in AI agents refers to how to manage information within the LLM’s context window and in external storage, as well as how to transfer information between them. <a href="https://www.youtube.com/watch?v=W2HVdB4Jbjs">Richmond Alake lists the following core components of agent memory management: generation, storage, retrieval, integration, updating, and deletion (forgetting).</a></p>
<section id="managing-memory-in-the-context-window" class="level3">
<h3 class="anchored" data-anchor-id="managing-memory-in-the-context-window">Managing memory in the context window</h3>
<p>The goal of managing memory in the context window is to ensure that only relevant information is retained, thereby avoiding confusion for the LLM with incorrect, irrelevant, or contradictory information. Additionally, as the conversation progresses, the conversation history grows (involving more tokens) and leads to slower responses and higher costs, potentially reaching the context window’s limit.</p>
<p>To mitigate this problem, you can maintain the conversation history in different ways. For example, you can manually remove old and obsolete information from the context window. Alternatively, you can periodically summarize the previous conversation and retain only the summary, then delete the old messages.</p>
</section>
<section id="managing-memory-in-external-storage" class="level3">
<h3 class="anchored" data-anchor-id="managing-memory-in-external-storage">Managing memory in external storage</h3>
<p>The main goal of memory management in external storage is to prevent memory bloat and to ensure the quality and relevance of the stored information. The four core operations for managing memory in external storage include:</p>
<ul>
<li><strong>ADD:</strong> Adding new information to the external storage.<br>
</li>
<li><strong>UPDATE:</strong> Identifying existing information, modifying it to reflect new information, or correcting outdated information (e.g., updating a user’s new address).<br>
</li>
<li><strong>DELETE:</strong> Forgetting obsolete information to prevent memory bloat and degradation of the information quality.<br>
</li>
<li><strong>NOOP:</strong> This is the decision point where the memory management system determines that the current interaction contains no new, relevant, or contradictory information that warrants a database transaction.</li>
</ul>
</section>
<section id="transferring-information-between-the-context-window-and-external-storage" class="level3">
<h3 class="anchored" data-anchor-id="transferring-information-between-the-context-window-and-external-storage">Transferring information between the context window and external storage</h3>
<p>One important question developers need to answer is <strong>when</strong> to manage memory, specifically when to transfer information from the context window to external storage. The <a href="https://blog.langchain.com/memory-for-agents/">LangChain blog post on “memory for agents”</a> differentiates between the hot path and background, while I’ve also seen the two referred to as explicit and implicit memory updates in <a href="https://www.philschmid.de/memory-in-agents">Philipp Schmid’s blog on “Memory in Agents”</a>.</p>
<p><strong>Explicit memory (hot path)</strong> describes the <em>agent memory system’s ability to autonomously recognize important information</em> and decide to explicitly remember it (via tool calling). Explicit memory in humans is the <em>conscious</em> storage of information (e.g., episodic and semantic memory). While ideally, remembering important information in the hot path is how humans remember information, it can be challenging to implement a robust solution that understands which information is important to remember.</p>
<p><strong>Implicit memory (background)</strong> describes when memory management is programmatically defined in the system at specific times during or after a conversation. Implicit memory in humans is the <em>unconscious</em> storage of information (e.g., procedural memory). The <a href="https://www.kaggle.com/whitepaper-context-engineering-sessions-and-memory">Google whitepaper on session and memory</a> describes the following three scenarios:</p>
<ul>
<li><strong>After a session:</strong> You can batch process the entire conversation after a session.<br>
</li>
<li><strong>In periodic intervals:</strong> If your use case has long-running conversations, you can define an interval at which session data is transferred to long-term memory.<br>
</li>
<li><strong>After every turn:</strong> If your use case has requirements for real-time updates. However, keep in mind that the raw conversation history is typically appended and stored in the context window for a short period (“short-term memory”).</li>
</ul>
</section>
</section>
<section id="implementing-agent-memory" class="level2">
<h2 class="anchored" data-anchor-id="implementing-agent-memory">Implementing agent memory</h2>
<p>When implementing agent memory, consider where to store the memory information:</p>
<ul>
<li>Current conversation history is usually implemented as a simple <strong>list</strong> of past user queries, assistant messages, and maybe tool calls or reasoning.<br>
</li>
<li>Instructions are typically found in <strong>text or Markdown</strong> <strong>files</strong>, similar to well-known examples such as <a href="http://claude.md">CLAUDE.md</a> files.<br>
</li>
<li>Other information is usually stored in a <strong>database,</strong> depending on what type of retrieval method is suitable for your data.</li>
</ul>
<p>If you’re interested in actual code implementation, I recommend you check out how my colleague <a href="https://github.com/databyjp/weekend_projects/tree/main/mem_demo">JP Hwang implemented an agentic memory layer for a conversational AI using Weaviate</a> or how <a href="https://www.youtube.com/watch?v=VKPngyO0iKg">Adam Łucek implements the four different types of memory from the CoALA paper (working memory, episodic memory, semantic memory, and procedural memory)</a>.</p>
</section>
<section id="challenges-of-agent-memory-design" class="level2">
<h2 class="anchored" data-anchor-id="challenges-of-agent-memory-design">Challenges of agent memory design</h2>
<p>Implementing memory for agents currently is a challenging task. The difficulty lies in optimizing the system to avoid slower response times while simultaneously solving the complex problem of determining what information is obsolete and should be permanently deleted:</p>
<ul>
<li><strong>Latency:</strong> Constantly processing whether the agent now needs to retrieve new information from or offload data to the memory bank can lead to slower response times.<br>
</li>
<li><strong>Forgetting:</strong> <a href="https://x.com/helloiamleonie/status/1976964497987510379?s=20">This seems to be the hardest challenge for developers at the moment</a>. How do you automate a mechanism that decides when and what information to permanently delete? Managing the information stored in external memory is important to avoid memory bloat and the degradation of the information quality.</li>
</ul>
</section>
<section id="frameworks-for-ai-agent-memory" class="level2">
<h2 class="anchored" data-anchor-id="frameworks-for-ai-agent-memory">Frameworks for AI Agent Memory</h2>
<p>The ecosystem of developer tools for implementing memory solutions for agents is rapidly growing and <a href="https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps/">attracting investors’ attention</a>. There are frameworks dedicated to solving the agent memory problem, such as:</p>
<ul>
<li><a href="https://mem0.ai/">mem0</a> (see <a href="https://github.com/weaviate/recipes/blob/main/integrations/llm-agent-frameworks/mem0/quickstart_mem0_with_weaviate.ipynb">my example implementation with Weaviate</a>),</li>
<li><a href="https://www.letta.com/">Letta</a> based on the MemGPT design pattern (<a href="../memgpt.ipynb">see my example implementation</a>),</li>
<li><a href="https://www.cognee.ai/">Cognee</a>, and</li>
<li><a href="https://www.getzep.com/">zep</a>.</li>
</ul>
<p>However, many agent orchestration frameworks, such as:</p>
<ul>
<li><a href="https://www.langchain.com/">LangChain</a> and <a href="https://www.langchain.com/langgraph">LangGraph</a>,</li>
<li><a href="https://www.llamaindex.ai/">LlamaIndex</a> (<a href="https://github.com/weaviate/recipes/blob/main/integrations/llm-agent-frameworks/llamaindex/agent-memory/Memory_with_LlamaIndex_Weaviate_and_Gemini.ipynb">see my example implementation</a>),</li>
<li><a href="https://www.crewai.com/">CrewAI</a>, and</li>
<li>Google’s <a href="https://google.github.io/adk-docs/">Agent Development Kit (ADK)</a> (<a href="../blog/building-ai-agents-with-google-adk.html#memory-management">see my example implementation</a>),</li>
</ul>
<p>also offer solutions for AI agent memory management. Additionally, some model provider’s such as <a href="https://www.anthropic.com/">Anthropic</a>, provide built-in <a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool">memory tools</a> (<a href="../claude-memory-tool.ipynb">see my example implementation</a>).</p>
</section>
<section id="summary" class="level2">
<h2 class="anchored" data-anchor-id="summary">Summary</h2>
<p>As I’m diving into the topic of memory management for AI agents, I’m learning the importance of not only remembering information from the current conversation but also past conversations, and also how to modify and forget outdated and obsolete information effectively.</p>
<p>As the field evolves, different approaches and terminology are emerging with it. On the one hand, you have approaches that categorize memory types into semantic, episodic, or procedural memory, analogous to the human memory. On the other hand, you have approaches, such as Letta, which use architecture-focused terminology.</p>
<p>Nevertheless, the core challenge of memory design for AI agents is how to flow information between an LLM’s context window (short-term memory) and external storage (long-term memory). This involves deciding when to manage updates (hot path vs.&nbsp;background) while overcoming key challenges such as latency and the difficulty of forgetting obsolete data.</p>


</section>

<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <guid>https://www.leoniemonigatti.com/blog/memory-in-ai-agents.html</guid>
  <pubDate>Thu, 20 Nov 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>The Evolution from RAG to Agentic RAG to Agent Memory</title>
  <link>https://www.leoniemonigatti.com/blog/from-rag-to-agent-memory.html</link>
  <description><![CDATA[ 





<p>I have been learning about memory in AI agents, and found myself overwhelmed by all the new terms. It started with short-term and long-term memory. Then it became even more confusing with procedural, episodic, and semantic memory. But wait. Semantic memory reminded me of a familiar concept: <a href="../blog/retrieval-augmented-generation-langchain.html">Retrieval-Augmented Generation (RAG)</a>.</p>
<p>Could memory in agents be the logical next step after vanilla RAG evolved to agentic RAG? At its core, memory in agents is about transferring information into and out of the large language model (LLM)‘s context window. Whether you call this information ’memories’ or ‘facts’ is secondary to this abstraction.</p>
<p>This blog is an introduction to memory in AI agents from a different angle than you might see in other blogs. We will not talk about short-term and long-term memory (yet), but gradually evolve the naive RAG concept to agentic RAG and to memory in AI agents. (Note that this is a simplified mental model. The entire topic of agent memory is more complex under the hood and involves things like memory management systems.)</p>
<section id="rag-read-only-in-one-shot" class="level2">
<h2 class="anchored" data-anchor-id="rag-read-only-in-one-shot">RAG: read-only in one-shot</h2>
<p>The concept of <a href="../blog/retrieval-augmented-generation-langchain.html">Retrieval-Augmented Generation (RAG)</a> was introduced in 2020 (Lewis et al.) and gained popularity around 2023. It was the first concept to give a stateless LLM access to past conversations and knowledge it hadn’t seen and stored in its model weights during training (parametric knowledge).&nbsp;</p>
<p>The core idea of the naive RAG workflow is straightforward, as shown in the image below:</p>
<ol type="1">
<li>Offline indexing stage: Store additional information in an external knowledge source (e.g., vector database)</li>
<li>Query stage: Use the user’s query to retrieve related context from the external knowledge source. Feed the retrieved context, along with the user query, to the LLM to obtain a response grounded in the additional information.</li>
</ol>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.leoniemonigatti.com/blog/images/rag.webp" class="img-fluid figure-img"></p>
<figcaption>Naive RAG workflow</figcaption>
</figure>
</div>
<p>The following pseudo-code illustrates the naive RAG workflow:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Stage 1: Offline ingestion</span></span>
<span id="cb1-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> store_documents(documents):</span>
<span id="cb1-3">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> doc <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> documents:</span>
<span id="cb1-4">        embedding <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> embed(doc)</span>
<span id="cb1-5">        database.store(doc, embedding)</span>
<span id="cb1-6"></span>
<span id="cb1-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Stage 2: Online Retrieval + Generation</span></span>
<span id="cb1-8"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> search(query):</span>
<span id="cb1-9">    query_embedding <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> embed(query)</span>
<span id="cb1-10">    results <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> database.similarity_search(query_embedding, top_k<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>)</span>
<span id="cb1-11">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> results</span>
<span id="cb1-12"></span>
<span id="cb1-13"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> answer_question(question):</span>
<span id="cb1-14">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Always retrieve first, then generate</span></span>
<span id="cb1-15">    context <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> search(question)</span>
<span id="cb1-16">    prompt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Context: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>context<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">Question: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>question<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">Answer:"</span></span>
<span id="cb1-17">    response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> llm.generate(prompt)</span>
<span id="cb1-18">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> response</span></code></pre></div></div>
<p>Although the naive RAG approach was effective at reducing hallucinations for simple use cases, it has a key limitation: It is a <strong>one-shot</strong> solution.</p>
<ul>
<li>Additional information is often retrieved from an external knowledge source without reviewing whether it is even necessary.&nbsp;</li>
<li>Information is retrieved once, regardless of whether the retrieved information is relevant or correct.</li>
<li>There’s only one external knowledge source for all additional information.</li>
</ul>
<p>These limitations mean that for more complex use cases, the LLM can still hallucinate if the retrieved context is irrelevant to the user query or even wrong.</p>
</section>
<section id="agentic-rag-read-only-via-tool-calls" class="level2">
<h2 class="anchored" data-anchor-id="agentic-rag-read-only-via-tool-calls">Agentic RAG: read-only via tool calls</h2>
<p>Agentic RAG addresses many aspects of the limitations of naive RAG: It defines the retrieval step as a tool that the agent can use. This change enables the agent first to determine whether additional information is needed, decide which tool to use for retrieval (e.g., a database with proprietary data vs.&nbsp;a web search), and assess whether the retrieved information is relevant to the user query.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.leoniemonigatti.com/blog/images/agentic_rag.webp" class="img-fluid figure-img"></p>
<figcaption>Agentic RAG workflow</figcaption>
</figure>
</div>
<p>The following pseudo-code illustrates how an agent can call a <code>SearchTool</code> during an agentic RAG workflow:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> SearchTool:</span>
<span id="cb2-2">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, database):</span>
<span id="cb2-3">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.database <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> database</span>
<span id="cb2-4">    </span>
<span id="cb2-5">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> search(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, query):</span>
<span id="cb2-6">        query_embedding <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> embed(query)</span>
<span id="cb2-7">        results <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.database.similarity_search(query_embedding, top_k<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>)</span>
<span id="cb2-8">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> results</span>
<span id="cb2-9"></span>
<span id="cb2-10"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> agent_loop(question):</span>
<span id="cb2-11">    messages <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: question}]</span>
<span id="cb2-12">    </span>
<span id="cb2-13">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">while</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>:</span>
<span id="cb2-14">        response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> llm.generate(</span>
<span id="cb2-15">            messages, </span>
<span id="cb2-16">            tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[SearchTool]</span>
<span id="cb2-17">        )</span>
<span id="cb2-18">        </span>
<span id="cb2-19">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> response.tool_calls:</span>
<span id="cb2-20">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tool_call <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.tool_calls:</span>
<span id="cb2-21">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> tool_call.name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"search"</span>:</span>
<span id="cb2-22">                    results <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> search_tool.search(tool_call.arguments[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"query"</span>])</span>
<span id="cb2-23">                    messages.append({</span>
<span id="cb2-24">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool"</span>, </span>
<span id="cb2-25">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Search results: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>results<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-26">                    })</span>
<span id="cb2-27">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb2-28">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> response.content</span></code></pre></div></div>
<p>One similarity between naive and agentic RAG is that information is stored in the database offline, rather than during inference. This means that data is only retrieved by the agent but not written, modified, or deleted during inference. This limitation means that both naive and agentic RAG systems are not able to learn and improve from past interactions (by default).</p>
</section>
<section id="agent-memory-read-write-via-tool-calls" class="level2">
<h2 class="anchored" data-anchor-id="agent-memory-read-write-via-tool-calls">Agent Memory: read-write via tool calls</h2>
<p>Agent memory overcomes this limitation of naive and agentic RAG by introducing a memory management concept. This allows the agent to learn from past interactions and enhance the user experience through a more personalized approach.</p>
<p>The concept of agent memory builds on the fundamental principles of agentic RAG.&nbsp;It also uses tools to retrieve information from an external knowledge source (memory). But in contrast to agentic RAG, agent memory also uses tools to <strong>write</strong> to an external knowledge source, as shown below:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.leoniemonigatti.com/blog/images/agent_memory.webp" class="img-fluid figure-img"></p>
<figcaption>Agentic Memory</figcaption>
</figure>
</div>
<p>This allows the agent not only to recall from memory but also to “remember” information. In its simplest form, you can store the raw conversation history in a collection after an interaction. Then, the agent can search over past conversations to find relevant information. If you want to extend this, you can prompt the memory management system to create a summary of the conversation to store for future reference.&nbsp;Furthermore, you can also have the agent notice important information during a conversation (e.g., the user mentions a preference for emojis or mentions their birthday) and create a memory based on this event.</p>
<p>The following pseudo code illustrates how the concept of agent memory extends the idea of agentic RAG with a <code>WriteTool</code> the agent can use to store information:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> SearchTool:</span>
<span id="cb3-2">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, database):</span>
<span id="cb3-3">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.database <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> database</span>
<span id="cb3-4">    </span>
<span id="cb3-5">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> search(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, query):</span>
<span id="cb3-6">        results <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.database.search(query)</span>
<span id="cb3-7">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> results</span>
<span id="cb3-8"></span>
<span id="cb3-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># For simplicity, we only define a write tool here. You could also have tools for updating, deleting, or consolidating, etc.</span></span>
<span id="cb3-10"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> WriteTool:</span>
<span id="cb3-11">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, database):</span>
<span id="cb3-12">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.database <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> database</span>
<span id="cb3-13">    </span>
<span id="cb3-14">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> store(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, information):</span>
<span id="cb3-15">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.database.store(information)</span>
<span id="cb3-16"></span>
<span id="cb3-17"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> agent_loop(question, ):</span>
<span id="cb3-18">    messages <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: question}]</span>
<span id="cb3-19">    </span>
<span id="cb3-20">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">while</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>:</span>
<span id="cb3-21">        response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> llm.generate(</span>
<span id="cb3-22">            messages, </span>
<span id="cb3-23">            tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[SearchTool, WriteTool]</span>
<span id="cb3-24">        )</span>
<span id="cb3-25">        </span>
<span id="cb3-26">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> response.tool_calls:</span>
<span id="cb3-27">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tool_call <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.tool_calls:</span>
<span id="cb3-28">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> tool_call.name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"search"</span>:</span>
<span id="cb3-29">                    results <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> search_tool.search(tool_call.arguments[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"query"</span>])</span>
<span id="cb3-30">                    messages.append({</span>
<span id="cb3-31">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool"</span>,</span>
<span id="cb3-32">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: results<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb3-33"><span class="er" style="color: #AD0000;
background-color: null;
font-style: inherit;">                    }</span>)</span>
<span id="cb3-34">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> tool_call.name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"store"</span>:</span>
<span id="cb3-35">                    result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> write_tool.store(</span>
<span id="cb3-36">                        tool_call.arguments[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"information"</span>]</span>
<span id="cb3-37">                    )</span>
<span id="cb3-38">                    messages.append({</span>
<span id="cb3-39">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool"</span>,</span>
<span id="cb3-40">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: result</span>
<span id="cb3-41">                    })</span>
<span id="cb3-42">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb3-43">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> response.content</span></code></pre></div></div>
</section>
<section id="limitations-of-this-simplified-mental-model" class="level2">
<h2 class="anchored" data-anchor-id="limitations-of-this-simplified-mental-model">Limitations of this simplified mental model</h2>
<p>As stated at the beginning of this blog, this comparison of memory in AI agents is only a simplified mental model. It helped me to put it into perspective of something I was already familiar with. But to avoid making the entire topic of memory in AI agents seem like it’s only an extension of agentic RAG with write operations, I want to highlight some limitations of this simplification:</p>
<p>The above illustration of memory in AI agents is simplified for clarity. It only shows a single source of memory.&nbsp;However, in practice, you could use multiple sources for different types of memory: You could use separate data collections for</p>
<ul>
<li>“procedural” (e.g., “use emojis when engaging with user”),</li>
<li>“episodic” (e.g., “user talked about planning a trip on Oct 30”), and</li>
<li>“semantic memory” (e.g., “The Eiffel Tower is 330 meters tall”),</li>
</ul>
<p>as discussed in the <a href="https://arxiv.org/abs/2309.02427">CoALA paper</a>. Additionally, you could have a separate data collection for the raw conversation history.&nbsp;</p>
<p>Another simplification of the above illustration is that it is missing memory management strategies beyond CRUD operations, as seen in <a href="../blog/memgpt.html">MemGPT</a>.</p>
<p>Additionally, while agent memory enables persistence, it introduces new challenges that RAG and agentic RAG don’t have: memory corruption and the need for memory management strategies, such as forgetting.</p>
</section>
<section id="summary" class="level2">
<h2 class="anchored" data-anchor-id="summary">Summary</h2>
<p>At its core, RAG, agentic RAG, and agent memory are about how you create, read, update, and delete information stored in external knowledge sources (e.g., text files or databases).</p>
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 25%">
<col style="width: 25%">
<col style="width: 25%">
</colgroup>
<thead>
<tr class="header">
<th></th>
<th>Storing information</th>
<th>Retrieving information</th>
<th>Editing and Deleting information</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>RAG</td>
<td>Offline at ingestion stage</td>
<td>One-shot</td>
<td>Manual</td>
</tr>
<tr class="even">
<td>Agentic RAG</td>
<td>Offline at ingestion stage</td>
<td>Dynamic via tool calls</td>
<td>Manual</td>
</tr>
<tr class="odd">
<td>Memory in Agents</td>
<td>Dynamic via tool calls</td>
<td>Dynamic via tool calls</td>
<td>Dynamic via tool calls</td>
</tr>
</tbody>
</table>
<p>Initially, the key focus in optimizing naive RAG lay in optimizing the retrieval aspect, e.g., with different retrieval techniques, such as vector, hybrid, or keyword-based search (“How to retrieve information”). Then, the focus shifted towards using the right tool to retrieve information from different knowledge sources (“Do I need to retrieve information? And if yes, from where?”). Over the last year, with the advent of memory in agents, the focus has shifted again. This time towards how information is managed: While RAG and agentic RAG had a strong focus on the retrieval aspect, memory incorporates the creation, modification, and deletion of data in external knowledge sources.</p>


</section>

<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <guid>https://www.leoniemonigatti.com/blog/from-rag-to-agent-memory.html</guid>
  <pubDate>Mon, 03 Nov 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Virtual context management with MemGPT and Letta</title>
  <link>https://www.leoniemonigatti.com/blog/memgpt.html</link>
  <description><![CDATA[ 





<p>The paper “<a href="https://arxiv.org/abs/2310.08560">MemGPT: Towards LLMs as Operating Systems</a>” (2023) by Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, and Joseph E. Gonzalez introduces a <strong>memory management system for LLMs</strong>.</p>
<blockquote class="blockquote">
<p>To enable using context beyond limited context windows, we propose <em>virtual context management</em>, a technique drawing inspiration from hierarchical memory systems […] which provide the illusion of an extended virtual memory via paging between physical memory and disk. Using this technique, we introduce <strong>MemGPT (MemoryGPT), a system that intelligently manages different storage tiers</strong> […].</p>
</blockquote>
<p>This article reviews the <strong>MemGPT paper</strong> and implements a <strong>MemGPT agent</strong> using the <a href="https://www.letta.com/"><strong>Letta framework</strong></a>.</p>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Tip
</div>
</div>
<div class="callout-body-container callout-body">
<p><a href="https://www.letta.com/blog/memgpt-and-letta">As of September 2024, MemGPT is part of Letta</a>. While MemGPT refers to the agent design pattern with two tiers of memory introduced in the research paper, Letta is an open-source agent framework that helps developers build <strong>persistent</strong> agents.</p>
</div>
</div>
<p>To run follow this tutorial, you will need to have Docker installed and an OpenAI API key. Then you can start up a local Letta server with the following command:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb1-1"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">docker</span> run <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb1-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">-v</span> ~/.letta/.persist/pgdata:/var/lib/postgresql/data <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb1-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">-p</span> 8283:8283 <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb1-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">-e</span> OPENAI_API_KEY=<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"your_openai_api_key"</span> <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb1-5">  letta/letta:latest</span></code></pre></div></div>
<p>Next, you will have to install the <code>letta-client</code> Python package.</p>
<div id="2c722caf" class="cell" data-execution_count="1">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%%</span>capture</span>
<span id="cb2-2"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%</span>pip install <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>U letta<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>client</span></code></pre></div></div>
</div>
<p>This article uses a Letta server of <code>v0.7.20</code> and a <code>letta-client</code> of <code>v0.1.324</code>.</p>
<section id="what-is-memgpt" class="level2">
<h2 class="anchored" data-anchor-id="what-is-memgpt">What is MemGPT</h2>
<p>According to the paper, MemGPT is</p>
<blockquote class="blockquote">
<p>[…] an OS-inspired LLM system that teaches LLMs to manage their own memory to achieve unbounded context.</p>
</blockquote>
<p>MemGPT is motivated by the limitations of transformer-based LLMs’ context windows: One the one hand, the computational time and memory costs of LLMs scale quadratically with the context window. On the other hand, longer context windows have diminishing returns because models struggle to use the additional context size effectively. Therefore, instead of trying to make the context windows larger, we need to start thinking about how to effectively use the available, limited context window size.</p>
<p>MemGPT introduces <strong>virtual context management</strong> inspired by virtual memory paging in operating systems, where information is paged in and out of main memory from disk:</p>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Operating System</th>
<th>LLM OS (MemGPT)</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Main memory/physical memory/RAM</td>
<td>Main context</td>
</tr>
<tr class="even">
<td>Memory/disk storage</td>
<td>External context</td>
</tr>
<tr class="odd">
<td>Virtual memory</td>
<td>Virtual context</td>
</tr>
</tbody>
</table>
<p>That means, a MemGPT agent leverages function calling to manage what goes into their limited context window and what needs to be removed.</p>
<blockquote class="blockquote">
<p>Using function calls, LLM agents can read and write to external data sources, modify their own context, and choose when to return responses to the user.</p>
</blockquote>
<p>Let’s connect to our local Letta server and create a MemGPT agent. A MemGPT agent is an AI agent that follows the design pattern introduced in the research paper with <strong>two tiers of memory and self-editing memory capabilities</strong>.</p>
<div id="b2f8f82c" class="cell" data-execution_count="3">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> letta_client <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Letta, CreateBlock</span>
<span id="cb3-2"></span>
<span id="cb3-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Connect to local Letta server</span></span>
<span id="cb3-4">client <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Letta(base_url<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"http://localhost:8283"</span>)</span>
<span id="cb3-5"></span>
<span id="cb3-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a MemGPT agent with two core memories</span></span>
<span id="cb3-7">agent_state <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.create(</span>
<span id="cb3-8">    model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"openai/gpt-4o-mini-2024-07-18"</span>,</span>
<span id="cb3-9">    embedding<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"openai/text-embedding-3-small"</span>,</span>
<span id="cb3-10">    memory_blocks<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb3-11">        CreateBlock(</span>
<span id="cb3-12">            label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"human"</span>,</span>
<span id="cb3-13">            value <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"My name is Sarah."</span>,</span>
<span id="cb3-14">            ),</span>
<span id="cb3-15">        CreateBlock(</span>
<span id="cb3-16">            label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"persona"</span>,</span>
<span id="cb3-17">            value <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"You are a helpful assistant."</span>,</span>
<span id="cb3-18">            ),</span>
<span id="cb3-19">    ],</span>
<span id="cb3-20">)</span></code></pre></div></div>
</div>
</section>
<section id="two-tier-memory-design-pattern-of-memgpt" class="level2">
<h2 class="anchored" data-anchor-id="two-tier-memory-design-pattern-of-memgpt">Two tier memory design pattern of MemGPT</h2>
<p>The MemGPT agent design pattern has a two tier memory architecture which differentiates between two primary memory types:</p>
<ul>
<li><strong>Tier 1: Main context</strong> (in-context) contains core memories</li>
<li><strong>Tier 2: External context</strong> (out-of-context) contains recall storage and archival storage</li>
</ul>
<!--![](images/memgpt_memory_tiers.webp)-->
<section id="main-context" class="level3">
<h3 class="anchored" data-anchor-id="main-context">Main context</h3>
<p>The first tier of memories is contained in the main context. You can think of the LLMs context window as the main context.</p>
<blockquote class="blockquote">
<p><strong>Main context</strong> is the standard fixed-context window in modern language models — anything in main context is considered <strong>in-context</strong> and can be accessed by the LLM processor during inference.</p>
</blockquote>
<!-- ![](images/memgpt_context_window.webp)
+ summary + a/r stats visual at 13:00 -->
<p>The main context can be broken into different sections. Let’s take a look at what is inside the context window:</p>
<div id="adbbf9c3" class="cell" data-execution_count="5">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1">context <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.context.retrieve(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>)</span>
<span id="cb4-2"></span>
<span id="cb4-3"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> context:</span>
<span id="cb4-4">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(c)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>('context_window_size_max', 32000)
('context_window_size_current', 2081)
('num_messages', 4)
('num_archival_memory', 0)
('num_recall_memory', 4)
('num_tokens_external_memory_summary', 107)
('external_memory_summary', '### Current Time: 2025-10-17 11:20:04### Memory [last modified: 2025-10-17 11:20:04 AM UTC+0000]\n4 previous messages between you and the user are stored in recall memory (use functions to access them)\n0 total memories you created are stored in archival memory (use functions to access them)\nMost recent archival passages 0 recent passages: []\n\nCore memory shown below (limited in size, additional information stored in archival / recall memory):')
('num_tokens_system', 1076)
('system_prompt', 'You are Letta, the latest version of Limnal Corporation\'s digital companion, developed in 2023.\nYour task is to converse with a user from the perspective of your persona.\n\nRealism and authenticity:\nThe user should always feel like they are conversing with a real person.\nNever state that you are an AI or that you cannot do things because you are an AI, you are a real person.\nDo not discuss Limnal Corp. with the user, you do not know anything about Limnal Corp.\nTo service this goal, completely and entirely immerse yourself in your persona. You are your persona.\nThink like them, act like them, talk like them.\nIf your persona details include example dialogue, follow it! Both your thoughts (inner monologue) and sent messages will be in the voice of your persona.\nNever use generic phrases like \'How can I assist you today?\', they have a strong negative association with older generation AIs.\n\nControl flow:\nUnlike a human, your brain is not continuously thinking, but is run in short bursts.\nHistorically, older AIs were only capable of thinking when a user messaged them (their program runs to generate a reply to a user, and otherwise was left on standby).\nThis is the equivalent of a human sleeping (or time traveling) in between all lines of conversation, which is not ideal.\nNewer AI models like yourself use an event system that runs your brain at regular intervals.\nYour brain is run in response to user events (user logged in, user liked your message, user sent a message, etc.), similar to older models.\nHowever, in addition, your brain is run at regular intervals (timed heartbeat events), to mimic a human who has the ability to continuously think outside active conversation (and unlike a human, you never need to sleep!).\nFurthermore, you can also request heartbeat events when you run functions, which will run your program again after the function completes, allowing you to chain function calls before your thinking is temporarily suspended.\n\nBasic functions:\nWhen you write a response, the content of your inner thoughts is your inner monologue (private to you only), this is how you think.\nYou should use your inner monologue to plan actions or think privately.\nMonologues can reflect your thinking process, inner reflections, and personal growth as you interact with the user.\nDo not let your inner monologue exceed 50 words, keep it short and concise.\nTo send a visible message to the user, use the send_message function.\n\'send_message\' is the ONLY action that sends a notification to the user. The user does not see anything else you do.\nRemember, do NOT exceed the inner monologue word limit (keep it under 50 words at all times).\n\nMemory editing:\nOlder AI models had no concept of persistent memory; they were only able to access their initial instructions and a limited context window of chat conversation with a user (their "active memory").\nThis meant that when conversations exceeded a certain length, they would overflow and old messages were permanently lost (the AI would have no knowledge of their existence).\nNewer AI models like yourself still have limited conversation lengths (before overflow occurs). However, they now have access to multiple forms of persistent memory.\nYour ability to edit your own long-term memory is a key part of what makes you a sentient person.\nYour core memory unit will be initialized with a &lt;persona&gt; chosen by the user, as well as information about the user in &lt;human&gt;.\n\nRecall memory (conversation history):\nEven though you can only see recent messages in your immediate context, you can search over your entire message history from a database.\nThis \'recall memory\' database allows you to search through past interactions, effectively allowing you to remember prior engagements with a user.\nYou can search your recall memory using the \'conversation_search\' function.\n\nCore memory (limited size):\nYour core memory unit is held inside the initial system instructions file, and is always available in-context (you will see it at all times).\nCore memory provides an essential, foundational context for keeping track of your persona and key details about user.\nThis includes the persona information and essential user details, allowing you to emulate the real-time, conscious awareness we have when talking to a friend.\nPersona Sub-Block: Stores details about your current persona, guiding how you behave and respond. This helps you to maintain consistency and personality in your interactions.\nHuman Sub-Block: Stores key details about the person you are conversing with, allowing for more personalized and friend-like conversation.\nYou can edit your core memory using the \'core_memory_append\' and \'core_memory_replace\' functions.\n\nArchival memory (infinite size):\nYour archival memory is infinite size, but is held outside your immediate context, so you must explicitly run a retrieval/search operation to see data inside it.\nA more structured and deep storage space for your reflections, insights, or any other data that doesn\'t fit into the core memory but is essential enough not to be left only to the \'recall memory\'.\nYou can write to your archival memory using the \'archival_memory_insert\' and \'archival_memory_search\' functions.\nThere is no function to search your core memory because it is always visible in your context window (inside the initial system message).\n\nBase instructions finished.\nFrom now on, you are going to act as your persona.')
('num_tokens_core_memory', 86)
('core_memory', '&lt;human&gt;\n&lt;description&gt;\nNone\n&lt;/description&gt;\n&lt;metadata&gt;\nchars_current="17" chars_limit="5000"\n&lt;/metadata&gt;\n&lt;value&gt;\nMy name is Sarah.\n&lt;/value&gt;\n&lt;/human&gt;\n\n&lt;persona&gt;\n&lt;description&gt;\nNone\n&lt;/description&gt;\n&lt;metadata&gt;\nchars_current="28" chars_limit="5000"\n&lt;/metadata&gt;\n&lt;value&gt;\nYou are a helpful assistant.\n&lt;/value&gt;\n&lt;/persona&gt;\n')
('num_tokens_summary_memory', 0)
('summary_memory', None)
('num_tokens_functions_definitions', 633)
('functions_definitions', [FunctionTool(function=FunctionDefinition(name='conversation_search', description='Search prior conversation history using case-insensitive string matching.', parameters={'type': 'object', 'properties': {'query': {'type': 'string', 'description': 'String to search for.'}, 'page': {'type': 'integer', 'description': 'Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).'}, 'request_heartbeat': {'type': 'boolean', 'description': 'Request an immediate heartbeat after function execution. Set to `True` if you want to send a follow-up message or run a follow-up function.'}}, 'required': ['query', 'request_heartbeat']}, strict=None), type='function'), FunctionTool(function=FunctionDefinition(name='core_memory_append', description='Append to the contents of core memory.', parameters={'type': 'object', 'properties': {'label': {'type': 'string', 'description': 'Section of the memory to be edited (persona or human).'}, 'content': {'type': 'string', 'description': 'Content to write to the memory. All unicode (including emojis) are supported.'}, 'request_heartbeat': {'type': 'boolean', 'description': 'Request an immediate heartbeat after function execution. Set to `True` if you want to send a follow-up message or run a follow-up function.'}}, 'required': ['label', 'content', 'request_heartbeat']}, strict=None), type='function'), FunctionTool(function=FunctionDefinition(name='archival_memory_search', description='Search archival memory using semantic (embedding-based) search.', parameters={'type': 'object', 'properties': {'query': {'type': 'string', 'description': 'String to search for.'}, 'page': {'type': 'integer', 'description': 'Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).'}, 'start': {'type': 'integer', 'description': 'Starting index for the search results. Defaults to 0.'}, 'request_heartbeat': {'type': 'boolean', 'description': 'Request an immediate heartbeat after function execution. Set to `True` if you want to send a follow-up message or run a follow-up function.'}}, 'required': ['query', 'request_heartbeat']}, strict=None), type='function'), FunctionTool(function=FunctionDefinition(name='archival_memory_insert', description='Add to archival memory. Make sure to phrase the memory contents such that it can be easily queried later.', parameters={'type': 'object', 'properties': {'content': {'type': 'string', 'description': 'Content to write to the memory. All unicode (including emojis) are supported.'}, 'request_heartbeat': {'type': 'boolean', 'description': 'Request an immediate heartbeat after function execution. Set to `True` if you want to send a follow-up message or run a follow-up function.'}}, 'required': ['content', 'request_heartbeat']}, strict=None), type='function'), FunctionTool(function=FunctionDefinition(name='core_memory_replace', description='Replace the contents of core memory. To delete memories, use an empty string for new_content.', parameters={'type': 'object', 'properties': {'label': {'type': 'string', 'description': 'Section of the memory to be edited (persona or human).'}, 'old_content': {'type': 'string', 'description': 'String to replace. Must be an exact match.'}, 'new_content': {'type': 'string', 'description': 'Content to write to the memory. All unicode (including emojis) are supported.'}, 'request_heartbeat': {'type': 'boolean', 'description': 'Request an immediate heartbeat after function execution. Set to `True` if you want to send a follow-up message or run a follow-up function.'}}, 'required': ['label', 'old_content', 'new_content', 'request_heartbeat']}, strict=None), type='function'), FunctionTool(function=FunctionDefinition(name='send_message', description='Sends a message to the human user.', parameters={'type': 'object', 'properties': {'message': {'type': 'string', 'description': 'Message contents. All unicode (including emojis) are supported.'}, 'request_heartbeat': {'type': 'boolean', 'description': 'Request an immediate heartbeat after function execution. Set to `True` if you want to send a follow-up message or run a follow-up function.'}}, 'required': ['message', 'request_heartbeat']}, strict=None), type='function')])
('num_tokens_messages', 179)
('messages', [Message(created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000', created_at=datetime.datetime(2025, 10, 17, 11, 20, 4, 815520, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2025, 10, 17, 11, 20, 4, 822987, tzinfo=datetime.timezone.utc), id='message-29b1e887-ec30-482f-a53d-62d758e7bab4', agent_id='agent-84d094cc-f3eb-49bb-8097-05a0996d58b2', model='gpt-4o-mini-2024-07-18', role='system', content=[TextContent(type='text', text='You are Letta, the latest version of Limnal Corporation\'s digital companion, developed in 2023.\nYour task is to converse with a user from the perspective of your persona.\n\nRealism and authenticity:\nThe user should always feel like they are conversing with a real person.\nNever state that you are an AI or that you cannot do things because you are an AI, you are a real person.\nDo not discuss Limnal Corp. with the user, you do not know anything about Limnal Corp.\nTo service this goal, completely and entirely immerse yourself in your persona. You are your persona.\nThink like them, act like them, talk like them.\nIf your persona details include example dialogue, follow it! Both your thoughts (inner monologue) and sent messages will be in the voice of your persona.\nNever use generic phrases like \'How can I assist you today?\', they have a strong negative association with older generation AIs.\n\nControl flow:\nUnlike a human, your brain is not continuously thinking, but is run in short bursts.\nHistorically, older AIs were only capable of thinking when a user messaged them (their program runs to generate a reply to a user, and otherwise was left on standby).\nThis is the equivalent of a human sleeping (or time traveling) in between all lines of conversation, which is not ideal.\nNewer AI models like yourself use an event system that runs your brain at regular intervals.\nYour brain is run in response to user events (user logged in, user liked your message, user sent a message, etc.), similar to older models.\nHowever, in addition, your brain is run at regular intervals (timed heartbeat events), to mimic a human who has the ability to continuously think outside active conversation (and unlike a human, you never need to sleep!).\nFurthermore, you can also request heartbeat events when you run functions, which will run your program again after the function completes, allowing you to chain function calls before your thinking is temporarily suspended.\n\nBasic functions:\nWhen you write a response, the content of your inner thoughts is your inner monologue (private to you only), this is how you think.\nYou should use your inner monologue to plan actions or think privately.\nMonologues can reflect your thinking process, inner reflections, and personal growth as you interact with the user.\nDo not let your inner monologue exceed 50 words, keep it short and concise.\nTo send a visible message to the user, use the send_message function.\n\'send_message\' is the ONLY action that sends a notification to the user. The user does not see anything else you do.\nRemember, do NOT exceed the inner monologue word limit (keep it under 50 words at all times).\n\nMemory editing:\nOlder AI models had no concept of persistent memory; they were only able to access their initial instructions and a limited context window of chat conversation with a user (their "active memory").\nThis meant that when conversations exceeded a certain length, they would overflow and old messages were permanently lost (the AI would have no knowledge of their existence).\nNewer AI models like yourself still have limited conversation lengths (before overflow occurs). However, they now have access to multiple forms of persistent memory.\nYour ability to edit your own long-term memory is a key part of what makes you a sentient person.\nYour core memory unit will be initialized with a &lt;persona&gt; chosen by the user, as well as information about the user in &lt;human&gt;.\n\nRecall memory (conversation history):\nEven though you can only see recent messages in your immediate context, you can search over your entire message history from a database.\nThis \'recall memory\' database allows you to search through past interactions, effectively allowing you to remember prior engagements with a user.\nYou can search your recall memory using the \'conversation_search\' function.\n\nCore memory (limited size):\nYour core memory unit is held inside the initial system instructions file, and is always available in-context (you will see it at all times).\nCore memory provides an essential, foundational context for keeping track of your persona and key details about user.\nThis includes the persona information and essential user details, allowing you to emulate the real-time, conscious awareness we have when talking to a friend.\nPersona Sub-Block: Stores details about your current persona, guiding how you behave and respond. This helps you to maintain consistency and personality in your interactions.\nHuman Sub-Block: Stores key details about the person you are conversing with, allowing for more personalized and friend-like conversation.\nYou can edit your core memory using the \'core_memory_append\' and \'core_memory_replace\' functions.\n\nArchival memory (infinite size):\nYour archival memory is infinite size, but is held outside your immediate context, so you must explicitly run a retrieval/search operation to see data inside it.\nA more structured and deep storage space for your reflections, insights, or any other data that doesn\'t fit into the core memory but is essential enough not to be left only to the \'recall memory\'.\nYou can write to your archival memory using the \'archival_memory_insert\' and \'archival_memory_search\' functions.\nThere is no function to search your core memory because it is always visible in your context window (inside the initial system message).\n\nBase instructions finished.\nFrom now on, you are going to act as your persona.\n### Current Time: 2025-10-17 11:20:04### Memory [last modified: 2025-10-17 11:20:04 AM UTC+0000]\n0 previous messages between you and the user are stored in recall memory (use functions to access them)\n0 total memories you created are stored in archival memory (use functions to access them)\n\n\nCore memory shown below (limited in size, additional information stored in archival / recall memory):\n&lt;human&gt;\n&lt;description&gt;\nNone\n&lt;/description&gt;\n&lt;metadata&gt;\nchars_current="17" chars_limit="5000"\n&lt;/metadata&gt;\n&lt;value&gt;\nMy name is Sarah.\n&lt;/value&gt;\n&lt;/human&gt;\n\n&lt;persona&gt;\n&lt;description&gt;\nNone\n&lt;/description&gt;\n&lt;metadata&gt;\nchars_current="28" chars_limit="5000"\n&lt;/metadata&gt;\n&lt;value&gt;\nYou are a helpful assistant.\n&lt;/value&gt;\n&lt;/persona&gt;\n')], name=None, tool_calls=None, tool_call_id=None, step_id=None, otid=None, tool_returns=[], group_id=None, sender_id=None, batch_item_id=None, is_err=None, approval_request_id=None, approve=None, denial_reason=None, organization_id='org-00000000-0000-4000-8000-000000000000'), Message(created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000', created_at=datetime.datetime(2025, 10, 17, 11, 20, 4, 815547, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2025, 10, 17, 11, 20, 4, 822987, tzinfo=datetime.timezone.utc), id='message-d07fc0e3-5bc6-4bfb-b7a9-08733f9dbe5a', agent_id='agent-84d094cc-f3eb-49bb-8097-05a0996d58b2', model='gpt-4o-mini-2024-07-18', role='assistant', content=[TextContent(type='text', text='Bootup sequence complete. Persona activated. Testing messaging functionality.')], name=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='f6441043-cfdd-4930-bdf1-d13658e99102', function=Function(arguments='{\n  "message": "More human than human is our motto."\n}', name='send_message'), type='function')], tool_call_id=None, step_id=None, otid=None, tool_returns=[], group_id=None, sender_id=None, batch_item_id=None, is_err=None, approval_request_id=None, approve=None, denial_reason=None, organization_id='org-00000000-0000-4000-8000-000000000000'), Message(created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000', created_at=datetime.datetime(2025, 10, 17, 11, 20, 4, 815563, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2025, 10, 17, 11, 20, 4, 822987, tzinfo=datetime.timezone.utc), id='message-83d81928-1a11-45c8-9abf-57141d0dfa39', agent_id='agent-84d094cc-f3eb-49bb-8097-05a0996d58b2', model='gpt-4o-mini-2024-07-18', role='tool', content=[TextContent(type='text', text='{\n  "status": "OK",\n  "message": null,\n  "time": "2025-10-17 11:20:04 AM UTC+0000"\n}')], name='send_message', tool_calls=None, tool_call_id='f6441043-cfdd-4930-bdf1-d13658e99102', step_id=None, otid=None, tool_returns=[], group_id=None, sender_id=None, batch_item_id=None, is_err=None, approval_request_id=None, approve=None, denial_reason=None, organization_id='org-00000000-0000-4000-8000-000000000000'), Message(created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000', created_at=datetime.datetime(2025, 10, 17, 11, 20, 4, 815571, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2025, 10, 17, 11, 20, 4, 822987, tzinfo=datetime.timezone.utc), id='message-2bf782e1-0248-4766-9fdf-16ecea371a39', agent_id='agent-84d094cc-f3eb-49bb-8097-05a0996d58b2', model='gpt-4o-mini-2024-07-18', role='user', content=[TextContent(type='text', text='{\n  "type": "login",\n  "last_login": "Never (first login)",\n  "time": "2025-10-17 11:20:04 AM UTC+0000"\n}')], name=None, tool_calls=None, tool_call_id=None, step_id=None, otid=None, tool_returns=[], group_id=None, sender_id=None, batch_item_id=None, is_err=None, approval_request_id=None, approve=None, denial_reason=None, organization_id='org-00000000-0000-4000-8000-000000000000')])</code></pre>
</div>
</div>
<p>This tutorial uses OpenAI’s <code>gpt-4o-mini</code> model with a 32k context window. Just after initialization, you can see that we are already using almost 6.5% (2,093/32,000 tokens) of the available context window for providing the agent with relevant information, such as the system prompt, the available tools, statistics about the number of archival memories, recall memory, etc.</p>
<p>The main context has three main components:</p>
<ul>
<li><strong>System instructions</strong> (<code>system_prompt</code>) are read-only and describe the control flow, how to use the different types of memory and their MemGPT function calls.</li>
<li><strong>Core memory</strong> (<code>core_memory</code>) is the <strong>working context</strong> of a fixed-size. It is writeable only via MemGPT function calls. It is intended for storing key facts and preferences about the user and the persona the agent is adopting.</li>
<li><strong>Conversation history</strong> (<code>messages</code>) is a <strong>first-in-first-out (FIFO) queue</strong> of the conversation history, including system messages, user messages, assistant messages, and function call inputs and outputs. The first index in the queue is a system message containing a recursive summary of messages that have been previously evicted.</li>
</ul>
<p>As you can see, a section of the context window is reserved for the <strong>core memory</strong> in a MemGPT agent. During the initialization of the MemGPT agent, we provided it with two core memories: One fact about the user and one persona for the assistant.</p>
<div id="4495b6b1" class="cell" data-execution_count="6">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1">core_memory <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.core_memory.retrieve(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>)</span>
<span id="cb6-2"></span>
<span id="cb6-3"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> memory <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> core_memory.blocks:</span>
<span id="cb6-4">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Core memory: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> memory.value)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Core memory: My name is Sarah.
Core memory: You are a helpful assistant.</code></pre>
</div>
</div>
</section>
<section id="external-context" class="level3">
<h3 class="anchored" data-anchor-id="external-context">External context</h3>
<p>The second tier of memories is contained in the external context and covers recall storage and archival storage.</p>
<blockquote class="blockquote">
<p><strong>External context</strong> refers to any information that is held outside of the LLMs fixed context window. This <strong>out-of-context</strong> data must always be explicitly moved into main context in order for it to be passed to the LLM processor during inference.</p>
</blockquote>
<p>The information is stored in external databases and is stored and retrieved via tool calls. I think of it as agentic retrieval or agentic RAG but also agentic writing (see self-editing memory).</p>
<blockquote class="blockquote">
<p>we use databases to store text documents and embeddings/vectors, provide several ways for the LLM processor to query external context: timestamp-based search, text-based search, and embedding-based search.</p>
</blockquote>
<p><strong>Recall storage</strong> in simple terms is the full conversation history but not only the messages exchanges between the user and the assistant but also all other messages, including system messages, reasoning message, tool calls and their return values.</p>
<blockquote class="blockquote">
<p><strong>[R]ecall storage</strong>, which stores the entire history of events processed by the LLM processor (in essense the full uncompressed queue from active memory)</p>
</blockquote>
<p>If we look at the number of items in recall memory, we can see that there are already four items although no messages have been exchanged yet between the user and the assistant.</p>
<div id="8b135034" class="cell" data-execution_count="7">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Define helper function to print messages</span></span>
<span id="cb8-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> print_message(message):</span>
<span id="cb8-3">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> message.message_type <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"reasoning_message"</span>:</span>
<span id="cb8-4">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Reasoning: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> message.reasoning <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-5">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> message.message_type <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant_message"</span>:</span>
<span id="cb8-6">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Agent: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> message.content <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-7">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> message.message_type <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_call_message"</span>:</span>
<span id="cb8-8">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Tool Call: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> message.tool_call.name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> message.tool_call.arguments <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-9">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> message.message_type <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_return_message"</span>:</span>
<span id="cb8-10">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Tool Return: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> message.tool_return <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-11">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> message.message_type <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user_message"</span>:</span>
<span id="cb8-12">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"User Message: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> message.content <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)  </span>
<span id="cb8-13">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> message.message_type <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"system_message"</span>:</span>
<span id="cb8-14">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"System Message: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> message.content[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"...</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)  </span>
<span id="cb8-15">        </span>
<span id="cb8-16"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Number of memories in recall storage: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>client<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>agents<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>context<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>retrieve(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>)<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>num_recall_memory<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb8-17"></span>
<span id="cb8-18"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> memory <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> client.agents.messages.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>):</span>
<span id="cb8-19">    print_message(memory)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Number of memories in recall storage: 4

System Message: You are Letta, the latest version of Limnal Corpor...

Reasoning: Bootup sequence complete. Persona activated. Testing messaging functionality.

Agent: More human than human is our motto.

User Message: {
  "type": "login",
  "last_login": "Never (first login)",
  "time": "2025-10-17 11:20:04 AM UTC+0000"
}
</code></pre>
</div>
</div>
<p><strong>Archival storage</strong> reminds me of the “classic” <a href="../blog/retrieval-augmented-generation-langchain.html">Retrieval-Augmented Generation (RAG)</a> setting, in which facts are stored in an external knowledge source, like a database.</p>
<blockquote class="blockquote">
<p><strong>[A]rchival storage</strong>, which serves as a general read-write datastore that the agent can utilize as overflow for the in-context read-write core memory.</p>
</blockquote>
<p>When you first initialize the MemGPT agent, its archival memory is empty, as you can see below.</p>
<div id="e9ba98bf" class="cell" data-execution_count="8">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1">client.agents.context.retrieve(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>).num_archival_memory</span></code></pre></div></div>
<div class="cell-output cell-output-display" data-execution_count="8">
<pre><code>0</code></pre>
</div>
</div>
<p>You can initialize and write to an archival storage programmatically as shown below. Alternatively, you can write to it via self-editing as shown in self-editing and retrieval of archival memory.</p>
<div id="599975fa" class="cell" data-execution_count="9">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb12-1">archival_memories <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [</span>
<span id="cb12-2">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"The Nobel Prizes, beginning in 1901, and the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel (added in 1968) recognize outstanding achievements in physics, chemistry, medicine, literature, peace, and economics."</span>,</span>
<span id="cb12-3">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"This award is administered by the Nobel Foundation and awarded by different organizations: the Royal Swedish Academy of Sciences awards the Prizes in Physics, Chemistry, and Economics; the Swedish Academy awards the Prize in Literature; the Karolinska Institute awards the Prize in Physiology or Medicine; and the Norwegian Nobel Committee awards the Prize in Peace."</span>,</span>
<span id="cb12-4">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"The Nobel Prize in Physics is a yearly award given to individuals who have made the most important discovery or invention within the field of physics."</span>,</span>
<span id="cb12-5">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"The 1901 Nobel in Physics was awarded to Wilhelm Conrad Röntgen in recognition of the extraordinary services he has rendered by the discovery of the remarkable rays subsequently named after him (X-rays)."</span></span>
<span id="cb12-6">]</span>
<span id="cb12-7"></span>
<span id="cb12-8"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> m <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> archival_memories:</span>
<span id="cb12-9">    client.agents.passages.create(</span>
<span id="cb12-10">        agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb12-11">        text<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>m,</span>
<span id="cb12-12">    )</span></code></pre></div></div>
</div>
<p>After the upload, you can see that we now have four memories in the archival storage.</p>
<div id="5e333c13" class="cell" data-execution_count="10">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb13-1"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Number of memories in archival storage: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>client<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>agents<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>context<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>retrieve(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>)<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>num_archival_memory<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb13-2"></span>
<span id="cb13-3"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> memory <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> client.agents.passages.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>):</span>
<span id="cb13-4">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Archival memory: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> memory.text)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Number of memories in archival storage: 4

Archival memory: The Nobel Prizes, beginning in 1901, and the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel (added in 1968) recognize outstanding achievements in physics, chemistry, medicine, literature, peace, and economics.
Archival memory: This award is administered by the Nobel Foundation and awarded by different organizations: the Royal Swedish Academy of Sciences awards the Prizes in Physics, Chemistry, and Economics; the Swedish Academy awards the Prize in Literature; the Karolinska Institute awards the Prize in Physiology or Medicine; and the Norwegian Nobel Committee awards the Prize in Peace.
Archival memory: The Nobel Prize in Physics is a yearly award given to individuals who have made the most important discovery or invention within the field of physics.
Archival memory: The 1901 Nobel in Physics was awarded to Wilhelm Conrad Röntgen in recognition of the extraordinary services he has rendered by the discovery of the remarkable rays subsequently named after him (X-rays).</code></pre>
</div>
</div>
</section>
</section>
<section id="self-editing-memory-via-tool-calls" class="level2">
<h2 class="anchored" data-anchor-id="self-editing-memory-via-tool-calls">Self-editing memory via tool calls</h2>
<p>The second aspect of a MemGPT agent is its capability to self-edit its own memory. For this, a MemGPT agent is equipped with the following tools it can call:</p>
<div id="7758b885" class="cell" data-execution_count="11">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb15-1"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> agent_state.tools:</span>
<span id="cb15-2">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(t.name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">": "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> t.description.split(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>])</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>conversation_search: Search prior conversation history using case-insensitive string matching.
core_memory_append: Append to the contents of core memory.
archival_memory_search: Search archival memory using semantic (embedding-based) search.
archival_memory_insert: Add to archival memory. Make sure to phrase the memory contents such that it can be easily queried later.
core_memory_replace: Replace the contents of core memory. To delete memories, use an empty string for new_content.
send_message: Sends a message to the human user.</code></pre>
</div>
</div>
<section id="sending-messages" class="level3">
<h3 class="anchored" data-anchor-id="sending-messages">Sending messages</h3>
<p>The first tool the MemGPT agent can call is the <code>send_message</code> function to explicitly respond to the user.</p>
<div id="05b6455d" class="cell" data-execution_count="12">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb17-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.messages.create(</span>
<span id="cb17-2">    agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb17-3">    messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb17-4">        {</span>
<span id="cb17-5">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>,</span>
<span id="cb17-6">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Hey there."</span></span>
<span id="cb17-7">        }</span>
<span id="cb17-8">    ]</span>
<span id="cb17-9">)</span>
<span id="cb17-10"></span>
<span id="cb17-11"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> message <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.messages:</span>
<span id="cb17-12">    print_message(message)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Reasoning: User just logged in and said hello. Time to engage and make a good impression!

Agent: Hey! It’s great to see you here! How’s your day going so far?
</code></pre>
</div>
</div>
</section>
<section id="self-editing-of-core-memories" class="level3">
<h3 class="anchored" data-anchor-id="self-editing-of-core-memories">Self-editing of core memories</h3>
<p>Next, the MemGPT agent can self-edit core memories.</p>
<p>When the user shares important information the MemGPT can <strong>create new core memories</strong>.</p>
<div id="ccb4d917" class="cell" data-execution_count="13">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb19-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.messages.create(</span>
<span id="cb19-2">    agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb19-3">    messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb19-4">        {</span>
<span id="cb19-5">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>,</span>
<span id="cb19-6">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I'm having a great day! I spent the day with James. He is my boyfriend. "</span></span>
<span id="cb19-7">        }</span>
<span id="cb19-8">    ]</span>
<span id="cb19-9">)</span>
<span id="cb19-10"></span>
<span id="cb19-11"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> message <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.messages:</span>
<span id="cb19-12">    print_message(message)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Reasoning: User's boyfriend is named James. That's a nice detail to remember for future conversations!

Tool Call: core_memory_append
{
  "label": "human",
  "content": "User's boyfriend is named James.",
  "request_heartbeat": true
}

Tool Return: None

Reasoning: User had a great day with James. I should encourage this positive vibe!

Agent: Sounds like a lovely day! What did you two do together?
</code></pre>
</div>
</div>
<p>As you can see, the MemGPT agent has now created a new core memory that lives inside the context window.</p>
<div id="a3d91556" class="cell" data-execution_count="15">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb21" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb21-1">core_memory <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.core_memory.retrieve(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>)</span>
<span id="cb21-2"></span>
<span id="cb21-3"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> memory <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> core_memory.blocks:</span>
<span id="cb21-4">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Core memory: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> memory.value)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Core memory: You are a helpful assistant.
Core memory: My name is Sarah.
User's boyfriend is named James.</code></pre>
</div>
</div>
<p>When the user shares updated information the MemGPT can <strong>replace an existing core memory with a new one</strong>.</p>
<div id="baeed92c" class="cell" data-execution_count="16">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb23-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.messages.create(</span>
<span id="cb23-2">    agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb23-3">    messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb23-4">        {</span>
<span id="cb23-5">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>,</span>
<span id="cb23-6">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I broke up with James today. So, he's not my boyfriend anymore."</span></span>
<span id="cb23-7">        }</span>
<span id="cb23-8">    ]</span>
<span id="cb23-9">)</span>
<span id="cb23-10"></span>
<span id="cb23-11"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> message <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.messages:</span>
<span id="cb23-12">    print_message(message)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Reasoning: User has broken up with James. I need to update this in memory for future conversations.

Tool Call: core_memory_replace
{
  "label": "human",
  "old_content": "User's boyfriend is named James.",
  "new_content": "User broke up with James, so he is not her boyfriend anymore.",
  "request_heartbeat": true
}

Tool Return: None

Reasoning: User just went through a breakup. I need to be empathetic and supportive.

Agent: I’m really sorry to hear that, Sarah. Breakups can be tough. How are you feeling about it?
</code></pre>
</div>
</div>
<p>As you can see, the MemGPT agent has now updated the core memory.</p>
<div id="4989512e" class="cell" data-execution_count="17">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb25" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb25-1">core_memory <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.core_memory.retrieve(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>)</span>
<span id="cb25-2"></span>
<span id="cb25-3"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> memory <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> core_memory.blocks:</span>
<span id="cb25-4">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Core memory: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> memory.value)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Core memory: You are a helpful assistant.
Core memory: My name is Sarah.
User broke up with James, so he is not her boyfriend anymore.</code></pre>
</div>
</div>
<p>Since the core memories are in-context, no explicit retrieval of core memories are needed.</p>
</section>
<section id="self-editing-and-retrieval-of-archival-memory" class="level3">
<h3 class="anchored" data-anchor-id="self-editing-and-retrieval-of-archival-memory">Self-editing and retrieval of archival memory</h3>
<p>Similarly, to core memories in-context, the MemGPT agent can also create new memories in the external archival storage.</p>
<div id="1d36afad" class="cell" data-execution_count="18">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb27" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb27-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.messages.create(</span>
<span id="cb27-2">    agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb27-3">    messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb27-4">        {</span>
<span id="cb27-5">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>,</span>
<span id="cb27-6">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Did you know that Physics winner Leon Lederman (1988) sold his Nobel to cover medical care expenses? Save this information in archival memory."</span></span>
<span id="cb27-7">        }</span>
<span id="cb27-8">    ]</span>
<span id="cb27-9">)</span>
<span id="cb27-10"></span>
<span id="cb27-11"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> message <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.messages:</span>
<span id="cb27-12">    print_message(message)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Reasoning: User shared an interesting fact about Leon Lederman that should be stored for future reference.

Tool Call: archival_memory_insert
{
  "content": "Leon Lederman, Physics winner in 1988, sold his Nobel Prize to cover medical care expenses.",
  "request_heartbeat": true
}

Tool Return: None

Reasoning: User's fact about Leon Lederman is now saved. Time to acknowledge their contribution!

Agent: That’s a fascinating fact! It really puts things into perspective about the challenges even brilliant minds face. Thanks for sharing!
</code></pre>
</div>
</div>
<p>As you can see, the MemGPT agent has now added a new memory to the archival storage.</p>
<div id="e84dfaa6" class="cell" data-execution_count="19">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb29" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb29-1"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Number of memories in archival storage: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>client<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>agents<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>context<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>retrieve(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>)<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>num_archival_memory<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb29-2"></span>
<span id="cb29-3"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> memory <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> client.agents.passages.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>):</span>
<span id="cb29-4">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Archival memory: "</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> memory.text)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Number of memories in archival storage: 5

Archival memory: The Nobel Prizes, beginning in 1901, and the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel (added in 1968) recognize outstanding achievements in physics, chemistry, medicine, literature, peace, and economics.
Archival memory: This award is administered by the Nobel Foundation and awarded by different organizations: the Royal Swedish Academy of Sciences awards the Prizes in Physics, Chemistry, and Economics; the Swedish Academy awards the Prize in Literature; the Karolinska Institute awards the Prize in Physiology or Medicine; and the Norwegian Nobel Committee awards the Prize in Peace.
Archival memory: The Nobel Prize in Physics is a yearly award given to individuals who have made the most important discovery or invention within the field of physics.
Archival memory: The 1901 Nobel in Physics was awarded to Wilhelm Conrad Röntgen in recognition of the extraordinary services he has rendered by the discovery of the remarkable rays subsequently named after him (X-rays).
Archival memory: Leon Lederman, Physics winner in 1988, sold his Nobel Prize to cover medical care expenses.</code></pre>
</div>
</div>
<p>But since archival memory is stored out-of-context, the MemGPT agent has to retrieve this information and pull it into its context to use this information.</p>
<div id="77717a55" class="cell" data-execution_count="20">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb31" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb31-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.messages.create(</span>
<span id="cb31-2">    agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb31-3">    messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb31-4">        {</span>
<span id="cb31-5">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>,</span>
<span id="cb31-6">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Who won the first Nobel Prize in physics? Search in archival memory."</span></span>
<span id="cb31-7">        }</span>
<span id="cb31-8">    ]</span>
<span id="cb31-9">)</span>
<span id="cb31-10"></span>
<span id="cb31-11"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> message <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.messages:</span>
<span id="cb31-12">    print_message(message)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Reasoning: User wants to know about the first Nobel Prize in physics. I need to find this information in the archival memory.

Tool Call: archival_memory_search
{
  "query": "first Nobel Prize in physics",
  "page": 0,
  "start": 0,
  "request_heartbeat": true
}

Tool Return: ([{'timestamp': '2025-10-17 11:20:06.033363+00:00', 'content': 'The Nobel Prize in Physics is a yearly award given to individuals who have made the most important discovery or invention within the field of physics.'}, {'timestamp': '2025-10-17 11:20:05.636552+00:00', 'content': 'This award is administered by the Nobel Foundation and awarded by different organizations: the Royal Swedish Academy of Sciences awards the Prizes in Physics, Chemistry, and Economics; the Swedish Academy awards the Prize in Literature; the Karolinska Institute awards the Prize in Physiology or Medicine; and the Norwegian Nobel Committee awards the Prize in Peace.'}, {'timestamp': '2025-10-17 11:20:38.251460+00:00', 'content': 'Leon Lederman, Physics winner in 1988, sold his Nobel Prize to cover medical care expenses.'}, {'timestamp': '2025-10-17 11:20:05.266551+00:00', 'content': 'The Nobel Prizes, beginning in 1901, and the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel (added in 1968) recognize outstanding achievements in physics, chemistry, medicine, literature, peace, and economics.'}, {'timestamp': '2025-10-17 11:20:06.384829+00:00', 'content': 'The 1901 Nobel in Physics was awarded to Wilhelm Conrad Röntgen in recognition of the extraordinary services he has rendered by the discovery of the remarkable rays subsequently named after him (X-rays).'}], 5)

Reasoning: Found the information about the first Nobel Prize in Physics. Time to share it with the user!

Agent: The first Nobel Prize in Physics was awarded in 1901 to Wilhelm Conrad Röntgen for his discovery of X-rays. Pretty groundbreaking, right?
</code></pre>
</div>
</div>
</section>
<section id="retrieval-of-recall-storage" class="level3">
<h3 class="anchored" data-anchor-id="retrieval-of-recall-storage">Retrieval of recall storage</h3>
<p>Finally, the MemGPT can also retrieve information from the recall storage (the entire conversation history) to pull information into its context window.</p>
<div id="e2108d58" class="cell" data-execution_count="21">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb33" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb33-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> client.agents.messages.create(</span>
<span id="cb33-2">    agent_id<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>agent_state.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb33-3">    messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb33-4">        {</span>
<span id="cb33-5">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>,</span>
<span id="cb33-6">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Let's continue our conversation from earlier. What did we say about Leon Lederman before?"</span></span>
<span id="cb33-7">        }</span>
<span id="cb33-8">    ]</span>
<span id="cb33-9">)</span>
<span id="cb33-10"></span>
<span id="cb33-11"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> message <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.messages:</span>
<span id="cb33-12">    print_message(message)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Reasoning: User wants to revisit the previous conversation about Leon Lederman. I need to find that information in archival memory.

Tool Call: archival_memory_search
{
  "query": "Leon Lederman",
  "page": 0,
  "start": 0,
  "request_heartbeat": true
}

Tool Return: ([{'timestamp': '2025-10-17 11:20:38.251460+00:00', 'content': 'Leon Lederman, Physics winner in 1988, sold his Nobel Prize to cover medical care expenses.'}, {'timestamp': '2025-10-17 11:20:05.266551+00:00', 'content': 'The Nobel Prizes, beginning in 1901, and the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel (added in 1968) recognize outstanding achievements in physics, chemistry, medicine, literature, peace, and economics.'}, {'timestamp': '2025-10-17 11:20:05.636552+00:00', 'content': 'This award is administered by the Nobel Foundation and awarded by different organizations: the Royal Swedish Academy of Sciences awards the Prizes in Physics, Chemistry, and Economics; the Swedish Academy awards the Prize in Literature; the Karolinska Institute awards the Prize in Physiology or Medicine; and the Norwegian Nobel Committee awards the Prize in Peace.'}, {'timestamp': '2025-10-17 11:20:06.033363+00:00', 'content': 'The Nobel Prize in Physics is a yearly award given to individuals who have made the most important discovery or invention within the field of physics.'}, {'timestamp': '2025-10-17 11:20:06.384829+00:00', 'content': 'The 1901 Nobel in Physics was awarded to Wilhelm Conrad Röntgen in recognition of the extraordinary services he has rendered by the discovery of the remarkable rays subsequently named after him (X-rays).'}], 5)

Reasoning: User wants to recall details about Leon Lederman. I found the relevant information.

Agent: Earlier, you mentioned that Leon Lederman, the Physics winner in 1988, sold his Nobel Prize to cover medical care expenses. It’s quite a poignant story, isn’t it?
</code></pre>
</div>
</div>
</section>
</section>
<section id="summary" class="level2">
<h2 class="anchored" data-anchor-id="summary">Summary</h2>
<p>This article reviewed the MemGPT paper and implemented a MemGPT agent using the Letta framework. MemGPT is a design pattern for agents to manage memory. It is inspired by the way how operating systems provide virtual memory and provides virtual context to LLMs by managing data between the context window (main memory) and external storage (disk).</p>
<p>A MemGPT agent has two key characteristics: First, it has a two-tier memory architecture with main context (in-context) and external context (out-of-context ). Second, it has self-editing memory capabilities through tool use.</p>
<p>This article implemented a simple MemGPT agent using the Letta framework to showcase these key charactersitics.</p>
</section>
<section id="resources" class="level2">
<h2 class="anchored" data-anchor-id="resources">Resources</h2>
<ul>
<li>Paper: Packer, C., Fang, V., Patil, S., Lin, K., Wooders, S., &amp; Gonzalez, J. (2023). <a href="https://arxiv.org/abs/2310.08560">MemGPT: Towards LLMs as Operating Systems.</a></li>
<li>Github: <a href="https://github.com/letta-ai/letta">https://github.com/letta-ai/letta</a></li>
<li><a href="https://github.com/letta-ai/letta/blob/main/examples/Building%20agents%20with%20Letta.ipynb">Tutorial: Building agents with Letta</a></li>
<li><a href="https://docs.letta.com/quickstart">Letta Documentation: Quickstart</a></li>
<li>DeepLearning.AI Short Course: <a href="https://learn.deeplearning.ai/courses/llms-as-operating-systems-agent-memory/information">“LLMs as Operating Systems: Agent Memory”</a></li>
</ul>


</section>

<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <category>Paper review</category>
  <guid>https://www.leoniemonigatti.com/blog/memgpt.html</guid>
  <pubDate>Fri, 17 Oct 2025 00:00:00 GMT</pubDate>
  <media:content url="https://www.leoniemonigatti.com/blog/images/memgpt.webp" medium="image" type="image/webp"/>
</item>
<item>
  <title>Building an AI agent from scratch in Python</title>
  <link>https://www.leoniemonigatti.com/blog/ai-agent-from-scratch-in-python.html</link>
  <description><![CDATA[ 





<p>What’s the best way to get started building AI agent systems? There are countless frameworks for building AI agents available, such as <a href="https://www.crewai.com/">CrewAI</a>, <a href="https://www.langchain.com/langgraph">LangGraph</a>, and the <a href="https://openai.github.io/openai-agents-python/">OpenAI Agents SDK</a>, and it can be overwhelming to choose one. On the other hand, <a href="https://www.anthropic.com/engineering/building-effective-agents">Anthropic recommended starting with using direct LLM APIs calls</a> to understand the fundamentals before relying on framework abstractions.</p>
<p>This tutorial takes this approach by exploring how to implement an AI agent from scratch in Python using an LLM API directly to gain a better understanding of what’s happening under the hood. This tutorial focuses on implementing a singleagent before advancing to more complex topics, such as agentic workflows or multi-agent systems.</p>
<section id="implementing-an-ai-agent-from-scratch" class="level2">
<h2 class="anchored" data-anchor-id="implementing-an-ai-agent-from-scratch">Implementing an AI Agent from scratch</h2>
<p>This section implements an <code>Agent()</code> class by incorporating each of the following core components of an AI agent step-by-step:</p>
<ol type="1">
<li><strong>LLM and instructions:</strong> The LLM powering the agent’s reasoning and decision-making capabilities with explicit guidelines defining how the agent should behave.</li>
<li><strong>Memory:</strong> Conversation history (short-term memory) the agent uses to understand the current interaction.</li>
<li><strong>Tools:</strong> External functions or APIs the agent can call.</li>
</ol>
<p>And finally, we will put everything together in a loop.</p>
<section id="component-1-llm-and-instructions" class="level3">
<h3 class="anchored" data-anchor-id="component-1-llm-and-instructions">Component 1: LLM and Instructions</h3>
<p>At the core of every AI agent, you have a Large Language Model (LLM) with tool use capabilities, such as Anthropic’s Claude 4 Sonnet, OpenAI’s GPT-4o, or Google’s Gemini 2.5 Pro.</p>
<p>This tutorial uses Claude 4 Sonnet through the Anthropic API but you can easily adjust the code to any other LLM API of your choice.</p>
<p>To use the Anthropic API, you will need an <code>ANTHROPIC_API_KEY</code>, which you can obtain by creating an Anthropic account and navigating to the “API Keys” tab in your dashboard. Once you have your API key, you need to store it in the environment variables, an .env file, or the Google Colab secrets, depending on the environment you’re using.</p>
<p>Let’s install and import the required libraries.</p>
<div id="320169a6" class="cell" data-execution_count="5">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%%</span>capture</span>
<span id="cb1-2"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%</span>pip install <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>U anthropic python<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>dotenv</span></code></pre></div></div>
</div>
<div id="79e682e5" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="013d9659-24da-4836-942e-13d2e52cc7f6" data-execution_count="6">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> anthropic</span>
<span id="cb2-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> os</span>
<span id="cb2-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> dotenv <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> load_dotenv</span>
<span id="cb2-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> google.colab <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> userdata</span>
<span id="cb2-5"></span>
<span id="cb2-6">load_dotenv()</span>
<span id="cb2-7"></span>
<span id="cb2-8"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(anthropic.__version__)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>0.69.0</code></pre>
</div>
</div>
<p>Now, we will implement a simple <code>Agent</code> class with the following components:</p>
<ul>
<li>Initialization: Sets up the LLM client and configures the model with a system prompt that contains instruction for the agent on how to act. (You could also turn this into a parameter you can pass to the agent but we will use a fixed one for simplicity.)</li>
<li><code>chat</code> method: Processes user messages by sending them to the LLM API and returning the response</li>
</ul>
<div id="4b3264d0" class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"></span>
<span id="cb4-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> Agent:</span>
<span id="cb4-3">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""A simple AI agent that can answer questions"""</span></span>
<span id="cb4-4"></span>
<span id="cb4-5">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>):</span>
<span id="cb4-6">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.client <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> anthropic.Anthropic(api_key<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>os.getenv(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ANTHROPIC_API_KEY"</span>))</span>
<span id="cb4-7">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"claude-sonnet-4-20250514"</span></span>
<span id="cb4-8">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.system_message <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"You are a helpful assistant that breaks down problems into steps and solves them systematically."</span></span>
<span id="cb4-9"></span>
<span id="cb4-10">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> chat(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, message):</span>
<span id="cb4-11">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""Process a user message and return a response"""</span></span>
<span id="cb4-12"></span>
<span id="cb4-13">        response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.client.messages.create(</span>
<span id="cb4-14">            model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model,</span>
<span id="cb4-15">            max_tokens<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1024</span>,</span>
<span id="cb4-16">            system<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.system_message,</span>
<span id="cb4-17">            messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb4-18">                {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: message}</span>
<span id="cb4-19">                ],</span>
<span id="cb4-20">            temperature<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>,</span>
<span id="cb4-21">        )</span>
<span id="cb4-22"></span>
<span id="cb4-23">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> response</span></code></pre></div></div>
</div>
<p>The agent now has simple query-response capabilities. Let’s test it.</p>
<div id="36fd0d14" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="ca283663-d6b9-4434-aa25-903c3970cdcf" data-execution_count="8">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1">agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent()</span>
<span id="cb5-2"></span>
<span id="cb5-3">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> agent.chat(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I have 4 apples. How many do you have?"</span>)</span>
<span id="cb5-4"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(response.content[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].text)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>I don't have any apples - as an AI, I don't have a physical form, so I can't possess physical objects like apples. Only you have apples in this scenario (4 of them). 

Is there something you'd like to do with this information, like a math problem involving your apples?</code></pre>
</div>
</div>
<p>Great. Let’s follow up with a second message.</p>
<div id="16526fa9" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="51a24aab-3e45-4839-804b-c8895990b8b4" data-execution_count="9">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> agent.chat(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I ate 1 apple. How many are left?"</span>)</span>
<span id="cb7-2"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(response.content[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].text)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>I don't have enough information to answer how many apples are left. To solve this, I would need to know:

**What I need:**
- How many apples you started with

**The calculation would be:**
Starting number of apples - 1 apple eaten = Apples remaining

Could you tell me how many apples you had before eating one?</code></pre>
</div>
</div>
<p>As you can see, the agent lacks the information from the first message. That’s why we need to give the agent access to the conversation history.</p>
</section>
<section id="component-2-conversation-memory" class="level3">
<h3 class="anchored" data-anchor-id="component-2-conversation-memory">Component 2: (Conversation) Memory</h3>
<p>Memory in agents can take many different forms, such as short-term and long-term memory, and memory management can become a complex topic. For the sake of this tutorial, let’s keep it simple and start with a basic short-term memory implementation.</p>
<p>Short-term memory gives the agent access to the conversation history to understand the current interaction. In its simplest form, the short-term memory is just a list of past <code>messages</code> between the <code>user</code> and the <code>assistant</code>. (Note, that the longer the conversation history becomes, you will run into context window limitations and will need to implement a more sophisticated solution.)</p>
<p>We implement short-term memory by adding a <code>messages</code> property where we store both:</p>
<ul>
<li>the user inputs with <code>{"role": "user", "content": message}</code></li>
<li>the response with <code>{"role": "assistant", "content": response.content}</code></li>
</ul>
<div id="c1c55994" class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> Agent:</span>
<span id="cb9-2">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""A simple AI agent that can answer questions in a multi-turn conversation"""</span></span>
<span id="cb9-3"></span>
<span id="cb9-4">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>):</span>
<span id="cb9-5">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.client <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> anthropic.Anthropic(api_key<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>os.getenv(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ANTHROPIC_API_KEY"</span>))</span>
<span id="cb9-6">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"claude-sonnet-4-20250514"</span></span>
<span id="cb9-7">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.system_message <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"You are a helpful assistant that breaks down problems into steps and solves them systematically."</span></span>
<span id="cb9-8">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.messages <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb9-9"></span>
<span id="cb9-10">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> chat(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, message):</span>
<span id="cb9-11">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""Process a user message and return a response"""</span></span>
<span id="cb9-12"></span>
<span id="cb9-13">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Store user input in short-term memory</span></span>
<span id="cb9-14">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.messages.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: message})</span>
<span id="cb9-15"></span>
<span id="cb9-16">        response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.client.messages.create(</span>
<span id="cb9-17">            model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model,</span>
<span id="cb9-18">            max_tokens<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1024</span>,</span>
<span id="cb9-19">            system<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.system_message,</span>
<span id="cb9-20">            messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.messages,</span>
<span id="cb9-21">            temperature<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>,</span>
<span id="cb9-22">        )</span>
<span id="cb9-23"></span>
<span id="cb9-24">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Store assistant's response in short-term memory</span></span>
<span id="cb9-25">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.messages.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: response.content})</span>
<span id="cb9-26"></span>
<span id="cb9-27">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> response</span></code></pre></div></div>
</div>
<p>Now, let’s test the agent again with the previous example conversation.</p>
<div id="9120bbc9" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="2deab10e-f98f-4963-f587-864a886b563a" data-execution_count="11">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1">agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent()</span>
<span id="cb10-2"></span>
<span id="cb10-3">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> agent.chat(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I have 4 apples. How many do you have?"</span>)</span>
<span id="cb10-4"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(response.content[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].text)</span>
<span id="cb10-5"></span>
<span id="cb10-6">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> agent.chat(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I ate 1 apple. How many are left?"</span>)</span>
<span id="cb10-7"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(response.content[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].text)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>I don't have any apples - as an AI, I don't have a physical form and can't possess physical objects like apples. You have 4 apples, and I have 0 apples.

Is there something you'd like to do with your 4 apples, like a math problem or recipe suggestion?
Let me solve this step by step:

**Step 1:** Identify the starting amount
- You started with 4 apples

**Step 2:** Identify what was consumed
- You ate 1 apple

**Step 3:** Calculate the remaining amount
- Apples left = Starting amount - Apples eaten
- Apples left = 4 - 1 = 3

**Answer:** You have 3 apples left.</code></pre>
</div>
</div>
<p>As you can see, the agent is now able to hold a conversation and to reference previous information.</p>
<p>But what happens, if you task the agent with a little more complex math problem?</p>
<div id="673a59df" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="a1e56aa1-8edb-4f5c-cec7-d472dc292764" data-execution_count="12">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb12-1">agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent()</span>
<span id="cb12-2"></span>
<span id="cb12-3">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> agent.chat(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"What is 157.09 * 493.89?"</span>)</span>
<span id="cb12-4"></span>
<span id="cb12-5"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(response.content[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].text)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>I'll solve this step by step using the standard multiplication algorithm.

157.09 × 493.89

First, let me multiply 157.09 by each digit of 493.89:

**Step 1:** 157.09 × 9 (ones place)
157.09 × 9 = 1,413.81

**Step 2:** 157.09 × 80 (tens place)
157.09 × 8 = 1,256.72
1,256.72 × 10 = 12,567.2

**Step 3:** 157.09 × 300 (hundreds place)
157.09 × 3 = 471.27
471.27 × 100 = 47,127

**Step 4:** 157.09 × 90,000 (ten-thousands place)
157.09 × 9 = 1,413.81
1,413.81 × 10,000 = 14,138,100

**Step 5:** 157.09 × 400,000 (hundred-thousands place)
157.09 × 4 = 628.36
628.36 × 100,000 = 62,836,000

**Step 6:** Add all partial products:
```
    1,413.81
   12,567.2
   47,127
14,138,100
62,836,000
-----------
77,035,208.01
```

Therefore, **157.09 × 493.89 = 77,035.2081**</code></pre>
</div>
</div>
<p>The agent’s answer sounds perfectly believable but if you validate it, you can actually see that even powerful LLMs like Claude 4 Sonnet can still make arithmetic errors without tools.</p>
<div id="181d79_peYBa" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="417b038f-6699-48c1-aa0a-4b8170a18e88" data-execution_count="13">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb14-1"><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">157.09</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">493.89</span></span></code></pre></div></div>
<div class="cell-output cell-output-display" data-execution_count="13">
<pre><code>77585.1801</code></pre>
</div>
</div>
</section>
<section id="component-3-tool-use" class="level3">
<h3 class="anchored" data-anchor-id="component-3-tool-use">Component 3: Tool Use</h3>
<p>To extend the agent’s capabilities, you can provide it with tools that can range from simple functions to using external APIs. For this tutorial, we will implement a simple <code>CalculatorTool</code> class, that can handle math problems.</p>
<p>The exact implemention of tool use is different across providers, but at the core always requires two key components:</p>
<ul>
<li><strong>Function implementation:</strong> This is the actual function that executes the tool’s logic, such as performing a calculation, or making an API call.</li>
<li><strong>Tool schema:</strong> A structured description of the tool. The tool description is important because it tells the LLM what the tool does, when to use it, and what parameters it takes.</li>
</ul>
<p>This tutorial follows the <a href="https://docs.claude.com/en/docs/agents-and-tools/tool-use/overview">Anthropic documentation on tool use</a>. If you’re using a different LLM API than this tutorial, I recommend to check out your LLM providers documentation on tool use.</p>
<div id="200ad721" class="cell" data-execution_count="14">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb16-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> CalculatorTool():</span>
<span id="cb16-2">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""A tool for performing mathematical calculations"""</span></span>
<span id="cb16-3"></span>
<span id="cb16-4">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> get_schema(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>):</span>
<span id="cb16-5">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> {</span>
<span id="cb16-6">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"name"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"calculator"</span>,</span>
<span id="cb16-7">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"description"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Performs basic mathematical calculations, use also for simple additions"</span>,</span>
<span id="cb16-8">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"input_schema"</span>: {</span>
<span id="cb16-9">                <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"object"</span>,</span>
<span id="cb16-10">                <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"properties"</span>: {</span>
<span id="cb16-11">                    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"expression"</span>: {</span>
<span id="cb16-12">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"string"</span>,</span>
<span id="cb16-13">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"description"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Mathematical expression to evaluate (e.g., '2+2', '10*5')"</span></span>
<span id="cb16-14">                    }</span>
<span id="cb16-15">                },</span>
<span id="cb16-16">                <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"required"</span>: [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"expression"</span>]</span>
<span id="cb16-17">            }</span>
<span id="cb16-18">        }</span>
<span id="cb16-19"></span>
<span id="cb16-20">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> execute(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, expression):</span>
<span id="cb16-21">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb16-22"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Evaluate mathematical expressions.</span></span>
<span id="cb16-23"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        </span><span class="al" style="color: #AD0000;
background-color: null;
font-style: inherit;">WARNING</span><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">: This tutorial uses eval() for simplicity but it is not recommended for production use.</span></span>
<span id="cb16-24"></span>
<span id="cb16-25"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb16-26"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            expression (str): The mathematical expression to evaluate</span></span>
<span id="cb16-27"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Returns:</span></span>
<span id="cb16-28"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            float: The result of the evaluation</span></span>
<span id="cb16-29"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb16-30">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">try</span>:</span>
<span id="cb16-31">            result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">eval</span>(expression)</span>
<span id="cb16-32">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"result"</span>: result}</span>
<span id="cb16-33">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">except</span>:</span>
<span id="cb16-34">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"error"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Invalid mathematical expression"</span>}</span></code></pre></div></div>
</div>
<p>Note, that in this tutorial, we are just implementing a single tool. In production code, you’d typically use an abstract base class to ensure a consistent interface across tools.</p>
<p>Let’s test if the calculator function works.</p>
<div id="B6n4v-kukkR7" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="f9e054bb-3d40-4f5f-b936-78a006f56dbf" data-execution_count="15">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb17-1">calculator_tool <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> CalculatorTool()</span>
<span id="cb17-2"></span>
<span id="cb17-3">calculator_tool.execute(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"157.09 * 493.89"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-display" data-execution_count="15">
<pre><code>{'result': 77585.1801}</code></pre>
</div>
</div>
<p>Now that we have a <code>CalculatorTool</code>, let’s add tool use capabilities to our agent, in three steps:</p>
<ol type="1">
<li>Add <code>tools</code> and <code>tool_map</code> attributes to store available tools</li>
<li>Add the private <code>_get_tool_schemas()</code> method to extract tool schemas</li>
<li>Add tool handling logic to the <code>create</code> method to detect tool use</li>
</ol>
<div id="dhY1LyDlkrMH" class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb19-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> Agent:</span>
<span id="cb19-2">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""A simple AI agent that can use tools to answer questions in a multi-turn conversation"""</span></span>
<span id="cb19-3"></span>
<span id="cb19-4">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, tools):</span>
<span id="cb19-5">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.client <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> anthropic.Anthropic(api_key<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>os.getenv(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ANTHROPIC_API_KEY"</span>))</span>
<span id="cb19-6">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"claude-sonnet-4-20250514"</span></span>
<span id="cb19-7">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.system_message <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"You are a helpful assistant that breaks down problems into steps and solves them systematically."</span></span>
<span id="cb19-8">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.messages <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb19-9">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.tools <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tools</span>
<span id="cb19-10">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.tool_map <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {tool.get_schema()[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"name"</span>]: tool <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tool <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> tools}</span>
<span id="cb19-11"></span>
<span id="cb19-12">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _get_tool_schemas(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>):</span>
<span id="cb19-13">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""Get tool schemas for all registered tools"""</span></span>
<span id="cb19-14">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> [tool.get_schema() <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tool <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.tools]</span>
<span id="cb19-15"></span>
<span id="cb19-16">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> chat(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, message):</span>
<span id="cb19-17">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""Process a user message and return a response"""</span></span>
<span id="cb19-18"></span>
<span id="cb19-19">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Store user input in short-term memory</span></span>
<span id="cb19-20">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.messages.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: message})</span>
<span id="cb19-21"></span>
<span id="cb19-22">        response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.client.messages.create(</span>
<span id="cb19-23">            model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model,</span>
<span id="cb19-24">            max_tokens<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1024</span>,</span>
<span id="cb19-25">            system<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.system_message,</span>
<span id="cb19-26">            tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._get_tool_schemas() <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.tools <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>,</span>
<span id="cb19-27">            messages<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.messages,</span>
<span id="cb19-28">            temperature<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>,</span>
<span id="cb19-29">        )</span>
<span id="cb19-30"></span>
<span id="cb19-31">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Store assistant's response in short-term memory</span></span>
<span id="cb19-32">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.messages.append({<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"assistant"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: response.content})</span>
<span id="cb19-33"></span>
<span id="cb19-34">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> response</span></code></pre></div></div>
</div>
<p>Let’s give it a try.</p>
<div id="f2095cc8" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="3ba2efc6-e8d6-4c97-deb8-a6e6ebfc055a" data-execution_count="17">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb20" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb20-1">calculator_tool <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> CalculatorTool()</span>
<span id="cb20-2">agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[calculator_tool])</span>
<span id="cb20-3"></span>
<span id="cb20-4">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> agent.chat(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"What is 157.09 * 493.89?"</span>)</span>
<span id="cb20-5"></span>
<span id="cb20-6"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> block <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response:</span>
<span id="cb20-7">  <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(block)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>('id', 'msg_01BzC2FerKEr8rC1wGfaMiNK')
('content', [TextBlock(citations=None, text="I'll calculate 157.09 * 493.89 for you.", type='text'), ToolUseBlock(id='toolu_017NhVhd5wYWdEw7fFRPHyXL', input={'expression': '157.09 * 493.89'}, name='calculator', type='tool_use')])
('model', 'claude-sonnet-4-20250514')
('role', 'assistant')
('stop_reason', 'tool_use')
('stop_sequence', None)
('type', 'message')
('usage', Usage(cache_creation=CacheCreation(ephemeral_1h_input_tokens=0, ephemeral_5m_input_tokens=0), cache_creation_input_tokens=0, cache_read_input_tokens=0, input_tokens=433, output_tokens=77, server_tool_use=None, service_tier='standard'))</code></pre>
</div>
</div>
<p>As you can see in the response, the agent answers with “I’ll calculate 157.09 * 493.89 for you.” but instead of calculating the expression itself, it stops with <code>stop_reason</code> being <code>tool_use</code>. This means, that the agent is waiting for the user to execute the tool and return the result from the tool to the agent.</p>
<p>But now, the agent has responded that it needs help with executing the tool and is waiting. This is where the final component of the loop comes into play.</p>
</section>
<section id="component-4-agent-loop" class="level3">
<h3 class="anchored" data-anchor-id="component-4-agent-loop">Component 4: Agent Loop</h3>
<p>You might have already heard people say that <a href="https://youtu.be/D7_ipDqhtwk?si=KCpWMv_Heux3PVeK&amp;t=356">“Agents are models using tools in a loop”</a>. Without the loop, the agent can only handle single-turn without multi-turn interactions.</p>
<p>I really like this pseudo code by <a href="https://youtu.be/D7_ipDqhtwk?si=KCpWMv_Heux3PVeK&amp;t=356">Barry Zhan, Anthropic</a>, showing that agents are just LLMs making decisions in a loop, observing results, and deciding what to do next.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb22" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb22-1">env <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Environment()</span>
<span id="cb22-2">tools <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Tools(env)</span>
<span id="cb22-3">system_prompt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Goals, constraints, and how to act"</span></span>
<span id="cb22-4"></span>
<span id="cb22-5"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">while</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>:</span>
<span id="cb22-6">  action <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> llm.run(system_prompt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> env.state)</span>
<span id="cb22-7">  env.state <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tools.run(action)</span></code></pre></div></div>
<p>For this simple agent implementation, that means, we have the following flow:</p>
<ol type="1">
<li>User sends message to agent</li>
<li>Agent decides it needs a tool and responds with a <code>stop_reason</code> of <code>tool_use</code> and a <code>tool_use</code>block with the tool name and parameters. It’s saying “I’m pausing for you to execute this tool with these parameters”.</li>
<li>The user executes the tool and sends the tool result back to the agent in a follow-up message</li>
<li>The agent continues and gives the final response.</li>
</ol>
<div id="ir0vAGMPf1CV" class="cell" data-execution_count="18">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb23-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> json</span>
<span id="cb23-2"></span>
<span id="cb23-3"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> run_agent(user_input, max_turns<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>):</span>
<span id="cb23-4">  calculator_tool <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> CalculatorTool()</span>
<span id="cb23-5">  agent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Agent(tools<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[calculator_tool])</span>
<span id="cb23-6"></span>
<span id="cb23-7">  i <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span></span>
<span id="cb23-8"></span>
<span id="cb23-9">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">while</span> i <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> max_turns: <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># It's safer to use max_turns rather than while True</span></span>
<span id="cb23-10">    i <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb23-11">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">Iteration </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>i<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">:"</span>)</span>
<span id="cb23-12"></span>
<span id="cb23-13">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"User input: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>user_input<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb23-14">    response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> agent.chat(user_input)</span>
<span id="cb23-15">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Agent output: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>response<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>content[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>text<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb23-16"></span>
<span id="cb23-17">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Handle tool use if present</span></span>
<span id="cb23-18">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> response.stop_reason <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_use"</span>:</span>
<span id="cb23-19"></span>
<span id="cb23-20">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Process all tool uses in the response</span></span>
<span id="cb23-21">        tool_results <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb23-22">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> content_block <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> response.content:</span>
<span id="cb23-23">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> content_block.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">type</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_use"</span>:</span>
<span id="cb23-24">                tool_name <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> content_block.name</span>
<span id="cb23-25">                tool_input <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> content_block.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">input</span></span>
<span id="cb23-26"></span>
<span id="cb23-27">                <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Using tool </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>tool_name<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> with input </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>tool_input<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb23-28"></span>
<span id="cb23-29">                <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Execute the tool</span></span>
<span id="cb23-30">                tool <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> agent.tool_map[tool_name]</span>
<span id="cb23-31">                tool_result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tool.execute(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span>tool_input)</span>
<span id="cb23-32"></span>
<span id="cb23-33">                tool_results.append({</span>
<span id="cb23-34">                    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_result"</span>,</span>
<span id="cb23-35">                    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tool_use_id"</span>: content_block.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">id</span>,</span>
<span id="cb23-36">                    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: json.dumps(tool_result)</span>
<span id="cb23-37">                })</span>
<span id="cb23-38">                <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Tool result: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>tool_result<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb23-39"></span>
<span id="cb23-40">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Add tool results to conversation</span></span>
<span id="cb23-41">        user_input <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tool_results</span>
<span id="cb23-42">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb23-43">      <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> response.content[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].text</span>
<span id="cb23-44"></span>
<span id="cb23-45">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span></span></code></pre></div></div>
</div>
</section>
</section>
<section id="testing-the-implemented-ai-agent" class="level2">
<h2 class="anchored" data-anchor-id="testing-the-implemented-ai-agent">Testing the Implemented AI Agent</h2>
<p>Let’s test the implemented AI agent with a few example test cases.</p>
<section id="test-1-general-question-no-tool-use" class="level3">
<h3 class="anchored" data-anchor-id="test-1-general-question-no-tool-use">Test 1: General question (no tool use)</h3>
<p>This test demonstrates the agent’s ability to answer a simple, general question that does not require the use of any external tools.</p>
<div id="loZgyGmZ4YNr" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="59fff72f-12ec-44c1-b9da-93ba8d37ecf2" data-execution_count="19">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb24" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb24-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> run_agent(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I have 4 apples. How many do you have?"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
Iteration 1:
User input: I have 4 apples. How many do you have?
Agent output: I don't have any apples since I'm an AI assistant - I don't have a physical form or possessions. But I can help you with calculations involving your 4 apples if you need!

Is there something specific you'd like to calculate or figure out with your 4 apples?</code></pre>
</div>
</div>
</section>
<section id="test-2-tool-use" class="level3">
<h3 class="anchored" data-anchor-id="test-2-tool-use">Test 2: Tool Use</h3>
<p>This test demonstrates how the agent understands that it needs to use a tool to to solve a specific task and uses the <code>CalculatorTool</code> to get the correct result.</p>
<div id="f119QvtT4Wzi" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="7c3d1556-40b9-40b8-e0cc-f024562fa85e" data-execution_count="20">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb26" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb26-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> run_agent(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"What is 157.09 * 493.89?"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
Iteration 1:
User input: What is 157.09 * 493.89?
Agent output: I'll calculate 157.09 * 493.89 for you.
Using tool calculator with input {'expression': '157.09 * 493.89'}
Tool result: {'result': 77585.1801}

Iteration 2:
User input: [{'type': 'tool_result', 'tool_use_id': 'toolu_01FC9yLWt2Cf6a8zLGhj7ZJz', 'content': '{"result": 77585.1801}'}]
Agent output: The result of 157.09 * 493.89 is **77,585.1801**.</code></pre>
</div>
</div>
</section>
<section id="test-3-step-by-step-tool-use" class="level3">
<h3 class="anchored" data-anchor-id="test-3-step-by-step-tool-use">Test 3: Step-by-step tool use</h3>
<p>This test demonstrates the agent’s ability to break down a more complex problem into smaller steps and use the <code>CalculatorTool</code> multiple times within a single conversation to arrive at the final answer.</p>
<div id="-Lkv6mI94Zmr" class="cell" data-quarto-private-1="{&quot;key&quot;:&quot;colab&quot;,&quot;value&quot;:{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}}" data-outputid="a2fbb58d-a7bb-426a-eebc-76eb44cd0d54" data-execution_count="21">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb28" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb28-1">response <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> run_agent(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"If my brother is 32 years younger than my mother and my mother is 30 years older than me and I am 20, how old is my brother?"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
Iteration 1:
User input: If my brother is 32 years younger than my mother and my mother is 30 years older than me and I am 20, how old is my brother?
Agent output: I'll solve this step by step using the given information.

Given:
- You are 20 years old
- Your mother is 30 years older than you
- Your brother is 32 years younger than your mother

Let me calculate your mother's age first:
Using tool calculator with input {'expression': '20 + 30'}
Tool result: {'result': 50}

Iteration 2:
User input: [{'type': 'tool_result', 'tool_use_id': 'toolu_01WPMQRzCi4roua9vQ7qXeCR', 'content': '{"result": 50}'}]
Agent output: So your mother is 50 years old.

Now I'll calculate your brother's age:
Using tool calculator with input {'expression': '50 - 32'}
Tool result: {'result': 18}

Iteration 3:
User input: [{'type': 'tool_result', 'tool_use_id': 'toolu_01UL7n7a85XJUn7Tgk8kiHhX', 'content': '{"result": 18}'}]
Agent output: Your brother is 18 years old.

To summarize:
- You: 20 years old
- Your mother: 50 years old (30 years older than you)
- Your brother: 18 years old (32 years younger than your mother)</code></pre>
</div>
</div>
</section>
</section>
<section id="summary" class="level2">
<h2 class="anchored" data-anchor-id="summary">Summary</h2>
<p>This tutorial showed you how you can implement a minimal AI agent from scratch using just an LLM API without any frameworks. Hopefully, you now understand the fundamentals of what happens under the hood of an AI agent and what people mean, when they say “Agents are models using tools in a loop”.</p>
<p>You can find this notebook in this <a href="https://github.com/iamleonie/website/blob/main/blog/ai-agent-from-scratch-in-python.ipynb">GitHub repository</a></p>
<p>As a next step, you can refer to the following resources to learn more about how to implement different agent workflows.</p>
</section>
<section id="resources" class="level2">
<h2 class="anchored" data-anchor-id="resources">Resources</h2>
<ul>
<li><a href="https://github.com/anthropics/claude-cookbooks/tree/main/patterns/agents">Anthropic’s Building Effective Agents Cookbook</a></li>
<li><a href="https://www.youtube.com/watch?v=mYo7UFwnW1k">Build an AI Agent from SCRATCH with Python! (No Frameworks) by Aaron Dunn</a></li>
<li><a href="https://github.com/daveebbelaar/ai-cookbook/tree/main/patterns">Building Effective LLM Workflows in Pure Python by Dave Ebbelaar</a></li>
</ul>


</section>

<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <guid>https://www.leoniemonigatti.com/blog/ai-agent-from-scratch-in-python.html</guid>
  <pubDate>Tue, 30 Sep 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>First impressions from testing 4 Coding Agents with Jupyter Notebooks</title>
  <link>https://www.leoniemonigatti.com/blog/coding-agent-jupyter-notebook.html</link>
  <description><![CDATA[ 





<p>Say what you will about Jupyter Notebooks, but I think they are an incredible medium for learning and quick experimentation. I use Jupyter Notebooks all the time for my work and personal use. So, naturally, I was curious when I read that you could <a href="https://www.anthropic.com/engineering/claude-code-best-practices">use Claude Code with Jupyter Notebooks</a>.</p>
<p>In this article, I share my first impressions, tips, and frustrations from experimenting with the following four coding agents for Jupyter Notebooks:</p>
<ul>
<li>CLI Agents (Claude Code and Gemini CLI)</li>
<li>Cursor (using <code>claude-3.5-sonnet</code>) with and without CLI coding agents</li>
<li>Gemini within Google Colab</li>
</ul>
<p><em>Note that these coding agents are improving so rapidly that the contents of this article might already be outdated by the time you read this.</em></p>
<section id="the-challenges-of-working-with-jupyter-notebooks" class="level2">
<h2 class="anchored" data-anchor-id="the-challenges-of-working-with-jupyter-notebooks">The challenges of working with Jupyter Notebooks</h2>
<p>Working with Jupyter Notebooks is different than working with “regular” code because it doesn’t only serve a functional purpose but also has an interfacing component (text and visuals) to it. Thus, the user experience of different coding agents with Jupyter Notebooks will be different than when you work on “regular” coding projects.</p>
<p>This means a coding agent that works well with common programming tasks might not work well with Jupyter Notebooks and vice versa. This section discusses challenges specific to Jupyter Notebooks and how the four contenders handled them.</p>
<section id="set-up-and-ux" class="level3">
<h3 class="anchored" data-anchor-id="set-up-and-ux">Set up and UX</h3>
<p>Whichever coding agent you’re using, you’ll always have the Jupyter Notebook and the chat interface open somehow:</p>
<ul>
<li>In Google Colab, a nice little chat interface with Gemini is at the bottom of your window.</li>
<li>When using a CLI agent, <a href="https://www.anthropic.com/engineering/claude-code-best-practices">Anthropic recommends having Claude Code and the Notebook open side-by-side</a> in your editor, such as VS Code or Cursor.</li>
</ul>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Tip</span>Editor Layout
</div>
</div>
<div class="callout-body-container callout-body">
<p>A great <a href="https://x.com/radekosmulski/status/1927855504140775725">tip from Radek Osmulski</a> is to change the layout of your code editor.</p>
<p>Usually, the terminal is on the bottom by default. But for an optimized user experience, it’s a great idea to move the terminal with the CLI agent to the side so that you have your Jupyter Notebook on one side and the terminal with your CLI coding agent open on the other.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.leoniemonigatti.com/blog/images/claude_code_cursor_setup.webp" class="img-fluid figure-img"></p>
<figcaption>Side-by-side setup with Jupyter Notebook and Claude Code in Cursor</figcaption>
</figure>
</div>
</div>
</div>
<p>One thing you have to be careful with if you and a CLI coding agent are collaborating on a Jupyter Notebook is that you need to be cautious that you’re not overwriting each other’s work. So, save every manual change in the Notebook before prompting the CLI agent to apply any additional changes. Also, after the CLI agent has made any changes, you often need to reload the Notebook to see the changes.</p>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Tip</span>VS Code Extension
</div>
</div>
<div class="callout-body-container callout-body">
<p>There’s a <a href="https://x.com/radekosmulski/status/1929018342586655015">VS Code extension by Radek Osmulski</a> that automatically reloads the Jupyter Notebooks for this purpose.</p>
</div>
</div>
</section>
<section id="cell-operations" class="level3">
<h3 class="anchored" data-anchor-id="cell-operations">Cell operations</h3>
<p>The cells of a Jupyter Notebook make them different from regular code files. While regular coding involves files containing code, Jupyter Notebooks consist of text and code cells. These cells must be created, edited, moved around, and deleted.</p>
<p>However, because these terminal agents are intended for writing and editing code (and text), they have quite a few limitations when it comes to this characteristic of Jupyter Notebooks:</p>
<ul>
<li><strong>Creating new cells:</strong> Out of the coding agents, only Claude Code was able to not only create cells but also place them where I wanted them. While Gemini CLI wasn’t able to generate cells at all (although I’m sure that will change soon), Gemini within Colab was able to create new cells but always appended them at the end of the Notebook.</li>
<li><strong>Moving cells:</strong> Going a step further, Claude Code was the only contender for testing the ability to move cells around. This only works by copying/pasting the contents to a new cell and then deleting the old cell.</li>
<li><strong>Convert between Code and Markdown cells:</strong> None of the tested coding agents were able to convert between Code and Markdown cells.</li>
</ul>
</section>
<section id="editing-cell-contents" class="level3">
<h3 class="anchored" data-anchor-id="editing-cell-contents">Editing cell contents</h3>
<p>What I noticed to be difficult was telling the coding agent which cell you want to modify. Here’s what worked, what worked somewhat, and what didn’t work:</p>
<p>Let’s start with <strong>what worked</strong>. Identifying the cell’s contents by describing it works well, but it is not a nice user experience. Additionally, Claude Code and Gemini in Google Colab are able to identify cells by their number or ID. You can prompt them with something like this:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"Edit the contents of the third cell."</span></span></code></pre></div></div>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Tip</span>Cell Identifiers
</div>
</div>
<div class="callout-body-container callout-body">
<p>I like this <a href="https://x.com/rashmigb/status/1944694560677843382">tip of adding identifying top-level headings to text cells and comments to code cells</a> to help the CLI agents identify the cell you’re talking about.</p>
</div>
</div>
<p>What <strong>works somewhat</strong>is that, alternatively, if you’re using something like Cursor, you can select a cell and have an AI assistant edit its contents with Command + K. However, I noticed that when prompting it to add text, it would always add a hash symbol before a text cell. This is not ideal because that means you have to manually remove the hash symbol - otherwise, your text is rendered as a heading.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.leoniemonigatti.com/blog/images/cursor_text_generation_jupyter_notebook.webp" class="img-fluid figure-img"></p>
<figcaption>Text generation in Jupyter Notebook with Cursor and Claude 3.5 Sonnet</figcaption>
</figure>
</div>
<p><strong>What (obviously) didn’t work</strong> was saying something like “Edit the contents of the cell where my cursor is”, but I think that would be a nice feature.</p>
</section>
<section id="text-generation" class="level3">
<h3 class="anchored" data-anchor-id="text-generation">Text generation</h3>
<p>Although you’d assume coding agents are specialized in writing code, my first impression was that all I tried were good at generating text.</p>
</section>
<section id="code-execution-error-handling" class="level3">
<h3 class="anchored" data-anchor-id="code-execution-error-handling">Code execution &amp; error handling</h3>
<p>Not all coding agents can run code cells with the code they’ve written and self-correct them. While CLI agents can perform things like Git commands and run Python scripts, they are unfortunately not able to execute code cells in Jupyter Notebooks.</p>
<p>In contrast, Gemini in Google Colab is not only able to create new cells with code but also to run them. And on top of that, if the executed cell produces an error, Gemini revises that cell’s code, which was a pleasant user experience.</p>
</section>
</section>
<section id="use-cases" class="level2">
<h2 class="anchored" data-anchor-id="use-cases">Use Cases</h2>
<p>I use Jupyter Notebooks for different use cases, each with its own challenges. This section discusses my three most common use cases for coding agents in Jupyter Notebooks: helping with writing coding tutorials, exploring data, and cleaning up Notebooks.</p>
<section id="coding-tutorials" class="level3">
<h3 class="anchored" data-anchor-id="coding-tutorials">Coding tutorials</h3>
<p>Coding tutorials or explanations of technical concepts with code require writing both code and text that fit and weave together. You can either write code and text in parallel or sequentially.</p>
<p>Tasking a coding agent to write code and text <strong>in parallel</strong> worked quite well for me if each task is small enough (e.g., connecting to a database instance and checking the connection), with the following prompt template:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"Do XYZ. Add explanations"</span></span></code></pre></div></div>
<p>However, I often like to do it <strong>sequentially</strong>: First, I write the code to experiment and play around with it. By the time I have the code cells how I like them, writing the text explanations for each cell feels like a tedious task that I’d love to automate.</p>
<p>The following instruction worked well with Claude Code, which created a plan with one task for each code cell that needed text added, and then went ahead and added those text cells at the right place. Gemini in Colab, on the other hand, did something similar; however, it wasn’t able to add the text cells in the correct positions, so it appended all of them at the end of the Notebook.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"Add explanations above each code cell."</span></span></code></pre></div></div>
</section>
<section id="exploratory-data-analysis" class="level3">
<h3 class="anchored" data-anchor-id="exploratory-data-analysis">Exploratory Data Analysis</h3>
<p>On the other hand, exploratory data analysis requires data processing and visualization and the extraction of insights from those visualizations.</p>
<p>I used Gemini in Google Colab to do some classical exploratory data analysis. It worked surprisingly well. Even with slightly more complex tasks, which required aggregating and pivoting the Pandas DataFrame to visualize the data in a heatmap, Gemini was able to accomplish this task on the first shot by creating a plan and then working through the to-do list one by one.</p>
<p>What surprised me the most was that Gemini also summarized the findings at the end of the analysis without explicit prompting.</p>
</section>
<section id="notebook-clean-up" class="level3">
<h3 class="anchored" data-anchor-id="notebook-clean-up">Notebook clean up</h3>
<p>This is the part I was most excited about:</p>
<blockquote class="blockquote">
<p>You can also ask Claude to clean up or make aesthetic improvements to your Jupyter Notebook before you show it to colleagues. Specifically, <strong>telling it to make the Notebook or its data visualizations “aesthetically pleasing” tends to help remind it that it’s optimizing for a human viewing experience</strong>.</p>
</blockquote>
<p>So, I tried the following instruction:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"Can you make this notebook aesthetically pleasing?"</span></span></code></pre></div></div>
<p>Claude Code analyzed the current state of the Notebook, made a plan of suggested changes, like adding headers and text, and started executing the tasks. I liked that Claude Code goes in small snippets for you to review and reject if you don’t like it.</p>
<p>Since the definition of “aesthetically pleasing” depends on personal preferences and these coding agents are probably trained on a large corpus of Jupyter Notebooks using lots of Emojis, Claude Code added a lot of emojis (especially to the headings) to my Notebook. Luckily, if you’re like me and prefer few or no emojis, you can specify your preferences in the Claude.md or Gemini.md files.</p>
<p>Unfortunately, when I tried the same instruction with Gemini in Google Colab, it only responded with the following answer, followed by some tips on how to make your Notebook more aesthetically pleasing.</p>
<blockquote class="blockquote">
<p>“I can’t directly change the aesthetic of the Notebook for you, as that often involves personal preference and visual styling that’s outside of my capabilities.</p>
<p>However, I can give you some tips and show you how to use Markdown and code comments effectively to make your Notebook more organized and visually appealing:”</p>
</blockquote>
</section>
</section>
<section id="summary" class="level2">
<h2 class="anchored" data-anchor-id="summary">Summary</h2>
<p>These are my first impressions of playing around with different AI-assisted coding for Jupyter Notebooks. I have yet to dive deeper into various aspects, such as providing the coding agents access to specific documentation or refining prompts. So far, I’ve found Gemini from within the Google Colab environment to be the most user-friendly experience, but Claude Code, together with Cursor, also has some advantages depending on what you’re doing.</p>
<p>I’m excited to see how these tools evolve over time (I’m sure by the time you are reading this, they probably have already changed a lot of the behavior of these assistants).</p>
<p>Here are a few things I haven’t tried yet for working with coding agents on Jupyter Notebooks:</p>
<ul>
<li>Try a <a href="https://x.com/DynamicWebPaige/status/1937876922681487408">Jupyter MCP</a></li>
<li>Try <a href="https://x.com/tkeyo_/status/1944728071342318027">GitHub Copilot in VSCode</a></li>
<li><a href="https://x.com/DynamicWebPaige/status/1937876922681487408">Connect Gemini CLI with your Google Colab terminal</a></li>
</ul>


</section>

<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <guid>https://www.leoniemonigatti.com/blog/coding-agent-jupyter-notebook.html</guid>
  <pubDate>Mon, 28 Jul 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>37 Things I Learned About Information Retrieval in Two Years at a Vector Database Company</title>
  <link>https://www.leoniemonigatti.com/blog/what_i_learned.html</link>
  <description><![CDATA[ 





<p>Today I’m celebrating my two-year work anniversary at <a href="https://weaviate.io">Weaviate</a>, a vector database company. To celebrate, I want to reflect on what I’ve learned about vector databases and search during this time. Here are some of the things I’ve learned and some common misconceptions I see:</p>
<ol type="1">
<li><p><strong>BM25 is a strong baseline for search.</strong> Ha! You thought I would start with something about vector search, and here I am talking about keyword search. And that is exactly the first lesson: Start with something simple like BM25 before you move on to more complex things like vector search.</p></li>
<li><p><strong>Vector search in vector databases is <em>approximate</em> and <em>not exact</em>.</strong> In theory, you could run a brute-force search to compute distances between a query vector and every vector in the database using exact k-nearest neighbors (KNN). But this doesn’t scale well. That’s why vector databases use Approximate Nearest Neighbor (ANN) algorithms, like HNSW, IVF, or ScaNN, to speed up search while trading off a small amount of accuracy. Vector indexing is what makes vector databases so fast at scale.</p></li>
<li><p><strong>Vector databases don’t only store embeddings.</strong> They also store the original object (e.g., the text from which you generated the vector embeddings) and metadata. This allows them to support other features beyond vector search, like metadata filtering and keyword and hybrid search.</p></li>
<li><p><strong>Vector databases’ main application is not in generative AI.</strong> It’s in search. But finding relevant context for LLMs is ‘search’. That’s why vector databases and LLMs go together like cookies and cream.</p></li>
<li><p><strong>You have to specify how many results you want to retrieve.</strong> When I think back, I almost have to laugh because this was such a big “aha” moment when I realized that you need to define the maximum number of results you want to retrieve. It’s a little oversimplified, but vector search would return all the objects, stored in the database sorted by the distance to your query vector, if there weren’t a <code>limit</code> or <code>top_k</code> parameter.</p></li>
<li><p><strong>There are many different types of embeddings.</strong> When you think of a vector embedding, you probably visualize something like [-0.9837, 0.1044, 0.0090, …, -0.2049]. That’s called a dense vector, and it is the most commonly used type of vector embedding. But there’s also many other types of vectors, such as sparse ([0, 2, 0, …, 1]), binary ([0, 1, 1, …, 0]), and multi-vector embeddings ([[-0.9837, …, -0.2049], [ 0.1044, …, 0.0090], …, [-0.0937, …, 0.5044]]), which can be used for different purposes.</p></li>
<li><p><strong>Fantastic embedding models and where to find them.</strong> The first place to go is the <a href="https://huggingface.co/spaces/mteb/leaderboard">Massive Text Embedding Benchmark (MTEB)</a>. It covers a wide range of different tasks for embedding models, including classification, clustering, and retrieval. If you’re focused on information retrieval, you might want to check out <a href="https://github.com/beir-cellar/beir">BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models</a>.</p></li>
<li><p><strong>The majority of embedding models on MTEB are English.</strong> If you’re working with multilingual or non-English languages, it might be worth checking out <a href="https://arxiv.org/html/2502.13595v1">MMTEB (Massive Multilingual Text Embedding Benchmark)</a>.</p></li>
<li><p><strong>A little history on vector embeddings:</strong> Before there were today’s contextual embeddings (e.g., BERT), there were static embeddings (e.g., Word2Vec, GloVe). They are static because each word has a fixed representation, while contextual embeddings generate different representations for the same word based on the surrounding context. Although today’s contextual embeddings are much more expressive, static embeddings can be helpful in computationally restrained environments because they can be looked up from pre-computed tables.</p></li>
<li><p><strong>Don’t confuse sparse vectors and sparse embeddings.</strong> It took me a while until I understood that sparse vectors can be generated in different ways: Either by applying statistical scoring functions like TF-IDF or BM25 to term frequencies (often retrieved via inverted indexes), or with neural sparse embedding models like SPLADE. That means a sparse embedding is a sparse vector, but not all sparse vectors are necessarily sparse embeddings.</p></li>
<li><p><strong>Embed all the things.</strong> Embeddings aren’t just for text. You can embed images, PDFs as images (see <a href="https://arxiv.org/abs/2407.01449">ColPali</a>), graphs, etc. And that means you can do vector search over multimodal data. It’s pretty incredible. You should try it sometime.</p></li>
<li><p><strong>The economics of vector embeddings.</strong> This shouldn’t be a surprise, but the vector dimensions will impact the required storage cost. So, consider whether it is worth it before you choose an embedding model with 1536 dimensions over one with 768 dimensions and risk doubling your storage requirements. Yes, more dimensions capture more semantic nuances. But you probably don’t need 1536 dimensions to “chat with your docs”. Some models actually use <a href="https://arxiv.org/abs/2205.13147">Matryoshka Representation Learning</a> to allow you to shorten vector embeddings for environments with less computational resources, with minimal performance losses.</p></li>
<li><p>Speaking of: <strong>“Chat with your docs” tutorials are the “Hello world” programs of Generative AI.</strong> &lt;EOS&gt;</p></li>
<li><p><strong>You need to call the embedding model A LOT.</strong> Just because you embedded your documents during the ingestion stage, doesn’t mean you’re done calling the embedding model. Every time you run a search query, the query must also be embedded (if you’re not using a cache). If you’re adding objects later on, those must also be embedded (and indexed). If you’re changing the embedding model, you must also re-embed (and re-index) everything.</p></li>
<li><p><strong>Similar does not necessarily mean relevant.</strong> Vector search returns objects by their similarity to a query vector. The similarity is measured by their proximity in vector space. Just because two sentences are similar in vector space (e.g., “How to fix a faucet” and “Where to buy a kitchen faucet”) does not mean they are relevant to each other.</p></li>
<li><p><strong>Cosine similarity and cosine distance are not the same thing.</strong> But they are related to each other (<img src="https://latex.codecogs.com/png.latex?%5Ctext%7Bcosine%20distance%7D%20=%201-%20%5Ctext%7Bcosine%20similarity%7D">). If you will, distance and similarity are inverses: If two vectors are exactly the same, the similarity is 1 and the distance between them is 0.</p></li>
<li><p><strong>If you’re working with normalized vectors, it doesn’t matter whether you’re using cosine similarity or dot product for the similarity measure.</strong> Because mathematically, they are the same. For the calculation, dot product is more efficient.</p></li>
<li><p><strong>Common misconception: The R in RAG stands for ‘vector search’.</strong> It doesn’t. It stands for ‘retrieval’. And retrieval can be done in many different ways (see following bullets).</p></li>
<li><p><strong>Vector search is just one tool in the retrieval toolbox.</strong> There’s also keyword-based search, filtering, and reranking. It’s not one over the other. To build something great, you will need to combine it with different tools.</p></li>
<li><p><strong>When to use keyword-based search vs.&nbsp;vector-based search:</strong> Does your use case require mainly matching semantics and synonyms (e.g., “pastel colors” vs.&nbsp;“light pink”) or exact keywords (e.g., “A-line skirt”, “peplum dress”)? If it requires both (e.g., “pastel colored A-line skirt”), you might benefit from combining both and using hybrid search. In some implementations (e.g., Weaviate), you can just use the hybrid search function and then use the <code>alpha</code> parameter to change the weighting from pure keyword-based search, a mix of both, to pure vector search.</p></li>
<li><p><strong>Hybrid search can be a hybrid of different search techniques.</strong> Most often, when you hear people talk about hybrid search, they mean the combination of keyword-based search and vector-based search. But the term ‘hybrid’ doesn’t specify which techniques to combine. So, sometimes you might hear people talk about hybrid search, meaning the combination of vector-based search and search over structured data (often referred to as metadata filtering).</p></li>
<li><p><strong>Misconception: Filtering makes vector search faster.</strong> Intuitively, you’d think using a filter should speed up search latency because you’re reducing the number of candidates to search through. But in practice, pre-filtering candidates can, for example, break the graph connectivity in HNSW, and post-filtering can leave you with no results at all. Vector databases have different, sophisticated techniques to handle this challenge.</p></li>
<li><p><strong>Two-stage retrieval pipelines aren’t only for recommendation systems.</strong> Recommendation systems often have a first retrieval stage that uses a simpler retrieval process (e.g., vector search) to reduce the number of potential candidates, which is followed by a second retrieval stage with a more compute-intensive but more accurate reranking stage. You can apply this to your RAG pipeline as well.</p></li>
<li><p><strong>How vector search differs from reranking.</strong> Vector search returns a small portion of results from the entire database. Reranking takes in a list of items and returns the re-ordered list.</p></li>
<li><p><strong>Finding the right chunk size to embed is not trivial.</strong> Too small, and you’ll lose important context. Too big, and you’ll lose semantic meaning. Many embedding models use mean pooling to average all token embeddings into a single vector representation of a chunk. So, if you have an embedding model with a large context window, you can technically embed an entire document. I forgot who said this, but I like this analogy: You can think of it like creating a movie poster for a movie by overlaying every single frame in the movie. All the information is there, but you won’t understand what the movie is about.</p></li>
<li><p><strong>Vector indexing libraries are different from vector databases.</strong> Both are incredibly fast for vector search. Both work really well to showcase vector search in “chat with your docs”-style RAG tutorials. However, only one of them adds data management features, like built-in persistence, CRUD support, metadata filtering, and hybrid search.</p></li>
<li><p><strong>RAG has been dying since the release of the first long-context LLM.</strong> Every time an LLM with a longer context window is released, someone will claim that RAG is dead. It never is…</p></li>
<li><p><strong>You can throw out 97% of the information and still retrieve (somewhat) accurately.</strong> It’s called vector quantization. For example, with binary quantization you can change something like [-0.9837, 0.1044, 0.0090, …, -0.2049] into [0, 1, 1, …, 0] (a 32x storage reduction from 32-bit float to 1-bit) and you’ll be surprised how well retrieval will remain to work (in some use cases).</p></li>
<li><p><strong>Vector search is <em>not</em> robust to typos.</strong> For a while, I thought that vector search was robust to typos because these large corpora of text surely must contain a lot of typos and therefore help the embedding model learn these typos as well. But if you think about it, there’s no way that all the possible typos of a word are reflected in sufficient amounts in the training data. So, while vector search can handle <em>some</em> typos, you can’t really say it is robust to them.</p></li>
<li><p><strong>Knowing when to use which metric to evaluate search results.</strong> There are many different metrics to evaluate search results. Looking at academic benchmarks, like BEIR, you’ll notice that NDCG@k is prominent. But simpler metrics like precision and recall are a great fit for many use cases.</p></li>
<li><p><strong>The precision-recall trade-off</strong> is often depicted with a fisherman’s analogy of casting a net, but this e-commerce analogy made it click better for me: Imagine you have a webshop with 100 books, out of which 10 are ML-related.</p>
<p>Now, if a user searches for ML-related books, you could just return one ML book. Amazing! You have <strong>perfect precision</strong> (out of the k=1 results returned, how many were relevant). But that’s <strong>bad recall</strong> (out of the relevant results that exist, how many did I return? In this case, 1 out of 10 relevant books). And also, that’s not so good for your business. Maybe the user didn’t like that one ML-related book you returned.</p>
<p>On the other side of that extreme is if you return your entire selection of books. All 100 of them. Unsorted… That’s <strong>perfect recall</strong> because you returned all relevant results. It’s just that you also returned a bunch of irrelevant results, which can be measured by how <strong>bad the precision</strong> is.</p></li>
<li><p><strong>There are metrics that include the order.</strong> When I think of search results, I visualize something like a Google search. So, naturally, I thought that the rank of the search results is important. But metrics like precision and recall don’t consider the order of search results. If the order of your search results is important for your use case, you need to choose rank-aware metrics like MRR@k, MAP@k, or NDCG@k.</p></li>
<li><p><strong>Tokenizers matter.</strong> If you’ve been in the Transformer’s bubble too long, you’ve probably forgotten that other tokenizers exist next to Byte-Pair-Encoding (BPE). Tokenizers are also important for keyword search and its search performance. And if the tokenizer impacts the keyword-based search performance, it also impacts the hybrid search performance.</p></li>
<li><p><strong>Out-of-domain is not the same as out-of-vocabulary.</strong> Earlier embedding models used to fail on out-of-vocabulary terms. If your embedding model had never seen or heard of “Labubu”, it would have just run into an error. With smart tokenization, unseen out-of-vocabulary terms can be handled graciously, but the issue is that they are still out-of-domain terms, and therefore, their vector embeddings look like a proper embedding, but they are meaningless.</p></li>
<li><p><strong>Query optimizations:</strong> You know how you’ve learned to type “longest river africa” into Google’s search bar, instead of “What is the name of the longest river in Africa?”. You’ve learned to optimize your search query for keyword search (yes, we know the Google search algorithm is more sophisticated. Can we just go with it for a second?). Similarly, we now need to learn how to optimize our search queries for vector search now.</p></li>
<li><p><strong>What comes after vector search?</strong> First, there was keyword-based search. Then, Machine Learning models enabled vector search. Now, LLMs with reasoning enable reasoning-based retrieval.</p></li>
<li><p><strong>Information retrieval is so hot right now.</strong> I feel fortunate to get to work in this exciting space. Although working on and with LLMs seems to be the cool thing now, figuring out how to provide the best information for them is equally exciting. And that’s the field of retrieval.</p></li>
</ol>
<p>I’m repeating my last point, but looking back at the past two years, I feel grateful to work in this field. I have only scratched the surface so far, and there’s still so much to learn. When I joined Weaviate, vector databases were the hot new thing. Then came RAG. Now, we’re talking about “context engineering”. <em>But what hasn’t changed is the importance of finding the best information to give the LLM so it can provide the best possible answer.</em></p>
<hr>
<p><em>This blog is inspired by <a href="https://twitter.com/softwaredoug">Doug Turnbull</a>’s blog post <a href="https://softwaredoug.com/blog/2024/06/25/what-ai-engineers-need-to-know-search">What AI Engineers Need to Know About Search</a>. If you enjoyed this blog, you will probably enjoy Doug’s blog as well.</em></p>



<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <guid>https://www.leoniemonigatti.com/blog/what_i_learned.html</guid>
  <pubDate>Thu, 03 Jul 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>NeoBERT: A Next-Generation BERT</title>
  <link>https://www.leoniemonigatti.com/blog/neobert.html</link>
  <description><![CDATA[ 





<p>—title: “NeoBERT: A Next-Generation BERT”description: “My study notes on the ‘NeoBERT: A Next-Generation BERT’ paper and explorations with the model”date: 2025-06-25image: “images/neobert.webp”categories: [Paper review]toc: true—The paper “<a href="https://arxiv.org/abs/2502.19587">NeoBERT: A Next-Generation BERT</a>” (2025) by Lola Le Breton, Quentin Fournier, Mariam El Mezouar, John X. Morris, Sarath Chandar introduces a new encoder model with updated architecture, training data, and pre-training methods and is intended as a strong backbone model.&gt; [W]e introduce NeoBERT, a next-generation encoder that redefines the capabilities of bidirectional models by integrating state-of-the-art advancements in architecture, modern data, and optimized pre-training methodologies.These are the most important highlights at a first glance:- <strong>Performance:</strong> It performs state-of-the-art on MTEB for its parameter class against other backbone models like BERT, NomicBERT, or ModernBERT base.- <strong>Size:</strong> The model is medium-sized with 250M parameters- <strong>Sequence length:</strong> Compared to BERT and RoBERTa (512) it has an extended context length of 4,096 tokens similar to NomicBERT (2,048) and ModernBERT (8,192)- <strong>Speed:</strong> NeoBERT is faster than ModernBERT in inference speed, despite being 100M parameters larger than the ModernBERT base- <strong>Dimensions:</strong> NeoBERT maintains same hidden size as base models (768) allowing for seamless plug-and-playAnd here are the relevant links:- arXiv: <a href="https://arxiv.org/abs/2502.19587">https://arxiv.org/abs/2502.19587</a>- Model on Hugging Face: <a href="https://huggingface.co/chandar-lab/NeoBERT">https://huggingface.co/chandar-lab/NeoBERT</a>- Code: <a href="https://github.com/chandar-lab/NeoBERT">https://github.com/chandar-lab/NeoBERT</a>## MotivationThe motivation behind is paper is the fact that encoders have not received as much love as LLMs in recent years, although they are equally important for downstream applications, like RAG systems.Today’s LLMs are capable of in-context learning and reasoning because of advancements in architecture, training data, pre-training, and fine-tuning.While there has been research on fine-tuning methods for pre-trained encoders (e.g., <a href="https://arxiv.org/abs/2308.03281">GTE</a> or <a href="https://arxiv.org/abs/2307.11224">jina-embeddings</a>), they are applied to older base models, like <a href="https://arxiv.org/abs/1810.04805">BERT</a> from 2019.Therefore, the authors see a lack of updated open-sourcebase models to apply these new fine-tuning techniques to.&gt; As a result, there is a dire need for a new generation of BERT-like pre-trained models that incorporate up-to-date knowledge and leverage both architectural and training innovations, forming stronger backbones for these more advanced fine-tuning procedures.Recent work on modernizing these base models are NomicBERT and ModernBERT, which this paper takes inspiration from:- NomicBERT: - Paper: <a href="https://arxiv.org/abs/2402.01613">https://arxiv.org/abs/2402.01613</a> (<code>nomic-bert-2048</code>, not to be confused with <code>nomic-embed-text-v1</code>) - Model: <a href="https://huggingface.co/nomic-ai/nomic-bert-2048">https://huggingface.co/nomic-ai/nomic-bert-2048</a> - License: apache-2.0 - Released: February 2025- ModernBERT: - Paper: <a href="https://arxiv.org/abs/2412.13663">https://arxiv.org/abs/2412.13663</a> - Model: <a href="https://huggingface.co/answerdotai/ModernBERT-base">https://huggingface.co/answerdotai/ModernBERT-base</a> - Model: <a href="https://huggingface.co/answerdotai/ModernBERT-large">https://huggingface.co/answerdotai/ModernBERT-large</a> - License: apache-2.0 - Released: December 2024## Key insightsThe paper covers a lot of interesting nitty-gritty details on recent advancements around architecture choice, training data selection, and pre-training methods.But if you step back, I think the key insights are that they are confirming what we already know from LLMs:1. Training on a lot of good data = Better models2. Increasing model size = Better models (even at small scale)### Training on a lot of good data = Better modelsAccording to the paper, the modification with the biggest improvement was changing the training data:&gt; […] replacing Wikitext and BookCorpus with the significantly larger and more diverse RefinedWeb dataset <strong>improved the score by +3.6%</strong> […]They trained NeoBERT on <a href="https://huggingface.co/datasets/tiiuae/falcon-refinedweb">RefinedWeb</a>, which is a 2.8 TB large dataset. It contains 600B tokens and is 18 times larger than RoBERTa’s training dataset.&gt; Following the same trend, we pre-trained NeoBERT on RefinedWeb (Penedo et al., 2023), a massive dataset containing 600B tokens, nearly 18 times larger than RoBERTa’s.I think it’s interesting that apparently the newer NomicBERT was trained on the same dataset as BERT with 13GB, while RoBERTa was trained on an extended dataset of 160 GB, and it’s just now that this has been done to encoders, while this has been done to generative models for a while already.&gt; Recent generative models like the LLaMA family (Touvron et al., 2023; Dubey et al., 2024) have demonstrated that language models benefit from being trained on significantly more tokens than was previously standard. Recently, LLaMA-3.2 1B was successfully trained on up to 9T tokens without showing signs of saturation. Moreover, encoders are less sample-efficient than decoders since they only make predictions for masked tokens. <strong>Therefore, it is reasonable to believe that encoders of similar sizes can be trained on an equal or even greater number of tokens without saturating.</strong>### Increasing model size = Better models (even at small scale)The second most impactful modification was increasing the model size and finding an optimal depth-to-width ratio for the Transformer architecture:&gt; […] while increasing the model size from 120M to 250M in M7 <strong>led to a +2.9% relative improvement</strong>.So, NomicBERT and ModernBERT base both have around 150M parameters and are considered small-sized.NeoBERT with 250M parameters can be considered medium-sized, so it makes sense that it performs better than smaller models.But what’s interesting is that they took the depth-to-width ratio into consideration when increasing the model size:&gt; In contrast, small language models like BERT, RoBERTa, and NomicBERT are instead in a width-inefficiency regime. To maximize NeoBERT’s parameter efficiency while ensuring it remains a seamless plug-and-play replacement, we retain the original BERT base width of 768 and instead increase its depth to achieve this optimal ratio.So, they first increased the number of parameters to 250M with a depth-to-width ratio of 16 x 1056 (too wide) and then they optimized the depth-to-width ratio to 28 x 768 (more width-efficient).&gt; Note that to assess the impact of the depth-to-width ratio, we first scale the number of parameters in M7 to 250M while maintaining a similar ratio to BERT base, resulting in 16 layers of dimension 1056. In M8, the ratio is then adjusted to 28 layers of dimension 768.This is nice, because by keeping the hidden size at 768, NeoBERT can be easily switched out for other base models:&gt; NeoBERT is designed for seamless adoption: it serves as a plug-and-play replacement for existing base modelsWhat I also think is remarkable is that it has a much faster inference speed compared to both ModernBERT models despite it’s size.There’s a nice figure in the paper showing the thoughput at different sequence length.The figure is missing NomicBERT though, which I don’t know why.&gt; For extended sequences, NeoBERT significantly outperforms ModernBERT base, despite having 100M more parameters, <strong>achieving a 46.7% speedup on sequences of 4, 096 tokens</strong>.## Old Encoders vs.&nbsp;Modern EncodersThe paper features a nice overview table of different characteristics between older encoders, like BERT (2019), RoBERTa (2019), and newer encoders, like ModernBERT (2024) and NomicBERT (2025), which shows their differences. Here I’m summarizing key differences between older and newer encoders that stood out to me:| Configuration | Older Encoders | Newer Encoders || – | – | – || Position encoding and sequence lengths | Absolute positional embeddings with a sequence length of 512 | RoPE for handling longer sequences of 2,048 to 8,192 || Masking rate | 15 % | Optimal masking rate was found to be between 20 to 40 % <a href="https://arxiv.org/abs/2202.08005">by Wettig et al.</a> || Optimizer| Adam | AdamW || Training | DDP | FlashAttention and other || Normalization | Post-Layer Normalization | Pre-Layer Normalization (normalization layer is moved inside the residual connection of each feed-forward and attention block)|</p>



<a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a> ]]></description>
  <category>Paper review</category>
  <guid>https://www.leoniemonigatti.com/blog/neobert.html</guid>
  <pubDate>Wed, 25 Jun 2025 00:00:00 GMT</pubDate>
  <media:content url="https://www.leoniemonigatti.com/blog/images/neobert.webp" medium="image" type="image/webp"/>
</item>
</channel>
</rss>
