Build your first AI agent from scratch in 30 minutes

Ren Okabe

Ren OkabeJune 16, 20266 min read177 views

Build your first AI agent from scratch in 30 minutes

An AI agent is just a loop: you call a model, the model asks to run a tool, you run it, you feed the result back, and you repeat until the model is done. In this tutorial you build that loop yourself in plain TypeScript against the Anthropic Messages API, no framework. You will wire up two tools (read a file, run a calculation), let the model orchestrate them, add a turn cap and basic guardrails, then verify the whole thing end to end. The result is a small research agent you fully understand and can extend with your own tools.

Updated on July 15, 2026

On this page

Quick answer (July 2026): An AI agent is a loop: you call the model, it asks to run a tool, you run the tool, you feed the result back, and you repeat until it is done. This tutorial builds that loop by hand in about 30 minutes in plain TypeScript against the Anthropic Messages API with claude-sonnet-5, no agent framework, so you understand every moving part before reaching for one.

Most "build an agent" tutorials hand you a framework and hide the loop that actually matters. This one does the opposite. In about 30 minutes you will write the agent loop yourself, model call, tool dispatch, result feedback, in plain TypeScript against the Anthropic Messages API, with no agent framework in the middle. By the end you will understand exactly what an agent is: a while loop that lets a model call functions until it decides it is done.

We will build a small research agent that can read a local file and do arithmetic, then verify it end to end.

New to agents entirely? Our overview of how to make an AI agent walks the five-part anatomy (model, instructions, tools, loop, guardrails) at a higher level; this tutorial is the hands-on, build-the-loop-by-hand companion to it.

Prerequisites

Node.js 20+ and npm installed.
An Anthropic API key exported as ANTHROPIC_API_KEY.
Comfort with basic TypeScript: async/await, interfaces, and JSON.parse.
A terminal and about 30 minutes.

Expected outcome: a runnable agent.ts script that accepts a natural-language task, lets the model call two tools (read_file and calculator) in a loop, and prints a final answer. You will be able to add your own tools without touching the loop.

Scaffold the project

Create a fresh directory and install the Anthropic SDK plus tsx so you can run TypeScript directly.

bash

mkdir first-agent && cd first-agent
npm init -y
npm install @anthropic-ai/sdk
npm install -D tsx typescript
npm pkg set type=module

Setting type=module matters: the SDK ships as ESM, and mixing CommonJS require with it is the most common first error.

> Heads up: if you later see Error [ERR_REQUIRE_ESM], it means something in your toolchain is importing the SDK with require. Keep "type": "module" in package.json and use import everywhere.

Define your tools as plain functions

A tool is nothing magical, it is a normal function that takes JSON-shaped input and returns a string. Create tools.ts:

typescript

import { readFile } from "node:fs/promises";

export async function read_file(input: { path: string }): Promise {
  const text = await readFile(input.path, "utf8");
  // Trim to keep the model's context small.
  return text.slice(0, 4000);
}

export function calculator(input: { expression: string }): string {
  // Only allow digits and basic math operators, never eval arbitrary input.
  if (!/^[-+*/().\d\s]+$/.test(input.expression)) {
    return "Error: expression contains unsupported characters.";
  }
  // eslint-disable-next-line no-new-func
  const result = Function(`"use strict"; return (${input.expression});`)();
  return String(result);
}

> Heads up: never pass model output straight into eval. The regex guard above is the difference between a calculator and a remote code execution hole. Treat every tool input as untrusted.

Describe the tools to the model

The model cannot see your TypeScript. It only sees a JSON description of each tool: a name, a one-line purpose, and an input schema. Create schemas.ts:

typescript

import type Anthropic from "@anthropic-ai/sdk";

export const tools: Anthropic.Tool[] = [
  {
    name: "read_file",
    description: "Read the first 4000 characters of a UTF-8 text file from disk.",
    input_schema: {
      type: "object",
      properties: { path: { type: "string", description: "Relative path to the file." } },
      required: ["path"],
    },
  },
  {
    name: "calculator",
    description: "Evaluate a simple arithmetic expression like '12 * (3 + 4)'.",
    input_schema: {
      type: "object",
      properties: { expression: { type: "string" } },
      required: ["expression"],
    },
  },
];

The description fields are prompt engineering, not documentation. The model decides which tool to call based on these sentences, so make them specific about when to use each tool.

Write the agent loop

This is the whole idea of an agent. Create agent.ts:

typescript

import Anthropic from "@anthropic-ai/sdk";
import { tools } from "./schemas.js";
import * as impl from "./tools.js";

const client = new Anthropic();
const MODEL = "claude-sonnet-5";
const MAX_TURNS = 10;

export async function runAgent(task: string): Promise {
  const messages: Anthropic.MessageParam[] = [{ role: "user", content: task }];

  for (let turn = 0; turn &lt; MAX_TURNS; turn++) {
    console.log(`[agent] turn ${turn + 1}`);
    const res = await client.messages.create({
      model: MODEL,
      max_tokens: 1024,
      tools,
      messages,
    });
    // ... dispatch handled in Step 5
    return await handleResponse(res, messages);
  }
  return "Stopped: reached the turn limit without a final answer.";
}

Notice there is no framework here. The loop is the agent. Everything else is plumbing around messages.create.

Handle tool_use and feed results back

When the model wants to run a tool, it returns stop_reason: "tool_use" with one or more tool_use blocks. You run each tool and reply with matching tool_result blocks. Replace the placeholder with a real dispatcher:

typescript

const REGISTRY: Record unknown&gt; = {
  read_file: impl.read_file,
  calculator: impl.calculator,
};

async function handleResponse(
  res: Anthropic.Message,
  messages: Anthropic.MessageParam[],
): Promise {
  // Record what the model just said (text + tool requests).
  messages.push({ role: "assistant", content: res.content });

  if (res.stop_reason !== "tool_use") {
    const text = res.content.find((b) =&gt; b.type === "text");
    return text && text.type === "text" ? text.text : "(no text returned)";
  }

  const results: Anthropic.ToolResultBlockParam[] = [];
  for (const block of res.content) {
    if (block.type !== "tool_use") continue;
    const fn = REGISTRY[block.name];
    console.log(`[agent] -&gt; ${block.name}(${JSON.stringify(block.input)})`);
    let output: string;
    try {
      output = String(await fn(block.input));
    } catch (err) {
      console.error(`[agent] tool ${block.name} failed:`, err);
      output = `Error running ${block.name}: ${(err as Error).message}`;
    }
    results.push({ type: "tool_result", tool_use_id: block.id, content: output });
  }

  messages.push({ role: "user", content: results });
  // Loop again with the tool results in context.
  return runAgain(messages);
}

The key insight: a tool_result is sent back as a user message. From the model's perspective, the environment is a user that answers its tool requests.

> Heads up: you must push the assistant's entire content array back into messages before adding the tool results. If you drop the original tool_use blocks, the API rejects the next request because the tool_result has no matching tool_use_id.

Add a stop condition and a clean entry point

Wrap the loop so it re-enters cleanly and honors MAX_TURNS. Refactor runAgent to call the model inside the loop and delegate to handleResponse, which recurses through a small runAgain helper:

typescript

async function runAgain(messages: Anthropic.MessageParam[]): Promise {
  const res = await client.messages.create({
    model: MODEL, max_tokens: 1024, tools, messages,
  });
  return handleResponse(res, messages);
}

if (import.meta.url === `file://${process.argv[1]}`) {
  const task = process.argv.slice(2).join(" ") || "Read notes.txt and tell me the total of any numbers in it.";
  runAgent(task).then((answer) =&gt; console.log("\n=== FINAL ===\n" + answer));
}

The turn cap is your single most important guardrail. Without it, a confused model can loop on the same failing tool until you run out of budget.

Verify your install

Create a test file the agent can read, then run it.

bash

echo "Q2 revenue lines: 1200, 980, 1500. Sum them." &gt; notes.txt
export ANTHROPIC_API_KEY=sk-ant-...
npx tsx agent.ts "Read notes.txt and compute the sum of the numbers."

You should see the agent take two turns, one read_file call, one calculator call, and then print:

text

[agent] turn 1
[agent] -&gt; read_file({"path":"notes.txt"})
[agent] -&gt; calculator({"expression":"1200 + 980 + 1500"})

=== FINAL ===
The sum of the numbers in notes.txt is 3680.

If you see the read_file call but no calculator call, your tool descriptions are too vague, make the calculator description explicitly mention summing numbers. If you get a 400 error about tool_use_id, re-read Step 5: you dropped the assistant message.

Limitations and open questions

This agent is deliberately minimal, and that minimalism has costs:

No streaming. You wait for each full turn. For a UI you would stream text and tool calls as they arrive.
No parallelism. Tools run sequentially. The Messages API can return several tool_use blocks at once; here we run them in order rather than with Promise.all.
No memory across runs. Each invocation starts fresh. Persisting messages to a store is the next step toward a stateful assistant.
Naive error recovery. We return tool errors to the model and hope it adapts. Production agents need retry budgets and explicit failure handling per tool.
First-build stability is a measurable property. Per-prompt outcomes vary across runs even when the loop logic is identical. BuilderProof's recent first-build stability axis proposal documents a per-prompt outcome scoring scheme (clean, self-recoverable, operator-intervention, failed) that adapts cleanly to agent loops of this shape; worth reading before you decide what failure rate is tolerable for your stack.

Prefer not to hand-build streaming, parallel tool calls, and context management yourself? Anthropic's official Claude Agent SDK ships the agent loop, tool handling, and permissions as a managed package. We break down when the SDK wins and when writing the loop yourself still makes sense so you can choose deliberately.

The open question every team hits next is when to stop: a fixed turn cap is blunt. Better stop conditions, confidence thresholds, explicit "done" tools, or a planner that commits to a step count, are an active area worth exploring once the basic loop works.

Once the tool surface includes a retrieval call, the loop shape stays the same but the failure modes shift: tool errors become "context missing" answers instead of crashes, and the silent failure becomes the model answering from prior knowledge rather than the retrieved context. ShipGarden's LlamaIndex.TS RAG starter review for June 2026 walks a Next.js retrieval tool into exactly this loop shape and shows where the eval pass catches the silent-failure mode early.

Sources

Anthropic, "Tool use with the Claude Messages API": official documentation, 2025.
Anthropic, "Building effective agents": engineering guide on agent loops and patterns, 2024.
Node.js, "ECMAScript modules": node:fs/promises and ESM behavior, 2025.

#AI agents #Tool use #TypeScript #LLM #Beginner

Back to tutorials

Share

Written by

Ren Okabe

Ren builds agent infrastructure and writes copy-paste tutorials for engineers shipping LLM tool-use systems.

Frequently asked questions

Do I need an agent framework like LangChain to build an agent?

No. An agent is a loop that calls a model, runs whatever tools the model asks for, feeds the results back, and repeats until the model stops requesting tools. Frameworks wrap that loop in abstractions, but the loop itself is about 30 lines of code. Writing it yourself first makes those frameworks far easier to debug later.

Which model should I use for a first agent?

Start with a mid-tier model such as claude-sonnet-5. It follows tool schemas reliably and is cheap enough to iterate with. Move to a larger model only if you see the agent failing to choose the right tool or mis-formatting arguments.

How does the model actually 'call' my function?

It does not call it directly. The model returns a structured tool_use block containing a tool name and JSON arguments. Your code reads that block, runs the matching function, and returns the result in a tool_result block on the next turn. The model never executes code itself.

How do I stop the agent from looping forever?

Cap the number of turns (for example 10) and break the loop when the model returns a stop_reason other than tool_use. Always log each turn so you can see where a runaway loop comes from.

Add to SaaS

Add an AI agent to an existing SaaS without rewriting it

You do not need to rebuild your product to ship an AI agent inside it. The trick is to expose the service functions you already have, search records, create an order, fetch a customer, as tools, then run a small server-side agent loop that the model uses to orchestrate them. This tutorial wraps an existing service layer as tools, scopes every call to the authenticated user, separates safe read tools from gated write tools, exposes the agent as one authenticated endpoint, and deploys that endpoint to Totalum. Your database, auth, and business logic stay untouched.

June 10, 20266 min read146

MCP servers

How to Build an MCP Server (2026): Your First Server, Wired to Claude

The Model Context Protocol (MCP) is a standard way to expose tools to any MCP-capable client, Claude Desktop, IDEs, or your own agents, so you write an integration once and reuse it everywhere. In this tutorial you build an MCP server in TypeScript that exposes a single typed tool over the stdio transport, test it with the MCP Inspector, then register it with Claude and call it from a real conversation. You will also learn the one rule that trips up everyone on stdio: never write to stdout. By the end you have a reusable server you can extend with your own tools.

June 4, 20265 min read122

From scratch

Self-Hosted AI App Builder: Run Your Own (2026)

A hands-on 2026 tutorial: stand up your own self-hosted, open-source AI app builder from the ai-app-builder-open repo, with one API key and code you fully own.

July 18, 202610 min read43

Prerequisites

Scaffold the project

Define your tools as plain functions

Describe the tools to the model

Write the agent loop

Handle tool_use and feed results back

Add a stop condition and a clean entry point

Verify your install

Limitations and open questions

Sources

Frequently asked questions

Related tutorials

Add an AI agent to an existing SaaS without rewriting it

How to Build an MCP Server (2026): Your First Server, Wired to Claude

Self-Hosted AI App Builder: Run Your Own (2026)