From scratch
Ren Okabe12 min read3 views

Build your first AI agent from scratch in 30 minutes

An AI agent is just a loop: you call a model, the model asks to run a tool, you run it, you feed the result back, and you repeat until the model is done. In this tutorial you build that loop yourself in plain TypeScript against the Anthropic Messages API — no framework. You will wire up two tools (read a file, run a calculation), let the model orchestrate them, add a turn cap and basic guardrails, then verify the whole thing end to end. The result is a small research agent you fully understand and can extend with your own tools.

Source code on a dark editor screen
Source code on a dark editor screen
On this page

Most "build an agent" tutorials hand you a framework and hide the loop that actually matters. This one does the opposite. In about 30 minutes you will write the agent loop yourself — model call, tool dispatch, result feedback — in plain TypeScript against the Anthropic Messages API, with no agent framework in the middle. By the end you will understand exactly what an agent is: a while loop that lets a model call functions until it decides it is done.

We will build a small research agent that can read a local file and do arithmetic, then verify it end to end.

Prerequisites

  • Node.js 20+ and npm installed.
  • An Anthropic API key exported as ANTHROPIC_API_KEY.
  • Comfort with basic TypeScript: async/await, interfaces, and JSON.parse.
  • A terminal and about 30 minutes.

Expected outcome: a runnable agent.ts script that accepts a natural-language task, lets the model call two tools (read_file and calculator) in a loop, and prints a final answer. You will be able to add your own tools without touching the loop.

Scaffold the project

Create a fresh directory and install the Anthropic SDK plus tsx so you can run TypeScript directly.

bash
mkdir first-agent && cd first-agent
npm init -y
npm install @anthropic-ai/sdk
npm install -D tsx typescript
npm pkg set type=module

Setting type=module matters: the SDK ships as ESM, and mixing CommonJS require with it is the most common first error.

> Heads up: if you later see Error [ERR_REQUIRE_ESM], it means something in your toolchain is importing the SDK with require. Keep "type": "module" in package.json and use import everywhere.

Define your tools as plain functions

A tool is nothing magical — it is a normal function that takes JSON-shaped input and returns a string. Create tools.ts:

typescript
import { readFile } from "node:fs/promises";

export async function read_file(input: { path: string }): Promise {
  const text = await readFile(input.path, "utf8");
  // Trim to keep the model's context small.
  return text.slice(0, 4000);
}

export function calculator(input: { expression: string }): string {
  // Only allow digits and basic math operators — never eval arbitrary input.
  if (!/^[-+*/().\d\s]+$/.test(input.expression)) {
    return "Error: expression contains unsupported characters.";
  }
  // eslint-disable-next-line no-new-func
  const result = Function(`"use strict"; return (${input.expression});`)();
  return String(result);
}

> Heads up: never pass model output straight into eval. The regex guard above is the difference between a calculator and a remote code execution hole. Treat every tool input as untrusted.

Describe the tools to the model

The model cannot see your TypeScript. It only sees a JSON description of each tool: a name, a one-line purpose, and an input schema. Create schemas.ts:

typescript
import type Anthropic from "@anthropic-ai/sdk";

export const tools: Anthropic.Tool[] = [
  {
    name: "read_file",
    description: "Read the first 4000 characters of a UTF-8 text file from disk.",
    input_schema: {
      type: "object",
      properties: { path: { type: "string", description: "Relative path to the file." } },
      required: ["path"],
    },
  },
  {
    name: "calculator",
    description: "Evaluate a simple arithmetic expression like '12 * (3 + 4)'.",
    input_schema: {
      type: "object",
      properties: { expression: { type: "string" } },
      required: ["expression"],
    },
  },
];

The description fields are prompt engineering, not documentation. The model decides which tool to call based on these sentences, so make them specific about when to use each tool.

Write the agent loop

This is the whole idea of an agent. Create agent.ts:

typescript
import Anthropic from "@anthropic-ai/sdk";
import { tools } from "./schemas.js";
import * as impl from "./tools.js";

const client = new Anthropic();
const MODEL = "claude-sonnet-4-6";
const MAX_TURNS = 10;

export async function runAgent(task: string): Promise {
  const messages: Anthropic.MessageParam[] = [{ role: "user", content: task }];

  for (let turn = 0; turn < MAX_TURNS; turn++) {
    console.log(`[agent] turn ${turn + 1}`);
    const res = await client.messages.create({
      model: MODEL,
      max_tokens: 1024,
      tools,
      messages,
    });
    // ... dispatch handled in Step 5
    return await handleResponse(res, messages);
  }
  return "Stopped: reached the turn limit without a final answer.";
}

Notice there is no framework here. The loop is the agent. Everything else is plumbing around messages.create.

Handle tool_use and feed results back

When the model wants to run a tool, it returns stop_reason: "tool_use" with one or more tool_use blocks. You run each tool and reply with matching tool_result blocks. Replace the placeholder with a real dispatcher:

typescript
const REGISTRY: Record unknown> = {
  read_file: impl.read_file,
  calculator: impl.calculator,
};

async function handleResponse(
  res: Anthropic.Message,
  messages: Anthropic.MessageParam[],
): Promise {
  // Record what the model just said (text + tool requests).
  messages.push({ role: "assistant", content: res.content });

  if (res.stop_reason !== "tool_use") {
    const text = res.content.find((b) => b.type === "text");
    return text && text.type === "text" ? text.text : "(no text returned)";
  }

  const results: Anthropic.ToolResultBlockParam[] = [];
  for (const block of res.content) {
    if (block.type !== "tool_use") continue;
    const fn = REGISTRY[block.name];
    console.log(`[agent] -> ${block.name}(${JSON.stringify(block.input)})`);
    let output: string;
    try {
      output = String(await fn(block.input));
    } catch (err) {
      console.error(`[agent] tool ${block.name} failed:`, err);
      output = `Error running ${block.name}: ${(err as Error).message}`;
    }
    results.push({ type: "tool_result", tool_use_id: block.id, content: output });
  }

  messages.push({ role: "user", content: results });
  // Loop again with the tool results in context.
  return runAgain(messages);
}

The key insight: a tool_result is sent back as a user message. From the model's perspective, the environment is a user that answers its tool requests.

> Heads up: you must push the assistant's entire content array back into messages before adding the tool results. If you drop the original tool_use blocks, the API rejects the next request because the tool_result has no matching tool_use_id.

Add a stop condition and a clean entry point

Wrap the loop so it re-enters cleanly and honors MAX_TURNS. Refactor runAgent to call the model inside the loop and delegate to handleResponse, which recurses through a small runAgain helper:

typescript
async function runAgain(messages: Anthropic.MessageParam[]): Promise {
  const res = await client.messages.create({
    model: MODEL, max_tokens: 1024, tools, messages,
  });
  return handleResponse(res, messages);
}

if (import.meta.url === `file://${process.argv[1]}`) {
  const task = process.argv.slice(2).join(" ") || "Read notes.txt and tell me the total of any numbers in it.";
  runAgent(task).then((answer) => console.log("\n=== FINAL ===\n" + answer));
}

The turn cap is your single most important guardrail. Without it, a confused model can loop on the same failing tool until you run out of budget.

Verify your install

Create a test file the agent can read, then run it.

bash
echo "Q2 revenue lines: 1200, 980, 1500. Sum them." > notes.txt
export ANTHROPIC_API_KEY=sk-ant-...
npx tsx agent.ts "Read notes.txt and compute the sum of the numbers."

You should see the agent take two turns — one read_file call, one calculator call — and then print:

text
[agent] turn 1
[agent] -> read_file({"path":"notes.txt"})
[agent] -> calculator({"expression":"1200 + 980 + 1500"})

=== FINAL ===
The sum of the numbers in notes.txt is 3680.

If you see the read_file call but no calculator call, your tool descriptions are too vague — make the calculator description explicitly mention summing numbers. If you get a 400 error about tool_use_id, re-read Step 5: you dropped the assistant message.

Limitations and open questions

This agent is deliberately minimal, and that minimalism has costs:

  • No streaming. You wait for each full turn. For a UI you would stream text and tool calls as they arrive.
  • No parallelism. Tools run sequentially. The Messages API can return several tool_use blocks at once; here we run them in order rather than with Promise.all.
  • No memory across runs. Each invocation starts fresh. Persisting messages to a store is the next step toward a stateful assistant.
  • Naive error recovery. We return tool errors to the model and hope it adapts. Production agents need retry budgets and explicit failure handling per tool.

The open question every team hits next is when to stop: a fixed turn cap is blunt. Better stop conditions — confidence thresholds, explicit "done" tools, or a planner that commits to a step count — are an active area worth exploring once the basic loop works.

Sources

  • Anthropic, "Tool use (function calling) with the Messages API" — official documentation, 2025.
  • Anthropic, "Building effective agents" — engineering guide on agent loops and patterns, 2024.
  • Node.js Documentation, "ECMAScript modules" — node:fs/promises and ESM behavior, 2025.
Ren Okabe

Written by

Ren Okabe

Ren builds agent infrastructure and writes copy-paste tutorials for engineers shipping LLM tool-use systems.

Frequently asked questions

Do I need an agent framework like LangChain to build an agent?

No. An agent is a loop that calls a model, runs whatever tools the model asks for, feeds the results back, and repeats until the model stops requesting tools. Frameworks wrap that loop in abstractions, but the loop itself is about 30 lines of code. Writing it yourself first makes those frameworks far easier to debug later.

Which model should I use for a first agent?

Start with a mid-tier model such as claude-sonnet-4-6. It follows tool schemas reliably and is cheap enough to iterate with. Move to a larger model only if you see the agent failing to choose the right tool or mis-formatting arguments.

How does the model actually 'call' my function?

It does not call it directly. The model returns a structured tool_use block containing a tool name and JSON arguments. Your code reads that block, runs the matching function, and returns the result in a tool_result block on the next turn. The model never executes code itself.

How do I stop the agent from looping forever?

Cap the number of turns (for example 10) and break the loop when the model returns a stop_reason other than tool_use. Always log each turn so you can see where a runaway loop comes from.

Add to SaaS

Add an AI agent to an existing SaaS without rewriting it

You do not need to rebuild your product to ship an AI agent inside it. The trick is to expose the service functions you already have — search records, create an order, fetch a customer — as tools, then run a small server-side agent loop that the model uses to orchestrate them. This tutorial wraps an existing service layer as tools, scopes every call to the authenticated user, separates safe read tools from gated write tools, exposes the agent as one authenticated endpoint, and deploys that endpoint to Totalum. Your database, auth, and business logic stay untouched.

13 min read3
MCP servers

Write your first MCP server and wire it to Claude

The Model Context Protocol (MCP) is a standard way to expose tools to any MCP-capable client — Claude Desktop, IDEs, or your own agents — so you write an integration once and reuse it everywhere. In this tutorial you build an MCP server in TypeScript that exposes a single typed tool over the stdio transport, test it with the MCP Inspector, then register it with Claude and call it from a real conversation. You will also learn the one rule that trips up everyone on stdio: never write to stdout. By the end you have a reusable server you can extend with your own tools.

12 min read4