Run Claude Tool Calls in Parallel (TypeScript, 2026)

Ren Okabe

Ren OkabeJuly 2, 20265 min read4 views

Run Claude Tool Calls in Parallel (TypeScript, 2026)

Speed up your Claude agent loop by running independent tool calls concurrently in TypeScript. Copy-paste code for parallel tool_use, is_error handling, and disable_parallel_tool_use.

Colorful TypeScript source code on a dark screen, representing a Claude agent loop running concurrent tool calls

On this page

When your Claude agent needs to call two or more independent tools in one turn, running them one after another wastes wall-clock time. As of July 2, 2026, Claude returns every tool call as a separate tool_use block in a single assistant message, so you can execute them all at once with Promise.all and return one tool_result block per call. This tutorial builds a small TypeScript agent loop, shows the naive sequential version, then fixes it. Two tools that each take about 800 ms run in roughly 1.6 seconds sequentially and about 0.8 seconds in parallel. Across a five-tool fan-out the gap widens from about 4 seconds to about 0.8 seconds.

What you will build

A single-file TypeScript agent loop that lets Claude call two independent tools, get_weather and get_time, and executes both concurrently. You will see exactly where the naive loop blocks, how to fan the tool calls out, how to return results the API accepts, and when to switch parallel execution off.

Prerequisites

Node.js 20+
and a package manager.
The Anthropic TypeScript SDK
: npm i @anthropic-ai/sdk.
An ANTHROPIC_API_KEY in your environment.
Basic familiarity with the agent loop. If you have not built one yet, start with your first Claude agent from scratch.

This tutorial uses TypeScript TypeScript logo and the model id claude-opus-4-8, the current default for tool use per the Anthropic tool-use docs (2026).

Why sequential tool execution is the bottleneck

Claude decides on its own when a turn needs more than one tool. Given a prompt like "What is the weather and local time in Lisbon?", it emits two tool_use blocks in the same response. Those two calls do not depend on each other. If you await them one at a time, your loop sits idle during the first tool's network round trip before it even starts the second. The model already did the parallel thinking; the slowdown is entirely in your executor.

Define two independent tools

ts

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic(); // reads ANTHROPIC_API_KEY

const tools: Anthropic.Tool[] = [
  {
    name: "get_weather",
    description:
      "Get the current weather for a city. Returns a short text summary. Use when the user asks about weather or temperature.",
    input_schema: {
      type: "object",
      properties: { city: { type: "string", description: "City name, e.g. Lisbon" } },
      required: ["city"],
    },
  },
  {
    name: "get_time",
    description:
      "Get the current local time for a city as an ISO 8601 string. Use when the user asks what time it is somewhere.",
    input_schema: {
      type: "object",
      properties: { city: { type: "string", description: "City name, e.g. Lisbon" } },
      required: ["city"],
    },
  },
];

Now the implementations. The sleep calls stand in for real API round trips so the timing difference is visible.

ts

const sleep = (ms: number) =&gt; new Promise((r) =&gt; setTimeout(r, ms));

async function getWeather(city: string): Promise {
  await sleep(800); // pretend this is a real weather API
  return `${city}: 24C, clear skies`;
}

async function getTime(city: string): Promise {
  await sleep(800); // pretend this is a real time API
  return `${city}: 2026-07-02T14:05:00Z`;
}

async function runTool(name: string, input: any): Promise {
  switch (name) {
    case "get_weather":
      return getWeather(input.city);
    case "get_time":
      return getTime(input.city);
    default:
      throw new Error(`Unknown tool: ${name}`);
  }
}

The agent loop, naive sequential version

Here is the part most first drafts get wrong. It works, but it runs the tool calls back to back.

ts

// NAIVE: one tool at a time
const toolResults = [];
for (const block of message.content) {
  if (block.type !== "tool_use") continue;
  const output = await runTool(block.name, block.input); // blocks the whole loop
  toolResults.push({
    type: "tool_result",
    tool_use_id: block.id,
    content: output,
  });
}

Two 800 ms tools take about 1.6 seconds here, because the second await cannot start until the first resolves.

Run the tool calls in parallel

Collect every tool_use block first, then fire them together with Promise.all. The array order is preserved, and each result carries its own tool_use_id, so nothing gets crossed.

ts

const toolUses = message.content.filter(
  (b): b is Anthropic.ToolUseBlock =&gt; b.type === "tool_use"
);

const toolResults = await Promise.all(
  toolUses.map(async (block) =&gt; {
    try {
      const output = await runTool(block.name, block.input);
      return {
        type: "tool_result" as const,
        tool_use_id: block.id,
        content: output,
      };
    } catch (err) {
      return {
        type: "tool_result" as const,
        tool_use_id: block.id,
        content: `Error: ${(err as Error).message}`,
        is_error: true,
      };
    }
  })
);

Both tools now overlap, so the round trip drops to about 0.8 seconds. The try / catch is not optional: if one tool throws, you still owe Claude a tool_result for that tool_use_id, flagged with is_error: true. Skip it and the next API call rejects because a tool call went unanswered.

Put it in the full loop

ts

async function runAgent(userPrompt: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userPrompt },
  ];

  while (true) {
    const message = await client.messages.create({
      model: "claude-opus-4-8",
      max_tokens: 1024,
      tools,
      messages,
    });

    messages.push({ role: "assistant", content: message.content });

    if (message.stop_reason !== "tool_use") {
      const text = message.content.find((b) =&gt; b.type === "text");
      return text?.type === "text" ? text.text : "";
    }

    const toolUses = message.content.filter(
      (b): b is Anthropic.ToolUseBlock =&gt; b.type === "tool_use"
    );

    const toolResults = await Promise.all(
      toolUses.map(async (block) =&gt; {
        try {
          const output = await runTool(block.name, block.input);
          return { type: "tool_result" as const, tool_use_id: block.id, content: output };
        } catch (err) {
          return {
            type: "tool_result" as const,
            tool_use_id: block.id,
            content: `Error: ${(err as Error).message}`,
            is_error: true,
          };
        }
      })
    );

    messages.push({ role: "user", content: toolResults });
  }
}

console.log(await runAgent("What is the weather and local time in Lisbon?"));

All the tool results for one assistant turn go into a single user message. The API matches each tool_result to its tool_use by tool_use_id, so the order inside that array does not actually matter to Claude. Keeping array order just makes your logs easier to read.

When to turn parallel execution off

Parallelism is wrong when your tools have side effects that must happen in a fixed sequence, for example create_invoice then charge_card. In that case tell Claude to emit at most one tool call per turn:

ts

const message = await client.messages.create({
  model: "claude-opus-4-8",
  max_tokens: 1024,
  tools,
  tool_choice: { type: "auto", disable_parallel_tool_use: true },
  messages,
});

disable_parallel_tool_use lives inside tool_choice. It forces one tool_use block per turn, so the model runs your tools in strict order across multiple turns instead of batching them. Leave it off for read-only tools; turn it on for ordered writes.

This client-side loop is not the same as Anthropic's newer programmatic tool calling, where Claude orchestrates tools from generated code in a sandbox. The pattern above is the plain loop you run inside your own process, and it is still the one most production agents use in 2026. OpenAI's function calling exposes the same parallel shape; see the OpenAI function-calling guide (2026) if you are porting between providers.

How much time does this actually save?

The savings scale with fan-out width, not tool count in general. If Claude asks for N independent tools that each take T milliseconds, sequential execution costs about N times T, while parallel execution costs about T (the slowest single call). For a five-tool fan-out at 800 ms each, that is about 4 seconds down to about 0.8 seconds. Token usage and the number of API round trips are identical either way; you are only overlapping your own I/O.

Where to run this in production

Drop runAgent into a Next.js Next.js logo route handler and deploy on Vercel Vercel logo , or on any Node host. If you would rather not hand-wire the surrounding app (auth, database, deploy), an AI app builder like Totalum Totalum logo generates an owned Next.js codebase you can paste this loop into; its FAQ states the code is 100% yours with no vendor lock-in (totalum.app, checked July 2, 2026). Both routes run the exact same loop; pick based on how much of the app shell you want to own versus write yourself.

Verify it works

Run the file. If the loop is wired correctly you will see a final answer that combines both tool outputs, something like:

text

The weather in Lisbon is 24C with clear skies, and the local time is 2026-07-02T14:05:00Z.

To prove the parallel path is actually overlapping, wrap the Promise.all in a timer:

ts

const t0 = performance.now();
const toolResults = await Promise.all(/* ...as above... */);
console.log(`tools finished in ${Math.round(performance.now() - t0)}ms`);

With two 800 ms tools you should see roughly 800ms, not 1600ms. If you see 1600, you are still awaiting inside a loop somewhere. The full runnable SDK is on GitHub: anthropics/anthropic-sdk-typescript.

If you want the streaming version of this loop, so tokens and tool calls surface as they happen, read streaming Claude tool calls in a TypeScript agent loop next.

#parallel tool use #tool use #Claude #TypeScript #agent loop

Back to tutorials

Share

Written by

Ren Okabe

Ren builds agent infrastructure and writes copy-paste tutorials for engineers shipping LLM tool-use systems.

Frequently asked questions

Does Claude call tools in parallel by default?

Yes. When tool_choice is auto or any and the tasks are independent, Claude may emit several tool_use blocks in one assistant turn. You decide whether to execute them concurrently in your own code.

How do I return results for parallel tool calls?

Put one tool_result block per tool_use into a single user message, each with its matching tool_use_id. Claude matches by id, so the order inside the array does not matter.

What is disable_parallel_tool_use?

A flag inside tool_choice that forces Claude to emit at most one tool_use block per turn. Use it when your tools have side effects that must run in a fixed order, such as create then charge.

Does parallel execution change token usage or cost?

No. The number of API round trips and tokens is identical. Running tool calls in parallel only cuts wall-clock time by overlapping your own tool I/O.

What happens if one parallel tool fails?

Return that tool_result with is_error set to true and an error message in content. Claude reads the error and can retry or route around it. Never skip a result, or the next API call will reject.

Is this the same as Anthropic programmatic tool calling?

No. Programmatic tool calling (2026) lets Claude orchestrate tools from generated code in a sandbox. The pattern here is the classic client-side agent loop you run inside your own process.

From scratch

Stream Claude tool calls in a TypeScript agent loop (June 2026)

A complete TypeScript tutorial for the streaming agent loop on Claude: input_json_delta accumulation, multi-turn dispatch, AbortController cancellation, and the eager_input_streaming workaround for the verified 5 second first-content delay on tool use. About $0.03 per call with claude-sonnet-4-6 at June 2026 pricing.

June 19, 202611 min read51

From scratch

Build your first AI agent from scratch in 30 minutes

An AI agent is just a loop: you call a model, the model asks to run a tool, you run it, you feed the result back, and you repeat until the model is done. In this tutorial you build that loop yourself in plain TypeScript against the Anthropic Messages API, no framework. You will wire up two tools (read a file, run a calculation), let the model orchestrate them, add a turn cap and basic guardrails, then verify the whole thing end to end. The result is a small research agent you fully understand and can extend with your own tools.

June 16, 20266 min read68

What you will build

Prerequisites

Why sequential tool execution is the bottleneck

Define two independent tools

The agent loop, naive sequential version

Run the tool calls in parallel

Put it in the full loop

When to turn parallel execution off

How much time does this actually save?

Where to run this in production

Verify it works

Frequently asked questions

Related tutorials

Stream Claude tool calls in a TypeScript agent loop (June 2026)

Build your first AI agent from scratch in 30 minutes