Parallel vs Sequential Tool Dispatch in Agentic Systems

When an AI agent calls multiple tools in one turn, the naive implementation runs them sequentially: call tool A, wait, call tool B, wait, call tool C, wait. This is safe. It is also unnecessarily slow.

In a typical investigation turn, the model might emit five tool calls: read three files and run two searches. Each of those is independent. None of them mutate state. Running them sequentially takes five round trips. Running them concurrently takes one.

The Two Classes of Tool

Every tool falls into one of two categories:

Parallel-safe: read-only operations with no side effects. File reads, search queries, database lookups, API fetches that don’t mutate anything. Two of these can run simultaneously without any risk of interference.

Sequential: mutating operations where order matters. File edits, git commits, API calls that create or modify records, shell commands that write to disk. These must run in the exact order the model emitted them — because each may depend on the state left by the previous one.

The classification is a property of the tool itself, not the context it’s called in. A file read is always parallel-safe. A file write is always sequential.

The Dispatch Algorithm

Given a list of tool calls the model emitted in a single turn:

Walk the list left to right.
If the current tool is parallel-safe, collect consecutive parallel-safe tools into a batch.
Execute the batch with Promise.all.
If the current tool is sequential, run it alone and advance.

let cursor = 0;
while (cursor < toolUseBlocks.length) {
  if (tools.isParallelSafe(toolUseBlocks[cursor].name)) {
    const batchStart = cursor;
    while (
      cursor < toolUseBlocks.length &&
      tools.isParallelSafe(toolUseBlocks[cursor].name)
    ) {
      cursor++;
    }
    const batch = toolUseBlocks.slice(batchStart, cursor);
    await Promise.all(batch.map((b, i) => runOne(b, batchStart + i)));
  } else {
    await runOne(toolUseBlocks[cursor], cursor);
    cursor++;
  }
}

This preserves the “don’t reorder writes” contract while letting reads happen concurrently. A mixed turn — read, read, write, read, read — would run the first two reads in parallel, then the write sequentially, then the last two reads in parallel.

Why Emission Order Matters for Sequential Tools

The model emits tool calls in a deliberate order. If it emits edit_file(a.php) followed by edit_file(b.php), it intends a.php to be modified first. The edit to b.php may depend on the state introduced by the edit to a.php.

Reordering writes to gain parallelism is not a safe optimization. It changes the semantics of the operation. Even if the two edits happen to be independent in a specific case, the dispatch layer can’t know that — only the model does, and it expressed its intent through ordering.

The rule is simple: sequential tools run in the order they were emitted, no exceptions.

The Real-World Impact

In investigation-heavy phases — where the model reads multiple files and runs multiple searches per turn — parallel dispatch cuts wall-clock time by 40–60% on typical turns. A five-tool turn that takes 8 seconds sequentially takes 2–3 seconds with batching.

Over a 25-turn agent run where 15 turns involve parallel-safe batches of 3–4 tools, this compounds into minutes of wall-clock savings per run.

The implementation cost is low. Classifying tools at registration time — a single parallelSafe: boolean field — is the entire surface area of the change. Everything else follows from the dispatch algorithm above.

Applying This to Subagents

Subagents (specialist agents spawned by the primary loop for specific tasks) benefit from the same dispatch pattern. An explorer subagent running ten codebase searches per turn should run them in parallel. A code-writer subagent making edits should run them sequentially.

The parallel-safe classification is defined once per tool and shared across both the primary agent loop and all subagent loops. There’s no duplication — the same ToolRegistry instance carries the same classifications everywhere it’s used.

The discipline here is simple: know which of your tools are reads and which are writes. Run reads together. Run writes in order.