Calls

Calls are individual LLM API invocations. The SDK automatically captures the full context — messages, response, tokens, cost, latency, and status — so you never have to log any of it manually.

How call tracking works

Call tracking is a two-step process:

1. Intercept

When you wrap a client with warp(), the SDK intercepts every API call. It records the request metadata (model, messages, tools) and the response (content, tokens, latency, status) in memory — but doesn't send anything yet.

2. Link

When you call call(target, response), the SDK links the intercepted call to a run or group and queues it for transmission. Only calls that are explicitly linked are ever sent to the API.

This lazy linking design means you can use your wrapped client anywhere — in utility functions, libraries, tests — and only the calls you explicitly link will be tracked. No noise.

API

call(target: Run | Group, response: Response, opts?: object): void

Links an intercepted LLM response to a run or group. Queues the call for transmission.

Parameters
target— The run or group to link this call to.
response— The response object returned by the wrapped LLM client. For streams, pass the stream object after consuming it.
opts— Optional metadata to attach to this call.
Returns

Nothing. This is a fire-and-forget operation. The call is queued for batch transmission.

Basic example

Tracking a call
import OpenAI from 'openai';
import { warp, run, group, call, flush } from '@warpmetrics/warp';

const openai = warp(new OpenAI());
const r = run('Summarizer');
const g = group(r, 'Analysis');

// Make the LLM call as usual
const res = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Summarize this...' }],
});

// Link it to the group
call(g, res);

// The response works exactly as before
console.log(res.choices[0].message.content);

await flush();

What gets captured

The SDK automatically extracts all of this from the LLM response:

provideropenai or anthropic
modelThe model used (e.g., gpt-4o, claude-sonnet-4-5-20250514)
messagesThe input messages sent to the LLM
responseThe text content of the LLM's response
toolsTool/function names if tools were provided
toolCallsTool calls made by the LLM (id, name, arguments)
tokens.promptInput token count
tokens.completionOutput token count
tokens.totalTotal tokens used
tokens.cachedInputCached input tokens (OpenAI)
tokens.cacheWriteCache write tokens (Anthropic)
tokens.cacheReadCache read tokens (Anthropic)
costComputed cost in USD (server-side)
latencyWall-clock time in milliseconds
statussuccess or error
errorError message if the call failed
timestampISO 8601 timestamp

Linking to runs vs groups

You can link a call to either a run or a group. Both work the same way.

Linking to runs vs groups
// Link directly to a run (no groups needed)
const r = run('Simple agent');
const res = await openai.chat.completions.create({...});
call(r, res);

// Or link to a group within a run
const r2 = run('Complex agent');
const g = group(r2, 'Analysis');
const res2 = await openai.chat.completions.create({...});
call(g, res2);

Streaming

Streaming works automatically. The SDK wraps the async iterator to collect content chunks and capture usage data as the stream completes. Call call() after consuming the stream.

Streaming support
const stream = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
  stream: true,
});

// Consume the stream as usual
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

// Link after the stream is consumed
call(g, stream);

Token counts and latency are captured when the stream finishes. Cost is calculated server-side based on the model's pricing.

Error tracking

When an LLM call throws an error, the SDK captures it and attaches tracking data to the error object. You can still link failed calls.

Tracking failed calls
try {
  const res = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Hello' }],
  });
  call(g, res);
} catch (err) {
  // call() automatically extracts tracking data from the error
  call(g, err);

  // Record an outcome for the failure
  outcome(g, 'Error', { message: err.message });
}

Failed calls appear in the dashboard with status: 'error' and include the error message, latency, and any partial data available.

Outcomes on calls

You can record outcomes on individual calls. Use the response object as the target.

Call-level outcomes
const res = await openai.chat.completions.create({...});
call(g, res);

// Record an outcome on the call itself
outcome(res, 'Helpful');
outcome(res, 'Hallucination Free');

Supported providers

OpenAI
chat.completions.create()
responses.create()
Streaming, tool calls, cached tokens
Anthropic
messages.create()
Streaming, cache write/read tokens

Tips

-Each response can only be linked once. The second call() with the same response is silently ignored.
-Unllinked responses are never transmitted. If you forget to call(), the data stays in memory until garbage collected — no noise in your dashboard.
-The wrapped client is fully compatible. It returns the same types, supports the same options, and works with TypeScript autocomplete.
-call() is a fire-and-forget operation. It never throws and adds no latency to your agent.