SDK

Vercel AI SDK

stillrunning-vercel-ai-sdk wraps the Vercel AI SDK so every generateText, streamText, and generateObject call reports its duration, tokens, estimated cost, model, and tool-call count to StillRunning. One line of setup, no other code changes.

Install

terminal

npm install stillrunning-vercel-ai-sdk

30-second quickstart

1. Create a workflow at stillrunning.ai/app/new and copy its ping token.

2. Set it as an environment variable:

.env

STILLRUNNING_TOKEN=your_token_here

3. Swap your ai import for the StillRunning client. Everything else stays the same:

app.ts

import { stillrunning } from 'stillrunning-vercel-ai-sdk'
import { openai } from '@ai-sdk/openai'

const { generateText } = stillrunning() // reads STILLRUNNING_TOKEN

const { text } = await generateText({
  model: openai('gpt-4o'),
  prompt: 'Summarize the standup notes.',
})

That's it. Every call now appears in StillRunning with cost, tokens, and timing, and you get an alert the moment a run fails, stalls, or spikes in cost.

What gets captured

On each run the SDK sends a ping with:

Field	Source
`durationMs`	Wall-clock time of the call
`tokensIn / tokensOut`	result.totalUsage, aggregated across all steps
`costUsd`	Estimated from a built-in pricing table (overridable)
`model`	result.response.modelId
`toolCalls`	Total tool calls across every step
`traceId`	Groups one logical run (auto-generated, or set via withTrace)
`metadata`	{ finishReason, steps }

A failed call sends a fail ping with the error message, then rethrows the original error unchanged. Monitoring never alters your control flow, and a ping that fails to send never throws into your code.

Streaming

streamText is handled too. The success ping fires when the stream finishes, and your own onFinish / onError callbacks are preserved:

stream.ts

const { streamText } = stillrunning()

const result = streamText({
  model: openai('gpt-4o'),
  prompt: 'Write a haiku about uptime.',
  onFinish: ({ text }) => console.log('done:', text), // still called
})

for await (const chunk of result.textStream) {
  process.stdout.write(chunk)
}

Grouping multi-step agent runs

By default each call is its own run with its own traceId. When an agent makes several model calls that are really one logical execution, wrap them in withTrace so they share a trace, and StillRunning stitches them into a single outcome chain:

agent.ts

import { stillrunning, withTrace } from 'stillrunning-vercel-ai-sdk'

const sr = stillrunning()

await withTrace(async () => {
  await sr.generateText({ model, prompt: 'plan the task' })
  await sr.generateText({ model, prompt: 'execute step 1' })
  await sr.generateText({ model, prompt: 'execute step 2' })
}) // all three pings share one traceId

Pass an explicit id for nested agents: withTrace(fn, { traceId, parentRunId }).

Cost estimation

Cost is estimated from token counts and a built-in pricing table covering current Claude, GPT, and Gemini models. It's intentionally approximate, it powers relative cost-anomaly detection (a 5x spike is a 5x spike regardless of the exact rate) and a ballpark spend figure. For exact accounting, override it:

cost.ts

// Full control:
const sr = stillrunning({
  computeCost: ({ model, inputTokens, outputTokens }) =>
    myExactPricing(model, inputTokens, outputTokens),
})

// Or extend / override the built-in table (USD per 1M tokens):
import { registerModelPricing } from 'stillrunning-vercel-ai-sdk'
registerModelPricing([[/my-custom-model/, { input: 1.5, output: 6 }]])

Unknown models simply send no cost rather than a wrong one.

Configuration

options.ts

stillrunning({
  token,         // ping token; defaults to process.env.STILLRUNNING_TOKEN
  baseUrl,       // defaults to https://stillrunning.ai
  computeCost,   // (input) => number | undefined — override cost estimation
  awaitPing,     // default true; false = fire-and-forget (lowest latency)
  pingTimeoutMs, // default 3000
  onError,       // (err) => void — observe ping delivery failures
  fetch,         // custom fetch (testing / non-global-fetch runtimes)
})

By default the ping is awaited so it delivers reliably on serverless. It's a single small POST bounded by pingTimeoutMs, so a slow or down StillRunning never hangs your agent.

Requirements

Node 18+ (or any runtime with fetch and AsyncLocalStorage), and the Vercel AI SDK (ai) v5 or later as a peer dependency.

You're set

Open your dashboard to watch runs land in real time, with cost, duration, and anomaly alerts.

Open dashboard