AIPricingLabGuide · 6 min read
Guide · 6 min read

The reserve / commit / release pattern: atomic AI quota enforcement

Why naive AI usage metering breaks under concurrency, and the only correct pattern that fixes it. Reserve / commit / release explained, with full TypeScript example and the failure modes it prevents.

Last updated: 2026-05-10

Naive AI metering looks like this: check if the user has quota, call OpenAI, then count it. Three lines, all of them wrong under concurrency.

This guide explains exactly why that pattern breaks, what reserve / commit / release does instead, and the practical TypeScript implementation.

Step-by-step

1. See the bug in the naive pattern

The naive flow has a window between "check quota" and "increment counter" where another request can also pass the check. Two requests, both pass, both increment - quota overshoot.

// BROKEN
async function chat(userId: string) {
  const ok = await vevee.canUse(userId, "llm.call");
  if (!ok.allowed) throw new Error("limit");
  const res = await openai.chat.completions.create(/*…*/);
  await vevee.track(userId, "llm.call");          // ← race window from line 2 to 6
  return res;
}

2. Reserve atomically before the AI call

vevee.reserve() does the check AND the increment in a single atomic operation. If two requests reserve simultaneously, exactly one succeeds (assuming the user only had one slot).

const r = await vevee.reserve(userId, "llm.call", 1);
if (!r.allowed) throw new Error("limit");
// the slot is now ours - held by reservationId for 60 seconds

3. Commit on success

After the AI call succeeds, vevee.commit confirms the reservation. The counter stays incremented; this is the "happy path".

const res = await openai.chat.completions.create(/*…*/);
await vevee.commit(r.reservationId!);
return res;

4. Release on failure

If the AI call throws - network error, content policy block, OpenAI 500 - vevee.release rolls back the reservation. The counter is decremented; the user is not penalized for an outage you caused.

try {
  const res = await openai.chat.completions.create(/*…*/);
  await vevee.commit(r.reservationId!);
  return res;
} catch (err) {
  await vevee.release(r.reservationId!);
  throw err;
}

5. Trust the auto-release for crashes

If your server crashes between reserve and commit/release, the reservation auto-releases after 60 seconds. No orphan locks. No need for a watchdog.

Why 60 seconds

Long enough to outlast almost every AI call (LLM streams, image rendering, agent loops). Short enough that a quota leak from a crashed worker is bounded. If your AI calls genuinely take longer than 60 seconds, design the reservation as a check-out / check-in pair with periodic heartbeats - but for 99% of apps, 60 seconds is right.

When you can skip reserve and just track

Two cases: (1) you do not enforce hard limits - you only track for invoicing. (2) Your AI call is so fast that the race window is irrelevant (sub-50ms calls). For everything else, use reserve.

Reserving more than 1

For token-based metering, reserve an upper bound (4000 tokens, say), then refund the unused portion via a separate track event after commit. This holds the right amount of quota during the call without over-charging the user for what they did not use.

Composite events and reservations

A single reservation increments every limit group the event matches. If your image render hits both "total renders" and "premium renders", reserve once and both are atomically held. No need for two separate reservations.

Frequently asked questions

Is reserve / commit slower than canUse / track?

Marginally. reserve is one HTTP call (instead of canUse), commit is one HTTP call (instead of track). Same network overhead, just split into two. For AI calls measured in seconds, this is irrelevant.

What if commit fails (e.g. network error after the AI call)?

The reservation auto-releases after 60 seconds, so the user gets their quota back even though they technically used the AI call. A small leak in your favor (the user). For high-precision accounting, retry commit with idempotency.

Can I have nested reservations (one per agent step)?

Yes. Each vevee.reserve returns its own reservationId. Commit / release them independently. Common pattern: reserve a budget for the agent run, then track each step (without reserving) within that budget.

Why not use a database transaction or a Redis lock?

You could. AIPricingLab essentially gives you that, plus the period semantics, dashboards, plan model, and webhook firing. If you only need a lock and have no other reason to use AIPricingLab, a Redis Lua script is fine.

Other guides