AI Gateway — Control AI costs with spend limits
Key Points
- Blocks requests when budget exceeded
- Scope by model, provider, or metadata
- Supports Unified Billing and BYOK
Summary
AI Gateway now supports spend limits: dollar-based budgets that track cumulative cost (token usage × model pricing) and automatically block requests when a budget is exceeded. This is distinct from request rate limiting — spend limits enforce monetary caps rather than request counts.
Key Points
- Tracks actual cost using token usage and known model pricing; blocks requests when budget is exceeded.
- Scope limits by model, provider, or custom metadata (e.g., per-user, per-team, per-model).
- Configurable time windows: fixed or sliding enforcement intervals.
- Examples: $200/day per user, $10,000/day gateway cap, or $50/day per user for a specific model.
- Works with Unified Billing and BYOK requests for models with known pricing.
- Use cases: per-user budgets, global spend caps, per-model cost controls to prevent unexpected charges.