Powering the agents: Workers AI now runs large models, starting with Kimi K2.5
Cloudflare / Mar 19, 2026
- Kimi K2.5: 256k context & multi-turn tool calling
- Prefix caching + x-session-affinity cuts prefill cost and improves TTFT
- Pull-based async API for durable, high-volume inference