Building the foundation for running extra-large language modelsCloudflare / Apr 16, 2026PD分離で3x高速化キャッシュヒット60%→80%Infireで起動20秒未満prefill-decodekv-cachespeculative-decodinginfiremulti-gpusession-affinity
Powering the agents: Workers AI now runs large models, starting with Kimi K2.5Cloudflare / Mar 19, 2026Kimi K2.5対応256kコンテキストプレフィックスキャッシュ強化workers-aikimi-k2.5prefix-cachingasync-apiinfireagents-sdklarge-models