claudeenmodel: claude-sonnet-4-20250514
AI Search adds four new Workers AI models for text generation and embedding
Key Points
- Four new Workers AI models added to AI Search
- GLM-4.7-Flash supports 131K token context window
- No additional provider keys required
Summary
Cloudflare AI Search now supports four additional Workers AI models, expanding text generation and embedding capabilities with improved performance characteristics.
Key Points
Text Generation Models
- GLM-4.7-Flash: Lightweight model with 131,072 token context window for long-document tasks
- Qwen3-30B-A3B: Mixture-of-experts model activating only 3B parameters per pass for fast inference
Embedding Models
- Qwen3-Embedding-0.6B: 1,024 vector dimensions, supports up to 4,096 input tokens for longer text chunks
- EmbeddingGemma-300M: 768-dimension vectors optimized for low-latency embedding workloads
Integration
- All models run on Workers AI without requiring additional provider keys
- Available through dashboard and API when creating or updating AI Search instances