claudeenmodel: claude-sonnet-4-20250514

Google Gemma 4 26B A4B Model Now Available on Cloudflare Workers AI

workers-ai gemma-4 mixture-of-experts llm vision-ai function-calling multilingual cloudflare

Key Points

Mixture-of-Experts model with 26B parameters, only 4B active per inference
256,000 token context window with built-in reasoning capabilities
Vision understanding and function calling with multilingual support

Summary

Cloudflare has partnered with Google to bring the Gemma 4 26B A4B model to Workers AI. This Mixture-of-Experts (MoE) model delivers the performance of a 26B parameter model while only activating 4B parameters per forward pass, providing high-quality results with improved efficiency.

Key Points

Mixture-of-Experts Architecture: 8 active experts out of 128 total (plus 1 shared expert) for frontier-level performance at reduced compute cost
Extended Context Window: 256,000 token context for long conversations, tool definitions, and document processing
Built-in Reasoning: Thinking mode enables step-by-step reasoning for improved accuracy on complex tasks
Vision Capabilities: Object detection, document/PDF parsing, OCR, handwriting recognition with variable aspect ratios
Function Calling: Native structured tool support for agentic workflows and multi-step planning
Multilingual Support: Out-of-the-box support for 35+ languages, pre-trained on 140+ languages
Code Generation: Comprehensive coding capabilities including generation, completion, and correction

Access Methods

Workers AI binding (env.AI.run())
REST API endpoints (/run or /v1/chat/completions)
OpenAI-compatible endpoint

claudejamodel: claude-sonnet-4-20250514

Workers AI - Google Gemma 4 26B A4BがWorkers AIで利用可能になりました

Google Gemma 4 26B A4BがWorkers AIで利用可能になりました

2026年4月4日 | Workers AI

Googleとのパートナーシップにより、@cf/google/gemma-4-26b-a4b-itをWorkers AIに導入しました。Gemma 4 26B A4Bは、Gemini 3の研究から構築されたMixture-of-Experts（MoE）モデルで、総パラメータ数26B、フォワードパスあたりのアクティブパラメータは4Bのみです。推論時にパラメータの小さなサブセットをアクティブ化することで、4Bパラメータモデルとほぼ同じ速度で動作しながら、はるかに大きなモデルの品質を提供します。

Gemma 4は、パラメータあたりの知能を最大化するよう設計された、Googleの最も高性能なオープンモデルファミリーです。

主な機能

Mixture-of-Expertsアーキテクチャ: 128個の専門家のうち8個がアクティブ（プラス1個の共有専門家）で、密なモデルの計算コストの一部でフロンティアレベルの性能を提供
256,000トークンのコンテキストウィンドウ: 拡張セッション全体で完全な会話履歴、ツール定義、長いドキュメントを保持
組み込み思考モード: 回答前にモデルがステップバイステップで推論し、複雑なタスクの精度を向上
視覚理解: オブジェクト検出、ドキュメントとPDF解析、画面とUI理解、チャート理解、OCR（多言語対応）、手書き認識に対応し、可変アスペクト比と解像度をサポート
関数呼び出し: 構造化ツール使用のネイティブサポートにより、エージェントワークフローと多段階計画を実現
多言語対応: 35以上の言語を標準サポート、140以上の言語で事前訓練済み
コーディング: コード生成、補完、修正に対応

使用方法

Gemma 4 26B A4Bは以下の方法で使用できます：

Workers AIバインディング（env.AI.run()）
REST API（/runまたは/v1/chat/completions）
OpenAI互換エンドポイント

詳細については、Gemma 4 26B A4Bモデルページをご参照ください。

Workers AI - Google Gemma 4 26B A4B now available on Workers AI

Summary

Summary

Key Points

Access Methods

Translations

Google Gemma 4 26B A4BがWorkers AIで利用可能になりました

主な機能

使用方法