Gemini 3.5: frontier intelligence with action
Key Points
- Frontier agentic and coding performance
- Approximately 4× faster output throughput
- Available now across app, API, Antigravity and enterprise
Summary
Gemini 3.5 (launching with 3.5 Flash) combines frontier reasoning, coding and multimodal understanding with high-throughput execution for agentic workflows. 3.5 Flash delivers flagship-level accuracy on benchmarks while providing up to 4× higher output throughput, making it suited for long-horizon, multi-step tasks at lower latency and cost. It is available today across the Gemini app, AI Mode in Search, Google Antigravity, the Gemini API (AI Studio, Android Studio) and enterprise platforms; 3.5 Pro is rolling out next month.
Key Points
- Performance: matches or exceeds prior flagship results on agentic and coding benchmarks (Terminal-Bench 2.1: 76.2%; GDPval-AA: 1656 Elo; MCP Atlas: 83.6%; CharXiv Reasoning: 84.2%) while delivering ~4× token/sec throughput.
- Agent orchestration: optimized for long‑horizon workflows via the Antigravity harness and subagents to plan, build, iterate and execute multi-step tasks (examples: codebase migration to Next.js, game development, automated asset categorization).
- Multimodal outputs: generates richer interactive UIs, animations and hardware mockups; speeds up UX and branding iterations.
- Real-world adoption: deployed with partners (Shopify, Macquarie Bank, Salesforce, Ramp, Xero, Databricks) for forecasting, document reasoning, OCR, onboarding and data diagnostics.
- Personal agents & Search: powers Gemini Spark (personal 24/7 agent) and enhanced AI Mode experiences in Search.
- Safety: developed under the Frontier Safety Framework with strengthened cyber and CBRN safeguards, safety training, mitigations and interpretability tools for inner reasoning checks.
- Availability: 3.5 Flash is generally available now; 3.5 Pro is in internal use and planned public rollout next month.
Practical guidance for engineers
- Use cases: choose 3.5 Flash for latency-sensitive agentic workflows, heavy multimodal generation, and coding tasks where throughput reduces wall time.
- Integration: access via Gemini API (AI Studio/Android Studio) for direct model calls or use Antigravity for orchestrating subagents; use Gemini Enterprise for enterprise deployments and governance.
- Cost & benchmarking: expect significant time/cost reductions for long workflows (often <50% cost vs other frontier models); validate with representative end-to-end benchmarks and monitor orchestration overhead.
- Safety operations: enable provided interpretability and safety controls when enabling tool-calling or continuous agents; apply enterprise governance and review risk profiles before production rollout.