Warp’s big bet on building open source with GPT-5.5
Key Points
- GPT-5.5 cuts tokens per agent task by ~30%
- Agents co-create ~90% of Warp's internal PRs
- Oz orchestrates agents with persistent memory and live monitoring
Summary
Warp has open-sourced its terminal client and launched a model-driven approach called Open Agentic Development that uses GPT-5.5 to orchestrate persistent agents across local and cloud environments. Warp’s Oz control plane runs and coordinates agents, handles memory and context compaction, and routes work to appropriate model configurations. In internal benchmarks GPT-5.5 used ~30% fewer tokens per agentic coding task than GPT-5.4, and agents now co-create ~90% of Warp’s internal pull requests.
Key Points
- Platform: Oz is a control plane for launching, monitoring, and coordinating agents across local and cloud hosts (web UI, model/hosting selection, live sessions, recurring workflows).
- Model role: GPT-5.5 is part of Warp’s model mix for demanding coding, planning, and review tasks; models also act as judges in evaluation pipelines.
- Efficiency: GPT-5.5 reduced token usage by ~30% vs GPT-5.4 on Warp’s agentic coding tasks, improving cost and scale for long-running workflows.
- Operational needs: persistent memory, context compaction, subagents (e.g., code search, file analysis), observability, reproducible environments, permissions, and human review are critical for scaling agents.
- Workflow pattern: humans define intent and product judgment; agents plan, write, test, and open PRs for human review—Open Agentic Development focuses community contributions on supervision and vision rather than direct implementation.
- Business signal: Warp reports strong growth tied to enterprise demand for scalable agent orchestration.
Practical takeaways for engineers
- Route complex, long-horizon tasks to stronger model configurations and use lighter models for routine tasks to save tokens.
- Build observable control planes that surface live sessions, execution state, and artifacts for human-in-the-loop review.
- Invest in memory management (context compaction, persistent state) and reproducible environments to keep agent outputs consistent over time.
- Use automated evaluation (LLM-as-judge) and explicit permission/coordination layers when agents can open PRs or modify repositories.
Where this matters
- Teams building agent orchestration, CI for model-generated code, and platforms that mix local/cloud execution should prioritize memory, observability, and evaluation pipelines to scale safely.