Introducing GPT-5.4 — Frontier model for professional work
Key Points
- 1M token context
- native computer-use capabilities
- 33% fewer false claims
Summary
GPT‑5.4 (available as GPT‑5.4 Thinking and GPT‑5.4 Pro in ChatGPT, and in the API and Codex) is a frontier model focused on professional knowledge work, coding, and agentic computer use. It combines improved general reasoning, native computer-use capabilities, extended context, and more token-efficient problem solving to deliver higher-quality outputs faster and with fewer iterations.
Key Points
- Availability: ChatGPT (Thinking, Pro), API, and Codex. Experimental Codex skill: Playwright (Interactive).
- Context & scale: supports up to 1M tokens of context for long-horizon planning and verification.
- Computer use: first general-purpose model with native computer-use tool support (mouse/keyboard via screenshots, Playwright), improved DOM/screenshot interaction, and configurable safety/confirmation policies.
- Vision: higher-fidelity image inputs (original detail up to 10.24M pixels / 6000px max) and better visual perception for document parsing and UI interaction.
- Coding & latency: combines GPT‑5.3‑Codex coding strengths, faster token velocity via /fast mode in Codex and priority processing in the API; Playwright Interactive enables visual debugging and browser/Electron testing.
- Accuracy & efficiency: more factual (claims 33% less likely false; full responses 18% fewer errors vs GPT‑5.2) and substantially more token-efficient reasoning than GPT‑5.2.
- Benchmarks: GDPval 83.0% (state-of-the-art), OSWorld-Verified 75.0%, MMMU-Pro 81.2%, OmniDocBench error 0.109 (better than GPT‑5.2).
Practical guidance for engineers
- Use GPT‑5.4 Thinking/Pro in ChatGPT to get an upfront plan of thinking you can adjust mid-response.
- In the API/Codex, enable the updated computer tool to drive UI interactions; prefer original/high image detail for high-resolution perception and click accuracy.
- For low-latency coding workflows, use /fast mode (Codex) or priority processing (API).
- Configure developer messages and custom confirmation policies to tune agent safety and behavior for your application.
Key benefits
- Faster, more accurate outputs for spreadsheets, presentations, documents, and long-running agent workflows.
- Improved agent reliability across web and desktop tasks, with lower token cost and faster completion times.
Where to start
- Try GPT‑5.4 Thinking or Pro in ChatGPT for knowledge-work tasks; read the updated API/Codex docs and try Playwright (Interactive) for visual test/debug flows.