OpenAI Launches Safety Bug Bounty Program for AI Abuse and Safety Risks
Key Points
- New Safety Bug Bounty program launched targeting AI abuse risks
- Focus on agentic risks, prompt injection, and platform integrity
- Complements existing security program with safety-specific scope
Summary
OpenAI has launched a public Safety Bug Bounty program on March 25, 2026, specifically targeting AI abuse and safety risks across their products. This program complements their existing Security Bug Bounty by focusing on safety issues that may not qualify as traditional security vulnerabilities but still pose real risks.
Key Points
- Agentic Risks: Third-party prompt injection, data exfiltration, and unauthorized actions by OpenAI agents (including Browser, ChatGPT Agent) with 50%+ reproducibility requirement
- Proprietary Information: Model generations exposing reasoning-related proprietary information and other OpenAI confidential data
- Platform Integrity: Account manipulation, bypassing anti-automation controls, evading restrictions, and unauthorized access to features
- Exclusions: General jailbreaks and content-policy bypasses without demonstrable safety impact are out of scope
- Private Campaigns: Periodic focused bug bounty programs for specific harm types like biorisk content in ChatGPT Agent and GPT-5
Participation
Researchers can apply through the Safety Bug Bounty program portal. Reports are triaged by Safety and Security teams and may be rerouted between programs based on scope.