Introducing the OpenAI Safety Bug Bounty program
Key Points
- Public Safety Bug Bounty targets AI-specific abuse and safety risks
- In-scope: agentic risks, proprietary leaks, and account/platform integrity
- Reports triaged with 50% reproducibility requirement for agent hijacks
Summary
OpenAI launched a public Safety Bug Bounty program focused on identifying AI-specific safety and abuse risks that fall outside conventional security vulnerabilities. The program accepts reports on agentic risks (including third-party prompt injection and MCP), proprietary-information exposure, and account/platform integrity issues. Submissions are triaged by OpenAI’s Safety and Security Bug Bounty teams and may be routed between programs depending on ownership.
Key Points
- Scope: in-scope categories include agentic risks (e.g., reliable agent hijacking, Browser and ChatGPT Agent behavior), third-party prompt injection and data exfiltration, exposure of OpenAI proprietary information, and vulnerabilities affecting account or platform integrity.
- Reproducibility: agentic hijack/MC P reports must be reproducible at least 50% of the time and demonstrate plausible, material harm.
- Triage & routing: reports are reviewed by Safety and Security teams and may be rerouted between the Safety and Security bounty programs based on scope.
- Out-of-scope: general content-policy bypasses and jailbreaks that lack demonstrable safety/abuse impact (e.g., rude responses or easily searchable info) are not eligible; separate private campaigns may cover certain high-risk categories (e.g., biorisk).
- Responsible testing: MCP and third-party tests must comply with third-party terms of service and follow responsible disclosure practices—do not perform unauthorized actions or escalate harm.
- Report guidance for engineers and researchers: provide clear reproduction steps, PoC or test cases, success rate and scale, evidence (logs/screenshots), affected product(s), potential impact, and suggested mitigations to help triage and remediation.
- How to participate: apply and submit findings through the Safety Bug Bounty program portal; OpenAI invites collaboration with researchers, ethical hackers, and the safety/security community.