Boston Children’s uses AI to unlock new diagnoses
Key Points
- 40+ rare diagnoses
- 60,000 hours saved
- enterprise AI layer (internal ChatGPT)
Summary
Boston Children’s embedded an enterprise AI layer (an internal ChatGPT environment) across clinical, research, and operational teams to treat AI as infrastructure rather than a collection of point tools. The platform provides secure access to internal data, model-assisted synthesis of literature and genetics, and workflow automation. Outcomes include 40+ previously unresolved rare-disease diagnoses, ~60,000 hours saved across 50+ automations (≈$7M in redeployed labor), and one-third of employees using AI daily. Governance, monitoring, and fast deployment cycles (days instead of months) enable safe, repeatable rollouts.
Key Points
- Architecture: single enterprise AI layer (internal ChatGPT) that integrates with internal data and workflows.
- Clinical impact: "co-pilot geneticist" synthesizes genetic, phenotypic, and literature data to enable 40+ new diagnoses and gene target discoveries.
- Operational impact: 50+ automations (invoice intake, surgical scheduling, document drafting) producing ~60k hours saved and $7M+ redeployed labor.
- Governance & safety: monitoring, evaluation, and governance structures paired with tech to ensure safe clinical and operational use.
- Developer velocity: tools and capabilities can be deployed in days, enabling rapid iteration and role-specific solutions.
- Next steps: deeper clinical integration, cross-specialty expansion, and continued model refinement in partnership with OpenAI.
Implications for engineers
- Prioritize a shared platform approach (central model endpoint, secure data connectors, role-based access) over one-off integrations.
- Build observability: logging, auditing, and performance monitoring for safety and compliance.
- Design small automated workflows tied to measurable KPIs to justify scale-up and capture operational ROI.
- Collaborate with clinical teams to validate model outputs and iterate on prompt engineering, data inputs, and evaluation metrics.