Wayfair boosts catalog accuracy and support speed with OpenAI
Key Points
- Tag-agnostic OpenAI model for catalog attributes
- Wilma automates triage and 41,000 tickets/month
- 2.5M product tags corrected; 70x attribute expansion
Summary
Wayfair embedded OpenAI models into its catalog and supplier support workflows to reduce manual work, improve product attribute quality, and accelerate ticket resolution. The company built a tag-agnostic architecture (a single model plus a "definition agent") that ingests internal and web definitions and product data to classify attributes across product classes. In supplier support, an agentic system named Wilma handles triage, co-pilot, and autopilot flows—significantly increasing throughput and cutting turnaround times.
Key Points
- Architecture: single, tag-agnostic OpenAI model with a "definition agent" that produces contextual meanings for tags and consumes aggregated product data. This replaced costly bespoke models per tag.
- Deployment pattern: staged rollout with human audits and supplier validation; automated overwrites at high confidence, supplier confirmation for high-risk tags; alignment-rate monitoring to move from assist to semi-autonomous modes.
- Measured impact: corrected ~2.5M product tags across >1M products; scaled attribute coverage ~70x year-over-year; automated ~41,000 supplier tickets/month and reduced time-to-resolution across workflows.
- Wilma details: ticket triage reads intent, fills context from internal databases, routes to owners; co-pilots synthesize case history and draft responses; prototype-to-production for triage shipped in ~1 month.
- Engineering takeaways: centralize semantics with a definition layer, train assistants on historical success signals, integrate models with your data ecosystem for context, and enforce confidence thresholds + human-in-the-loop audits for high-risk changes.
- Metrics & monitoring: track alignment rate (model vs. human decision), A/B test ecommerce metrics (impressions, clicks, page rank), ticket throughput, and reopen rates to validate impact.
Practical next steps for engineers
- Build a reusable semantic layer (definition agent) to avoid per-tag models.
- Log and monitor model confidence and define alignment thresholds to gate automation.
- Keep human audits and supplier sign-off for high-risk or low-confidence changes while expanding autopilot scopes gradually.
- Use historical case data to train co-pilot/autopilot behaviors and measure alignment before enabling auto-actions.