Where the goblins came fromOpenAI News / Apr 29, 2026報酬が語彙を強化Nerdyから挙動が転移報酬とデータを修正して鎮静化reinforcement-learningreward-modelingfine-tuningdata-filteringmodel-auditalignment