rl Articles | DocsDigest

Matched posts: 1

Reasoning models struggle to control their chains of thought, and that’s good

OpenAI News / Mar 5, 2026

CoT制御率は極めて低い
長い推論や追加訓練で制御性低下
現状はCoT監視が有効な防護層

chain-of-thought cot monitoring evaluation safety rl prompting

Previous1 / 1Next