OpenAI谈论不谈论地精

OpenAI talks about not talking about goblins

2026年4月30日 21:42 The Verge AI 国际资讯关注 2 分钟阅读 349 字归档：2026年5月1日 02:48 查看原文 →

模型开发工具

OpenAI talks about not talking about goblins | The Verge

导读

OpenAI正在开放其地精问题。在Wired的一份报告显示OpenAI的编码模型“永远不要谈论地精、小精灵、浣熊、巨魔、食人魔、鸽子或其他动物或生物”的指示后，这家AI初创公司在其网站上发布了一份解释，称对这些生物的引用

OpenAI is opening up about its goblin problem. After a report from Wired revealed instructions to OpenAI's coding model to "never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures," the AI startup published an explanation on its website, calling references to the creatures a

原文快照

站内保留一份可阅读的正文副本；如抓取失败，则保留摘要和原文链接。

OpenAI talks about not talking about goblins

References to goblins and gremlins spiked with the release of GPT-5.1’s ‘Nerdy’ personality, and then spread to other models.

OpenAI is opening up about its goblin problem. After a report from Wired revealed instructions to OpenAI’s coding model to “never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures,” the AI startup published an explanation on its website, calling references to the creatures a “strange habit” its models developed as a result of their training.

As outlined in the blog post, OpenAI began noticing metaphors referencing goblins and other creatures starting with its GPT-5.1 model — specifically when using the “Nerdy” personality option. OpenAI says the problem continued to worsen with subsequent model releases, until it found that its reinforcement training rewarded the quirky metaphors with the Nerdy personality, which newer models were training on.

The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them. Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.

Though references to goblins and gremlins dropped off after OpenAI discontinued the Nerdy personality in March, they didn’t disappear completely with GPT-5.5 inside its Codex coding tool, as OpenAI started training the model before finding the “root cause.” The company had to give Codex very specific instructions not to talk about the mythological creatures as a result. But if you’d prefer to have your AI code with some goblin sprinkled in, OpenAI has shared a way to reverse its instructions.

Emma Roth