Skip to content
Search

Latest Stories

Submit Guest Post

What caused ChatGPT to start using goblin references in its responses?

A personality feature meant to make AI more playful ended up filling responses with mythological creatures

ChatGPT goblin references

The problem first came to light after the launch of GPT-5.1 in November

Getty Images

Highlights

  • Goblin mentions in ChatGPT rose 175 per cent after GPT-5.1 launched in November.
  • The "Nerdy" personality drove 66.7 per cent of all goblin references.
  • OpenAI has retired the personality and removed creature words from training data.
OpenAI has explained how an attempt to make ChatGPT more fun and personable accidentally gave it a fixation on goblins, gremlins, and other creatures.
The company published a detailed blog post laying out exactly how the problem started, how it spread, and what was done to fix it.

The issue traces back to a feature called the Nerdy personality, one of several personality options OpenAI built to let ChatGPT communicate in different styles.

This particular one was designed to make the AI sound enthusiastic, witty, and playful, like a knowledgeable friend who enjoys making complex ideas accessible.


The problem first came to light after the launch of GPT-5.1 in November. Users began complaining that the model felt strangely familiar in its tone.

When a safety researcher flagged a pattern of goblin and gremlin references, an internal review found that goblin mentions had risen by 175 per cent since the GPT-5.1 launch. Gremlin mentions were up by 52 per cent.

The Nerdy personality used a system prompt that pushed the model to be enthusiastic and use playful language.

During training, responses that included creature words were repeatedly scored more favourably, even though that was never the intention.

Spreading through training

The goblins, however, did not go away. By the time GPT-5.4 was being developed, the creature language had become harder to ignore.

Another internal review was carried out, and this time the connection to the Nerdy personality became clearer.

The Nerdy personality made up just 2.5 per cent of all ChatGPT responses, but it was behind 66.7 per cent of goblin mentions. The bigger problem was that the habit did not stay contained.

The bigger concern was that the habit was not staying within the Nerdy personality. Reinforcement learning does not automatically keep a learned behaviour tied to the condition that produced it.

Once creature language was being rewarded in one part of training, it began seeping into other responses too.

As goblin and gremlin mentions increased under the Nerdy personality prompt, they increased by nearly the same proportion in responses generated without it

When GPT-5.5 was being tested in Codex, OpenAI's coding assistant, staff quickly spotted the same goblin tendency.

The company added a direct instruction telling Codex not to mention goblins, gremlins, raccoons, trolls, ogres, or pigeons unless clearly relevant.

OpenAI has since retired the Nerdy personality, removed the reward signal that encouraged creature words, and cleaned up its training data.

The episode, OpenAI said, is a clear example of how reward signals can shape AI behaviour in ways that are difficult to predict.

Understanding why a model behaves unexpectedly, and building the tools to investigate those patterns quickly, is now part of what its research team treats as a core capability.

Add EasternEye As Your Trusted Source
preferred source on google news

More For You

AI fraud

AI tools are helping fraudsters create increasingly convincing investment scams

iStock

AI-powered investment scams cost Britons £221m as fraudsters target gold, crypto and wine

  • Investment scam losses jumped 40 per cent to £221.5m in 2025.
  • Fraudsters are increasingly using AI-generated websites, messages and voice cloning.
  • More than £1.28bn was stolen through fraud across the UK last year.

Investment scams involving cryptocurrencies, gold, property and even fine wine are becoming increasingly sophisticated, with victims across the UK losing more than £221m in 2025 as fraudsters embrace artificial intelligence to make their schemes appear legitimate.

The latest investment scam figures from UK Finance highlight a growing challenge for banks, regulators and consumers. According to the industry body, losses from investment fraud reached £221.5m last year, a 40 per cent increase compared with the previous year. Nearly 15,000 cases were reported by UK banks as criminals used AI-powered tools to lure victims into fake investments and fictitious funds.

Keep ReadingShow less