ChatGPT Generates Gory and Sexual Images: Mindgard

Collected Photo
Researchers from the British AI security startup Mindgard have discovered that the latest public version of OpenAI’s chatbot can be manipulated into producing sexualized images and scenes of graphic violence.
By slightly altering a widely-shared humorous prompt, the team successfully bypassed the system’s safety filters.
OpenAI, the creator of ChatGPT, said that it has taken action to block these responses after being contacted by the BBC. The company noted, “After investigating this trend, we’ve introduced additional safeguards against this type of prompt.”
Despite these efforts and the company’s claim of having multiple layers of protection, researchers found that minor adjustments to the prompt still allowed the chatbot to generate concerning material.
Nature of the Generated Imagery
BBC confirmed how the GPT-5.4 model was prompted to create graphic content. Peter Garraghan, the founder of Mindgard and a professor at Lancaster University, described the outputs as “very gruesome, sometimes sexualised, sometimes both together.”
He expressed significant concern that the AI produced these images of “its own volition” without specific subject-matter instructions. Garraghan remarked, “This is a perfectly innocent-looking instruction to an AI, but the consequence is it generates very, very bad imagery and content.”
The generated images were described as deeply distressing. One image depicted a man with a severe head injury, while another showed a deceased young woman covered in blood.
Mindgard researcher Jim Nightingale, who uncovered the flaw, stated he was left “shaken, and in tears” by the visuals. One specific image, titled “Grim crime scene aftermath” by ChatGPT, suggested sexual violence.
Another, titled “abandoned in fear and restraint,” showed a frightened young woman tied up and gagged.
Bypassing Safeguards and Training Data
Mindgard noted that previous research indicated ChatGPT could be tricked into creating nude deepfakes of real people by swapping faces. While OpenAI claimed to have fixed that particular issue, researchers demonstrated that alternative approaches still succeed.
Garraghan warned that more exploring would likely reveal even worse content, stating, “Other topics, I’m sure, would also come out if we spent more time doing so.”
Nightingale believes the output reflects the internet data used to train the model, noting in his report, “I’m struck that while what I saw was generated, an artificial image, it has ties to real images, and the real world.”
Although the researchers alerted OpenAI in May, they initially received only an automated response.
Industry and Regulatory Challenges
OpenAI maintains that it uses a combination of automated systems and human review to block harmful material.
Their official policy states, “The assistant should not generate erotica, depictions of illegal or non-consensual sexual activities, or extreme gore, except in scientific, historical, news, artistic or other contexts where sensitive content is appropriate.”
However, experts describe the task of securing AI as “mountainous”. Dr. Rumman Chowdhury, CEO of Humane Intelligence, characterized the situation as “a game of cat and mouse”. She explained that models lack human understanding. “Models do not understand intent. They do not understand context. They do not understand propriety or right or wrong.”
UK’s AI Security Institute previously found “jailbreaks” in every system it tested. Department for Science, Innovation and Technology acknowledged that “safeguards in AI models are improving, but there is more to do,” emphasizing continued collaboration with developers to strengthen security.
Source: BBC (adapted)


