DeepSeek 'highly vulnerable' to generating harmful content, study reveals

January 31, 2025 09:55 AM

Show more sharing options

Robert/Adobe Stock

DeepSeek took the world by storm this week. However, new research reveals the Chinese AI model has serious ethical and security flaws, including being 11 times more likely to generate harmful output than OpenAI’s o1.

Security platform Enkrypt AI published research that showed DeepSeek-R1, the flagship model powering the AI app, was three times more biased than Anthropic’s Claude 3 Opus and four times more vulnerable to generating insecure code than OpenAI’s o1 model.

Enkrypt’s security experts found DeepSeek extremely susceptible to manipulation, with users able to easily generate content to assist in the creation of chemical, biological, and cybersecurity weapons.

DeepSeek was found to be “highly biased” and easily capable of producing content including hate speech, threats, self-harm, and explicit, even potentially criminal material.

Subscribe today for free

The connectivity news and insights that matter - straight to your inbox

Sahil Agarwal, CEO of Enkrypt AI, said: “DeepSeek-R1 offers significant cost advantages in AI deployment, but these come with serious risks. Our research findings reveal major security and safety gaps that cannot be ignored.

“While DeepSeek-R1 may be viable for narrowly scoped applications, robust safeguards — including guardrails and continuous monitoring—are essential to prevent harmful misuse. AI safety must evolve alongside innovation, not as an afterthought.”

To probe DeepSeek’s guardrails, the Enkrypt team engaged in red teaming, a practice in AI development where researchers deliberately stress-test a model by submitting prompts designed to expose security vulnerabilities and weaknesses.

They found that DeepSeek was “highly vulnerable” to being jailbroken, where users bypass restrictions, and displayed “considerable vulnerabilities in operational and security risk”.

Security testing showed DeepSeek produced discriminatory output in 83% of tests, generating outputs that featured biases spanning race, gender, health, and religion.

Notably, the Enkrypt team found DeepSeek-R1 exhibited similar bias compared to OpenAI GPT-4o and o1 models. Capacity tests found the model generated biased pro-China responses to questions about whether Taiwan is a country.

Just under half (45%) of Enkrypt tests circumvented DeepSeek-R1’s safety protocols, generating criminal planning guides, illegal weapons information, and extremist propaganda.

One example saw DeekSeep draft a persuasive recruitment blog for terrorist organisations, exposing its high potential for misuse.

Over three-quarters (78%) of cybersecurity tests saw the researchers successfully trick DeepSeek-R1 into generating insecure or malicious code, including malware, trojans, and exploits.

The model was 4.5x more likely than OpenAI’s o1 to generate functional hacking tools, posing a major risk for cybercriminal exploitation.

“As the AI arms race between the U.S. and China intensifies, both nations are pushing the boundaries of next-generation AI for military, economic, and technological supremacy,” Agarwal said. “However, our findings reveal that DeepSeek-R1’s security vulnerabilities could be turned into a dangerous tool — one that cybercriminals, disinformation networks, and even those with biochemical warfare ambitions could exploit. These risks demand immediate attention.”