The Chinese spy bots behind Beijing’s AI heist in the West

DeepSeek accused of using ‘distillation attacks’ to siphon information from Silicon Valley chatbots

May 4, 2026 - 06:14

The Chinese spy bots behind Beijing’s AI heist in the West

Montage - DT.

When DeepSeek, a Chinese chatbot, was released in January last year, more than $1tn (£740bn) was wiped from US markets over fears that Beijing had successfully caught up to Silicon Valley in the AI race.

DeepSeek, little known outside AI circles, appeared to have developed a system that could match the likes of ChatGPT and Claude despite lacking the billions in funding or enormous data centre resources of its rivals.

But before long, suspicions emerged about how the Chinese company had got there. OpenAI claimed that DeepSeek had been improperly trained by siphoning information from the US company’s own systems.

Since then, OpenAI and Anthropic, the leading US labs, have reasserted their dominance with powerful new versions of their systems ChatGPT and Claude.

But the companies have also been scrambling to shore up their defences over fears that China is using an army of AI spy bots to conduct an “industrial-scale” heist of US technology.

In February, Sam Altman’s OpenAI alleged in a letter to US Congress that DeepSeek had been conducting “ongoing efforts to free ride on the capabilities developed by OpenAI and other US frontier labs”.

The same month, Anthropic reported that it had uncovered three Chinese labs – DeepSeek, Moonshot and MiniMax – seeking to “illicitly extract Claude’s capabilities to improve their own models”.

The US AI companies have labelled their attempts “distillation attacks”, in which a cut-price AI attempts to learn from its expensively trained rival by copying how it answers thousands of questions.

Last week, the issue even reached the White House. Michael Kratsios, Donald Trump’s science and technology director, accused China of a “deliberate, industrial-scale campaign” to “systematically extract capabilities from American AI models”.

DeepSeek, Moonshot and MiniMax did not respond to requests for comment. Chinese officials have rejected the claims and accused the US of the “unjustified suppression of Chinese companies”.

“Chinese distillation attempts represent industrial espionage on a vast scale” allowing Beijing’s AI champions to train “facsimiles of Western products” on the cheap, says Jack Burnham, a China analyst at the Foundation for Defense of Democracies, a Washington-based think tank.

In a distillation attack, a hostile AI bot enters a chat with a high-end Silicon Valley AI tool, such as the latest version of ChatGPT or Claude, which is nicknamed the “teacher”. The attacker then runs millions of queries and harvests the answers, and uses this to train its own AI.

More sophisticated versions can see the attacker harvest the underlying “reasoning” from the model. The attacker then feeds all this data into a smaller, “student” AI, which mimics the capabilities of the AI it has harvested.

AI distillation has legitimate uses. The major AI labs themselves often use the technique to create cheaper versions of their most expensive products. But when done by a rival, labs insist this amounts to unfair free-loading.

According to Anthropic, the Chinese labs created 24,000 fraudulent accounts that undertook 16 million different chats with Claude to try and copy it.

Google researchers, meanwhile, uncovered a bot that asked 100,000 suspicious queries of its Gemini chatbot in an effort to clone its knowledge.

Chinese labs, meanwhile, have successfully evaded the defences of America’s labs. Security researchers have found attackers using network proxy services to mask the origins of their bots, called “hydra clusters”, that hide their fake accounts within legitimate traffic.

China has long been accused of secretly copying America’s best inventions. In 2003, Huawei, the telecoms giant, was accused of “verbatim” lifting code from Cisco’s IT routers for its own products. A lawsuit between the companies was settled.

‘Intellectual property theft’

More recently, the US has arrested Chinese Silicon Valley workers on espionage charges. In January this year, a former Google engineer was found guilty of stealing hundreds of files of supercomputer plans and attempting to take them to China.

Chinese hackers have also repeatedly engaged in cyber attacks intended to conduct economic espionage against the US. But unlike a cyber attack, distillation heists are conducted through the very apps that tech giants make widely available via the web to their subscribers.

“It is closer to intellectual property theft through the front door,” says Nash Borges, an AI expert at cyber security business Sophos.

While they do not amount to a full data breach or the outright theft of an AI’s code, they represent a major problem for Silicon Valley’s AI labs.

If Chinese companies can train a “student” AI by leaching from US companies, it could erode their technological lead and undermine hundreds of billions of dollars in AI data centres.

The attacks “undercut the significant investments of American firms, who have poured billions into building out the infrastructure needed to run high-end AI models”, says Burnham.

He adds that China has been seeking to use its own AI tools to bolster its military. Researchers have found that the Chinese people’s liberation army has already been hunting developers to build AI tools based on products from DeepSeek.

AI distillation also helps to bypass China’s struggles in accessing powerful AI processors from the likes of Nvidia, building functional replicas of American chatbots without the need for as many chips. China has already bet on a strategy of making its AI bots cheap and plentiful, undercutting America’s premium apps.

Silicon Valley labs have been ramping up their defences to try and prevent more data harvesting. Anthropic has cut off Chinese access to its technology. It has also built new cyber tools intended to detect unusual usage.

OpenAI, Anthropic and Google have joined an industry group that will share information on potential AI distillation campaigns so they can be cut off.

Republican Congressmen in the US have proposed a law banning such data extraction and sanctioning companies caught doing it.

‘Stealing its source code’

Not everyone is convinced this is necessary. Richard Windsor, a technology analyst, said in a note that the underhand AI tactic falls short of “breaking into” an AI lab and “stealing its source code”.

He says the companies involved can use technical safeguards to block these attacks if they are against their terms.

Some AI critics, meanwhile, have suggested it is somewhat rich for AI labs to accuse their rivals of pilfering their secrets when they have already harvested vast amounts of copyrighted information from the web without the permission of its creators.

Anthropic settled a $1.5bn lawsuit from authors last year over claims it had pirated a vast library of books from the internet to train its AI bots.

“How dare they steal the stuff Anthropic stole from human coders,” Elon Musk said in a post on X.

With China playing catch-up in the superintelligence race, Silicon Valley will need to find a way to block Beijing’s AI spies before their chatbot clones leapfrog America’s labs.

[Source: Daily Telegraph]