Key Points
- Chinese startup DeepSeek became the top-rated free application on the App Store on Monday, overtaking ChatGPT.
- DeepSeek says its free assistant uses less data at a fraction of the cost of its competitors’ models.
- Analysts have disputed DeepSeek’s claims about its total training costs for its V3 and R1 models.
Chinese startup DeepSeek says it will temporarily limit registrations due to a after the company’s artificial intelligence assistant amassed sudden popularity.
On Monday, the company was also hit by outages on its website after its became the top-rated free application available on Apple’s App Store in the United States, overtaking rival ChatGPT.
The company resolved issues relating to its application programming interface and users’ inability to log in to the website, according to its status page.
The outages on Monday were the company’s longest in about 90 days and coincided with its skyrocketing popularity.
DeepSeek’s AI
Last week, DeepSeek launched a free assistant it says uses less data at a fraction of the cost of competitors’ models, possibly marking a turning point in the level of investment needed for AI.
Powered by the DeepSeek-V3 model, which its creators say “tops the leaderboard among open-source models and rivals the most advanced closed-source models globally”, the AI application has surged in popularity among US users since it was released on 10 January, according to app data research firm Sensor Tower.
The release of OpenAI’s ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence.
But after the release of the first Chinese ChatGPT equivalent, made by search engine giant Baidu, there was widespread disappointment in China at the gap in AI capabilities between US and Chinese firms.
The quality and cost efficiency of DeepSeek’s models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and US tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta’s most advanced models, the Chinese startup has said.
Controversy over chips
AI models from ChatGPT to DeepSeek require advanced chips to power their training.
The US administration of former president Joe Biden had since 2021 widened the scope of bans designed to stop these chips from being exported to China and used to train Chinese firms’ AI models.
However, DeepSeek researchers wrote in a paper last month that the DeepSeek-V3 used Nvidia’s H800 chips for training, spending less than $US6 million ($9.5 million).
Although this detail has since been disputed, the claim that the chips used were less powerful than the most advanced Nvidia products the US has sought to keep out of China, as well as the relatively cheap training costs, has prompted US tech executives to question the effectiveness of tech export controls.
What is DeepSeek?
DeepSeek is a Hangzhou-based startup whose controlling shareholder is Liang Wenfeng, co-founder of quantitative hedge fund High-Flyer, based on Chinese corporate records.
Liang’s fund announced in March 2023 on its official WeChat account that it was “starting again”, going beyond trading to concentrate resources on creating a “new and independent research group, to explore the essence of AGI” (Artificial General Intelligence). DeepSeek was created later that year.
It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office located in the same building as DeepSeek, and it also owns patents related to chip clusters used to train AI models, according to Chinese corporate records.
High-Flyer’s AI unit said on its official WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips.
But some have publicly expressed scepticism about DeepSeek’s success story.
Alexandr Wang, the CEO of Scale AI — a company that provides training data for machine learning models — said during an interview with CNBC on Thursday, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips (the highest-powered Nvidia chips on the market).
He claimed these chips would not be disclosed because that would violate Washington’s export controls that ban such advanced AI chips from being sold to Chinese companies.
DeepSeek did not immediately respond to a request for comment on the allegation.
On Monday, analysts from Bernstein Research highlighted in a research note that DeepSeek’s total training costs for its V3 model were unknown but were much higher than what the startup claimed was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed.