How DeepSeek can be as good as US AI rivals at fraction of cost
How DeepSeek can be as good as US AI rivals at fraction of cost
Share:
Based on the limited numbers of comparisons made so far, DeepSeek's AI models appear to be faster, smaller, and a whole lot cheaper than the best offerings from the supposed titans of AI like OpenAI, Anthropic and Google. And here's the kicker, the Chinese offering appears to be just as good. So how have they done it?.
Firstly, it looks like DeepSeek's engineers have thought about what an AI needs to do rather than what it might be able to do. It doesn't need to work out every possible answer to a question, just the best one - to two decimal places for example instead of 20.
Their models are still massive computer programmes, DeepSeek-V3 has 671 billion variables. But ChatGPT-4 is a colossal 1.76 trillion. Doing more with less seems to be down to the architecture of the model, which uses a technique called "mixture of experts".
Where OpenAI's latest model GPT-4.0 attempts to be Einstein, Shakespeare and Picasso rolled into one, DeepSeek's is more like a university broken up into expert departments. This allows the AI to decide what kind of query it's being asked, and then send it to a particular part of its digital brain to be dealt with.
Please use Chrome browser for a more accessible video player. This allows the other parts to remain switched off, saving time, energy and most importantly the need for computing power. And it's the equivalent performance with significantly less computing power, that has shocked the big AI developers and financial markets.