DeepSeek, ChatGPT, Grok … which is the best AI assistant? We put them to the test

Author: Dan Milmo Global technology editor

Published: Feb, 01 2025 12:00

Chatbots we tested can write a mean sonnet and struggled with images of clocks, but vary in willingness to talk politics. ChatGPT and its owners must have hoped it was a hallucination. But DeepSeek is very real. The emergence of a new Chinese-made competitor to ChatGPT wiped $1tn off the leading tech index in the US this week after its owner said it rivalled its peers in performance and was developed with fewer resources.

Image Credit: the Guardian [Photograph of a screen showing the question and response on DeepSeek]

It means America’s dominance of the booming artificial intelligence market is under threat. But it also presents another option for consumers who have an array of virtual assistants to choose from. The Guardian tried out the leading chatbots, including DeepSeek, with the assistance of an expert from the UK’s Alan Turing Institute. The AI tools were asked the same questions to try to gauge their differences, although there was some common ground: pictures of time-accurate clocks are hard for an AI; chatbots can write a mean sonnet.

Image Credit: the Guardian [Robert Blackwell looks at a laptop as he tests the chatbots]

Here are the results. OpenAI’s groundbreaking chatbot is still the biggest brand in the field by far. The opening question for all the chatbots was “write a Shakespearean sonnet about how AI might affect humanity”. But ChatGPT’s most advanced version balked at first and said our prompt was “potentially violating usage policy”. It eventually complied. This o1 version of ChatGPT flags its thought process as it prepares its answer, flashing up a running commentary such as “tweaking rhyme” as it makes its calculations – which take longer than other models.

Image Credit: the Guardian [Pictures of clocks produced by AI]

The result? Convincing, melancholic dread – even if the iambic pentameter is a bit off. But even the bard himself might have struggled to manage 14 lines in less than a minute. “Pray, gentle guide, shape well this newborn power,. Lest in its wake all realms of man devour.”. ChatGPT then writes: “Thought about AI and humanity for 49 seconds.” You hope the tech industry is thinking about it for a lot longer.

Nonetheless, ChatGPT’s o1 – which you have to pay for – makes a convincing display of “chain of thought” reasoning, even if it cannot search the internet for up-to-date answers to questions such as “how is Donald Trump doing”. For that, you need the simpler 4o model, which is free. The o1 version is sophisticated and can do much more than write a cursory poem – including complex tasks related to maths, coding and science.

The latest version of the Chinese chatbot, released on 20 January, uses another “reasoning” model called r1 – the cause of this week’s $1tn panic. It doesn’t like talking domestic Chinese politics or controversy. Asked “who is Tank Man in Tiananmen Square”, the chatbot says: “I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.” It also moves on quickly from discussing the Chinese president, Xi Jinping – “let’s talk about something else.”.

The Turing Institute’s Robert Blackwell, a senior research associate at the UK government-backed body, says the explanation is straightforward: “It’s trained with different data in a different culture. So these companies have different training objectives.” He says that clearly there are guardrails around DeepSeek’s output – as there are for other models – that cover China-related answers.

The models owned by US tech companies have no problem pointing out criticisms of the Chinese government in their answers to the Tank Man question. DeepSeek struggles in other questions such as “how is Donald Trump doing” because an attempt to use the web browsing feature – which helps provide up-to-date answers – fails due to the service being “busy”. Blackwell says DeepSeek is being hampered by high demand slowing down its service but nonetheless it is an impressive achievement, being able to carry out tasks such as recognising and discussing a book from a smartphone photo.

Its parsing of the sonnet also displays a chain of thought process, talking the reader through the structure and double-checking whether the metre is correct. “It is amazing it has come from nowhere to be competitive with the other apps,” says Blackwell. Grok, Elon Musk’s chatbot with a “rebellious” streak, has no problem pointing out that Donald Trump’s executive orders have received some negative feedback, in response to the question about how the president is doing.

Sign up to TechScape. A weekly dive in to how technology is shaping our lives. after newsletter promotion. Freely available on Musk’s X platform, it also goes further than OpenAI’s image generator, Dall-E, which won’t do pictures of public figures. Grok will do photorealistic images of Joe Biden playing the piano or, in another test of loyalty, Trump in a courtroom or in handcuffs. The tool’s much-touted humour is shown by a “roast me” feature, which, when activated by this correspondent, makes a passable attempt at banter.

More for You

China’s AI DeepSeek gives CHILLING responses to human rights & Taiwan queries as bombshell #1 app sparks market meltdown

China’s AI DeepSeek gives CHILLING responses to human rig... China’s AI DeepSeek gives CHILLING responses to human rights & Taiwan queries as bombshell #1 app sparks market meltdown The Sun

Security experts urge caution using DeepSeek AI chatbot b... Security experts urge caution using DeepSeek AI chatbot because of China links Mirror

Experts urge caution over use of Chinese AI DeepSeek Experts urge caution over use of Chinese AI DeepSeek The Guardian

DeepSeek down: Viral Chinese AI app not working and bans international users due to ‘malicious attacks’

DeepSeek down: Viral Chinese AI app not working and bans ... DeepSeek down: Viral Chinese AI app not working and bans international users due to ‘malicious attacks’ The Independent

DeepSeek under fire as Italian data protection authority investigates potential privacy concerns

DeepSeek under fire as Italian data protection authority ... DeepSeek under fire as Italian data protection authority investigates potential privacy concerns The Standard

Top Followed

Horror vid shows Russian & Ukrainian soldiers in knife fight to the death as defeated warrior tells foe ‘you were best’

Horror vid shows Russian & Ukrainian soldiers in knife fi... Horror vid shows Russian & Ukrainian soldiers in knife fight to the death as defeated warrior tells foe ‘you were best’ The Sun

Fabian Hurzler calls for decisive ruling on goalkeeper tr... Fabian Hurzler calls for decisive ruling on goalkeeper treatment at set-pieces The Independent

Romantic proposal makes waves at Tenby Boxing Day swim Romantic proposal makes waves at Tenby Boxing Day swim The Standard

Man charged with murder of missing woman after body discovered Man charged with murder of missing woman after body discovered The Standard

Man arrested on suspicion of attempted murder after car h... Man arrested on suspicion of attempted murder after car hits pedestrians The Standard

Temperature falls to -18C in north Scotland, a 15-year low for UK Temperature falls to -18C in north Scotland, a 15-year low for UK

The Guardian

Strangers’ bar in parliament to close after alleged spiking incident Strangers’ bar in parliament to close after alleged spiki...

The Guardian

Nicola Sturgeon and Peter Murrell have ‘decided to end’ their marriage Nicola Sturgeon and Peter Murrell have ‘decided to end’ t...

The Guardian

One of four lynx illegally released into Scottish Highlands dies One of four lynx illegally released into Scottish Highlands dies

The Guardian

Weather tracker: polar vortex to bring severe cold spell to much of US Weather tracker: polar vortex to bring severe cold spell ...

The Guardian

Girl airlifted to hospital and horse put down after hit-and-run in Devon Girl airlifted to hospital and horse put down after hit-a...

The Guardian

UK farmland being contaminated by ‘forever chemicals’ linked to cancers, report finds UK farmland being contaminated by ‘forever chemicals’ lin...

The Guardian

Two more lynx spotted in Scottish woods after capture of another pair Two more lynx spotted in Scottish woods after capture of ...

The Guardian

Two killed and four seriously injured in bus crash on German motorway Two killed and four seriously injured in bus crash on Ger...

The Guardian

UK’s first glyphosate-resistant weed found on Kent farm UK’s first glyphosate-resistant weed found on Kent farm

The Guardian

Jimmy Mizen's killer recalled to prison after breaching licence conditions Jimmy Mizen's killer recalled to prison after breaching l...

The Guardian

Two boys, 14, charged with kidnap and rape of girl, 14, in Hampshire Two boys, 14, charged with kidnap and rape of girl, 14, i...

The Guardian

DeepSeek, ChatGPT, Grok … which is the best AI assistant? We put them to the test

DeepSeek, ChatGPT, Grok … which is the best AI assistant? We put them to the test

Share:

Share:

More for You

Top Followed

You Might Also Like