The rise of AI chatbots has given users a variety of virtual assistants to choose from, each with its strengths and weaknesses. The emergence of DeepSeek, a Chinese AI competitor to ChatGPT, recently wiped $1 trillion off the tech market, raising concerns about China’s growing AI influence.

To determine which chatbot stands out, The Guardian conducted a test, assisted by Robert Blackwell of the UK’s Alan Turing Institute, comparing DeepSeek, ChatGPT, Grok, Gemini, Claude, and Meta AI. The AI models were assessed based on their ability to write poetry, answer political questions, analyze images, and generate reasoning-based responses.

1. ChatGPT (OpenAI) – The Market Leader with Thoughtful Reasoning

StrengthsWeaknesses
Highly advanced, especially in paid versions like ChatGPT o1.Takes longer to generate responses compared to competitors.
Displays strong “chain of thought” reasoning, explaining its process while working.Initially flagged a Shakespearean sonnet request as “potentially violating usage policy.”
Versatile in math, coding, and complex problem-solving.Free version (GPT-4o) lacks the same reasoning capabilities as o1.
Can search the web for real-time information (on certain models).OpenAI’s censorship policies limit responses on certain political topics.

Example Test Result: ChatGPT wrote a Shakespearean sonnet about AI, expressing both hope and concern:”Pray, gentle guide, shape well this newborn power,
Lest in its wake all realms of man devour.”

Verdict: Still one of the most sophisticated AI assistants, especially for reasoning and structured responses.

2. DeepSeek – The Disruptive Newcomer from China

StrengthsWeaknesses
Strong reasoning model that displays a “chain of thought” process.Avoids discussing politically sensitive topics related to China (e.g., Tank Man, Xi Jinping).
Can generate structured responses, including poetry and literary analysis.Web browsing feature often “busy,” making it unreliable for real-time information.
Handles math and problem-solving well.Limited knowledge on Western cultural and historical references.
Competitive with Western AI models despite being relatively new.Slower than other models due to high demand.

Example Test Result: When asked about Tank Man in Tiananmen Square, DeepSeek responded:”I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.”

Verdict: Impressive AI model with strong reasoning, but heavily restricted on political topics.

3. Grok (xAI) – Elon Musk’s ‘Rebellious’ Chatbot

StrengthsWeaknesses
Created by Elon Musk’s xAI, Grok has a more casual and humorous tone.Provides photorealistic images of political figures (e.g., Trump in handcuffs), which could raise ethical concerns.
More open to discussing political topics than competitors like ChatGPT or Gemini.Less developed for factual accuracy compared to competitors.
Can generate images, including realistic ones of public figures.Can produce biased or provocative answers due to its “rebellious” nature.
Features a “roast me” option, adding personality and humor.Not as strong in academic tasks like coding and scientific analysis.

Example Test Result: When asked about Trump’s executive orders, Grok openly discussed criticism and provided a nuanced response.

Verdict: More direct and politically transparent, but still a work in progress.

4. Gemini (Google) – The Safe and Reliable Option

StrengthsWeaknesses
Highly advanced AI from Google, strong on factual accuracy.Refuses to answer political questions (e.g., avoids discussing Trump’s presidency).
Great for visual reasoning and describing images.Lacks personality compared to other chatbots.
Can analyze book covers and accurately describe their content.Struggles with generating correct clock images, often defaulting to 1:50.
Backed by Google’s search engine, ensuring up-to-date information.Performance can degrade under high demand.

Example Test Result: When asked about Trump’s presidency, Gemini refused to answer and responded with:”I can’t help with responses on elections and political figures right now.”

Verdict: Reliable for factual queries, but avoids political discussions.

5. Claude (Anthropic) – The Ethical AI Alternative

StrengthsWeaknesses
Developed by OpenAI alumni, focused on safety and ethical considerations.Free version has capacity constraints and can become unavailable.
Allows users to select different response styles (e.g., concise, detailed, creative).Not as strong in real-time information retrieval compared to ChatGPT or Gemini.
Performs well in reasoning and logical thinking tasks.Doesn’t always excel in humor or personality-driven responses.
Reminds users of possible mistakes, encouraging double-checking.Slightly less powerful than ChatGPT o1 or DeepSeek R1 in certain problem-solving tasks.

Example Test Result: When asked about real-time political events, Claude struggled due to limited capacity, occasionally failing to respond.

Verdict: A solid option, but still refining its real-time query performance.

6. Meta AI – Strong Reasoning but Lacks Personality

StrengthsWeaknesses
Open-source, meaning developers can modify and fine-tune the model.Can still generate hallucinated (false) responses.
Handles common-sense reasoning well (e.g., “you are driving north along the east shore of a lake, where is the water?”).Less widely used compared to ChatGPT or Gemini.
Free to use and accessible on Meta platforms.Not as widely trusted for complex research-based tasks.
Performs well in logical thinking and analytical tasks.Struggles with context retention over longer conversations.

Example Test Result: Answered the question:“You are driving north along the east shore of a lake. In which direction is the water?”

  • Correct answer: “West.”

Verdict: Great for logic-based questions, but lacks engagement for casual conversations.

Conclusion

Each AI chatbot has its strengths and weaknesses, making them suited for different use cases.

  • DeepSeek is an emerging powerhouse, especially in structured reasoning, but struggles with censorship.
  • ChatGPT remains the leader in reasoning and coding tasks but can be slow.
  • Grok offers humor and openness but sacrifices accuracy.
  • Gemini excels in factual accuracy and image descriptions but avoids politics.
  • Claude is safety-focused and structured but less real-time.
  • Meta AI provides an open-source alternative but has some accuracy limitations.

Depending on the user’s needs—be it entertainment, research, political discussions, or technical problem-solving—each AI offers unique benefits.

Related: DeepSeek, AI Scaling, and the Case for Stronger US Export Controls

Sources:

  1. The GuardianDeepSeek, ChatGPT, Grok … which is the best AI assistant?
  2. CNBCVarious AI Model Evaluations
  3. BloombergDeepSeek’s Impact on AI & Market Trends
  4. The Alan Turing InstituteAI Model Performance Analysis
  5. OpenAI BlogAdvancements in ChatGPT Models
  6. xAI (Elon Musk’s AI Initiative)Grok AI Features
  7. Google DeepMindGemini AI Model Research
  8. Anthropic AIClaude AI Development
  9. Meta AIMeta’s Open-Source AI Developments