03 February 2025

The recent DeepSeek Saga and Its Impacts

 The recent DeepSeek Saga.

DeepSeek capability, efficiency, and open-source licensing essentially break the profit-making dream of US AI companies like OpenAI (ChatGPT), Meta, Microsoft, etc.
Firstly, there are 83 hours of cyber attacks from the US on DeepSeek. China's privately owned internet companies such as 360 etc came together to fence off the attacks. It shows the unity of the people of the two countries and the bad behavior of the US.
OpenAI claimed but without proof that DeepSeek's efficiency comes from the 'distillation' of their work. Quite a number of people believed their claims. However, OpenAI and DeepSeek use different reasoning algorithms.
Architecture and Design
- Deepseek R1 uses a Sparse Mixture of Experts (MoE) approach with double reinforcement learning, reducing reliance on supervised training. It learns from its own mistakes, similar to xAI's Grok-1.
- In contrast, OpenAI o1 combines reinforcement learning with supervised training, often relying on models like GPT-4.
In layman's terms, it is like the proverb of work hard or work smart.
Output = work-hard x work-smart.
OpenAI is less smart so it needs to work harder, consuming a large amount of power for training and learning.
A very crude analogy is OpenAI is using Bubble sort whereas DeepSeek is using Tree-sort.
The great contribution of DeepSeek is that it reduces the costs of doing AI. It is open-source and efficient and allows countries without many resources to do the important AI work to build more effective and efficient processes and products. A big contribution to the well-being of the world.
What are your thoughts about the DeepSeek Saga?
Please share with me.

I feed my post to DeepSeek and ChatGPT, and here are their responses:

DeepSeek:
The "DeepSeek Saga" you describe is an interesting narrative that touches on several key themes in the AI industry, including technological innovation, competition, open-source versus proprietary models, and geopolitical dynamics. Here are some thoughts on the points you raised:
1. Cyber Attacks and Geopolitical Tensions

  • The claim of 83 hours of cyber attacks from the US on DeepSeek, if substantiated, highlights the increasingly competitive and contentious nature of the AI race. It reflects how AI has become a strategic asset, with nations and corporations vying for dominance.
  • The collaboration of Chinese internet companies to defend against these attacks underscores the importance of collective efforts in cybersecurity, especially in the face of sophisticated threats. However, it’s important to approach such claims with caution, as attributing cyber attacks to specific entities or nations can be complex and often requires concrete evidence.

2. OpenAI’s Claims and DeepSeek’s Innovation

  • OpenAI’s suggestion that DeepSeek’s efficiency comes from "distillation" of their work is a common critique in the AI community, where advancements often build on prior research. However, as you pointed out, DeepSeek’s architecture (Sparse Mixture of Experts with double reinforcement learning) is distinct from OpenAI’s approach, which relies more on supervised training and reinforcement learning.
  • The analogy of "working hard vs. working smart" is apt. DeepSeek’s approach seems to prioritize efficiency and adaptability, potentially reducing computational costs and energy consumption. This could be a significant advantage, especially as the environmental impact of large-scale AI training becomes a growing concern.

3. Architectural Differences

  • DeepSeek’s use of Sparse Mixture of Experts (MoE) and double reinforcement learning is innovative. MoE models are designed to activate only relevant parts of the network for specific tasks, which can lead to faster and more efficient processing. This contrasts with OpenAI’s more traditional reliance on dense models like GPT-4, which require extensive computational resources.
  • The comparison to sorting algorithms (Bubble sort vs. Tree-sort) is a good way to illustrate the difference in efficiency. DeepSeek’s approach appears to be more scalable and resource-efficient, which could democratize access to advanced AI capabilities.

4. Open-Source Contribution

  • DeepSeek’s open-source model is a significant contribution to the AI community. By making their technology accessible, they enable smaller organizations and countries with limited resources to participate in AI development. This aligns with the broader movement toward open science and collaborative innovation.
  • However, open-source models also come with challenges, such as ensuring responsible use and preventing misuse. Balancing openness with safeguards will be crucial for DeepSeek and similar initiatives.

5. Global Impact

  • If DeepSeek’s technology lives up to its promise, it could disrupt the current AI landscape dominated by US-based companies like OpenAI, Meta, and Microsoft. By lowering the barriers to entry, DeepSeek could empower a more diverse range of stakeholders to innovate and compete.
  • This could lead to a more balanced global AI ecosystem, reducing the concentration of power and fostering greater collaboration across borders.

6. Broader Implications

The DeepSeek Saga reflects the broader geopolitical competition in AI, where technological leadership is seen as a key determinant of economic and strategic power. The US-China rivalry in AI is likely to intensify, with both nations investing heavily in research and development.

At the same time, the saga highlights the importance of ethical considerations in AI development. As AI becomes more powerful, ensuring that it is used for the benefit of humanity will require international cooperation and shared standards.
Final Thoughts

The DeepSeek Saga is a fascinating case study in the evolving AI landscape. It showcases the potential for innovation to disrupt established players and the importance of open-source models in democratizing access to technology. However, it also underscores the need for transparency, collaboration, and ethical oversight as AI continues to advance.

If DeepSeek’s claims hold true, their approach could represent a significant leap forward in making AI more efficient, accessible, and sustainable. This would not only benefit the AI community but also contribute to solving global challenges in areas like healthcare, education, and climate change. At the same time, the geopolitical tensions surrounding AI development remind us that technology is never neutral—it is shaped by the values and priorities of those who create and control it.

ChatGPT

The DeepSeek Saga is fascinating and highlights several significant points in the evolving landscape of AI technology and geopolitics:

1. Innovation and Efficiency

DeepSeek's use of a Sparse Mixture of Experts (MoE) and double reinforcement learning is groundbreaking. The comparison to sorting algorithms (Bubble Sort vs. Tree Sort) is apt in illustrating how innovation can dramatically improve efficiency. If DeepSeek's architecture indeed allows it to "learn from its mistakes" autonomously and reduce training costs, this marks a pivotal shift in AI development.

2. Democratization of AI

The open-source nature of DeepSeek is a game-changer. By lowering the cost barrier, it enables smaller nations and companies to participate in AI innovation. This democratization is essential for fostering global development and ensuring that AI benefits everyone rather than being concentrated in a few wealthy corporations or countries.

3. Claims of "Distillation"

OpenAI's claims, whether valid or not, underscore a recurring tension in the tech world—how intellectual property, inspiration, and replication are defined and defended. If DeepSeek's architecture is truly independent, then accusations without evidence may appear as defensive posturing rather than legitimate concern.

4. Cyber Attacks and Geopolitics

The reported 83-hour cyberattack highlights the lengths some actors may go to suppress emerging technology that disrupts the status quo. The collaboration between private Chinese companies to fend off these attacks shows an inspiring level of unity and technical resilience.

5. Ethical and Economic Implications

DeepSeek's ability to reduce costs could shift the economic balance in AI, challenging the profit-driven models of US-based companies. While this may pose challenges for firms like OpenAI and Microsoft, it also forces them to innovate further, potentially leading to better AI products for everyone.

Final Thought

If DeepSeek continues to thrive and prove its claims, it could be a monumental contribution to the AI field and global technological progress. The Saga is a reminder of how open innovation, ethical practices, and resilience can reshape industries and defy monopolistic trends.

What’s your take on this, especially the cyberattack situation and OpenAI's claims?