According to insiders, the global AI leader OpenAI has been actively and systematically seeking alternatives to NVIDIA's computing power in recent days. This move stems from its disappointment with NVIDIA's latest generation AI chips, particularly in specific reasoning processes (especially response speed).
Core Pain Point: Slow Reasoning Speed Hinders User Experience
OpenAI found that in scenarios such as code generation and complex software system interactions, the current hardware's response speed has become a bottleneck:
Shift in Strategic Focus: OpenAI is shifting its attention from "training" the model to "reasoning" (the process of providing answers to end users).
Latency and Throughput: The performance during the reasoning phase directly affects user experience and operational costs. In high-bandwidth, low-latency tasks, traditional GPU architectures suffer from latency due to frequent access to external memory, causing the chip to remain in a "waiting for data" state for long periods.
High Demands from Professional Users: CEO Sam Altman pointed out that professional users such as developers are extremely sensitive to the generation speed of code-based models, and the current hardware architecture limits the user experience of related products.
Alternative Solutions: Partnering with Emerging Forces in Reasoning Acceleration
To address the hardware bottleneck, OpenAI plans to introduce new hardware to handle about 10% of future reasoning computing needs:
Collaborating with Cerebras: OpenAI has already partnered with Cerebras. The latter's architecture integrates a massive amount of static memory (SRAM) on a single chip, significantly shortening the access path and improving response speed.
Discussions with Groq: The company had previously contacted Groq, seeking to use its expertise in reasoning acceleration to optimize AI systems such as chatbots.
Power Struggle: A Previously Certain Investment Has Changed
This shift in technology strategy has made OpenAI's relationship with its long-time core supplier NVIDIA more delicate:
Stalled Billion-Dollar Deal: The two parties were negotiating a $100 billion investment and supply agreement (with NVIDIA trading chips for equity), but the talks have been delayed for several months.
Diversified Procurement: OpenAI has signed new GPU procurement or cooperation agreements with other vendors such as AMD, further reducing its reliance on a single supplier.
Competitive Pressure: In contrast, Claude from Anthropic and Gemini from Google rely more on Google's self-developed TPUs, which have inherent advantages in reasoning tasks, putting significant pressure on NVIDIA.
Although both sides maintain a positive image in public, and NVIDIA CEO Jensen Huang has strongly denied rumors of discord, as OpenAI begins to place real money orders for third-party reasoning chips, the "one dominant player and multiple strong competitors" structure in the AI computing market is facing a transformation.
