Home » Technology » OpenAI’s New Codex-Spark AI Codes Faster on Cerebras, Ditching Nvidia

OpenAI’s New Codex-Spark AI Codes Faster on Cerebras, Ditching Nvidia

by Sophie Lin - Technology Editor

OpenAI has unveiled GPT-5.3-Codex-Spark, a new coding model designed for speed and responsiveness, marking a significant departure from its reliance on Nvidia hardware. The model, powered by chips from Cerebras Systems, delivers code at over 1,000 tokens per second, a substantial increase in velocity for developers building software. This move signals a broader strategy by OpenAI to diversify its AI infrastructure and optimize for specific workloads, particularly those demanding low latency.

The release of Codex-Spark comes at a pivotal moment in the AI landscape, where the ability to rapidly generate and iterate on code is becoming a key differentiator. AI coding assistants, like OpenAI’s Codex and Anthropic’s Claude Code, are increasingly used for prototyping, interface development, and generating boilerplate code. The race to deliver faster, more efficient coding tools is intensifying, with latency – the time it takes for the model to respond – emerging as a critical factor for developer productivity. This focus on speed is driving innovation in both model architecture and the underlying hardware.

While OpenAI maintains that GPUs remain foundational to its overall AI strategy, the partnership with Cerebras represents a strategic complement. According to Sachin Katti, head of compute at OpenAI, Cerebras excels at workflows requiring extremely low latency, enhancing the responsiveness of coding tools like Codex-Spark. This is particularly important for real-time coding scenarios where immediate feedback is crucial for a seamless developer experience.

Cerebras Hardware and Performance Metrics

Codex-Spark runs on Cerebras’ Wafer Scale Engine 3 (WSE-3), a large-format chip designed for AI workloads. The company announced the partnership with OpenAI in January, and Codex-Spark is the first product to emerge from this collaboration. While delivering over 1,000 tokens per second, Cerebras has demonstrated even higher performance with other models: 2,100 tokens per second on Llama 3.1 70B and 3,000 tokens per second on OpenAI’s own gpt-oss-120B model. This suggests that the comparatively lower speed of Codex-Spark may be due to the model’s specific design and optimization for coding tasks, rather than a limitation of the Cerebras hardware itself.

OpenAI has been actively working to reduce its dependence on Nvidia over the past year. In October 2025, the company signed a multi-year deal with AMD, followed by a $38 billion cloud computing agreement with Amazon in November. OpenAI is developing its own custom AI chip, slated for fabrication by TSMC. These moves indicate a deliberate effort to build a more diversified and resilient AI infrastructure.

Shifting Dynamics in the AI Chip Market

The shift away from exclusive reliance on Nvidia comes after a planned $100 billion infrastructure deal with the chipmaker reportedly stalled. While Nvidia has since committed to a $20 billion investment, Reuters reported that OpenAI expressed concerns about the speed of Nvidia chips for inference tasks – the process of using a trained model to generate outputs – which is precisely the area where Codex-Spark aims to excel.

Still, OpenAI emphasizes that GPUs will continue to play a vital role in its operations. A spokesperson stated that GPUs remain “foundational” for training and broad usage, while Cerebras complements this foundation by providing specialized capabilities for low-latency workloads. This nuanced approach suggests OpenAI is seeking to leverage the strengths of different hardware architectures to optimize performance across a range of AI applications.

Codex-Spark is currently available as a research preview to ChatGPT Pro subscribers ($200/month) through the Codex app, command-line interface, and VS Code extension, with API access being rolled out to select design partners. The model features a 128,000-token context window and currently supports text-only input.

Looking Ahead: The Future of AI-Powered Coding

The launch of GPT-5.3-Codex-Spark underscores the rapid evolution of AI-powered coding tools and the growing importance of hardware diversification in the AI industry. As OpenAI and its competitors continue to push the boundaries of code generation, the demand for specialized hardware capable of delivering low-latency performance will likely increase. The success of this partnership between OpenAI and Cerebras will be closely watched as a potential model for future collaborations in the AI ecosystem. The focus will now shift to evaluating the real-world impact of this new model on developer workflows and the overall speed of software development.

What are your thoughts on the increasing role of AI in software development? Share your insights in the comments below.

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Adblock Detected

Please support us by disabling your AdBlocker extension from your browsers for our website.