OpenAI Releases GPT‑5.3‑Codex‑Spark for Real-Time Coding


XINSTALL BY CLICKING THE DOWNLOAD FILE
A message from our partner

For fixing Windows errors, we recommend Fortect:

Fortect will identify and deploy the correct fix for your Windows errors. Follow the 3 easy steps to get rid of Windows errors:

  • Download Fortect and install it on your PC
  • Launch the tool and Start scanning your PC for Windows errors
  • Right-click on Start Repair to deploy the right fix for each error encountered during the scan
Download Now Fortect has been downloaded by 0 readers this month, rated 4.6 on TrustPilot

OpenAI today unveiled a research preview of GPT‑5.3‑Codex‑Spark, a smaller, speed-optimized version of GPT‑5.3‑Codex designed for real-time coding. This comes just a few days after the company unveiled the GPT-5.3-Codex to take on Anthropic’s Claude Opus 4.6.

Notably, the release of GPT‑5.3‑Codex‑Spark marks the first milestone in OpenAI’s collaboration with Cerebras, which aims to deliver coding results almost instantly on ultra-low latency hardware. Codex-Spark can produce more than 1000 tokens per second while remaining capable of real-world development tasks.

Codex-Spark optimized for speed and interactivity

Unlike previous Codex models, Codex-Spark focuses on interactive coding. It’s designed to make specific edits, refine logic, and update interfaces in real time. The model has a 128k token context window and currently supports text-only interactions. Users can collaborate, interrupt, and redirect their work instantly, making it suitable for rapid iteration and hands-on coding experiments.

Early benchmarks show Codex-Spark completing tasks far faster than its predecessor while maintaining strong accuracy. On SWE-Bench Pro and Terminal-Bench 2.0, the model reduced task durations significantly, confirming that speed improvements are not just token-level but across the full request-response pipeline. OpenAI also introduced WebSocket-based persistent connections, cutting per-token overhead by 30% and time-to-first-token by 50%.

Powered by Cerebras, Complementing GPU Infrastructure

Codex-Spark runs on Cerebras’ Wafer Scale Engine 3⁠, a purpose-built AI chip for ultra-low latency inference. GPUs remain central for training and broad usage, while Cerebras handles workflows where minimal lag matters most. Sean Lie, Cerebras CTO, emphasized the model’s potential to reshape developer workflows, introducing new interaction patterns and use cases.

Available now as a research preview for ChatGPT Pro users, Codex-Spark comes with separate rate limits during the preview. Moreover, OpenAI plans to expand access gradually and integrate user feedback to refine the model’s real-time capabilities, while keeping safety safeguards in place.

More about the topics: AI, GPT-5.3, OpenAI

Readers help support Windows Report. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help Windows Report sustain the editorial team. Read more

User forum

0 messages