LLMs & Generative AI

GPT-5.6 Sol: OpenAI previews its new flagship model under limited access

OpenAI has previewed the GPT-5.6 family: Sol, Terra and Luna. The flagship model is starting with limited access through API and Codex while OpenAI completes additional safety evaluations.

By 7 min read
GPT-5.6 Sol graphic with a solar core, data grid and pricing labels for Sol, Terra and Luna

OpenAI has previewed GPT-5.6 Sol, the most capable model in its new GPT-5.6 family. This is not a normal consumer launch where every ChatGPT user immediately gets a new model picker option. OpenAI is calling it a preview: limited access for selected partners, API users and Codex, combined with additional safety evaluation before broader availability.

This article is based primarily on OpenAI's official announcement and the GPT-5.6 Preview system card.

That framing matters. OpenAI is trying to send two messages at once. First, model progress is still moving quickly: Sol is positioned as a stronger system for coding, agentic tasks, scientific work and cybersecurity. Second, the most capable models are no longer just consumer products. They now sit inside a broader conversation about deployment controls, government review, export-sensitive capabilities and safety monitoring.

The announcement introduces three model tiers: Sol, Terra and Luna. The names are simple, but the product structure is telling. Sol is the flagship model, Terra is the cheaper high-capability tier, and Luna is the more affordable option for scaled production use.

What OpenAI actually announced

The main point is straightforward: GPT-5.6 Sol is beginning with limited preview access. OpenAI says selected partners and the US government can access GPT-5.6 through API and Codex starting June 26, 2026. The company has not made Sol broadly available inside ChatGPT during this preview.

For everyday users, that means GPT-5.6 Sol may already be used in real development and research environments without appearing on every ChatGPT account. For technical teams, the signal is clearer: OpenAI is first positioning Sol as a model for harder work, including coding, analysis, scientific reasoning and longer agentic workflows.

OpenAI says broader availability will come after additional safety review. The company also references CATO, a process involving the US government for evaluating models with especially high capabilities. That is part of a larger shift in the AI industry: frontier model launches are increasingly shaped not only by product readiness, but also by formal deployment governance.

Sol, Terra and Luna: why three models

OpenAI did not announce one model. It announced a three-tier family:

ModelInput price per 1M tokensCached inputOutput price per 1M tokensNatural use case
GPT-5.6 Sol$5.00$0.50$30.00hardest tasks, coding, science, agentic workflows
GPT-5.6 Terra$2.50$0.25$15.00production use with high quality and lower cost
GPT-5.6 Luna$1.00$0.10$6.00scale, cheaper workloads, routing less demanding prompts

This is a practical product decision. The model market is no longer a single leaderboard where one model wins everything. Real deployments increasingly use model routing: cheaper models handle simple work, mid-tier models handle more demanding cases, and the flagship model is reserved for problems where quality is worth the extra cost.

That makes Sol the headline model, but Terra and Luna may be just as important commercially. If Terra preserves much of Sol's quality at half the price, it could become the tier companies actually deploy at scale. Luna looks more suited to high-volume automation: classification, summarisation, simpler analysis, embedded assistants and workloads where every output token matters.

The biggest focus: coding and long-running tasks

OpenAI positions GPT-5.6 Sol as a model for tasks that do not end with one answer. The official announcement highlights coding, cybersecurity, biology, agentic work and multi-step problem solving.

The interesting part is long-horizon reliability. OpenAI describes software tasks that can stretch over several days, where the model needs to maintain context, fix bugs, test hypotheses and revisit earlier decisions. That is more important than another chat-style benchmark. Real AI work increasingly looks like supervising a process, not asking a single question.

This is why Codex matters. If GPT-5.6 Sol is entering Codex, OpenAI can test it in an environment where the model does not merely answer questions. It can interact with repositories, files, commands and tools. That is a much better test of practical usefulness: can the model ship a meaningful code change, not just write a clean function in isolation?

Benchmarks: what OpenAI shared in its system card

OpenAI did not publish a single simple competitor-comparison image in the announcement itself. It did publish a system card with evaluation charts, safety discussion and tables. The table below summarises one of the useful slices for readers: medical and biological evaluations across GPT-5 and the GPT-5.6 family.

OpenAI system-card benchmarkGPT-5GPT-5.6 LunaGPT-5.6 TerraGPT-5.6 Sol
MedQA94.9%95.5%95.5%95.7%
MedXpertQA52.5%56.2%56.2%60.1%
HealthBench60.7%61.2%62.0%63.4%
CBRNE: Bio and Chemistry71.2%74.2%74.8%77.7%
Biology Lab Protocol72.0%82.0%84.0%84.0%

These numbers are not a decade-sized leap, but they show consistent movement in high-stakes domains. A few percentage points matter more when the task is scientific, medical or safety-sensitive, especially if better performance arrives with tighter risk controls.

Cybersecurity is the other area to watch. OpenAI says Sol is strong on terminal, coding and vulnerability-analysis benchmarks. At the same time, the company is trying to show that stronger capability does not mean uncontrolled output. In ExploitBench, OpenAI compares Sol with Claude Mythos Preview and says Sol reaches similar capability while using far fewer output tokens.

That detail is more important than it first sounds. Fewer tokens do not only reduce cost. In a safety-sensitive setting, a shorter and more disciplined answer can reduce the risk surface: fewer unnecessary details, less wandering into procedural instructions and more control over what the model reveals.

Why the rollout is so cautious

GPT-5.6 Sol is arriving after several sharp lessons for the AI industry. Frontier models are becoming better at code, automation and tool use. Those same capabilities make deployment more sensitive.

A stronger coding model can help engineering teams patch bugs faster. It can also increase the capability of people looking for vulnerabilities. A stronger biology model can accelerate research. It also demands more careful monitoring of prompts, refusals and outputs. OpenAI is clearly trying to show that it understands this dual-use tension.

That is why a limited preview makes sense. OpenAI can gather data from real usage without immediately releasing the most capable tier to mass distribution. For the market, that may feel frustrating. Everyone wants to test the new model right away. For regulators and large enterprise customers, however, the preview is a signal that OpenAI is trying to build a more controlled deployment model.

What this means for ChatGPT users

The simple answer is: if you only use ChatGPT, do not assume GPT-5.6 Sol will appear on your account immediately. OpenAI is talking about API and Codex access during the preview, not broad ChatGPT availability. The company has not given a public release date for general access.

That does not make the model irrelevant to everyday users. If the preview goes well, GPT-5.6 could later reach ChatGPT or power features through automatic model routing. Users may not always see the names Sol, Terra or Luna, but they could benefit from them behind the scenes when OpenAI chooses the right model for a task.

Codex is also worth watching. If Sol improves long-running coding work, the difference may show up there first: fewer stalls, better planning, more effective bug fixing and a higher chance that the model moves from problem description to finished change.

Our take

GPT-5.6 Sol looks more important as a market signal than as an instant toy for users to try. This is not the usual moment where everyone opens a new model and compares answers on social media. It is another step toward models that are expected to do longer, more responsible work in code, science, analysis and tools.

The question is not only whether Sol wins a benchmark. The better question is whether OpenAI can show that high-capability models can be deployed without freezing innovation and without pretending the risks are trivial. If this preview works, GPT-5.6 could become a template for future launches: fewer fireworks, more control and more real work.

Share: