The artificial intelligence landscape is moving at breakneck speed. Just three months after Google’s Gemini 3 Pro established itself as a leading frontier model, the company has unveiled Gemini 3.1 Pro, an update focused on bringing the power of its specialized Deep Reckon system to a wider range of tasks. This isn’t a full version leap, but a significant incremental upgrade – a shift Google signals with its first “point one” release for the Gemini family.
The core innovation of Gemini 3.1 Pro lies in its adjustable reasoning capabilities. Enterprise AI teams and developers now have a single model capable of dynamically scaling its computational effort, from swiftly handling routine queries to dedicating extended processing time to complex problem-solving. This flexibility aims to streamline AI deployments and reduce the operational burden of managing multiple specialized models.
Gemini 3.1 Pro is currently rolling out in preview across a broad range of Google platforms, including the Gemini API via Google AI Studio and Gemini CLI, the agentic development platform Antigravity, Vertex AI, Gemini Enterprise, Android Studio, the consumer Gemini app, and NotebookLM.
The ‘Deep Think Mini’ Effect: Granular Control Over Reasoning
While benchmark numbers are important, the most impactful feature of Gemini 3.1 Pro is the introduction of a three-tier thinking level system. Previously, Gemini 3 Pro offered only “low” and “high” reasoning modes. The new version adds a “medium” setting, roughly equivalent to the previous “high,” and fundamentally redefines what “high” means. When set to its highest level, Gemini 3.1 Pro effectively operates as a “mini version of Gemini Deep Think,” Google’s specialized reasoning model recently upgraded to tackle complex challenges in science, research, and engineering.
This approach has significant implications for enterprise deployment. Instead of routing requests to different models based on complexity – a common but often cumbersome practice – organizations can now utilize a single endpoint and adjust the reasoning depth on demand. Simple tasks like document summarization can be handled quickly with “low” thinking, while complex analytical work can leverage the “high” setting for Deep Think-level reasoning.
Benchmark Performance: Significant Gains Across the Board
Google’s published benchmarks demonstrate substantial improvements, particularly in reasoning and agentic capabilities. On ARC-AGI-2, a benchmark evaluating abstract reasoning, Gemini 3.1 Pro achieved a score of 77.1% – more than double the 31.1% of Gemini 3 Pro and surpassing competitors like Anthropic’s Sonnet 4.6 (58.3%), Opus 4.6 (68.8%), and OpenAI’s GPT-5.2 (52.9%).
The gains extend to other key benchmarks. Gemini 3.1 Pro scored 44.4% on Humanity’s Last Exam, a rigorous academic reasoning test, up from 37.5% for Gemini 3 Pro and exceeding Claude Sonnet 4.6 (33.2%) and Opus 4.6 (40.0%). It also reached 94.3% on GPQA Diamond, a scientific knowledge evaluation, outperforming all listed competitors.
Perhaps most relevant for enterprise applications are the improvements in agentic benchmarks – evaluations measuring performance with tools and multi-step tasks. Gemini 3.1 Pro scored 68.5% on Terminal-Bench 2.0 (agentic terminal coding), compared to 56.9% for its predecessor. On MCP Atlas, measuring multi-step workflows, it reached 69.2%, a 15-point improvement over Gemini 3 Pro’s 54.1%. And on BrowseComp, testing web search capabilities, the score jumped to 85.9% from 3 Pro’s 59.2%.
A Shift in Release Strategy and the Role of Reinforcement Learning
The decision to designate this update as “3.1” rather than another “3 Pro preview” suggests Google views the improvements as substantial enough to warrant a version increment, while the “point one” framing indicates an evolutionary, rather than revolutionary, change. According to Google, 3.1 Pro builds directly on lessons learned from the Gemini Deep Think series, incorporating techniques from both earlier and more recent versions.
The benchmark results strongly suggest that reinforcement learning played a crucial role in these gains, particularly in areas like ARC-AGI-2, coding, and agentic evaluations – domains where reinforcement learning environments can provide clear reward signals.
Gemini 3.1 Pro is being released as a preview, with Google stating it will continue refining agentic workflows before a full general availability launch.
For IT decision-makers evaluating frontier model providers, the release of Gemini 3.1 Pro necessitates a reevaluation of model selection and adaptation to the rapid pace of change in the AI landscape. The competitive response to these gains will likely unfold in the coming weeks, as Anthropic, OpenAI, and the open-weight community assess and respond to Google’s latest advancements.
Gemini 3.1 Pro is available now in preview through the Gemini API in Google AI Studio, Gemini CLI, Google Antigravity, and Android Studio for developers. Enterprise customers can access it through Vertex AI and Gemini Enterprise. Consumers on Google AI Pro and Ultra plans can access it through the Gemini app, and NotebookLM.
As Google continues to iterate on its Gemini models, the focus will likely remain on enhancing agentic capabilities and refining the balance between reasoning depth and computational efficiency. The ongoing evolution of these models promises to unlock new possibilities for AI-powered applications across a wide range of industries.
What are your thoughts on the new Gemini 3.1 Pro? Share your insights and experiences in the comments below.