
Google Cloud Next 2025: News and releases
Google Cloud Next 2025 has started with even more strength than previous editions and we have already seen the company’s strongest bets, which are focused on generative AI, cloud infrastructures, chips, agents, etc.
At Plain Concepts, we did not want to miss any detail, and we moved to Las Vegas to follow everything that was going to be announced during the event. Here is the summary of the highlights!
Google Cloud Next 2025 Recap
It’s been exactly one year since the last Google Cloud Next, and the 2025 edition has been a spectacle of news, partnerships, betting on the future, and the present.
Twelve months ago, they shared a vision of how AI can radically transform organizations, but today, that vision is not just a possibility; it is a reality they are building.
Today, more than 4 million developers are using Gemini, coupled with a 20x increase in Vertex AI usage, or the more than 2 million AI assists provided to monthly users. In addition, they have expanded their infrastructure to 42 regions, having more than 200 points of presence (PoPs) in over 200 countries and territories, creating a truly global and resilient foundation.
For those of you who were unable to attend, we leave here the video of the Keynote by Thomas Kurian, CEO of Google Cloud, and review the highlights of the event:
AI hypercomputer and Ironwood
Its AI hypercomputer is a revolutionary supercomputing system designed to simplify AI implementation, improve performance, and optimize costs.
At the event, they unveiled Ironwood, their 7th generation TPU, which represents their largest and most powerful TPU to date, with a more than 10x improvement over the previous one, with the most efficient performance. With over 9000 chips per pod, Ironwood delivers a staggering 42.5 exaflops of compute per pod, meeting the growing demands of the most sophisticated thought models, such as Gemini 2.5.
They also discussed Cluster Director, which enables enterprises to deploy and manage many accelerators as a single unified computing unit, maximizing performance, efficiency, and resiliency.
Partnership with NVIDIA
Google has partnered with NVIDIA to integrate Gemini into its Blackwell systems, with Dell as a key partner, so that it can be used locally in connected and isolated environments.
From this comes new AI hardware options, such as its GPU portfolio with the availability of A4 and A4X virtual machines powered by NVIDIA’s innovative Blackwell B200 and GB200 GPUs. Or also being one of the first to offer NVIDIA’s next-generation Vera Rubin GPUs, which deliver up to 15 exaflops of FP4 inference performance per rack.
Improvements for AI inference
There were 3 significant improvements for AI inference presented during the event:
- Google Kubernetes Engine (GKE) Inference: New inference capabilities in GKE, including AI-aware load balancing and scaling capabilities, can reduce service costs by up to 30%, reduce queue latency by up to 60%, and increase throughput by up to 40% based on internal benchmarks.
- Pathways availability: is Google’s distributed machine learning execution environment, now available for the first time to cloud customers, enabling state-of-the-art multi-host inference for dynamic scaling with exceptional performance at optimal cost.
- Availability of vLLM: they are implementing vLLM on TPUs. This enables customers who have optimized PyTorch with vLLM for GPUs to run their workloads on TPUs easily and cost-effectively, thus maximizing their investment and flexibility.
Gemini and new AI models
Announced the arrival of Gemini 2.5 Flash, its reference model specifically optimized for low latency and cost-effectiveness, to Vertex AI. This is ideal for everyday use cases, such as providing fast responses during high-volume customer interactions where real-time summaries or quick access to documents are required.
The model adjusts the depth of reasoning based on the complexity of the prompts and allows for performance control based on the customer’s budget. These new features make AI easier to use and affordable for all types of use cases.
In addition, they announced other innovative developments:
- Image 3: Its top-quality text-to-image model now has enhanced image generation and retouching functions for reconstructing missing or damaged parts of an image. It also improves object removal, resulting in a more natural and fluid editing experience.
- Chirp 3: Its audio generation model now includes a new way to create custom voices with as little as 10 seconds of audio input, allowing companies to customize their call centers, develop accessible content, and use other transcription features.
- Lyria: is the industry’s first text-to-music model, which can transform simple text prompts into music clips, opening up new avenues for creative expression.
- Veo 2: is its new video generation model that now includes new features that help organizations create videos, edit them, and add visual effects.
Cloud WAN
It is an AI-assisted, next-generation global network with near-zero latency. It supports essential Google services and spans more than three million kilometers of fiber optics, spread across more than 200 countries and territories.
This network is now available to businesses around the world, and is optimized to maximize application performance. It is up to 40% faster and reduces the total cost of ownership by 40%.
Google DeepMind developments
In their quest to make the cloud more suitable for global scientific research and discovery, they are combining the best of Google DeepMind and Google Research with new infrastructure and AI capabilities that include:
- AlphaFold 3: can predict the structure and interactions of all molecules with unprecedented accuracy. The new high-throughput AlphaFold 3 solution, available for non-commercial use and deployable via Google Cloud Cluster Toolkit, enables efficient batch processing of up to tens of thousands of protein sequences, while minimizing costs through an automated scaling infrastructure.
- WeatherNext AI models: enable fast and accurate weather forecasting, and are now available in Vertex AI Model Garden.
Vertex AI
Vertex AI Model Garden now has over 200 models, including models from Google, third-party models from companies such as Anthropic, AI21, and Mistral, and open models such as Gemma and Llama. Recently, they have added models from CAMB.AI, Qodo, and the full portfolio of open source models from The Allen Institute.
Vertex AI usage has experienced explosive growth, increasing 20x in the past year, empowering companies to gain significant new efficiencies by automating and accelerating routine and mission-critical processes.
To this end, they also announced new advancements, such as AI dashboards, training and tuning capabilities, model optimizer, live API, among others. In addition, they have introduced new capabilities to move towards a multi-agent ecosystem:
- Agent Development Kit (ADK): open source framework that simplifies the process of creating sophisticated multi-agent systems while maintaining precise control over agent behavior.
- Agent-to-Agent (A2A) Protocol: Google becomes the first hyperscaler to create an open Agent-to-Agent (A2A) protocol to help enterprises support multi-agent ecosystems so that agents can communicate with each other, regardless of the underlying technology.
- Agent Garden: is a collection of ready-to-use examples and tools, accessible directly from ADK.
- Interoperability: with Vertex AI, you can seamlessly manage agents created in multiple agent frameworks, such as LangGraph and Crew AI.
Google Agentspace
Agentspace integrates Google-quality enterprise search, conversational AI, Gemini, and external agents to enable employees to find and synthesize information within their organizations, interact with AI agents, and take action with their business applications.
Now includes new enhancements:
- Chrome Enterprise: This is now also seamlessly integrated with Chrome Enterprise, allowing employees to search and access all enterprise resources directly from the search box in Chrome, streamlining workflows and increasing productivity.
- Agent Gallery: this provides employees with a single view of available agents across the enterprise, making agents easy to discover and use.
- Agent Designer: a codeless interface for creating custom agents that automate daily work tasks. It helps employees tailor agents to their individual workflows and needs.
- Idea Generation agent: This agent uses a tournament-style framework to efficiently rank ideas according to employee-defined criteria and can help employees refine or generate new ideas.
- Deep Research agent: This agent explores complex topics on behalf of employees and provides them with results in a comprehensive, easy-to-read report.
Google Workspace
With the aim of empowering AI users, new innovations have been included in Workspace:
- Help Me Analyze: this feature transforms Google Sheets into a personal business analyst, intelligently identifying information from your data to help you make reliable decisions.
- Docs Audio Overview: high-quality audio readings or podcast-style summaries of your documents can be created.
- Google Workspace Flows: helps automate daily work and repetitive tasks, such as managing approvals, researching clients, organizing your email, summarizing your schedule, etc.
Google Unified Security (GUS)
This new security solution employs Gemini to improve efficiency in all aspects of the cybersecurity professional’s experience. It creates a scalable, searchable, unified network of security data that covers the entire attackable surface.
In addition, it provides visibility and detection capabilities as well as response across networks, endpoints, cloud, and applications. It also automatically enriches security data with advanced intelligence from Google.
And this is the summary of the first day, and tomorrow we will be back with more news!