Google Cloud Next 2025: Gemini & agentic AI updates, new TPUs

April 11, 2025 12:01 AM

Show more sharing options

AI was once again front and centre at Google Cloud Next 2025. CEO Sundar Pichai opened the show by reaffirming Alphabet’s long-standing approach: “to always bring our latest AI advances into the full layers of our stack: Products and platforms”.

Pichai highlighted that Gemini now powers every one of Google’s half-billion-user products — including seven with more than two billion users — and teased the arrival of Gemini 2.5 Flash, a new low-latency model optimised for fast reasoning and cost-efficiency.

Thomas Kurian, CEO of Google Cloud, expanded on this vision: “What was once a possibility is now the vibrant reality we’re collectively building.”

Kurian revealed that more than four million developers are now building with Gemini, while Vertex AI usage has grown 20x year-over-year, fuelled by the surging adoption of models like Gemini, Imagen, and Veo.

This surge in usage is backed by Google’s vast infrastructure: 42 regions, over two million miles of subsea and terrestrial fibre, and more than 200 points of presence globally — all now accessible to enterprises through the new Cloud WAN service.

Across AI models, agentic systems, networking, and security, Google Cloud’s message was clear: This isn’t just an AI platform; it’s a full-stack transformation engine for the enterprise.

Here are all the major announcements from Google Cloud Next 2025:

New AI models on the way

Alphabet CEO Pichai took to the keynote stage to tease the next model in the hyperscaler’s AI arsenal: Gemini 2.5 Flash, a low latency reasoning model. No specific release timeframe was revealed, but the CEO said it represents an evolution of its popular workhorse model.

Google Cloud also provided an update on Veo 2, a Google DeepMind-developed video generation model, revealing it’s now “production-ready” in the Gemini API.

The model can follow both simple and complex instructions, as well as simulate real-world physics in high-quality videos spanning a wide range of visual styles.

Early adopters include Wolf Games, which is using Veo 2 to build “cinematic experiences” for its personalised interactive story game platform.

Meet the new hypercomputer hardware: Ironwood

Google Cloud’s AI Hypercomputer is the workhorse behind almost every AI workload on its cloud platform. The integrated supercomputing system now features the latest iteration of its custom hardware line, Tensor Processing Units (TPUs).

Ironwood, the 7th generation TPU, offers 5x more peak compute capacity and 6x the high-bandwidth memory (HBM) capacity compared to the prior-generation, Trillium.

The new Ironwood TPUs come in two configurations: 256 chips or 9,216 chips, each available as a single scale-up pod, with the larger pod delivering a staggering 42.5 exaFLOPS of compute.

The Hypercomputer hardware is designed to be 2x more power-efficient compared to Trillium while delivering more value per watt.

Developers can now access Ironwood through Google Cloud’s optimised stack across PyTorch and JAX.

Agentic AI additions

Google Cloud saw the hyperscaler double down on its agentic AI offerings, unveiling new tools to let businesses build, deploy and scale multi-agent systems.

At the heart of the updates was the new Agent Development Kit (ADK) — an open-source framework that lets developers build sophisticated AI agents in under 100 lines of code. It’s already being used by brands like Renault and Revionics to automate workflows and decision-making.

To deploy these agents in production, Google introduced Agent Engine, a fully managed runtime on Vertex AI. It supports short- and long-term memory, built-in evaluation tooling, and native integration with Google’s Agentspace platform for secure internal sharing.

The second major agentic announcement was the Agent2Agent (A2A) protocol — an open interoperability standard that allows agents to communicate and collaborate across different frameworks like ADK, LangGraph, and Crew.ai. Over 50 partners including Box, ServiceNow, UiPath, and Deloitte are already onboard.

Networking updates: Cloud WAN, Gen AI service cost reductions

Networking at Next 2025 was centred on scaling for AI and improving cross-cloud performance.

A new 400G Cloud Interconnect and Cross-Cloud Interconnect, arriving later this year, promises 4x the bandwidth for faster data onboarding and multi-cloud model training.

Google Cloud also introduced support for AI clusters of up to 30,000 GPUs in a non-blocking configuration — now available in preview — aimed at supercharging training and inference throughput.

Generative AI serving costs have been cut by up to 30%, with throughput improvements of up to 40%, thanks to innovations like the GKE Inference Gateway.

Google also debuted Cloud WAN, a fully managed enterprise backbone that opens up its global network infrastructure for wide area networking. Designed to simplify and secure enterprise WAN architectures, it delivers up to 40% faster performance compared to the public internet.

At the edge, Google announced enhanced programmability and performance, with Service Extensions now GA for Cloud Load Balancing. Cloud CDN support is on the way, enabling developers to customise application behaviour at the edge using open standards like WebAssembly.

Security updates: Google Unified Security, Gemini agents

Enterprise infrastructure is growing in complexity, widening the attack surface and overloading siloed security teams. Google’s answer? Google Unified Security (GUS), which is now generally available.

GUS is designed to unify threat intelligence, security operations, cloud security, and secure browsing into a single AI-powered platform, integrating expertise from the firm’s Mandiant subsidiary to deliver more scalable, efficient protection.

The new security solution creates a searchable security data fabric across the entire attack surface, offering real-time visibility, detection, and response across networks, endpoints, cloud, and apps. Security signals are then automatically enriched with Google Threat Intelligence, and every workflow is streamlined with its flagship Gemini AI models.

Google also introduced Gemini-powered security agents. Among the new agentic AI tools include an alert triage agent in Google Security Operations, which automatically investigates alerts, compiles evidence, and renders verdicts.

A new malware analysis agent in Google Threat Intelligence evaluates potentially malicious code, executes deobfuscation scripts, and delivers verdicts with full explainability. Both are previewing in Q2.

Datacloud USA & Metro Fall 600x74 2025.jpg

Partnerships: Team ups with Nvidia, Juniper, SAP & more

It wouldn’t be a Google Cloud Next without a series of partnerships being struck or extended, and this year was no different.

The hyperscaler expanded its partnership with Lumen to enhance cloud and network solutions. The team-up will focus on integrating Cloud WAN with Lumen's services, providing direct fiber access to Google Cloud regions, and offering secure, air-gapped connections to Google Distributed Cloud.

Google Cloud also joined forces with Nvidia to bring its Gemini family of AI models to the chipmaker’s Blackwell systems. The move sees Gemini models become available on-prem, enabling customers to lock down sensitive information, such as patient records, financial transactions and classified government information.

“By bringing our Gemini models on premises with Nvidia Blackwell’s breakthrough performance and confidential computing capabilities, we’re enabling enterprises to unlock the full potential of agentic AI,” said Sachin Gupta, VP and general manager of infrastructure and solutions at Google Cloud.

Its Gemini models are also coming to SAP’s generative AI hub on its Business Technology Platform. The hyperscale also added its video and speech intelligence capabilities to support multimodal retrieval-augmented generation (RAG) for video-based learning and knowledge discovery in SAP products.

Also announced was a collaboration with Juniper Networks to accelerate new enterprise campus and branch deployments. Customers will be able to use Google’s Cloud WAN solution alongside Juniper Mist wired, wireless, NAC, firewalls and secure SD-WAN solutions, enabling them to connect critical applications and AI workloads whether on the internet, across clouds or within data centres.

The hyperscaler partnered with Oracle to unveil a partner programme designed to enable Oracle and Google Cloud partners to offer Oracle Database@Google Cloud to their customers.

Data storage firm DataDirect Networks (DDN) also joined up with Google Cloud on its Managed Lustre parallel file system service, which provides up to 1 TB/s of throughput for fast access services for enterprises and startups building AI and High-performance computing (HPC) applications.

Accenture also widened its strategic partnership with Google Cloud, with the pair pledging to work together to develop industry-specific AI solutions.

These latest partnerships add to the ones penned earlier this year, like with Deutsche Telekom, with the pair working together on AI advancement and cloud integration across the operator’s network infrastructure.