Tag Banner

All news with #cloud tpu tag

Mon, October 20, 2025

AI Hypercomputer Update: vLLM on TPUs and Tooling Advances

🔧 Google Cloud’s Q3 AI Hypercomputer update highlights inference improvements and expanded tooling to accelerate model serving and diagnostics. The release integrates vLLM with Cloud TPUs via the new tpu-inference plugin, unifying JAX and PyTorch runtimes and boosting TPU inference for models such as Gemma, Llama, and Qwen. Additional launches include improved XProf profiling and Cloud Diagnostics XProf, an AI inference recipe for NVIDIA Dynamo, NVIDIA NeMo RL recipes, and GA of the GKE Inference Gateway and Quickstart to help optimize latency and cost.

read more →

Mon, September 29, 2025

Google Cloud Customers: Monthly Innovations Roundup

🚀 This roundup highlights how leading organizations are using Google Cloud to optimize networks, accelerate AI, and scale mission-critical services. From Uber reducing edge latency with Hybrid NEGs to Target rebuilding search with AlloyDB AI hybrid search, customers report measurable gains in performance, cost, and reliability. Healthcare, finance, media, and telecommunications teams also describe operational wins — faster inference, seamless migrations, and stronger real-time experiences.

read more →

Tue, September 23, 2025

Escalante Uses JAX on TPUs for AI-driven Protein Design

🧬 Escalante leverages JAX's functional, composable design to combine many predictive models into a single differentiable objective for protein engineering. By translating models (including AlphaFold and Boltz-2) into a JAX-native stack and composing them serially or linearly, they compute gradients with respect to input sequences and evolve candidates via optimization. Each job samples thousands of sequences, filters to roughly ten lab-ready designs, and runs at scale on Google Kubernetes Engine using spot TPU v6e, yielding a reported 3.65x performance-per-dollar advantage over H100 GPUs.

read more →