Tag Banner

All news with #baseten tag

Mon, September 29, 2025

Google Cloud Customers: Monthly Innovations Roundup

🚀 This roundup highlights how leading organizations are using Google Cloud to optimize networks, accelerate AI, and scale mission-critical services. From Uber reducing edge latency with Hybrid NEGs to Target rebuilding search with AlloyDB AI hybrid search, customers report measurable gains in performance, cost, and reliability. Healthcare, finance, media, and telecommunications teams also describe operational wins — faster inference, seamless migrations, and stronger real-time experiences.

read more →

Thu, September 4, 2025

Baseten: improved cost-performance for AI inference

🚀 Baseten reports major cost-performance gains for AI inference by combining Google Cloud A4 VMs powered by NVIDIA Blackwell GPUs with Google Cloud’s Dynamic Workload Scheduler. The company cites 225% better cost-performance for high-throughput inference and 25% improvement for latency-sensitive workloads. Baseten pairs cutting-edge hardware with an open, optimized software stack — including TensorRT-LLM, NVIDIA Dynamo, and vLLM — and multi-cloud resilience to deliver scalable, production-ready inference.

read more →