All news with #anyscale tag
Tue, November 4, 2025
Anyscale's Managed Ray on Azure for Distributed AI
🚀 Microsoft and Anyscale announced a private preview bringing Anyscale’s managed Ray to Azure, enabling developers to run distributed Python AI/ML workloads with native Azure integration. The service leverages the RayTurbo runtime and Azure Kubernetes Service (AKS) to provide elastic scaling, GPU packing, spot VM support, and enhanced observability. It aims to simplify scaling from prototype to production and reduce operational overhead.
Mon, November 3, 2025
Ray on GKE: New AI Scheduling and Scaling Features
🚀 Google Cloud and Anyscale describe tighter integration between Ray and Kubernetes to improve distributed AI scheduling and autoscaling on GKE. The release introduces a Ray Label Selector API (Ray v2.49) to align task, actor and placement-group placement with Kubernetes labels and GKE custom compute classes, enabling targeted placement and fallback strategies for GPUs and markets. It also adds Dynamic Resource Allocation for A4X/GB200 racks, writable cgroups for Ray resource isolation on GKE v1.34+, TPU/JAX training support via a JAXTrainer in Ray v2.49, and in-place pod resizing (Kubernetes v1.33) for vertical autoscaling and higher efficiency.
Mon, November 3, 2025
Ray on TPUs with GKE: Native, Lower-Friction Integration
🚀 Google Cloud and Anyscale have enhanced the Ray experience on Cloud TPUs with GKE to reduce setup complexity and improve performance. The new ray.util.tpu library and a SlicePlacementGroup with a label_selector API automatically reserve co-located TPU slices and preserve SPMD topology to avoid resource fragmentation. Ray Train and Ray Serve gain expanded TPU support including alpha JAX training, while TPU metrics and libtpu logs appear in the Ray Dashboard for faster troubleshooting and migration between GPUs and TPUs.