All news with #autoscaling tag
Thu, September 18, 2025
Amazon SageMaker HyperPod Adds Managed Karpenter Autoscaling
🛠️ Amazon SageMaker HyperPod now supports managed node autoscaling using Karpenter, enabling automated cluster scaling for both inference and training workloads. This managed capability removes the operational burden of installing and maintaining autoscaling infrastructure while providing integrated resilience and fault tolerance. Customers gain just-in-time GPU provisioning, scale-to-zero during low demand, workload-aware instance selection, and cost reductions through intelligent consolidation.
Thu, August 28, 2025
Container-Optimized Compute Delivers Fast Autopilot Scaling
🚀 GKE Autopilot now runs on a container-optimized compute platform that rethinks autoscaling to deliver near-real-time capacity. The platform uses dynamically resizable VMs and a pool of pre-provisioned compute so nodes can be resized or allocated without disrupting workloads. Customers on GKE Autopilot 1.32+ get faster pod scheduling, improved HPA responsiveness, and support for in-place pod resize out of the box. Google recommends the general purpose compute class for small, gradually scaling services.