Tag Banner

All news with #trainium tag

Fri, September 19, 2025

AWS Neuron SDK 2.26 Adds Trn2, PyTorch 2.8, JAX 0.6.2

🚀 AWS has released Neuron SDK 2.26.0 as generally available, delivering framework and runtime improvements for Inferentia and Trainium-based instances. The update adds support for PyTorch 2.8 and JAX 0.6.2, enhances inference on Trainium2 (Trn2) instances, and enables deployment of models such as FLUX.1-dev and beta Llama 4 Scout/Maverick. It also introduces expert parallelism (beta) for MoE models, new Neuron Kernel Interface APIs, and an improved Neuron Profiler with system profile grouping for distributed workloads.

read more →

Tue, September 2, 2025

AWS Split Cost Allocation Adds GPU and Accelerator Cost Tracking

🔍 Split Cost Allocation Data now supports accelerator-based workloads running in Amazon Elastic Kubernetes Service (EKS), allowing customers to track costs for Trainium, Inferentia, NVIDIA and AMD GPUs alongside CPU and memory. Cost details are included in the AWS Cost and Usage Report (including CUR 2.0) and can be visualized using the Containers Cost Allocation dashboard in Amazon QuickSight or queried with Amazon Athena. New customers can enable the feature in the Billing and Cost Management console; it is automatically enabled for existing Split Cost Allocation Data customers.

read more →

Thu, August 21, 2025

AWS Neuron SDK 2.25: Inference and Monitoring Enhancements

🚀 AWS has released Neuron SDK 2.25.0, now generally available for Inferentia and Trainium instances, adding context and data parallelism support plus chunked attention to accelerate long-sequence inference. The update enhances neuron-ls and neuron-monitor APIs to show node affinities and device utilization, and introduces automatic aliasing (Beta) and disaggregated serving improvements (Beta). Upgraded AMIs and Deep Learning Containers are provided for inference and training.

read more →