< ciso
brief />
Tag Banner

All news with #amazon sagemaker ai tag

126 articles · page 5 of 7

Amazon SageMaker enables self-service notebook migration

🔁 Amazon SageMaker Notebook instances now support self-service migration via the PlatformIdentifier parameter in the UpdateNotebookInstance API. You can update unsupported platform identifiers (notebook-al1-v1, notebook-al2-v1, notebook-al2-v2) to supported versions (notebook-al2-v3, notebook-al2023-v1) while preserving data and configurations. The capability is available through AWS CLI (v2.31.27+) and SDKs in all Regions where Notebook instances are supported. This simplifies keeping instances current and reduces manual migration effort.
read more →

Amazon SageMaker HyperPod Adds Checkpointless Training

🚀 Amazon SageMaker HyperPod now supports checkpointless training, a foundational capability that eliminates the need for checkpoint-based, job-level restarts for distributed model training. Checkpointless training preserves forward training state across the cluster, automatically swaps out failed nodes, and uses peer-to-peer state transfer to resume progress, reducing recovery time from hours to minutes. The feature can deliver up to 95% training goodput at very large scale, is available in all Regions where HyperPod runs, and can be enabled with zero code changes for popular recipes or with minimal PyTorch modifications for custom models.
read more →

AWS SageMaker AI adds serverless model customization

🚀 Amazon SageMaker AI now offers a serverless model customization capability that lets developers quickly fine-tune popular models using supervised learning, reinforcement learning, and direct preference optimization. The fully managed, end-to-end workflow simplifies data preparation, synthetic data generation, training, evaluation, and deployment through an easy-to-use interface. Supported base models include Amazon Nova, Llama, Qwen, DeepSeek, and GPT-OSS. The AI agent-guided workflow is in preview with regional availability and a waitlist.
read more →

Amazon SageMaker HyperPod Adds Elastic Training at Scale

⚡ Amazon SageMaker HyperPod now supports elastic training, automatically scaling distributed training jobs to absorb idle accelerators and contract when higher‑priority workloads require resources. This eliminates the manual cycle of halting jobs, reconfiguring parameters, and restarting distributed training, which previously demanded specialized engineering time. Organizations can start training with minimal resources and grow opportunistically, improving cluster utilization and reducing costs. Elastic training can be enabled with zero code changes for public models like Llama and GPT OSS, and requires only lightweight configuration updates for custom architectures.
read more →

AWS AI Factories: Dedicated High-Performance AI Infrastructure

🚀 AWS AI Factories are now available to deploy high-performance AWS AI infrastructure inside customer data centers, combining AWS Trainium, NVIDIA GPUs, low-latency networking, and optimized storage. The service integrates Amazon Bedrock and Amazon SageMaker to provide immediate access to foundation models without separate provider contracts. AWS manages procurement, setup, and operations while customers supply space and power, enabling isolated, sovereign deployments that accelerate AI initiatives.
read more →

Amazon Nova Forge: Build Frontier Models with Nova

🚀 Amazon Web Services announced general availability of Nova Forge, a SageMaker AI service that enables organizations to build custom frontier models from Nova checkpoints across pre-, mid-, and post-training phases. Developers can blend proprietary data with Amazon-curated datasets, run Reinforcement Fine Tuning (RFT) with in-environment reward functions, and apply custom safety guardrails via a built-in responsible AI toolkit. Nova Forge includes early access to Nova 2 Pro and Nova 2 Omni and is available today in US East (N. Virginia).
read more →

Amazon SageMaker AI Adds Serverless MLflow Support

🧠 Amazon SageMaker AI now offers a serverless MLflow capability that automatically scales to support experiment tracking and model development without infrastructure setup. The service scales up for demanding workloads and scales down during idle periods, reducing operational overhead. Administrators can enable cross-account access via Resource Access Manager (RAM). The feature integrates with SageMaker AI JumpStart, Model Registry, and Pipelines and is offered at no additional charge in select AWS Regions.
read more →

Amazon SageMaker Catalog Exports Asset Metadata to Iceberg

🔍 Amazon SageMaker Catalog now exports asset metadata as an Apache Iceberg table via Amazon S3 Tables, enabling teams to query catalog inventory with standard SQL without building custom ETL. The export includes technical fields (resource_id, resource_type), business metadata (asset_name, business_description), ownership details, and timestamps, partitioned by snapshot_date for time travel queries. The dataset appears in SageMaker Unified Studio and is queryable from Amazon Athena, Studio notebooks, AI agents, and BI tools. Available in all supported Regions at no additional SageMaker charge; you pay for S3 Tables storage and Athena queries.
read more →

AWS Expands AI Competency with New Agentic AI Categories

🚀 AWS announced a major expansion of its AI Competency, validating 60 partners across three new Agentic AI categories: Agentic AI Tools, Agentic AI Applications, and Agentic AI Consulting Services. The launch includes an AI agent in AWS Partner Central to provide immediate feedback and speed specialization approvals. Validated partners demonstrate production-grade capabilities using services such as Amazon Bedrock AgentCore, Strands Agents, and Amazon SageMaker AI, and must meet AWS standards for security, reliability, and responsible AI.
read more →

AWS AI League 2026 Championship Expands Challenges

🤖 AWS has launched the AWS AI League 2026 Championship, expanding its flagship AI tournament with new challenge tracks and a doubled prize pool of $50,000 to drive builder innovation. The program pairs a brief orientation with two competition tracks: a Model Customization track using Amazon SageMaker AI to fine-tune foundation models for domain-specific tasks, and an Agentic AI track using Amazon Bedrock AgentCore to build planning and execution agents. Enterprises can apply to host internal tournaments and receive AWS credits to run team competitions, while individual developers can compete at AWS Summits to test skills and build with AWS AI services.
read more →

SageMaker HyperPod: Managed Tiered KV Cache Launch

⚡ Amazon SageMaker HyperPod now offers Managed Tiered KV Cache and Intelligent Routing to optimize LLM inference for long-context prompts and multi-turn conversations. The two-tier cache combines local CPU memory (L1) with disaggregated cluster storage (L2) — with AWS-native tiered storage recommended and Redis optional — to reuse computed key-value pairs and reduce recomputation. Intelligent Routing directs requests using prefix-aware, KV-aware, or round-robin strategies, while built-in observability integrates with Amazon Managed Grafana and deployment is enabled via InferenceEndpointConfig or SageMaker JumpStart.
read more →

SageMaker HyperPod Adds Custom Kubernetes Labels and Taints

🛠️ Amazon SageMaker HyperPod now supports custom Kubernetes labels and taints configured at the instance group level via the CreateCluster and UpdateCluster APIs. You can specify up to 50 labels and 50 taints per instance group using the KubernetesConfig parameter. HyperPod automatically applies and preserves these settings across node creation, replacement, scaling, and patching, eliminating manual kubectl work and ensuring device plugin pods (EFA, NVIDIA) schedule correctly while allowing NoSchedule taints to protect costly GPU nodes.
read more →

Amazon SageMaker HyperPod: Programmatic Node Recovery

🚀 Amazon SageMaker HyperPod is now generally available with new programmatic APIs that let administrators reboot or replace cluster nodes at scale. The BatchRebootClusterNodes and BatchReplaceClusterNodes APIs provide an orchestrator-agnostic way to recover unresponsive or degraded nodes for both Slurm and EKS clusters. Each API supports batch operations for up to 25 instances and complements existing orchestrator-specific workflows. The capabilities are currently available in US East (Ohio), Asia Pacific (Mumbai), and Asia Pacific (Tokyo) and are accessible via the AWS CLI, SDKs, or API calls.
read more →

SageMaker AI Adds Flexible Training Plans for Inference

⚙️ Amazon SageMaker AI's Flexible Training Plans (FTP) now support inference endpoints, allowing customers to reserve guaranteed GPU capacity for planned evaluations and production peaks. You choose instance types, compute requirements, reservation length, and start date, then reference the reservation ARN when creating an endpoint. SageMaker AI automatically provisions and runs the endpoint on the reserved capacity for the plan duration, removing much of the infrastructure scheduling overhead. FTP for inference is initially available in US East (N. Virginia), US West (Oregon), and US East (Ohio).
read more →

Manage SageMaker HyperPod Clusters with AI MCP Server

🔧 The Amazon SageMaker AI MCP Server now provides tools to set up and manage HyperPod clusters, allowing AI coding assistants to provision and operate clusters for distributed training, fine‑tuning, and deployment. It automates prerequisites and orchestrates clusters via Amazon EKS or Slurm with CloudFormation templates that optimize networking, storage, and compute. The server also delivers lifecycle operations — scaling, patching, diagnostics — so administrators and data scientists can manage large-scale AI/ML clusters without deep infrastructure expertise.
read more →

SageMaker AI Inference Adds Bidirectional Streaming

🎙️ Amazon SageMaker AI Inference now supports bidirectional streaming, enabling real-time speech-to-text transcription that returns partial transcripts while audio is still being captured. Using the new Bidirectional Stream API, clients open an HTTP/2 connection to the SageMaker AI runtime, which automatically creates a WebSocket to your model container so audio frames and interim transcripts flow continuously. Any container that implements a WebSocket handler per the SageMaker AI contract works out of the box, allowing real-time models such as Deepgram to run without modification. The feature eliminates weeks or months of custom streaming infrastructure work so teams can focus on model accuracy, latency tuning, and agent behavior.
read more →

Amazon SageMaker Adds EAGLE for Faster Inference Throughput

⚡ Amazon SageMaker AI now supports EAGLE (Extrapolation Algorithm for Greater Language-model Efficiency) speculative decoding to boost large language model inference throughput by up to 2.5x. The capability enables models to predict and validate multiple tokens in parallel rather than one at a time, preserving output quality while reducing latency. SageMaker automatically selects between EAGLE 2 and EAGLE 3 depending on model architecture and provides built‑in optimization jobs using curated or customer datasets. Optimized models can be deployed through existing SageMaker inference workflows without infrastructure changes, and the feature is available in select AWS Regions.
read more →

Amazon Athena for Apache Spark Integrated with SageMaker

🚀 Amazon SageMaker now supports Amazon Athena for Apache Spark, combining a new notebook experience with a fast serverless Spark runtime in a single workspace. Data engineers, analysts, and data scientists can query data, run Python, develop jobs, train models, and visualize results with no infrastructure to manage and second-level billing. The service runs Spark 3.5.6, is optimized for Apache Iceberg and Delta Lake, and adds debugging, real-time Spark UI monitoring, and secure Spark Connect communication. Table-level access controls are enforced through AWS Lake Formation.
read more →

Amazon SageMaker One-Click Onboarding for Existing Data

✨ Amazon SageMaker now offers one-click onboarding of existing AWS datasets into Amazon SageMaker Unified Studio, letting customers begin data work in minutes while retaining their current IAM roles and permissions. The feature provisions a pre-configured serverless notebook with a built-in AI agent that supports SQL, Python, Spark, and natural language. Users can start from SageMaker, Amazon Athena, Amazon Redshift, or Amazon S3 Tables consoles and the setup imports permissions from AWS Glue Data Catalog, Lake Formation, and S3 to accelerate first use.
read more →

Amazon SageMaker Data Agent for Analytics and ML Development

🤖 Amazon SageMaker Data Agent is a built-in AI agent in the new notebook experience that accelerates analytics and ML development. It translates natural-language prompts into detailed execution plans and generates SQL and Python code, while staying aware of notebook context and data catalog metadata. Available in multiple AWS regions, it speeds common tasks like data transformation, statistical analysis, and model prototyping.
read more →