< ciso
brief />
Tag Banner

All news with #amazon sagemaker ai tag

126 articles · page 4 of 7

SageMaker HyperPod Adds Continuous Provisioning for Slurm

🚀 Amazon SageMaker HyperPod now supports continuous provisioning for clusters using the Slurm orchestrator, allowing training jobs to start immediately on available instances while remaining capacity is provisioned in the background. Priority-based provisioning brings up the Slurm controller first, then login and worker nodes in parallel, with asynchronous retries for failed launches. The feature reduces time-to-training, improves utilization, and removes the need for manual scaling interventions.
read more →

Amazon SageMaker Unified Studio Adds Custom Filters

🔎 Amazon SageMaker Unified Studio now supports custom metadata search filters, enabling teams to narrow catalog results using organization-specific attributes like business region, data classification, or study name. Filters accept string fields with a contains operator and numeric fields (Integer, Long) with equals, greater than, and less than operators. Users can also filter by asset name, description, and date range, combine multiple filters, and retain selections across browser sessions; the feature is available in all AWS Regions where Unified Studio is supported.
read more →

Amazon SageMaker Unified Studio Aggregated Lineage View

🔍 Amazon SageMaker Unified Studio now provides an aggregated view of data lineage that consolidates all jobs contributing to a dataset. The aggregated view displays multi-level transformations and dependencies to help you identify upstream sources and downstream consumers across the full lineage graph. It is the default for IdC-based domains, with an option to revert to the previous event-timestamp-ordered view, and the new QueryGraph API exposes node graphs with metadata and augmented business context.
read more →

Amazon SageMaker Unified Studio: Aggregated Lineage

🔍 Amazon SageMaker Unified Studio now offers an aggregated lineage view that shows all jobs contributing to a dataset across multiple levels of the lineage graph. The aggregated view is the default for IdC-based domains, while the previous event-timestamp snapshot can be restored via a "display in event timestamp order" toggle. A new QueryGraph API returns lineage node graphs with metadata and augmented business context; the capability is available in all SageMaker Unified Studio regions, with documentation and API references provided.
read more →

Amazon SageMaker Training Plan Extension Now Available

🔁 Amazon SageMaker Training Plans can now be extended to cover AI training runs that take longer than originally scheduled. Extensions are available in 1-day increments up to 14 days, or 7-day increments up to 182 days, and can be purchased through the SageMaker console or via API. Once an extension is purchased the reserved GPU capacity (clusters up to 64 instances) continues to run without interruption, and SageMaker automatically provisions infrastructure so workloads keep running without reconfiguration. The feature helps teams maintain cost-efficient training schedules and reduce operational disruptions.
read more →

Amazon SageMaker Unified Studio Adds Light Mode Option

🔆 AWS has added light mode support to Amazon SageMaker Unified Studio for IAM-based domains, allowing users to choose between dark and light visual themes. The addition improves readability in bright environments and offers a familiar look for customers who prefer lighter interfaces. In Studio, select Profile > customize appearance to switch modes. The setting persists across browsers and devices and complements the existing dark mode to give users full control over their development environment's appearance.
read more →

Amazon SageMaker HyperPod Adds RIG Observability for Training

🔍 Amazon SageMaker HyperPod now provides integrated observability for Restricted Instance Groups (RIG), giving teams training foundation models with Nova Forge a unified view of compute resources and training workloads. A pre-configured Amazon Managed Grafana dashboard, backed by Amazon Managed Service for Prometheus, aggregates metrics from four exporters to show GPU utilization, NVLink bandwidth, CPU pressure, FSx for Lustre usage, network fabric, Kubernetes state, and curated logs including epoch progress, step-level logs, pipeline errors, and Python tracebacks. Observability is automatically enabled for new RIG clusters and can be turned on for existing clusters via the HyperPod console; it is available in all Regions where SageMaker HyperPod RIG is supported.
read more →

SageMaker Unified Studio Syncs Catalog Metadata to Partners

🔁 Amazon SageMaker Unified Studio now synchronizes catalog metadata and context with Atlan, Collibra, and Alation, aligning projects, assets, descriptions, glossary terms, and hierarchies across platforms. Collibra supports bidirectional synchronization and can manage SageMaker Unified Studio data access requests, while Atlan and Alation ingest metadata from SageMaker with additional enhancements planned. The Collibra integration is provided as an open-source solution on GitHub, and setup is performed by establishing connections from each partner to SageMaker Unified Studio.
read more →

Amazon SageMaker Unified Studio Adds AWS Glue 5.1 Support

🚀 Amazon SageMaker Unified Studio now supports AWS Glue 5.1 for Visual ETL, notebook, and code-based data processing jobs. With Glue 5.1 you can run on Apache Spark 3.5.6 with Python 3.11 and Scala 2.12.18, and use updated open table formats including Apache Iceberg 1.10.0, Apache Hudi 1.0.2, and Delta Lake 3.3.2. Select Glue 5.1 from the job version dropdown to apply the runtime across Visual ETL, notebooks, and code jobs.
read more →

Kiro IDE Now Connects Remotely to SageMaker Unified

🔗 AWS now enables Kiro IDE to connect remotely to Amazon SageMaker Unified Studio, allowing data scientists, ML engineers, and developers to use their local Kiro setup — including spec-driven development, conversational coding, and automated feature generation — while running workloads on SageMaker’s scalable compute. The integration uses the AWS Toolkit extension for secure IAM-based authentication and preserves local specs, steering files, and hooks. This reduces context switching and keeps agentic development workflows within a single environment across AWS analytics and ML services. The capability is available in all Regions where SageMaker Unified Studio is offered.
read more →

Amazon SageMaker HyperPod: API-driven Slurm Management

🔧 Amazon SageMaker HyperPod now supports API-driven Slurm configuration, enabling you to define Slurm topology, instance group to partition mappings, and FSx filesystem mounts directly in the cluster CreateCluster and UpdateCluster APIs or via the AWS Console. The update lets you specify node roles such as Controller, Login, and Compute per instance group and mount FSx for Lustre or FSx for OpenZFS filesystems. A new SlurmConfigStrategy (Managed, Overwrite, Merge) detects partition-node drift and controls whether updates are paused, overwritten, or merged to preserve manual customizations.
read more →

Amazon SageMaker HyperPod Adds Console Node Actions

🔧 Amazon SageMaker HyperPod now lets operators manage individual cluster nodes directly from the AWS Console. The console enables SSM session launches, copyable pre-populated SSM CLI commands, and direct node actions such as reboot, delete, and replace, with support for batch operations across multiple nodes. Available in all Regions where HyperPod is supported, these controls reduce context switching and speed manual recovery for time-sensitive AI training and inference workloads.
read more →

Cartesia Sonic 3 Available in SageMaker JumpStart Catalog

🔈 Cartesia's Sonic 3 model is now available in Amazon SageMaker JumpStart, giving AWS customers a turnkey option for advanced streaming text-to-speech. Sonic 3 is a state space model (SSM) offering high naturalness, accurate transcript following, sub-100ms latency, and fine-grained control over volume, speed, and emotion. It supports 42 languages, natural laughter, and voices optimized for agents and expressive characters. Deployments can be launched from SageMaker Studio or via the SageMaker Python SDK.
read more →

Apache Spark Lineage Now in SageMaker Unified Studio

🔍 Amazon SageMaker now provides Data Lineage for Apache Spark jobs run on Amazon EMR and AWS Glue within IDC-based SageMaker Unified Studio domains. The feature captures schema and column-level transformations from EMR-EC2, EMR-Serverless, EMR-EKS, and Glue, and makes lineage explorable as a visual graph or queryable via APIs. Teams can compare transformation history across Spark jobs to investigate regressions, trace root causes, and assess impact. Spark lineage is available in all existing SageMaker Unified Studio regions.
read more →

AWS Adds DeepSeek OCR, MiniMax, and Qwen3 to JumpStart

📢 AWS has added DeepSeek OCR, MiniMax M2.1, and Qwen3-VL-8B-Instruct to SageMaker JumpStart, expanding the set of foundation models available to customers. DeepSeek OCR focuses on visual-text compression and structured extraction from forms, invoices, diagrams, and other dense document layouts. MiniMax M2.1 targets multilingual coding, tool use, instruction following, and long-horizon planning to support autonomous workflows. Qwen3-VL-8B-Instruct enhances vision-language reasoning, spatial and video dynamics comprehension, and extended context handling. Customers can deploy any of these models via the JumpStart catalog or the SageMaker Python SDK to accelerate AI application development on AWS infrastructure.
read more →

NVIDIA NIMs Now Available on Amazon SageMaker JumpStart

🚀 With Amazon SageMaker JumpStart, customers can now deploy four NVIDIA NIMs — ProteinMPNN, Nemotron-3.5B-Instruct, MSA Search NIM, and Cosmos Reason — with one click. These prebuilt, optimized inference microservices are designed for NVIDIA-accelerated infrastructure and target biosciences and physical AI use cases. They enable protein sequence optimization, GPU-accelerated multiple sequence alignment, large-context reasoning and agentic tool calling, and vision-language planning for robotics. Deployments are accessible from the SageMaker JumpStart catalog or via the SageMaker Python SDK.
read more →

Amazon SageMaker HyperPod: Enhanced lifecycle script logging

🔍 Amazon SageMaker HyperPod now surfaces detailed error messages and points directly to the CloudWatch log group and log stream that captured lifecycle script output. You can view these messages through the DescribeCluster API or via the SageMaker console, which includes a 'View lifecycle script logs' button to open the exact CloudWatch stream. CloudWatch entries now contain execution markers (begin, download start/complete, success/failure) to help pinpoint where provisioning failed. This enhancement is available in all Regions where HyperPod is supported and reduces time to diagnose lifecycle script issues.
read more →

Amazon SageMaker HyperPod Validates Account Service Quotas

🧭 The Amazon SageMaker HyperPod console now validates AWS service quotas for your account before initiating cluster creation. The console automatically compares your requested cluster configuration—instance types, EBS volume sizes, and VPC-related resources—against account-level quotas and presents a clear table of expected utilization, applied quota values, and compliance status. If validation detects potential quota shortfalls, it issues a warning and provides direct links to the Service Quotas console so you can request increases before provisioning begins.
read more →

MiniMax-M2 Now Deployable via SageMaker JumpStart Support

🚀 MiniMax-M2 is now available on SageMaker JumpStart, enabling immediate deployment of this efficient open-source MoE model in minutes. The model combines 230 billion total parameters with 10 billion active parameters to deliver a compact, fast, and cost-effective option optimized for coding and agentic tasks while preserving strong general intelligence. Customers can deploy via SageMaker Studio or the SageMaker Python SDK and follow AWS best practices for production use.
read more →

Amazon SageMaker AI Launches in Asia Pacific (NZ) Region

🚀Amazon announced that SageMaker AI is now available in the Asia Pacific (New Zealand) AWS Region. Starting today, developers and data scientists in New Zealand can build, train, and deploy machine learning models locally using the fully managed SageMaker AI platform. The service removes much of the operational overhead across the ML lifecycle, helping teams move from experimentation to production more quickly and consistently. Customers should review AWS documentation and pricing to get started.
read more →