< ciso
brief />
Tag Banner

All news with #amazon sagemaker ai tag

126 articles · page 3 of 7

SageMaker Training Plans: CloudWatch Metrics for Capacity

📊 Amazon SageMaker Training Plans now publishes Amazon CloudWatch metrics to track utilization of capacity reservations tied to purchased Flexible Training Plans. Administrators gain both historical and real‑time views of instance usage at the individual plan level and across an account, enabling informed decisions about capacity allocation and cost optimization. This observability helps teams align compute consumption with AI budgets and timelines while reducing wasted reserved capacity.
read more →

SageMaker Unified Studio: Notebook Kernels Now in VPC

🔒 Amazon SageMaker Unified Studio now runs notebook kernels inside the domain-configured Amazon VPC, providing network isolation for interactive ML and data workloads. Kernels inherit VPC settings, subnets, and security groups defined at the domain level, enabling centralized network policy and secure access to private databases, internal APIs, and non-public data sources. This VPC configuration applies to the interactive compute where Python code and dataframes execute; other compute engines have separate VPC considerations. VPC-enabled kernels are available in all Regions where SageMaker Unified Studio is supported.
read more →

SageMaker Unified Studio Adds Multiple Code Spaces

🧑‍💻 Amazon SageMaker Unified Studio now lets data workers create and manage multiple code spaces within a single project for IAM domains. Each space maintains its own persistent Amazon EBS volume and independent compute and storage settings, and can be paused, resumed, or connected to a local IDE while preserving files and session state. This enables parallel workstreams and isolated experiments with tailored runtimes and is available in all Regions where SageMaker Unified Studio is offered.
read more →

Amazon SageMaker Adds Serverless Fine-Tuning for Qwen3.5

🧩 Amazon SageMaker AI now supports serverless model customization for Qwen3.5, enabling supervised fine-tuning (SFT) and reinforcement fine-tuning (RFT) of 4B, 9B, and 27B parameter models. With serverless customization, SageMaker handles infrastructure provisioning and training orchestration so teams can focus on data, evaluation, and domain adaptation while paying only for consumed resources. This capability is available in US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and EU (Ireland) and can be launched from SageMaker Studio or the SageMaker Python SDK.
read more →

SageMaker AI introduces automated inference recommendations

🔧 Amazon SageMaker AI now provides inference recommendations that automate optimization and benchmarking to deliver validated, deployment-ready configurations. Customers supply their own generative models, define expected traffic patterns, and set a performance objective — optimize for cost, minimize latency, or maximize throughput. SageMaker analyzes model architecture, benchmarks across multiple instance types using NVIDIA AIPerf, and returns metrics such as time to first token, inter-token latency, request latency percentiles, throughput, and cost projections. The capability is available today in seven AWS Regions.
read more →

Amazon SageMaker HyperPod Adds Flexible Instance Groups

🆕 Amazon SageMaker HyperPod now supports flexible instance groups, allowing multiple instance types and multiple subnets within a single instance group. Using a new InstanceRequirements parameter, HyperPod provisions the highest-priority instance type first and automatically falls back to lower-priority types when capacity is unavailable. The feature integrates with Karpenter autoscaling and can be created via the CreateCluster/UpdateCluster APIs, AWS CLI, or the Management Console.
read more →

Amazon SageMaker HyperPod Adds On-Demand GPU Health Checks

🔍 Amazon SageMaker HyperPod now supports on-demand deep health checks for Amazon EKS and Slurm-orchestrated clusters. Administrators can run comprehensive GPU stress and connectivity tests on entire instance groups or specific instances, with progress and results visible at both group and instance levels via the SageMaker console and APIs. Instances under test are isolated from scheduling and are returned to service upon passing or, when paired with automatic node recovery, rebooted or replaced if they fail.
read more →

SageMaker JumpStart Adds Optimized Deployments for FMs

🚀 SageMaker JumpStart now offers optimized deployments for foundation models, providing pre-configured, task-aware settings tailored to specific use cases and performance goals. Customers can choose cost-, throughput-, latency-optimized, or balanced configurations and preview P50 latency, time-to-first-token, and throughput metrics before deployment. Supported models include variants from Meta, Microsoft, Mistral AI, Qwen, Google, and TII, and deployments target SageMaker AI Managed Inference endpoints or HyperPod clusters. The feature leverages VPC deployment capabilities and is available in all regions where JumpStart is supported.
read more →

Nemotron-3-Super-120B and Qwen3.5 Models Added to SageMaker

🚀 Amazon SageMaker JumpStart now includes NVIDIA’s Nemotron-3-Super-120B and the Qwen3.5 family (9B and 27B), giving customers turnkey access to foundation models optimized for agentic reasoning, multilingual coding, and advanced instruction following. Nemotron-3-Super-120B employs a hybrid LatentMixture-of-Experts architecture with Mamba-2 and MoE layers to support collaborative agents and high-volume automation such as IT ticket triage and cybersecurity workflows. The Qwen3.5-9B prioritizes efficiency for resource-constrained environments, while Qwen3.5-27B offers deeper contextual and multimodal reasoning for large-scale document processing and complex scenarios. Users can deploy these models directly from the JumpStart catalog or programmatically via the SageMaker Python SDK.
read more →

SageMaker HyperPod Adds Gang Scheduling for EKS Clusters

Amazon SageMaker HyperPod task governance now supports gang scheduling for HyperPod clusters using the EKS orchestrator. Administrators can configure readiness timeouts, node-failure behavior, single-workload admission and retry policies so distributed training jobs are only admitted when all required pods are ready. Pulled workloads are automatically requeued to avoid stalls and wasted compute. This reduces deadlocks, resource contention, and unexpected cost overruns for multi-pod training jobs.
read more →

Amazon SageMaker Serverless Workflows for Identity Center

⚙️ Amazon SageMaker Unified Studio now supports Serverless Workflows in Identity Center domains, allowing customers to orchestrate data-processing tasks with Apache Airflow (via Managed Workflows for Apache Airflow) without provisioning Airflow infrastructure. Serverless Workflows auto-provision compute during runs and release it afterward, so you pay only for actual run time. Each workflow runs with its own execution role and isolated worker to ensure workflow-level security and prevent cross-workflow interference. The Visual Workflow experience supports around 200 operators and built-in integrations with services such as Amazon S3, Amazon Redshift, Amazon EMR, AWS Glue, and Amazon SageMaker AI.
read more →

SageMaker Unified Studio adds notebook import/export

📝 Amazon has added import/export capabilities to SageMaker Unified Studio notebooks to simplify migration from JupyterLab and other platforms. The feature supports .ipynb, .json, and .py formats while preserving cell types, outputs, execution history, and metadata. Exports are available in four package types (.zip with requirements, .ipynb, .py, and native .json). The release also introduces developer productivity features including cell reordering, keyboard shortcuts, cell renaming, and multi-line SQL with tabbed results.
read more →

Amazon SageMaker Data Agent Adds Charts, SQL, and MVs

📊 Amazon SageMaker Data Agent now embeds interactive charting, SQL analytics across Snowflake sources, and materialized view management directly inside SageMaker Unified Studio notebooks. You can ask natural-language prompts like "plot monthly revenue trends by region for 2025" to generate interactive charts that support hover, editing, and refinement without writing code. When analyses span AWS and Snowflake, the agent lets you join Snowflake tables via external connections with AWS Glue Data Catalog data in a single prompt. The agent can also recommend and create materialized views, including refresh schedules, to optimize query performance.
read more →

SageMaker Data Agent adds Japan and Australia CRI support

🔒 SageMaker Data Agent now supports cross-region inference profiles for Japan (JP-CRIS) and Australia (AU-CRIS) via Amazon Bedrock. Inference requests originating in Asia Pacific (Tokyo) and Asia Pacific (Sydney) are processed entirely within their respective geographies, helping customers meet data residency and sovereignty requirements. Data Agent continues to provide conversational data exploration, Python and SQL code generation, troubleshooting, and analytics inside SageMaker Unified Studio Notebooks and the Query Editor, with traffic routed exclusively over the AWS Global Network.
read more →

SageMaker Unified Studio: CloudWatch Metrics for Glue Jobs

🔍 Amazon SageMaker Unified Studio now surfaces Amazon CloudWatch metrics for AWS Glue jobs directly alongside job logs in a single, unified interface. Data engineers can correlate DPU utilization, memory consumption, CPU load, and data movement size with log output to diagnose compute bottlenecks and memory pressure faster. The consolidated view reduces mean time to resolution for ETL pipelines and is available in all Regions where SageMaker Unified Studio is generally available. To view metrics, open a Glue job run and select the Metrics tab.
read more →

Amazon SageMaker Data Agent in Query Editor for SQL

🔍Data Agent in the Amazon SageMaker Unified Studio Query Editor brings natural-language-to-SQL capabilities to your SQL analytics workflow. You can ask questions in plain language and have the agent generate context-aware SQL for Amazon Redshift and Amazon Athena, propose step-by-step plans, and use Fix with AI to diagnose and correct failed queries. It preserves query context across follow-ups and is available in IAM domains where SageMaker Unified Studio is supported.
read more →

SageMaker Studio Now Supports Remote Kiro and Cursor IDEs

🔗 AWS now enables remote connections from Kiro and Cursor IDEs to Amazon SageMaker Studio. Data scientists, ML engineers, and developers can use their local Kiro/Cursor setups — including spec-driven development, conversational coding, and automated feature generation — while running workloads on SageMaker's scalable cloud compute. Authentication is supported via the AWS Toolkit extension or SageMaker Studio's web UI, preserving Studio security boundaries and reducing context switching.
read more →

Cursor IDE connects remotely to SageMaker Unified Studio

🔗 AWS announced remote connection from Cursor IDE to SageMaker Unified Studio via the AWS Toolkit extension. This integration lets data scientists, ML engineers, and developers use their local Cursor setup — including AI-powered code completion, natural language editing, and multi-file editing — while leveraging SageMaker's scalable compute and data. Authentication is handled securely through AWS IAM via the Toolkit, preserving custom Cursor settings and enterprise-grade security.
read more →

AWS Batch Adds Quota Management and Preemption for SageMaker

⚙️ AWS Batch now supports quota management with job preemption for SageMaker Training, enabling prioritized GPU allocation and automatic preemption of lower-priority workloads. You can create up to 20 quota shares per job queue and choose resource-sharing strategies, with both cross-share and in-share preemption modes to restore or reallocate borrowed capacity. Capacity utilization is visible at queue, quota share, and job levels, and you can update job priorities after submission and set preemption retry limits. The feature integrates with the SageMaker Python SDK via the aws_batch module and is available in all AWS Regions where AWS Batch is offered; AWS provides an example notebook and user-guide documentation for implementation guidance.
read more →

Amazon SageMaker AI Adds Serverless Customization for Models

🚀 Amazon SageMaker AI now offers serverless model customization and reinforcement fine-tuning for 12 additional open‑weight models, enabling SFT, DPO, and advanced RFT techniques such as RLVR and RLAIF without infrastructure management. You can fine‑tune and evaluate these models on a pay‑per‑use basis across multiple regions. This simplifies alignment for complex, domain‑specific tasks and improves accuracy on verifiable tasks like code generation and structured extraction. No cluster setup, capacity planning, or distributed training expertise is required.
read more →