< ciso
brief />
Tag Banner

All news with #amazon sagemaker ai tag

126 articles · page 2 of 7

SageMaker Adds Serverless Fine-Tuning for Qwen3.6 Model

🚀 Amazon SageMaker AI now supports serverless customization for the Qwen3.6 27B parameter model using supervised fine-tuning (SFT) and reinforcement fine-tuning (RFT). This extends SageMaker's existing fine-tuning support for Qwen3.5 and other open-weight models. Serverless customization removes infrastructure management—SageMaker handles provisioning and orchestration—so teams pay only for what they use. The feature is available in US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and EU (Ireland).
read more →

AWS Adds GLM-5.1-FP8 and Phi-4-mini to SageMaker JumpStart

🔔 AWS has added GLM-5.1-FP8 (from Z.ai) and Phi-4-mini-instruct (from Microsoft) to Amazon SageMaker JumpStart, expanding foundation model choices for enterprise workloads. GLM-5.1-FP8 targets agentic software engineering and multi-round optimization for repository-level code, debugging, and long-horizon automation. Phi-4-mini-instruct provides compact, low-latency reasoning across 24 languages and supports function calling for edge and latency-sensitive use cases. Customers can deploy these models via SageMaker Studio or the SageMaker Python SDK in a few clicks.
read more →

Qwen Speech Models Added to Amazon SageMaker JumpStart

🔊 AWS has added three Qwen speech foundation models—Qwen3-TTS-12Hz-1.7B-CustomVoice, Qwen3-TTS-12Hz-1.7B-Base, and Qwen3-ASR-1.7B—to Amazon SageMaker JumpStart. The models deliver multilingual text-to-speech and automatic speech recognition capabilities across more than 10 languages and 52 languages/dialects. CustomVoice offers instruction-driven control over timbre, emotion, and prosody while Base enables 3-second rapid voice cloning. SageMaker JumpStart lets customers deploy these models from SageMaker Studio or via the SageMaker Python SDK with a few clicks.
read more →

New Image and Embedding Models Available in SageMaker

🆕 AWS added FLUX.2-klein-base-4B and Qwen3-Embedding-0.6B to Amazon SageMaker JumpStart. FLUX.2 targets real-time image generation and multi-reference editing in a compact architecture that can run on consumer GPUs with about 13GB VRAM. Qwen3-Embedding delivers instruction-aware, multilingual text embeddings across 100+ languages for retrieval, RAG, and semantic search. Customers can deploy these models via SageMaker Studio or the SageMaker Python SDK.
read more →

SageMaker Data Agent Supports IAM Identity Center Now

🧭 Amazon SageMaker Data Agent is now available in SageMaker Unified Studio domains configured with IAM Identity Center. The agent enables data analysts and engineers to describe analysis goals in plain English and receive working Python or SQL code for connected sources such as Amazon Athena, Amazon Redshift, Amazon S3, and AWS Glue Data Catalog. It preserves conversational context across notebook cells, selected tables, and query history, proposes step-by-step plans, and includes a Fix with AI feature to help debug execution errors. The capability is available in all commercial AWS Regions where Unified Studio is supported.
read more →

SageMaker Feature Store Adds SDK v3, Lake Formation

🔒 Amazon SageMaker Feature Store now supports the SageMaker Python SDK v3, providing modular APIs to manage feature groups with less boilerplate. Data scientists can enable Lake Formation access controls to enforce column- and row-level permissions on offline store data at feature group creation. The SDK also exposes Apache Iceberg table properties for configuring compaction and snapshot expiration to optimize storage and queries. Available in all AWS Regions where Feature Store is offered; install v3.8.0 or later to begin.
read more →

P6-B200 Instances Available in US East for SageMaker

🚀 Amazon announces general availability of EC2 P6-B200 instances in AWS US East (N. Virginia) for use with SageMaker Studio notebooks. These instances feature eight NVIDIA Blackwell GPUs, 1440 GB of high-bandwidth GPU memory, and 5th Generation Intel Xeon (Emerald Rapids) processors, offering up to 2x training performance vs P5en. They enable interactive development and fine-tuning of large foundation models directly in JupyterLab or CodeEditor for generative AI workloads.
read more →

P5.4xl Instances Now in SageMaker Studio Notebooks

🚀 Amazon Web Services has announced general availability of Amazon EC2 P5.4xl instances for SageMaker Studio notebooks, powered by NVIDIA H100 Tensor Core GPUs. These instances offer up to 4x faster time-to-solution versus previous-generation GPU instances and claim up to 40% lower training cost for ML models. They are designed to accelerate training and deployment of demanding DL and HPC workloads, including large language models and diffusion models. P5.4xl is available now in select US, Asia Pacific, and South America regions, with developer guides and pricing details provided.
read more →

G6 EC2 Instances Now in Dubai and Malaysia for SageMaker

🚀 Amazon Web Services announced general availability of Amazon EC2 G6 instances for SageMaker Studio notebooks in the Middle East (Dubai) and Asia Pacific (Malaysia). G6 instances pair up to eight NVIDIA L4 Tensor Core GPUs (24 GB each) with third-generation AMD EPYC processors, delivering roughly 2× better deep-learning inference performance than G4dn. These instances support interactive model deployment and training for generative AI fine-tuning, NLP, vision, and recommender workloads. Refer to developer guides for JupyterLab and CodeEditor setup and the pricing page for cost details.
read more →

AWS Adds G6e EC2 Instances to SageMaker Studio Regions

🚀 Amazon Web Services announced general availability of EC2 G6e instances on SageMaker Studio notebooks in Dubai, Tokyo, Seoul, Frankfurt, Stockholm and Spain. G6e instances provide up to 8 NVIDIA L40s Tensor Core GPUs with 48 GB per GPU and 3rd‑generation AMD EPYC processors, delivering up to 2.5× performance versus G5. They target interactive model testing, training and generative AI fine‑tuning, and can host LLMs up to 13B parameters as well as diffusion models for image, video and audio generation. Developer guides cover JupyterLab and CodeEditor setup; pricing is available on the AWS pricing page.
read more →

P4de Instances Expand to SageMaker Studio Notebooks

🚀 Amazon Web Services has announced the general availability of EC2 P4de instances on SageMaker Studio notebooks in Asia Pacific (Tokyo, Singapore) and Europe (Frankfurt). Each P4de packs eight NVIDIA A100 GPUs with 80GB HBM2e (640GB total), offering 2× the per‑GPU memory versus P4d. AWS reports up to 60% faster ML training and roughly 20% lower training cost compared to P4d, benefiting large high‑resolution datasets and reducing model training time. Developers can follow the SageMaker JupyterLab and CodeEditor guides and consult pricing for cost planning.
read more →

SageMaker Unified Studio adds guided tutorials and notes

🧭 Amazon SageMaker Unified Studio introduces a getting-started section with short tutorials that guide users through core workflows—running a first SQL query, analyzing notebook data, building a Visual ETL pipeline, and training an ML model—each using pre-loaded sample data and completable in under 10 minutes. The development environment now auto-matches your OS light/dark mode on first sign-in. A new in-product “What’s New” area surfaces release notes and recent feature announcements to help users discover capabilities as they launch.
read more →

SageMaker Unified Studio adds identity and user controls

🔐 Amazon announced new administration features for SageMaker Unified Studio that give administrators finer control over identity configuration and user management across both IAM and IAM Identity Center domain types. Administrators can now configure AWS IAM Identity Center for SSO onboarding, add IAM roles, users, and groups as project members, and manage domain users from a consolidated admin portal. For Identity Center domains, federated access through IAM roles now produces unique user sessions so collaborators sharing a role do not overwrite each other and actions remain auditable. These updates enable teams to use corporate IAM or IAM Identity Center identities consistently across domains and simplify collaboration and auditing in the Studio environment.
read more →

Amazon SageMaker HyperPod Adds AMI-Based Node Setup

🔧Amazon SageMaker HyperPod now supports AMI-based node lifecycle configuration for Slurm clusters, provisioning nodes with the software and configurations needed for production-ready AI/ML training environments. The AMI includes required components such as Docker, Enroot, and Pyxis, plus Slurm accounting, SSH key generation, log rotation, and user home setup. To enable it, omit the LifeCycleConfig block when creating clusters or select "None" under Lifecycle scripts in the console; you can still supply an extension script for additional customization or continue using full custom lifecycle scripts if you need complete control. This feature is available in all AWS Regions where SageMaker HyperPod is offered.
read more →

Amazon Quick Adds Direct Query to S3 Table Buckets

🔍 Amazon Quick now supports Amazon S3 table buckets as a direct data source, enabling dashboards, conversational analytics, and exploration of Apache Iceberg tables stored in S3 without intermediate warehouses or OLAP layers. Paired with Zero-ETL ingestion from systems like Salesforce, SAP, and Amazon Kinesis Data Firehose, organizations can access near real-time insights with reduced pipeline complexity. Admins configure S3 table bucket permissions once, and authors can immediately create datasets and use Dataset Q&A to query the lakehouse in natural language.
read more →

Amazon SageMaker AI Adds Agentic Model Customization

🤖 Amazon SageMaker AI introduces an agentic experience that dramatically shortens model customization from months to days or hours. Using SageMaker AI model customization agent skills, developers interact via natural language coding agents to prepare data, fine-tune models, evaluate quality with LLM-as-a-judge metrics, and generate reusable code artifacts. Skills can be installed into IDEs via the sagemaker-ai agent plugin or used pre-installed in SageMaker Studio Notebooks, and support deployment to Amazon Bedrock or SageMaker AI endpoints.
read more →

Amazon SageMaker AI adds prioritized instance fallback

🚀 Amazon SageMaker AI endpoints now support prioritized instance pools for flexible provisioning. When your preferred instance type has insufficient capacity, SageMaker AI automatically provisions from the next option in your prioritized list for endpoint creation, updates, and autoscaling — keeping endpoints reliable without manual intervention. You can specify hardware-optimized model artifacts per instance type and monitor per-instance-type CloudWatch metrics for latency, throughput, GPU utilization, and instance counts.
read more →

AWS SageMaker JumpStart Adds Google DeepMind Gemma 4

🤖 AWS has added Google DeepMind's instruction‑tuned Gemma 4 E4B, Gemma 4 26B‑A4B, and Gemma 4 31B to SageMaker JumpStart, making multimodal foundation models directly accessible to AWS customers. The models offer configurable step‑by‑step reasoning, interleaved text and image inputs, video and image understanding, native function calling, and multilingual support across 140+ languages. Gemma 4 E4B also supports audio input for ASR and speech‑to‑translated‑text workflows. Customers can deploy these models via SageMaker Studio or the SageMaker Python SDK for rapid experimentation and production.
read more →

New Multilingual and Table Models in SageMaker JumpStart

🆕 Amazon SageMaker JumpStart now includes paraphrase-multilingual-MiniLM-L12-v2, Microsoft Table Transformer Detection, and Bielik-11B-v3.0-Instruct. The MiniLM model maps sentences to 384-dimensional dense vectors across 50+ languages for cross-lingual semantic search, multilingual clustering, and sentence similarity scoring. The Microsoft Table Transformer is a DETR-based detector trained on PubTables-1M to locate tables in PDFs and scanned images for document digitization. Bielik-11B offers an 11B-parameter multilingual generative model focused on Polish and 32 European languages for dialogue, STEM reasoning, and enterprise NLP.
read more →

Amazon SageMaker HyperPod adds G7e and r5d.16xlarge

🚀 Amazon SageMaker HyperPod now supports G7e and r5d.16xlarge instances to improve large-model development, training, and deployment at scale. G7e uses NVIDIA RTX PRO 6000 Blackwell GPUs, offering up to 2.3x better inference performance than G6e and up to 768 GB GPU memory for larger LLMs, multimodal, and agentic AI workloads. The r5d.16xlarge provides 64 vCPUs, 512 GB RAM, and NVMe storage for distributed preprocessing, feature engineering, and memory-heavy orchestration; G7e is available in select US and Asia Pacific regions while r5d.16xlarge is available across all HyperPod regions.
read more →