< ciso
brief />
Tag Banner

All news with #amazon bedrock tag

173 articles · page 5 of 9

Amazon Bedrock Reserved Tier for Claude Sonnet in GovCloud

🔒 Amazon Bedrock is expanding its Reserved service tier to provide predictable, guaranteed tokens-per-minute capacity and prioritized compute for mission-critical workloads. The Reserved tier lets customers allocate separate input and output tokens-per-minute capacities to match asymmetric workload needs and control costs, while automatically overflowing to the pay-as-you-go Standard tier when reserved capacity is exceeded. This offering is available today for Anthropic Claude Sonnet 4.5 in AWS GovCloud (US-West) with 1- and 3-month reservation options billed monthly.
read more →

Amazon Bedrock Reserved Tier Adds Claude Opus & Haiku

🔒 Amazon Bedrock expands its Reserved service tier to provide predictable tokens‑per‑minute capacity for mission‑critical workloads. The tier lets customers reserve prioritized compute and separately configure input and output tokens‑per‑minute to match asymmetric usage patterns and control costs. When reserved capacity is exceeded, traffic automatically overflows to the pay‑as‑you‑go Standard tier to avoid interruptions. Reserved access is available today for Anthropic Claude Opus 4.5 and Claude Haiku 4.5, with 1‑month or 3‑month reservations billed monthly per 1K tokens‑per‑minute.
read more →

AWS Enables Granular Bedrock Operation Billing Labels

📊 AWS Data Exports now surfaces granular operation types for Amazon Bedrock in cost reports, replacing generic "Usage" labels with explicit operations such as InvokeModelInference and InvokeModelStreamingInference. These operation values appear in the line_item_operation column for Legacy CUR and CUR 2.0, the x_Operation column in FOCUS exports, and as Operation dimension values in the AWS Cost Explorer API. The change applies to all foundation models on Bedrock and is intended to help FinOps and cost optimization teams analyze and optimize model-driven spend.
read more →

Amazon Bedrock: Granular Operation Visibility in Cost Reports

📊 AWS Data Exports now surfaces granular Amazon Bedrock operation types in billing outputs, replacing generic "Usage" labels with explicit actions such as InvokeModelInference and InvokeModelStreamingInference. These operation values appear in Legacy CUR and CUR 2.0 via the line_item_operation column, in FOCUS exports via x_Operation, and as Operation dimension values in the Cost Explorer API. The visibility applies across all Bedrock foundation models and is intended to help FinOps and cost optimization teams perform more precise usage tracking and billing analysis.
read more →

Amazon Bedrock API Keys Now Available in GovCloud Regions

🔐 Amazon Bedrock now supports API keys in AWS GovCloud (US), extending the capability first introduced in commercial regions in July 2025. Developers can create short-term API keys (valid for the console session or up to 12 hours) and long-term keys with configurable lifetimes. Long-term keys are manageable through the AWS IAM console, reducing the need to manually configure IAM principals and policies and streamlining generative AI development.
read more →

NVIDIA Nemotron 3 Nano Now Available on Amazon Bedrock

🚀 Amazon Bedrock now supports NVIDIA Nemotron 3 Nano 30B A3B, NVIDIA's efficient hybrid Mixture-of-Experts language model with a 256k token context window and native tool-calling support. The model delivers higher throughput for agentic, coding, and complex reasoning workloads while preserving the depth of larger models through advanced reinforcement learning and multi-environment post-training. Powered by Project Mantle, Bedrock provides serverless distributed inference, QoS controls, automated capacity management and OpenAI API compatibility across multiple AWS Regions.
read more →

Amazon Bedrock Data Automation Adds Blueprint Optimization

🔧 Amazon Bedrock Data Automation now offers blueprint instruction optimization to improve custom field extraction accuracy using just a few example document assets with ground truth labels. The feature analyzes differences between expected values and Data Automation inferences, then refines the natural-language instructions in your blueprint to boost extraction performance without model training or fine-tuning. It produces evaluation metrics including exact match rates and F1 scores and is available in all Regions where Bedrock Data Automation is supported.
read more →

Amazon Quick Suite Adds Memory for Personalized Chat Agents

🧠 Amazon Quick Suite now adds memory to its chat agents, enabling personalized responses based on prior conversations and stated preferences. The feature stores inferred user preferences—such as response format, acronyms, dashboards, and integrations—and lets users view and remove any remembered items. Users may also choose Private Mode, in which chats are not used to infer memories. Memory is currently available in US East (N. Virginia) and US West (Oregon).
read more →

Amazon GameLift Servers Adds AI Assistance in Console

🤖 Amazon GameLift Servers now offers AI-powered assistance within the AWS Console, leveraging Amazon Q Developer to deliver tailored guidance for game developers. The integrated assistant helps with game server integration, fleet configuration, and performance optimization by surfacing in-console recommendations and troubleshooting steps. It is intended to streamline decision making, reduce troubleshooting time, and improve resource utilization for cost savings and better player experiences. The feature is available in all supported regions except AWS China.
read more →

Pegasus 1.2 Available with Global Cross-Region Inference

📣 Amazon Bedrock now offers TwelveLabs Pegasus 1.2 via Global cross-Region inference, expanding availability by 23 new Regions in addition to the seven where it was already supported. You can also access the model in all EU Regions using Geographic cross-Region inference to meet data-residency requirements. Pegasus 1.2 is a video-first model for long-form video-to-text generation and temporal understanding, enabling lower latency and simplified architecture for video-intelligence applications.
read more →

Amazon Bedrock Adds OpenAI-Compatible Responses API

🚀 Amazon Bedrock now exposes an OpenAI-compatible Responses API on new service endpoints, enabling asynchronous inference for long-running workloads, streaming and non-streaming modes, and automatic stateful conversation reconstruction so developers no longer must resend full histories. The endpoints provide Chat Completions with reasoning-effort support for models served by Mantle, Amazon’s distributed inference engine. Integration requires only a base URL change for OpenAI SDK–compatible code, and support starts today for OpenAI’s GPT OSS 20B and 120B models, with additional models coming soon.
read more →

Amazon Bedrock Adds Reinforcement Fine‑Tuning for Models

🔧 Amazon Bedrock now supports reinforcement fine-tuning, enabling developers to improve model accuracy without deep ML expertise or large labeled datasets. The service automates the reinforcement fine-tuning workflow and trains models by learning from feedback on multiple candidate responses, improving model judgment about what makes a good reply. AWS reports an average 66% accuracy gain over base models, allowing teams to deploy smaller, faster, and more cost-effective variants while maintaining quality. At launch the feature supports Amazon Nova 2 Lite, and it can be accessed via the Bedrock console or APIs.
read more →

AWS launches Apache Spark Upgrade Agent for Amazon EMR

🛠️ AWS announced the Apache Spark upgrade agent, a capability that automates and accelerates Spark version upgrades for Amazon EMR on EC2 and EMR Serverless. The agent performs automated code analysis across PySpark and Scala, identifies API and behavioral changes for Spark 2.4→3.5, and suggests precise code transformations. Engineers can invoke the agent from SageMaker Unified Studio, the Kiro CLI, or any MCP-compatible IDE, interact via natural-language prompts, review proposed edits, and approve implementations. Functional correctness is validated through data quality checks to help maintain processing accuracy during migration.
read more →

AWS announces Amazon Nova 2 models in Amazon Bedrock

🤖AWS has introduced Amazon Nova 2, a next-generation family of foundation models now available in Amazon Bedrock. The release includes Nova 2 Lite, optimized for fast, cost-effective reasoning for everyday workloads, and Nova 2 Pro (Preview), designed for complex, multistep tasks. Both models support step-by-step reasoning, three thinking intensity levels, built-in tools such as code interpreter and web grounding, remote MCP tool support, and a one-million-token context window. Nova 2 Lite supports supervised fine-tuning on Bedrock and SageMaker; full fine-tuning is available on SageMaker. Nova 2 Pro is available in preview for Amazon Nova Forge customers with global cross-region inference.
read more →

Amazon Bedrock Adds 18 Fully Managed Open Models Today

🚀 Amazon Bedrock expanded its model catalog with 18 new fully managed open-weight models, the largest single addition to date. The offering includes Gemma 3, Mistral Large 3, NVIDIA Nemotron Nano 2, OpenAI gpt-oss variants and other vendor models. Through a unified API, developers can evaluate, switch, and adopt these models in production without rewriting applications or changing infrastructure. Models are available in supported AWS Regions.
read more →

Amazon Announces Nova 2 Sonic for Real‑Time Voice AI

🎙️ Amazon announced Amazon Nova 2 Sonic, a speech-to-speech model for natural, real-time conversational AI available via Amazon Bedrock. The model delivers streaming speech understanding robust to background noise and diverse speaking styles, expressive polyglot voices, turn-taking controllability, asynchronous tool calling, and a one‑million token context window. Developers can integrate Nova 2 Sonic with Amazon Connect, leading telephony providers, open-source frameworks, and Bedrock’s bidirectional streaming API; it’s initially available in select AWS Regions.
read more →

AWS AI Factories: Dedicated High-Performance AI Infrastructure

🚀 AWS AI Factories are now available to deploy high-performance AWS AI infrastructure inside customer data centers, combining AWS Trainium, NVIDIA GPUs, low-latency networking, and optimized storage. The service integrates Amazon Bedrock and Amazon SageMaker to provide immediate access to foundation models without separate provider contracts. AWS manages procurement, setup, and operations while customers supply space and power, enabling isolated, sovereign deployments that accelerate AI initiatives.
read more →

Amazon S3 Vectors GA: Scalable, Cost‑Optimized Vector Store

🚀 Amazon S3 Vectors is now generally available, delivering native, purpose-built vector storage and query capabilities in cloud object storage. It supports up to two billion vectors per index, 10,000 indexes per vector bucket, and offers up to 90% lower costs to upload, store, and query vectors. S3 Vectors integrates with Amazon Bedrock, SageMaker Unified Studio, and OpenSearch Service, supports SSE-S3 and optional SSE-KMS encryption with per-index keys, and provides tagging for ABAC and cost allocation.
read more →

Mistral Large 3 and Ministral 3 Now on Amazon Bedrock

🚀 Amazon Bedrock now offers Mistral Large 3 and the Ministral 3 family alongside additional Mistral AI checkpoints, giving customers early access to open-weight multimodal models. Mistral Large 3 employs a granular Mixture-of-Experts architecture with 41B active and 675B total parameters and supports a 256K context window for long-form comprehension and agentic workflows. The Ministral 3 series (14B, 8B, 3B) plus Voxtral and Magistral small models let developers choose scales optimized for production assistants, RAG systems, single-GPU edge deployment, or low-resource environments.
read more →

Bedrock AgentCore Runtime Adds Bi-Directional Streaming

🔁 Amazon Bedrock AgentCore Runtime now supports bi-directional streaming, enabling real-time, continuous conversations where agents listen and respond simultaneously and handle interruptions or context shifts mid-turn. This removes stop-start friction in voice and text agents and preserves context across exchanges. Built into AgentCore Runtime, the capability reduces months of engineering work required to implement streaming infrastructure, letting developers focus on agent experiences rather than plumbing. Available in nine AWS Regions with consumption-based pricing.
read more →