Bedrock Batch Inference: Claude Sonnet 4 and GPT-OSS
🚀 Amazon Bedrock now supports Batch inference for Anthropic Claude Sonnet 4 and OpenAI GPT-OSS (120B, 20B), enabling asynchronous processing of large workloads at approximately 50% of on-demand inference cost. The update targets bulk scenarios such as document analysis, large-scale summarization, content generation, and structured data extraction, and is optimized to deliver higher overall batch throughput on these newer models. Batch progress and workload metrics — including pending and processed records, tokens per minute, and Claude-specific pending tokens — are exposed at the AWS account level via Amazon CloudWatch.
