Tag Banner

All news with #embeddings tag

Fri, December 5, 2025

Amazon OpenSearch Service Adds Automatic Semantic Enrichment

🔍 Amazon OpenSearch Service now provides automatic semantic enrichment for managed domains, extending an earlier capability from OpenSearch Serverless to managed clusters and enabling semantic search with minimal configuration. The feature performs semantic processing automatically so customers do not need to manage ML models. It supports English-only and multilingual variants across 15 languages (including Arabic, French, Hindi, Japanese, and Korean) and is billed based on ingestion usage as OpenSearch Compute Unit (OCU) - Semantic Search. The capability requires OpenSearch 2.19 or later and is currently available for non‑VPC domains in selected AWS Regions; see the OpenSearch Service documentation for setup and configuration details.

read more →

Thu, December 4, 2025

PubMed Data in BigQuery to Accelerate Medical Research

🔬 Google Cloud has made PubMed content available as a BigQuery public dataset with integrated vector search via Vertex AI, enabling semantic search across more than 35 million biomedical articles. Both BigQuery and Vertex AI Vector Search are FedRAMP High authorized, allowing organizations to run embedding models and VECTOR_SEARCH queries inside BigQuery. Early adopters like The Princess Máxima Center report literature reviews reduced from hours to minutes, and example SQL plus a demo repo are provided to help teams get started.

read more →

Tue, December 2, 2025

Amazon S3 Vectors GA: Scalable, Cost‑Optimized Vector Store

🚀 Amazon S3 Vectors is now generally available, delivering native, purpose-built vector storage and query capabilities in cloud object storage. It supports up to two billion vectors per index, 10,000 indexes per vector bucket, and offers up to 90% lower costs to upload, store, and query vectors. S3 Vectors integrates with Amazon Bedrock, SageMaker Unified Studio, and OpenSearch Service, supports SSE-S3 and optional SSE-KMS encryption with per-index keys, and provides tagging for ABAC and cost allocation.

read more →

Sun, November 30, 2025

AWS Bedrock Knowledge Bases Adds Multimodal Retrieval

🔍 AWS has announced general availability of multimodal retrieval in Amazon Bedrock Knowledge Bases, enabling unified search across text, images, audio, and video. The managed Retrieval Augmented Generation (RAG) workflow provides developers full control over ingestion, parsing, chunking, embedding (including Amazon Nova multimodal), and vector storage. Users can submit text or image queries and receive relevant text, image, audio, and video segments back, which can be combined with the LLM of their choice to generate richer, lower-latency responses. Region availability varies by feature set and is documented by AWS.

read more →

Tue, November 25, 2025

OpenSearch Service Introduces Agentic Search for NLP Queries

🔎 Amazon Web Services has introduced Agentic Search for OpenSearch Service, an agent-driven layer that interprets natural-language intent, orchestrates search tools, and generates OpenSearch DSL queries while providing transparent summaries of its decision process. The built-in QueryPlanningTool uses LLMs to plan and emit DSL, removing the need for manual query syntax. Two agent types are available: conversational agents with memory and flow agents optimized for throughput. Administrators can configure agents via APIs or OpenSearch Dashboards, and Agentic Search is supported on OpenSearch Service version 3.3+ across AWS Commercial and GovCloud regions.

read more →

Mon, November 24, 2025

Amazon OpenSearch Service: OpenSearch 3.3 Now Available

📢 Amazon OpenSearch Service now supports OpenSearch 3.3, introducing search performance, observability, and agentic AI integration improvements. Vector search enhancements include agentic search for natural-language queries without complex DSLs, batch processing for the semantic highlighter to lower latency and improve GPU utilization, and optimizations in the Neural Search plugin. The release also makes Apache Calcite the default query engine for PPL, adds a broader PPL command library, and improves the approximation framework for more responsive pagination and dashboards. A new workload management plugin enables grouping of search traffic and tenant-level network isolation to prevent resource overuse.

read more →

Fri, November 21, 2025

BigQuery AI: Unified ML, Generative AI, and Agents

🤖 BigQuery AI consolidates BigQuery’s built-in ML, generative AI functions, vector search, and agent tools into a unified platform. It enables users to apply generative models and embeddings directly via SQL, perform semantic vector search, and run end-to-end ML workflows without moving data. Role-specific data agents and assistive features like a data canvas and code completion accelerate work for engineers, data scientists, and business users.

read more →

Thu, November 20, 2025

BigQuery Agent Analytics: Stream and Analyze Agent Data

📊 Google introduces BigQuery Agent Analytics, an ADK plugin that streams agent interaction events into BigQuery to capture, analyze, and visualize performance, usage, and cost. The plugin provides a predefined schema and uses the BigQuery Storage Write API for low-latency, high-throughput streaming of requests, responses, and tool calls. Developers can filter and preprocess events (for example, redaction) and build dashboards in Looker Studio or Grafana while leveraging vector search and generative AI functions for deeper analysis.

read more →

Fri, November 14, 2025

Using BigQuery ML to Solve Lookalike Audiences at Zeotap

🔍 Zeotap and Google Cloud describe a SQL-first approach to building scalable lookalike audiences entirely within BigQuery. They convert low-cardinality categorical features into one-hot and multi-hot vectors, use Jaccard similarity reframed via dot-product and Manhattan norms, and index vectors with BigQuery’s VECTOR_SEARCH. By combining pre-filtering on discriminative features and batching queries, the workflow reduces compute, latency, and cost while avoiding a separate vector database.

read more →

Wed, October 29, 2025

TwelveLabs Marengo 3.0 Now on Amazon Bedrock Platform

🎥 TwelveLabs' Marengo Embed 3.0 is now available on Amazon Bedrock, providing a unified video-native multimodal embedding that represents video, images, audio, and text in a single vector space. The release doubles processing capacity—up to 4 hours and 6 GB per file—expands language support to 36 languages, and improves sports analysis and multimodal search precision. It supports synchronous low-latency text and image inference and asynchronous processing for video, audio, and large files.

read more →

Tue, October 28, 2025

Amazon Nova Multimodal Embeddings — Unified Cross-Modal

🚀 Amazon announces general availability of Amazon Nova Multimodal Embeddings, a unified embedding model designed for agentic RAG and semantic search across text, documents, images, video, and audio. The model handles inputs up to 8K tokens and video/audio segments up to 30 seconds, with segmentation for larger files and selectable embedding dimensions. Both synchronous and asynchronous APIs are supported to balance latency and throughput, and Nova is available in Amazon Bedrock in US East (N. Virginia).

read more →

Mon, October 13, 2025

Amazon ElastiCache Adds Vector Search with Valkey 8.2

🚀 Amazon ElastiCache now offers vector search generally available with Valkey 8.2, enabling indexing, searching, and updating billions of high-dimensional embeddings from providers such as Amazon Bedrock, Amazon SageMaker, Anthropic, and OpenAI with microsecond latency and up to 99% recall. Key use cases include semantic caching for LLMs, multi-turn conversational agents, and RAG-enabled agentic systems to reduce latency and cost. Vector search runs on node-based clusters in all AWS Regions at no additional cost, and existing Valkey or Redis OSS clusters can be upgraded to Valkey 8.2 with no downtime.

read more →

Fri, October 3, 2025

Amazon OpenSearch Service Adds Batch AI Inference Support

🧠 You can now run asynchronous batch AI inference inside Amazon OpenSearch Ingestion pipelines to enrich and ingest very large datasets for Amazon OpenSearch Service domains. The same AI connectors previously used for real-time calls to Amazon Bedrock, Amazon SageMaker, and third parties now support high-throughput, offline jobs. Batch inference is intended for offline enrichment scenarios—generating up to billions of vector embeddings—with improved performance and cost efficiency versus streaming inference. The feature is available in regions that support OpenSearch Ingestion on domains running 2.17+.

read more →

Thu, October 2, 2025

Cohere Embed v4 Multimodal Embeddings on Amazon Bedrock

🚀 Amazon Bedrock now supports Cohere Embed v4, a multimodal embedding model that generates high-quality embeddings for text, images, and complex business documents. The model natively processes tables, charts, diagrams, code snippets, and handwritten notes, reducing the need for extensive preprocessing and data cleanup. It supports over 100 languages and includes industry fine-tuning for finance, healthcare, and manufacturing. Cohere Embed v4 is available for on-demand inference in select AWS Regions; access is requested via the Bedrock console.

read more →

Thu, September 25, 2025

Enabling AI Sovereignty Through Choice and Openness Globally

🌐 Cloudflare argues that AI sovereignty should mean choice: the ability for nations to control data, select models, and deploy applications without vendor lock-in. Through its distributed edge network and serverless Workers AI, Cloudflare promotes accessible, low-cost deployment and inference close to users. The company hosts regional open-source models—India’s IndicTrans2, Japan’s PLaMo-Embedding-1B, and Singapore’s SEA-LION v4-27B—and offers an AI Gateway to connect diverse models. Open standards, interoperability, and pay-as-you-go economics are presented as central to resilient national AI strategies.

read more →

Thu, September 18, 2025

Amazon OpenSearch Serverless Adds Disk-Optimized Vectors

🔍 Amazon has added disk-optimized vector storage to OpenSearch Serverless, offering a lower-cost alternative to memory-optimized vectors while maintaining equivalent accuracy and recall. The disk-optimized option may introduce slightly higher latency, so it is best suited for semantic search, recommendation systems, and other AI search scenarios that do not require sub-millisecond responses. As a fully managed service, OpenSearch Serverless continues to automatically scale compute capacity (measured in OCUs) to match workload demands.

read more →

Wed, September 17, 2025

BigQuery scalability and reliability upgrades for Gen AI

🚀 Google Cloud announced BigQuery performance and usability enhancements to accelerate generative AI inference. Improvements include >100x throughput for first-party text generation and >30x for embeddings, plus support for Vertex AI Provisioned Throughput and dynamic token batching to pack many rows per request. New reliability features—partial-failure mode, adaptive traffic control, and robust retries—prevent individual row failures from failing whole queries and simplify large-scale LLM workflows.

read more →

Tue, September 9, 2025

TwelveLabs Marengo 2.7 Embeddings Now Synchronous in Bedrock

Amazon Bedrock now supports synchronous inference for TwelveLabs Marengo Embed 2.7, delivering low-latency text and image embeddings directly in API responses. Previously optimized for asynchronous processing of large video, audio, and image files, Marengo 2.7’s new mode enables responsive search and retrieval features—such as instant natural-language video search and image similarity discovery—while retaining advanced video understanding via asynchronous workflows.

read more →

Wed, September 3, 2025

Target modernizes search with hybrid AlloyDB AI platform

🔍 Target rebuilt its on-site search to combine lexical keyword matching with semantic vector retrieval, using AlloyDB AI to power filtered vector queries at scale. The engineering team implemented a multi-index architecture and a multi-channel relevance framework so hybrid queries can apply native SQL filters alongside vector similarity. The overhaul produced measurable gains — ~20% improvement in product discovery relevance, halved "no results" occurrences, and large latency reductions — while consolidating the stack and accelerating development.

read more →

Thu, August 28, 2025

What's New in Google Data Cloud: August Product Roundup

🔔 This Google Cloud roundup summarizes recent product milestones, GA launches, previews, and integrations across the data analytics, BI, and database portfolio. It highlights updates to BigQuery, Firestore, Cloud SQL, AlloyDB, and adjacent services aimed at easing ingestion, migration, and AI-driven operations. Notable items include MongoDB-compatible Firestore GA, PSC networking improvements for Database Migration Service, and a redesigned BigQuery data ingestion experience. The post also emphasizes resilience and DR enhancements such as immutable backups and Near Zero Downtime maintenance.

read more →