All news with #embeddings tag
Thu, November 20, 2025
BigQuery Agent Analytics: Stream and Analyze Agent Data
📊 Google introduces BigQuery Agent Analytics, an ADK plugin that streams agent interaction events into BigQuery to capture, analyze, and visualize performance, usage, and cost. The plugin provides a predefined schema and uses the BigQuery Storage Write API for low-latency, high-throughput streaming of requests, responses, and tool calls. Developers can filter and preprocess events (for example, redaction) and build dashboards in Looker Studio or Grafana while leveraging vector search and generative AI functions for deeper analysis.
Fri, November 14, 2025
Using BigQuery ML to Solve Lookalike Audiences at Zeotap
🔍 Zeotap and Google Cloud describe a SQL-first approach to building scalable lookalike audiences entirely within BigQuery. They convert low-cardinality categorical features into one-hot and multi-hot vectors, use Jaccard similarity reframed via dot-product and Manhattan norms, and index vectors with BigQuery’s VECTOR_SEARCH. By combining pre-filtering on discriminative features and batching queries, the workflow reduces compute, latency, and cost while avoiding a separate vector database.
Wed, October 29, 2025
TwelveLabs Marengo 3.0 Now on Amazon Bedrock Platform
🎥 TwelveLabs' Marengo Embed 3.0 is now available on Amazon Bedrock, providing a unified video-native multimodal embedding that represents video, images, audio, and text in a single vector space. The release doubles processing capacity—up to 4 hours and 6 GB per file—expands language support to 36 languages, and improves sports analysis and multimodal search precision. It supports synchronous low-latency text and image inference and asynchronous processing for video, audio, and large files.
Tue, October 28, 2025
Amazon Nova Multimodal Embeddings — Unified Cross-Modal
🚀 Amazon announces general availability of Amazon Nova Multimodal Embeddings, a unified embedding model designed for agentic RAG and semantic search across text, documents, images, video, and audio. The model handles inputs up to 8K tokens and video/audio segments up to 30 seconds, with segmentation for larger files and selectable embedding dimensions. Both synchronous and asynchronous APIs are supported to balance latency and throughput, and Nova is available in Amazon Bedrock in US East (N. Virginia).
Mon, October 13, 2025
Amazon ElastiCache Adds Vector Search with Valkey 8.2
🚀 Amazon ElastiCache now offers vector search generally available with Valkey 8.2, enabling indexing, searching, and updating billions of high-dimensional embeddings from providers such as Amazon Bedrock, Amazon SageMaker, Anthropic, and OpenAI with microsecond latency and up to 99% recall. Key use cases include semantic caching for LLMs, multi-turn conversational agents, and RAG-enabled agentic systems to reduce latency and cost. Vector search runs on node-based clusters in all AWS Regions at no additional cost, and existing Valkey or Redis OSS clusters can be upgraded to Valkey 8.2 with no downtime.
Fri, October 3, 2025
Amazon OpenSearch Service Adds Batch AI Inference Support
🧠 You can now run asynchronous batch AI inference inside Amazon OpenSearch Ingestion pipelines to enrich and ingest very large datasets for Amazon OpenSearch Service domains. The same AI connectors previously used for real-time calls to Amazon Bedrock, Amazon SageMaker, and third parties now support high-throughput, offline jobs. Batch inference is intended for offline enrichment scenarios—generating up to billions of vector embeddings—with improved performance and cost efficiency versus streaming inference. The feature is available in regions that support OpenSearch Ingestion on domains running 2.17+.
Thu, October 2, 2025
Cohere Embed v4 Multimodal Embeddings on Amazon Bedrock
🚀 Amazon Bedrock now supports Cohere Embed v4, a multimodal embedding model that generates high-quality embeddings for text, images, and complex business documents. The model natively processes tables, charts, diagrams, code snippets, and handwritten notes, reducing the need for extensive preprocessing and data cleanup. It supports over 100 languages and includes industry fine-tuning for finance, healthcare, and manufacturing. Cohere Embed v4 is available for on-demand inference in select AWS Regions; access is requested via the Bedrock console.
Thu, September 25, 2025
Enabling AI Sovereignty Through Choice and Openness Globally
🌐 Cloudflare argues that AI sovereignty should mean choice: the ability for nations to control data, select models, and deploy applications without vendor lock-in. Through its distributed edge network and serverless Workers AI, Cloudflare promotes accessible, low-cost deployment and inference close to users. The company hosts regional open-source models—India’s IndicTrans2, Japan’s PLaMo-Embedding-1B, and Singapore’s SEA-LION v4-27B—and offers an AI Gateway to connect diverse models. Open standards, interoperability, and pay-as-you-go economics are presented as central to resilient national AI strategies.
Thu, September 18, 2025
Amazon OpenSearch Serverless Adds Disk-Optimized Vectors
🔍 Amazon has added disk-optimized vector storage to OpenSearch Serverless, offering a lower-cost alternative to memory-optimized vectors while maintaining equivalent accuracy and recall. The disk-optimized option may introduce slightly higher latency, so it is best suited for semantic search, recommendation systems, and other AI search scenarios that do not require sub-millisecond responses. As a fully managed service, OpenSearch Serverless continues to automatically scale compute capacity (measured in OCUs) to match workload demands.
Wed, September 17, 2025
BigQuery scalability and reliability upgrades for Gen AI
🚀 Google Cloud announced BigQuery performance and usability enhancements to accelerate generative AI inference. Improvements include >100x throughput for first-party text generation and >30x for embeddings, plus support for Vertex AI Provisioned Throughput and dynamic token batching to pack many rows per request. New reliability features—partial-failure mode, adaptive traffic control, and robust retries—prevent individual row failures from failing whole queries and simplify large-scale LLM workflows.
Tue, September 9, 2025
TwelveLabs Marengo 2.7 Embeddings Now Synchronous in Bedrock
⚡ Amazon Bedrock now supports synchronous inference for TwelveLabs Marengo Embed 2.7, delivering low-latency text and image embeddings directly in API responses. Previously optimized for asynchronous processing of large video, audio, and image files, Marengo 2.7’s new mode enables responsive search and retrieval features—such as instant natural-language video search and image similarity discovery—while retaining advanced video understanding via asynchronous workflows.
Wed, September 3, 2025
Target modernizes search with hybrid AlloyDB AI platform
🔍 Target rebuilt its on-site search to combine lexical keyword matching with semantic vector retrieval, using AlloyDB AI to power filtered vector queries at scale. The engineering team implemented a multi-index architecture and a multi-channel relevance framework so hybrid queries can apply native SQL filters alongside vector similarity. The overhaul produced measurable gains — ~20% improvement in product discovery relevance, halved "no results" occurrences, and large latency reductions — while consolidating the stack and accelerating development.
Thu, August 28, 2025
What's New in Google Data Cloud: August Product Roundup
🔔 This Google Cloud roundup summarizes recent product milestones, GA launches, previews, and integrations across the data analytics, BI, and database portfolio. It highlights updates to BigQuery, Firestore, Cloud SQL, AlloyDB, and adjacent services aimed at easing ingestion, migration, and AI-driven operations. Notable items include MongoDB-compatible Firestore GA, PSC networking improvements for Database Migration Service, and a redesigned BigQuery data ingestion experience. The post also emphasizes resilience and DR enhancements such as immutable backups and Near Zero Downtime maintenance.
Thu, August 28, 2025
Make Websites Conversational with NLWeb and AutoRAG
🤖 Cloudflare offers a one-click path to conversational search by combining Microsoft’s NLWeb open standard with Cloudflare’s managed retrieval engine, AutoRAG. The integration crawls and indexes site content into R2 and a managed vector store, serves embeddings and inference via Workers AI, and exposes both a user-facing /ask endpoint and an agent-focused /mcp endpoint. Publishers get continuous re-indexing, controlled agent access, and observability through an AI Gateway, removing much of the infrastructure burden for conversational experiences.
Mon, August 25, 2025
Amazon RDS Supports MariaDB 11.8 with Vector Engine
🚀 Amazon RDS for MariaDB now supports MariaDB 11.8 (minor 11.8.3), the community's latest long-term maintenance release. The update introduces MariaDB Vector, enabling storage of vector embeddings and use of retrieval-augmented generation (RAG) directly in the managed database. It also adds controls to limit maximum temporary file and table sizes to better manage storage. You can upgrade manually, via snapshot restore, or with Amazon RDS Managed Blue/Green deployments; 11.8 is available in all regions where RDS MariaDB is offered.
Mon, August 25, 2025
Amazon Bedrock Data Automation Adds Five Document Languages
📄 Amazon Web Services' Bedrock Data Automation now supports five additional document languages — Portuguese, French, Italian, Spanish, and German — expanding multilingual document processing beyond English. Customers can build blueprints, prompts, and instructions in these languages using BDA Custom Output, while BDA Standard Output will produce summaries and figure captions in the detected document language. This update is generally available across multiple AWS commercial and GovCloud regions and aims to accelerate multilingual document workflows for intelligent document processing and multimodal automation.
Mon, August 25, 2025
Amazon Neptune Adds BYOKG RAG Support via GraphRAG
🔍 Amazon Web Services announced general availability of Bring Your Own Knowledge Graph (BYOKG) support for Retrieval-Augmented Generation (RAG) using the open-source GraphRAG Toolkit. Developers can now connect domain-specific graphs stored in Amazon Neptune (Database or Analytics) directly to LLM workflows, combining graph queries with vector search. This reduces hallucinations and improves multi-hop and temporal reasoning, easing operationalization of graph-aware generative AI.
Fri, August 22, 2025
What’s New in Google Cloud: Releases, Previews, and News
🔔 Google Cloud published a consolidated roundup of product releases and previews from early July through Aug 22, 2025, covering GA launches, public previews, and platform enhancements. Highlights include Earth Engine in BigQuery (GA), Vertex AI embedding scaling, new GKE features for NUMA alignment and swap, expanded NodeConfig controls, and Cloud Run with GPUs. Customers should review the linked documentation, request preview access via account teams where needed, and plan upgrades or migrations accordingly.
Thu, August 7, 2025
Google July AI updates: tools, creativity, and security
🔍 In July, Google announced a broad set of AI updates designed to expand access and practical value across Search, creativity, shopping and infrastructure. AI Mode in Search received Canvas planning, Search Live video, PDF uploads and better visual follow-ups via Circle to Search and Lens. NotebookLM added Mind Maps, Study Guides and Video Overviews, while Google Photos gained animation and remixing tools. Research advances include DeepMind’s Aeneas for reconstructing fragmentary texts and AlphaEarth Foundations for satellite embeddings, and Google said it used an AI agent to detect and stop a cybersecurity vulnerability.