< ciso
brief />
Tag Banner

All news with #bigquery tag

83 articles · page 3 of 5

Back Market Migrates to Google Data Cloud, Cuts Costs

🔁 Back Market migrated its data and core tech stack from AWS-based Snowflake and Databricks to Google Cloud, consolidating all historical and operational data in BigQuery. The team executed a two-week proof of concept and a live double-run migration that kept production on Databricks while writing to cloned BigQuery tables until outputs matched. They replaced AWS DMS with Datastream, implemented hourly batching to control small-file costs, and completed critical switchover in six months. The move halved data processing times, cut CDC costs by 90%, reduced technical debt, and improved observability, governance, and developer productivity.
read more →

PubMed Data in BigQuery to Accelerate Medical Research

🔬 Google Cloud has made PubMed content available as a BigQuery public dataset with integrated vector search via Vertex AI, enabling semantic search across more than 35 million biomedical articles. Both BigQuery and Vertex AI Vector Search are FedRAMP High authorized, allowing organizations to run embedding models and VECTOR_SEARCH queries inside BigQuery. Early adopters like The Princess Máxima Center report literature reviews reduced from hours to minutes, and example SQL plus a demo repo are provided to help teams get started.
read more →

Automated Metadata Generation in Google Data Cloud

🧭 Google announces generally available automated metadata generation in the Google Data Cloud, using Dataplex Universal Catalog and Gemini to convert profiling and schema context into human-readable table and column descriptions. The capability integrates with BigQuery, stores generated descriptions for search and governance, and is accessible via an API. It aims to reduce "metadata debt," accelerate time-to-insight, and provide reliable grounding for AI agents, while still encouraging human review for key business definitions.
read more →

Building Conversational Genomics with Multi-Agent AI

🧬 Combining Google’s ADK, Gemini, and Cloud infrastructure, this work reframes variant interpretation as a conversational workflow that removes repetitive scripting and context switching. A two-phase design performs heavy VEP annotation once, stores versioned ADK artifacts and public BigQuery datasets, and enables sub-5-second interactive queries via a QueryAgent. Validation with an APOB spike-in demonstrated single-variant precision, compatibility across DeepVariant versions, and scalability to ~8.8M variants.
read more →

BigQuery AI: Unified ML, Generative AI, and Agents

🤖 BigQuery AI consolidates BigQuery’s built-in ML, generative AI functions, vector search, and agent tools into a unified platform. It enables users to apply generative models and embeddings directly via SQL, perform semantic vector search, and run end-to-end ML workflows without moving data. Role-specific data agents and assistive features like a data canvas and code completion accelerate work for engineers, data scientists, and business users.
read more →

Google: Leader in 2025 Gartner Magic Quadrant for CDBMS

📈 Google announces it was named a Leader in the 2025 Gartner Magic Quadrant for Cloud Database Management Systems for the sixth consecutive year and positioned furthest in vision. The post presents the company's AI-native Data Cloud—a unified stack integrating BigQuery, Spanner, AlloyDB, Looker, and Dataplex—to support agentic AI. Google highlights embedded specialized agents, developer tooling (Data Agents API, ADK, Gemini CLI) and Agent Analytics in BigQuery to accelerate AI-driven applications while asserting cost and governance benefits on a single, open platform.
read more →

BigQuery Data Transfer Service Enhancements and Compliance

🔔 The BigQuery Data Transfer Service expands its connector ecosystem with new GA integrations (Oracle, Salesforce, ServiceNow, SFMC, Facebook Ads, and GA4) and preview connectors like Stripe, PayPal, Snowflake, and Hive. Platform improvements include event-driven transfers, incremental ingestion, GAQL-based custom Google Ads reports, and enhanced Oracle scale. Security and compliance gains—EU Data Boundary GA, FedRAMP High, CJIS, access transparency, regional endpoints, and key usage tracking—support regulated workloads. A new consumption-based pricing model applies to third-party connectors once they reach GA.
read more →

BigQuery Agent Analytics: Stream and Analyze Agent Data

📊 Google introduces BigQuery Agent Analytics, an ADK plugin that streams agent interaction events into BigQuery to capture, analyze, and visualize performance, usage, and cost. The plugin provides a predefined schema and uses the BigQuery Storage Write API for low-latency, high-throughput streaming of requests, responses, and tool calls. Developers can filter and preprocess events (for example, redaction) and build dashboards in Looker Studio or Grafana while leveraging vector search and generative AI functions for deeper analysis.
read more →

TimesFM Integration Brings Forecasting to BigQuery

🕒 Google is integrating the TimesFM time-series foundation model into BigQuery and AlloyDB, enabling zero-shot forecasting on customer data without retraining. AI.FORECAST and AI.EVALUATE are now Generally Available in BigQuery, while AI.DETECT_ANOMALIES is in public preview. TimesFM 2.5 offers improved accuracy and lower latency, supports dynamic context windows up to 15K, and can return historical data with forecasts. AlloyDB preview lets users call TimesFM endpoints hosted on Vertex AI so operational data can be forecasted in-place, preserving data residency and reducing export overhead.
read more →

Using BigQuery ML to Solve Lookalike Audiences at Zeotap

🔍 Zeotap and Google Cloud describe a SQL-first approach to building scalable lookalike audiences entirely within BigQuery. They convert low-cardinality categorical features into one-hot and multi-hot vectors, use Jaccard similarity reframed via dot-product and Manhattan norms, and index vectors with BigQuery’s VECTOR_SEARCH. By combining pre-filtering on discriminative features and batching queries, the workflow reduces compute, latency, and cost while avoiding a separate vector database.
read more →

BigQuery AI Functions: Reimagining SQL for the AI Era

🤖 BigQuery is introducing managed AI functions in public preview — AI.IF, AI.CLASSIFY, and AI.SCORE — that let analysts apply generative AI directly inside SQL queries. These functions enable semantic filtering and joins, label-based classification of text and images, and natural-language ranking, while BigQuery applies prompt, query-plan, and endpoint optimizations to reduce LLM calls and control cost. They complement existing Gemini inference functions and remove much of the need for complex prompt tuning or separate model selection, making AI-driven analytics more accessible within familiar SQL workflows.
read more →

BigQuery adds MATCH_RECOGNIZE for row-sequence SQL

🔍 BigQuery now supports MATCH_RECOGNIZE, a SQL clause for identifying ordered patterns across rows and time-series data. It lets analysts express complex sequence logic—using PARTITION BY, ORDER BY, PATTERN, DEFINE and MEASURES—inside a single query without heavy joins or external processing. The feature targets use cases like funnels, fraud detection, log sequencing, and financial pattern detection, and is immediately available to all BigQuery users.
read more →

How BigQuery Brought Vector Search to Analytics at Scale

🔍 In early 2024 Google introduced native vector search in BigQuery, embedding semantic search directly into the data warehouse to remove the need for separate vector databases. Users can create indexes with a simple CREATE VECTOR INDEX statement and run semantic queries via the VECTOR_SEARCH function or through Python integrations like LangChain. BigQuery provides serverless scaling, asynchronous index refreshes, model rebuilds with no downtime, partitioned indexes, and ScaNN-based TreeAH for improved price/performance, while retaining row- and column-level security and a pay-as-you-go pricing model.
read more →

Zeotap cuts costs 46% migrating to Bigtable from ScyllaDB

🚀 Zeotap migrated its Customer Data Platform from ScyllaDB to Bigtable to address scaling challenges, operational overhead, and highly spiky workloads. The cloud-native stack—using Dataflow, a home-grown streaming engine, Memorystore as a cache, Bigtable as the hot store, and BigQuery for analytics—delivers predictable low-latency reads and writes at scale. The transition yielded a 46% reduction in TCO and a ~20% drop in operational tasks while enabling sub-second SLAs and faster ML deployment.
read more →

Automating FinOps Governance with Workload Manager

🔧 Workload Manager automates FinOps governance by codifying cost-control policies and enforcing them across Google Cloud environments. It supports both predefined checks (for example, bigquery-missing-labels) and custom rules written in Open Policy Agent (OPA) Rego, allowing organization-, folder-, or project-level scans. Scheduled evaluations can export results to BigQuery, trigger notifications (email, Slack, PagerDuty), and feed Looker Studio dashboards for reporting and trend analysis. New pricing reduces scan costs by up to 95% and includes a small free tier to accelerate adoption.
read more →

BigQuery's Data Engineering Agent: Automating Pipelines

🔧 The preview of the Data Engineering Agent in BigQuery introduces a Gemini-powered assistant that automates pipeline development, maintenance, and migrations. The agent converts natural-language requirements into SQL, enforces engineering best practices, and supports custom instructions and UDFs to reflect organizational logic. Integrated with Dataplex, it uses governance metadata to improve table descriptions, data quality assertions, and PII-aware handling, and it also generates documentation and troubleshooting guidance. The feature is available in preview via BigQuery Pipelines and the Dataform UI.
read more →

Mercado Libre's Spanner-Based Platform for Scale and AI

🚀 Mercado Libre leverages Spanner as the core of a developer-facing platform, exposing consistent, globally-scalable transactions through its internal gateway, Fury. Fury abstracts distributed database complexity and serves both relational and key-value workloads. Integration with BigQuery via Data Boost and Change Streams enables near-real-time analytics and reverse ETL to operational systems.
read more →

Integrating Oracle with Google Cloud for AI Automation

🔁 This Google Cloud post explains how enterprises can integrate Oracle Database with cloud-native analytics and AI by moving transactional data into BigQuery. It recommends ingestion patterns such as low-latency Change Data Capture via Datastream, batch staging to Cloud Storage, and notes ODBC/JDBC for interactive queries but not continuous replication. Once data resides in BigQuery, organizations can leverage Gemini-powered features, BigQuery ML, and AI agents (via the Agent Developer Kit) for natural-language exploration, assisted coding, multimodal analysis, and automated workflows across retail and education use cases.
read more →

Agent Factory Recap: AI Agents for Data Engineering

🔍 The episode of The Agent Factory reviewed practical AI agents for data engineering and data science, highlighting demos that combine Gemini, BigQuery, Colab Enterprise, and Spanner-based graph queries. It showcased a BigQuery Data Engineering Agent that generates pipelines, time dimensions, and data-quality assertions from SQL, and a Data Science Agent that runs end-to-end anomaly detection in Colab. The post also covered CodeMender for autonomous code security fixes and a creative Spanner+ADK comic demo illustrating multi-region concepts.
read more →

Dataplex Supports Column-Level Lineage for BigQuery

🔍 Dataplex Universal Catalog now captures column-level lineage for BigQuery, extending object-level tracing to granular column transformations at no extra cost. The update provides interactive visual lineage graphs so users can inspect upstream and downstream flows for individual columns, trace origins, and assess downstream impact of modifications. This granularity helps validate authoritative sources for AI/ML features, enforce column-level governance, and improve compliance. It also surfaces freshness and usage metadata to support context-aware agents.
read more →