< ciso
brief />
Tag Banner

All news with #data governance tag

95 articles

SageMaker adds catalog and governance for IAM domains

🛠️ Amazon SageMaker Unified Studio now adds business context, metadata, and data governance features for IAM-based domains. Customers can annotate AWS Glue Data Catalog tables with business names, descriptions, and README documentation, and use AI-generated metadata to automate cataloging. Teams can build business glossaries, define metadata form templates, and capture structured attributes like classification, retention, and ownership. These capabilities enable search, filtering by glossary or metadata fields, and access requests with automated Lake Formation permission grants, and are available in all regions where SageMaker Unified Studio is supported.
read more →

SageMaker Unified Studio adds data quality tools

🛠️ Amazon SageMaker Unified Studio now integrates data quality rule authoring and evaluation powered by AWS Glue Data Quality. Data engineers, analysts, and data scientists can define rules, run evaluations, and view results for both data at rest and data in transit. The feature supports catalog table checks and Visual ETL job evaluations to detect issues before they impact analytics or ML workloads.
read more →

SageMaker Feature Store Adds SDK v3, Lake Formation

🔒 Amazon SageMaker Feature Store now supports the SageMaker Python SDK v3, providing modular APIs to manage feature groups with less boilerplate. Data scientists can enable Lake Formation access controls to enforce column- and row-level permissions on offline store data at feature group creation. The SDK also exposes Apache Iceberg table properties for configuring compaction and snapshot expiration to optimize storage and queries. Available in all AWS Regions where Feature Store is offered; install v3.8.0 or later to begin.
read more →

FTC to Bar Kochava From Selling Americans' Location Data

🔒 The Federal Trade Commission will ban data broker Kochava and its subsidiary Collective Data Solutions (CDS) from selling precise geolocation data without consumers' affirmative express consent as part of a settlement stemming from an August 2022 suit. The FTC alleged Kochava supplied paid clients — via an AWS Marketplace feed — with high-volume raw latitude/longitude transactions that enabled tracking to sensitive sites. Under the proposed court order, sales or transfers of precise location data are prohibited unless consumers directly request a service and explicitly consent; the companies must also implement a sensitive location program, supplier assessments, consent withdrawal and disclosure mechanisms, incident reporting to the FTC, and retention/deletion schedules.
read more →

How CISOs Should Use DSPM to Inform Risk Decisions

🔎 Data security posture management (DSPM) is less about buying a single product and more about adopting a mindset: identify where sensitive data lives, quantify its value-at-risk, and use that information to prioritize remediation and investments. Full DSPM platforms can demand one to three dedicated FTEs to maintain, so many organizations should start with manual inventories, lightweight scanners or existing DLP outputs. The piece highlights practical scenarios—patch prioritization, M&A integrations, and IAM reviews—and warns that rising agentic AI and vendor access requirements make timely, measurable data discovery increasingly urgent in 2026.
read more →

Spatial Data Management on AWS: Connectors and Installer

🔧SDMA on AWS now supports custom transformation connectors and a unified desktop installer. Custom connectors enable submission of compute‑intensive jobs—such as format conversion, 3D rendering, image tiling, and metadata extraction—to AWS Deadline Cloud using Open Job Description templates, and can extend SDMA's built-in content analysis with bespoke verification or transformation logic. Connectors run in isolated compute environments and automatically ingest declared outputs back into SDMA's governed asset repository, allowing automated, chained processing across spatial data pipelines. The SDMA desktop application now offers a standalone installer that bundles required dependencies, removing the need to install the CLI or other components separately.
read more →

UKG Builds People Fabric with AlloyDB and Agentic Cloud

🤖 UKG built People Fabric to unify its legacy HCM and WFM systems into a single, real-time data and intelligence platform powered by AlloyDB for PostgreSQL and Google's Agentic Data Cloud. The platform establishes a canonical data model, ingests change streams via a custom CDC pipeline and Dataflow, and serves operational queries from AlloyDB while routing analytics workloads to BigQuery and tenancy metadata to Cloud SQL. The outcome is millisecond read-after-write behavior, native vector support for AI agents, and faster developer velocity across 126 application teams and thousands of database instances.
read more →

House GOP Privacy Bills Challenge Enterprise Data Practices

📜 The House Republican proposals — the SECURE Data Act and the GUARD Financial Data Act — would establish federal privacy standards that broadly preempt stronger state laws while limiting private lawsuits and centralizing enforcement with the FTC and state attorneys general. The bills emphasize data minimization, controller-processor obligations, a federal data broker registry, and new limits on automated profiling and teen data. Critics warn the measures could weaken existing protections, impose heavy operational burdens on CIOs and CISOs, and force vendors and legal teams to rework procurement, retention, and AI training practices.
read more →

Google's Agentic Data Cloud: System of Action for Agents

🤖 Google Cloud introduces the Agentic Data Cloud, an AI-native architecture that converts enterprise data platforms into a dynamic System of Action for autonomous agents. It pairs a universal Knowledge Catalog, agentic-first practitioner tools, and a cross-cloud lakehouse to deliver trusted context, secure orchestration, and borderless data access. Early customers report substantial time and cost savings from agent-driven automation.
read more →

Looker Enhancements for Agentic BI and BigQuery Integration

🚀 At Google Cloud Next '26, Looker was updated to enable Agentic BI through deeper integration with Gemini and BigQuery, introducing conversational agents that can trigger downstream business actions. New agents include upgraded Conversational Agents, Dashboard Agents, embedded conversational experiences, and Agentic Workflows. The release also modernizes the UI with AI-powered self-service tools like Visualization, Expression, and Insight Assistants. Emphasis is on governed semantic layers, open protocols, and developer tooling to reduce hallucinations and accelerate model-driven analytics.
read more →

Google Cloud Cross-Cloud Lakehouse Platform for Agentic AI

🤖 Google Cloud introduced a next-generation cross-cloud Lakehouse engineered for the agentic AI era. It combines fully managed Apache Iceberg storage with read/write interoperability, a high-performance Managed Service for Apache Spark, and BigQuery integration to run multimodal workloads in real time. The service adds cross-cloud interconnect and caching to access AWS and Azure data with low latency, and unified governance via Knowledge Catalog to secure and profile data. Customers like Spotify and partners such as Accenture are already testing the platform.
read more →

Configuration-Driven ETL to Convert Logs to OCSF at Scale

🔁 The AWS Professional Services team provides a configuration-driven ETL accelerator that converts custom security logs into OCSF v1.1 and writes OCSF-compliant Parquet files partitioned for use with Amazon Security Lake or other data lakes. The serverless-first solution uses S3, Lambda, DynamoDB, Step Functions and either AWS Glue or EMR Serverless, and ingests mapping and metadata CSVs to drive transformations. An open-source GitHub repository includes deployment artifacts, example mappings, and instructions to validate outputs and run historical loads.
read more →

Redirects for AI Training enforces canonical content

🔁 Cloudflare introduces Redirects for AI Training, a toggle that turns existing rel="canonical" tags into HTTP 301 redirects for verified AI training crawlers. On paid Cloudflare plans this enforcement redirects AI crawler traffic (examples include GPTBot, ClaudeBot, Bytespider) to canonical URLs, preventing ingestion of deprecated content. Human visitors and other automated classes are unaffected.
read more →

Smart Tier for Azure Blob and Data Lake Generally Available

☁️ Azure announces the general availability of smart tier for Blob and Data Lake Storage, a fully managed automated tiering service that continuously optimizes object placement across hot, cool, and cold tiers. It evaluates last-access timestamps—objects idle 30 days move to cool and after 60 more days move to cold—and promotes data back to hot on access. Enable during account creation or switch existing zonal accounts to start optimizing automatically.
read more →

Data Curation Accelerators for Google Data Cloud Platform

🔍 Google outlines a set of curation accelerators within Google Data Cloud that automate cataloging, metadata enrichment, profiling, lineage, and pipeline generation to shorten time-to-insight. Key capabilities include Cloud Storage auto-discovery via Dataplex Universal Catalog, semantic metadata augmentation with Data Insights, automated data quality and lineage controls, and AI agents that generate ingestion and transformation code. The platform also provides built-in AI SQL functions, embeddings, and continuous queries to support multimodal and real-time curation. These features are designed to reduce manual ETL work so teams can focus on analysis, ML, and business decisions.
read more →

CloudWatch Pipelines Adds Compliance and Governance

🛡️ Amazon CloudWatch pipelines introduces compliance and governance controls to help preserve data integrity and restrict pipeline creation. You can enable a keep original toggle to store raw logs before any transformation, and processed entries now include metadata indicating they were transformed. New IAM condition keys let administrators limit pipeline creation by log source and type. These capabilities are provided at no additional cost and are available in Regions where pipelines is supported.
read more →

Google Reintroduces Data Studio for Data Cloud Assets

📊 Google is reintroducing Data Studio (formerly Looker Studio) as the central home for Google Data Cloud assets, emphasizing unified access to reports, BigQuery conversational agents, and data apps built in Colab. The redesigned product will sit alongside Looker, targeted to personal, ad-hoc exploration while Looker remains the governed enterprise BI solution. A free edition continues to serve individuals and a Data Studio Pro tier offers AI, enterprise security, and management features; existing assets will be migrated transparently.
read more →

BigQuery read/write interoperability for Apache Iceberg

🧊 Google announced preview read/write interoperability between BigQuery and Iceberg-compatible engines via the Google-managed Iceberg REST Catalog. The capability lets BigQuery, Trino, Spark, Flink and others create, update, and query a single Iceberg table type while enforcing unified governance and table-level access controls. Customers can offload compaction and garbage collection to BigLake to reduce small-file and metadata bloat and improve query performance.
read more →

Protecting Gmail Privacy as Gemini AI Enters Inbox

🔒 Google explains how it designed Gmail to protect user data as Gemini-powered features roll out. The company says Gemini is not trained on personal email content and only accesses messages for specific, isolated tasks like summarization. According to Gmail’s VP of product, Blake Barnes, the feature processes requests inside the inbox and does not retain the processed data.
read more →

Activating Your Data Layer for Production-Ready AI

🔍 This article introduces labs demonstrating how to prepare and use data stored in Google Cloud databases to support production-ready AI. It highlights semantic search using embeddings in AlloyDB and Cloud SQL (PostgreSQL and MySQL), multimodal image–text embeddings, and AlloyDB AI functions like on-the-fly semantic evaluation and reranking. It also covers NL2SQL generation via the alloydb_ai_nl extension and points to hands-on modules for moving from tests to production.
read more →