< ciso
brief />
Tag Banner

All news with #data governance tag

95 articles · page 4 of 5

Amazon SageMaker Catalog Exports Asset Metadata to Iceberg

🔍 Amazon SageMaker Catalog now exports asset metadata as an Apache Iceberg table via Amazon S3 Tables, enabling teams to query catalog inventory with standard SQL without building custom ETL. The export includes technical fields (resource_id, resource_type), business metadata (asset_name, business_description), ownership details, and timestamps, partitioned by snapshot_date for time travel queries. The dataset appears in SageMaker Unified Studio and is queryable from Amazon Athena, Studio notebooks, AI agents, and BI tools. Available in all supported Regions at no additional SageMaker charge; you pay for S3 Tables storage and Athena queries.
read more →

Amazon SageMaker Catalog Adds Automated Data Classification

🤖 Amazon SageMaker Catalog now provides automated data classification that suggests business glossary terms during dataset publishing to reduce manual tagging and improve metadata consistency. The capability leverages Amazon Bedrock language models to analyze table metadata and schema and recommend relevant business and sensitive-data terms from organizational glossaries. Data producers receive AI-generated suggestions they can accept or modify before publishing, helping standardize vocabulary and improve data discoverability. The feature is available in multiple AWS regions and can be managed via SageMaker Unified Studio, the AWS CLI, or SDKs.
read more →

AWS Glue Adds Apache Iceberg-Based Materialized Views

⚡ AWS Glue now supports materialized views stored in Apache Iceberg format and managed in the AWS Glue Data Catalog. Data teams can create views with standard Spark SQL, attach a refresh schedule, and rely on automatic change detection, incremental updates, and managed compute for refresh jobs. Query engines across Athena, EMR, and AWS Glue rewrite queries to use these views, improving performance by up to 8x and lowering compute costs, while SQL tools like Redshift and SageMaker can read the Iceberg tables directly.
read more →

Amazon S3 Metadata Now Available in 22 More Regions

🔍 Amazon S3 Metadata is expanding to twenty-two additional AWS Regions, bringing automated, queryable object and custom metadata closer to more customers. The feature automatically populates metadata for both new and existing objects in near real-time and supports system-defined details (size, source) and user-defined tags such as product SKUs or transaction IDs. This expansion makes S3 Metadata generally available in 28 Regions and enables faster data discovery, curation, and analytics inside existing S3 workflows.
read more →

AWS Glue: Zero-ETL Replication for Self-Managed Databases

🔁AWS Glue now supports zero-ETL for self-managed database sources, enabling no-code replication from Oracle, SQL Server, MySQL, and PostgreSQL hosted on-premises or on EC2 to Amazon Redshift. The feature auto-creates ongoing integrations to simplify setup, reduce operational overhead, and eliminate much of the engineering work previously required to build ingestion pipelines. It is available in multiple AWS Regions and aims to save teams weeks of engineering effort.
read more →

BigQuery Data Transfer Service Enhancements and Compliance

🔔 The BigQuery Data Transfer Service expands its connector ecosystem with new GA integrations (Oracle, Salesforce, ServiceNow, SFMC, Facebook Ads, and GA4) and preview connectors like Stripe, PayPal, Snowflake, and Hive. Platform improvements include event-driven transfers, incremental ingestion, GAQL-based custom Google Ads reports, and enhanced Oracle scale. Security and compliance gains—EU Data Boundary GA, FedRAMP High, CJIS, access transparency, regional endpoints, and key usage tracking—support regulated workloads. A new consumption-based pricing model applies to third-party connectors once they reach GA.
read more →

Amazon SageMaker Catalog Adds Column-Level Metadata

📣 Amazon SageMaker Catalog now supports custom column-level metadata forms and markdown-enabled rich text descriptions so data stewards can attach business-specific key-value metadata and formatted documentation directly to individual columns. Form values and rich text are indexed in real time and become immediately searchable alongside column names, descriptions, and glossary terms. This capability is available in all AWS Regions where SageMaker is supported.
read more →

Amazon SageMaker Catalog Enforces Glossary Metadata

📌 Amazon SageMaker Catalog now enforces glossary-term metadata during asset publishing. Administrators can require data producers to tag assets with approved business vocabulary from organizational glossaries, and enforcement rules will block publication if required terms are missing. This standardizes metadata, aligns technical schemas with business language, and improves discoverability and governance. Available in all regions where Amazon SageMaker Catalog operates; policies can be managed via the console, CLI, or SDKs.
read more →

India DPDP Rules 2025 Make Privacy an Engineering Challenge

🔒 India’s new Digital Personal Data Protection (DPDP) Rules, 2025 impose strict consent, verification, and fixed deletion timelines that require large platforms and enterprises to redesign how they collect, store, and erase personal data. The rules create Significant Data Fiduciaries with added audit and algorithmic-check obligations and formalize certified Consent Managers. Organizations have 12–18 months to adopt automated consent capture, verification, retention enforcement, and data-mapping across cloud, on‑prem, and SaaS environments.
read more →

Amazon SageMaker Unified Studio Adds Catalog Notifications

🔔 Amazon SageMaker Unified Studio now delivers real-time notifications for data catalog activities, including new dataset publications, metadata changes, subscription requests, comments, and access approvals. Alerts are surfaced via a bell icon on the project home page and through a notification center that shows a recent list and a full, filterable tabular view by catalog, project, and event type. The feature is available in all regions where SageMaker Unified Studio is supported.
read more →

Ericsson Secures Data Integrity with Dataplex Governance

🔒 Ericsson has implemented a global data governance framework using Dataplex Universal Catalog on Google Cloud to ensure data integrity, discoverability, and compliance across its Managed Services operation. The program standardized a business glossary, automated quality checks with incident-driven alerts, and visualized column-level lineage to support analytics, AI, and automation at scale. It balances defensive compliance with offensive innovation and embeds stewardship through Ericsson’s Data Operating Model.
read more →

Data Security Posture Management: Top DSPM Tools Reviewed

🛡️ Data Security Posture Management (DSPM) tools help organizations discover, classify and manage sensitive data across dynamic cloud environments. They focus on locating "shadow data" in known and unknown repositories and typically collect metadata via agentless or API-based scans to avoid moving raw data. DSPM dashboards catalog findings, map lineage and assess compliance, while remediation often integrates with SOAR, SIEM or CNAPP solutions. Many vendors now combine discovery with some automated "fix it" capabilities to streamline response.
read more →

LinkedIn to Use EU, UK and Other Profiles for AI Training

🔒 Microsoft-owned LinkedIn will begin using profile details, public posts and feed activity from users in the UK, EU, Switzerland, Canada and Hong Kong to train generative AI models and to support personalised ads across Microsoft starting 3 November 2025. Private messages are excluded. Users can opt out via Settings & Privacy > Data Privacy and toggle Data for Generative AI Improvement to Off. Organisations should update social media policies and remind staff to review their advertising and data-sharing settings.
read more →

Social Media Privacy Ranking 2025: Platforms Compared

🔒 Incogni’s Social Media Privacy Ranking 2025 evaluates 15 major platforms across data collection, resale, AI training, privacy settings, and regulatory fines. The analysis identifies Pinterest and Quora as the most privacy-conscious, while TikTok and Facebook rank lowest, driven by extensive data use and historical penalties. The report highlights practical differences in opt-outs, data-sharing, and default settings and recommends users review privacy controls and use Kaspersky’s Privacy Checker.
read more →

DeepSeek Privacy and Security: What Users Should Know

🔒 DeepSeek collects extensive interaction data — chats, images and videos — plus account details, IP address and device/browser information, and retains it for an unspecified period under a vague “retain as long as needed” policy. The service operates under Chinese jurisdiction, so stored chats may be accessible to local authorities and have been observed on China Mobile servers. Users can disable model training in web and mobile Data settings, export or delete chats (export is web-only), or run the open-source model locally to avoid server-side retention, but local deployment and deletion have trade-offs and require device protections.
read more →

Dataplex Supports Column-Level Lineage for BigQuery

🔍 Dataplex Universal Catalog now captures column-level lineage for BigQuery, extending object-level tracing to granular column transformations at no extra cost. The update provides interactive visual lineage graphs so users can inspect upstream and downstream flows for individual columns, trace origins, and assess downstream impact of modifications. This granularity helps validate authoritative sources for AI/ML features, enforce column-level governance, and improve compliance. It also surfaces freshness and usage metadata to support context-aware agents.
read more →

Aurora PostgreSQL zero-ETL now integrates SageMaker

🔁 Amazon Aurora PostgreSQL-Compatible Edition now offers zero-ETL integration with Amazon SageMaker, enabling near-real-time replication of PostgreSQL tables into a lakehouse. The synced data conforms to Apache Iceberg open standards and is immediately accessible to SQL, Apache Spark, BI, and ML tools via a simple no-code interface without impacting production workloads. Comprehensive, fine-grained access controls are enforced across analytics engines, and the capability is available in multiple AWS Regions.
read more →

BigQuery Data Clean Room Query Templates — Preview

🔒 BigQuery data clean room query templates are now available in preview, enabling clean room owners to publish fixed, reusable TVF-based queries that accept table or field inputs and return only aggregated rows. Templates reduce data exfiltration risk, simplify onboarding for non-SQL users, and enforce consistent analytical and privacy controls via aggregation thresholds and approval workflows. They support single-direction and multi-party collaboration while keeping query logic hidden from subscribers.
read more →

AWS Clean Rooms Adds Cross-Region Data Collaboration

🌐 AWS Clean Rooms now supports cross-region collaboration, letting organizations analyze partner data stored in different AWS and Snowflake Regions without copying or sharing underlying datasets. Collaboration creators can specify allowed result regions to help meet data residency and sovereignty requirements. This reduces integration work—no new pipelines or replication—and enables faster, secure joint analyses across advertising, investment, and R&D use cases.
read more →

SageMaker Unified Studio adds SSO for Spark sessions

🔐 Amazon SageMaker Unified Studio now supports corporate identities for interactive Apache Spark sessions using AWS Identity Center trusted identity propagation. Data engineers and scientists can sign on to JupyterLab Spark sessions with organizational credentials while administrators apply fine-grained access controls and maintain end-to-end data access traceability. The integration leverages AWS Lake Formation, Amazon S3 Access Grants, and Amazon Redshift Data APIs, and includes comprehensive AWS CloudTrail logging for interactive and background sessions to streamline compliance.
read more →