All news with #aws glue tag

Sun, November 30, 2025

AWS Glue Adds Apache Iceberg-Based Materialized Views

#AWS #AWS Glue #Apache Iceberg #Product Release

⚡ AWS Glue now supports materialized views stored in Apache Iceberg format and managed in the AWS Glue Data Catalog. Data teams can create views with standard Spark SQL, attach a refresh schedule, and rely on automatic change detection, incremental updates, and managed compute for refresh jobs. Query engines across Athena, EMR, and AWS Glue rewrite queries to use these views, improving performance by up to 8x and lowering compute costs, while SQL tools like Redshift and SageMaker can read the Iceberg tables directly.

Wed, November 26, 2025

AWS Adds Apache Iceberg V3 Deletion Vectors and Lineage

#AWS #Apache Iceberg #Amazon EMR #AWS Glue #Amazon SageMaker

🔔 AWS now supports Apache Iceberg V3 deletion vectors and row lineage across key analytics services. These features — available in Amazon EMR 7.12, AWS Glue, SageMaker notebooks, Amazon S3 Tables, and the AWS Glue Data Catalog — accelerate data modifications and make it simpler to identify changed records. Enable V3 by setting the table property 'format-version = 3' in CREATE TABLE or by updating table metadata; supported AWS query engines will automatically use deletion vectors and row lineage.

Wed, November 26, 2025

AWS Glue 5.1 GA: Spark 3.5, Iceberg 3.0, Lake Formation

#AWS #Product Release #AWS Glue #Apache Iceberg #Lake Formation #Apache Hudi #Delta Lake #Apache Spark

⚡ AWS Glue 5.1 is now generally available, upgrading core engines to Apache Spark 3.5.6, Python 3.11, and Scala 2.12.18 to deliver performance and security improvements. The release refreshes open table format support (Apache Hudi 1.0.2, Apache Iceberg 1.10.0, Delta Lake 3.3.2) and adds Apache Iceberg format 3.0 features such as default column values and deletion vectors. AWS Lake Formation now enforces fine‑grained write control for Spark DDL/DML, and Glue adds full‑table access control for Hudi and Delta tables in Spark.

Wed, November 26, 2025

Amazon EMR and AWS Glue Enforce Lake Formation Write FGAC

#Product Release #Amazon EMR #AWS Glue #AWS Lake Formation #Fine-Grained Access Control

🔐 Amazon has extended AWS Lake Formation fine-grained access control to include write operations for tables registered with Lake Formation when used in Apache Spark jobs on Amazon EMR and AWS Glue. Administrators can now enforce table-, column-, and row-level permissions for DML actions (CREATE, ALTER, INSERT, UPDATE, DELETE, MERGE INTO, DROP) as well as read operations, enabling single-job read/write pipelines. The change reduces the need for separate clusters or applications and centralizes governance. The feature is available in all Regions where EMR, Glue, and Lake Formation are supported.

Wed, November 26, 2025

Amazon EMR and AWS Glue Add Audit Context for Lake Formation

#AWS #Amazon EMR #AWS Glue #AWS Lake Formation #CloudTrail

🔒 Amazon EMR and AWS Glue now include comprehensive audit context support for AWS Lake Formation credential vending APIs and AWS Glue Data Catalog GetTable and GetTables calls. Enabled by default, the feature logs platform type and identifiers (Cluster ID, Step ID, Job Run ID, Virtual Cluster ID) to AWS CloudTrail for enhanced security auditing and troubleshooting. It supports EMR 7.12+ and AWS Glue 5.1+ across all Regions that offer EMR, AWS Glue, and Lake Formation.

Tue, November 25, 2025

AWS Glue: Zero-ETL Replication for Self-Managed Databases

#AWS #Product Release #AWS EC2 #AWS Glue #Amazon Redshift #Zero-ETL

🔁AWS Glue now supports zero-ETL for self-managed database sources, enabling no-code replication from Oracle, SQL Server, MySQL, and PostgreSQL hosted on-premises or on EC2 to Amazon Redshift. The feature auto-creates ongoing integrations to simplify setup, reduce operational overhead, and eliminate much of the engineering work previously required to build ingestion pipelines. It is available in multiple AWS Regions and aims to save teams weeks of engineering effort.

Tue, November 25, 2025

AWS Glue Data Quality Adds Rule Labeling for Reporting

#AWS #Product Release #AWS Glue #Data Quality

🔖 AWS has made AWS Glue Data Quality rule labeling generally available, allowing teams to attach custom key-value labels to data quality rules for better organization and targeted reporting. Labels can represent business context, team ownership, compliance tags, or priority and can be authored in DQDL. Queryable in rule outcomes, row-level results, and APIs, labels enable focused reports and streamlined remediation workflows across all commercial AWS Regions where the service is available.

Tue, November 25, 2025

AWS Glue Data Quality Adds Preprocessing Queries Support

#Product Release #AWS #AWS Glue

🛠️ AWS announces general availability of AWS Glue Data Quality preprocessing queries, enabling transformations before running data quality checks through the Glue Data Catalog APIs. The feature lets you create derived columns, filter datasets, perform calculations, and validate column relationships as part of the quality evaluation. This capability removes separate preprocessing steps, streamlines workflows, and tailors recommendations and rules to specific data subsets across commercial AWS Regions.

Mon, November 24, 2025

AWS Glue: Catalog Federation for Remote Iceberg Catalogs

#AWS #Product Release #AWS Glue #Apache Iceberg #AWS S3 #Lake Formation

🔗 AWS announces general availability of AWS Glue catalog federation for remote Apache Iceberg catalogs. The feature enables analytics engines to query Iceberg tables stored in Amazon S3 and cataloged remotely without moving or copying data, with real-time metadata synchronization to the AWS Glue Data Catalog. It leverages AWS Lake Formation for fine-grained access controls and supports the Iceberg REST specifications; federation is available in the Lake Formation console and via SDKs/APIs.

Fri, November 21, 2025

AWS Glue zero-ETL now supports CloudFormation & CDK

#AWS #AWS Glue #AWS CloudFormation #AWS CDK #Zero-ETL #AWS S3 #Amazon Redshift

🚀 AWS Glue zero-ETL integrations now support AWS CloudFormation and the AWS Cloud Development Kit (CDK), enabling creation and management of zero-ETL integrations using infrastructure as code. This lets teams ingest data from DynamoDB and enterprise SaaS sources (Salesforce, ServiceNow, SAP, Zendesk) into Amazon Redshift, S3, and S3 Tables. CloudFormation and CDK support makes it easier to deploy, update, and version-control zero-ETL configurations consistently across multiple AWS accounts.

Thu, November 20, 2025

AWS Glue Adds Zero-ETL Support for More SAP Entities

#AWS #AWS Glue #Amazon Redshift #Amazon SageMaker #Zero-ETL

🔄 AWS Glue now provides full snapshot and incremental zero-ETL ingestion for additional SAP entities. The update adds snapshot ingestion for entities without deletion tracking and timestamp-based incremental loads for non-ODP systems, extending existing ODP support. Organizations can ingest SAP data directly into Amazon Redshift or the lakehouse architecture used by Amazon SageMaker, reducing engineering effort and operational complexity. This feature is available in all Regions where AWS Glue zero-ETL is offered.

Thu, November 20, 2025

SageMaker Studio: Long‑Running Sessions with Corporate IDs

#AWS #Amazon SageMaker #AWS IAM #AWS EKS #AWS Glue

⏳ Amazon SageMaker Unified Studio now supports long-running background sessions using corporate identities via AWS IAM Identity Center's trusted identity propagation (TIP). Users can launch interactive notebooks and data processing on SageMaker, Amazon EMR, and AWS Glue that persist when they log off or experience network or credential interruptions. Sessions retain corporate permissions and can run up to 90 days (default 7 days), reducing the need for continuous monitoring and improving productivity for multi-hour or multi-day workloads.

Wed, November 5, 2025

AWS Glue Schema Registry Adds Native C# Client Support

#AWS #AWS Glue #AWS Glue Schema Registry #AWS Kinesis Data Streams #CSharp

🔧 AWS Glue Schema Registry now provides C# support in its client library, extending beyond the existing Java SDK to offer first-class integration for .NET streaming applications. C# services using Apache Kafka, Amazon MSK, Amazon Kinesis Data Streams, or Apache Flink can register, validate, and enforce schemas to keep producers and consumers aligned. The serverless registry enforces centralized schema validation at no additional charge. C# support is available in all regions where Glue Schema Registry is offered and the SDK is distributed via NuGet.

Fri, October 3, 2025

AWS Glue Adds Write Support for Four Application Connectors

#AWS #AWS Glue #Product Release #SAP #Salesforce #HubSpot #Marketo

🔁 AWS Glue now supports write operations for SAP OData, Adobe Marketo Engage, Salesforce Marketing Cloud, and HubSpot connectors, allowing ETL jobs to create and update records directly in those applications. Announced Oct 3, 2025, the enhancement lets teams sync leads and CRM records, update subscribers and campaign data, and manage contacts, companies, and deals without custom scripts or intermediate systems. This capability simplifies end-to-end ETL pipelines and reduces integration complexity and latency. The feature is available in all Regions where AWS Glue is offered; consult the AWS Glue documentation for supported entities.

Wed, September 3, 2025

AWS Config Adds Five New Resource Types for Monitoring

#AWS #AWS Config #Product Release #AWS CodeArtifact #AWS Glue #AWS Network Manager #AWS RolesAnywhere

🔔 AWS Config now supports five additional AWS resource types, expanding its ability to discover, assess, audit, and remediate resources across your accounts. The new types — AWS::CodeArtifact::Domain, AWS::Config::ConformancePack, AWS::Glue::Database, AWS::NetworkManager::TransitGatewayPeering, and AWS::RolesAnywhere::TrustAnchor — are tracked automatically if you record all resource types and are available for Config rules and aggregators. Support applies in all Regions where these resources are available, enabling broader compliance and operational visibility. This update simplifies monitoring and remediation workflows.

Fri, August 29, 2025

Amazon EMR Adds Spark FGAC and Glue Data Catalog Views

#Amazon EMR #Apache Spark #AWS #AWS Glue #Data Governance #Lake Formation

🔒 Amazon EMR on EC2 now supports Apache Spark native fine-grained access control (FGAC) through AWS Lake Formation and adds support for AWS Glue Data Catalog views. These capabilities let administrators define and enforce granular Lake Formation policies once and apply them consistently to Spark jobs and interactive sessions, reducing administrative overhead and security risk. Access checks support named resource grants, data filters, and tag-based controls and are logged in AWS CloudTrail for auditing.