All news with #aws glue tag
Sun, November 30, 2025
AWS Glue Adds Apache Iceberg-Based Materialized Views
⚡ AWS Glue now supports materialized views stored in Apache Iceberg format and managed in the AWS Glue Data Catalog. Data teams can create views with standard Spark SQL, attach a refresh schedule, and rely on automatic change detection, incremental updates, and managed compute for refresh jobs. Query engines across Athena, EMR, and AWS Glue rewrite queries to use these views, improving performance by up to 8x and lowering compute costs, while SQL tools like Redshift and SageMaker can read the Iceberg tables directly.
Wed, November 26, 2025
AWS Adds Apache Iceberg V3 Deletion Vectors and Lineage
🔔 AWS now supports Apache Iceberg V3 deletion vectors and row lineage across key analytics services. These features — available in Amazon EMR 7.12, AWS Glue, SageMaker notebooks, Amazon S3 Tables, and the AWS Glue Data Catalog — accelerate data modifications and make it simpler to identify changed records. Enable V3 by setting the table property 'format-version = 3' in CREATE TABLE or by updating table metadata; supported AWS query engines will automatically use deletion vectors and row lineage.
Wed, November 26, 2025
AWS Glue 5.1 GA: Spark 3.5, Iceberg 3.0, Lake Formation
⚡ AWS Glue 5.1 is now generally available, upgrading core engines to Apache Spark 3.5.6, Python 3.11, and Scala 2.12.18 to deliver performance and security improvements. The release refreshes open table format support (Apache Hudi 1.0.2, Apache Iceberg 1.10.0, Delta Lake 3.3.2) and adds Apache Iceberg format 3.0 features such as default column values and deletion vectors. AWS Lake Formation now enforces fine‑grained write control for Spark DDL/DML, and Glue adds full‑table access control for Hudi and Delta tables in Spark.
Wed, November 26, 2025
Amazon EMR and AWS Glue Enforce Lake Formation Write FGAC
🔐 Amazon has extended AWS Lake Formation fine-grained access control to include write operations for tables registered with Lake Formation when used in Apache Spark jobs on Amazon EMR and AWS Glue. Administrators can now enforce table-, column-, and row-level permissions for DML actions (CREATE, ALTER, INSERT, UPDATE, DELETE, MERGE INTO, DROP) as well as read operations, enabling single-job read/write pipelines. The change reduces the need for separate clusters or applications and centralizes governance. The feature is available in all Regions where EMR, Glue, and Lake Formation are supported.
Wed, November 26, 2025
Amazon EMR and AWS Glue Add Audit Context for Lake Formation
🔒 Amazon EMR and AWS Glue now include comprehensive audit context support for AWS Lake Formation credential vending APIs and AWS Glue Data Catalog GetTable and GetTables calls. Enabled by default, the feature logs platform type and identifiers (Cluster ID, Step ID, Job Run ID, Virtual Cluster ID) to AWS CloudTrail for enhanced security auditing and troubleshooting. It supports EMR 7.12+ and AWS Glue 5.1+ across all Regions that offer EMR, AWS Glue, and Lake Formation.
Tue, November 25, 2025
AWS Glue: Zero-ETL Replication for Self-Managed Databases
🔁AWS Glue now supports zero-ETL for self-managed database sources, enabling no-code replication from Oracle, SQL Server, MySQL, and PostgreSQL hosted on-premises or on EC2 to Amazon Redshift. The feature auto-creates ongoing integrations to simplify setup, reduce operational overhead, and eliminate much of the engineering work previously required to build ingestion pipelines. It is available in multiple AWS Regions and aims to save teams weeks of engineering effort.
Tue, November 25, 2025
AWS Glue Data Quality Adds Rule Labeling for Reporting
🔖 AWS has made AWS Glue Data Quality rule labeling generally available, allowing teams to attach custom key-value labels to data quality rules for better organization and targeted reporting. Labels can represent business context, team ownership, compliance tags, or priority and can be authored in DQDL. Queryable in rule outcomes, row-level results, and APIs, labels enable focused reports and streamlined remediation workflows across all commercial AWS Regions where the service is available.
Tue, November 25, 2025
AWS Glue Data Quality Adds Preprocessing Queries Support
🛠️ AWS announces general availability of AWS Glue Data Quality preprocessing queries, enabling transformations before running data quality checks through the Glue Data Catalog APIs. The feature lets you create derived columns, filter datasets, perform calculations, and validate column relationships as part of the quality evaluation. This capability removes separate preprocessing steps, streamlines workflows, and tailors recommendations and rules to specific data subsets across commercial AWS Regions.
Mon, November 24, 2025
AWS Glue: Catalog Federation for Remote Iceberg Catalogs
🔗 AWS announces general availability of AWS Glue catalog federation for remote Apache Iceberg catalogs. The feature enables analytics engines to query Iceberg tables stored in Amazon S3 and cataloged remotely without moving or copying data, with real-time metadata synchronization to the AWS Glue Data Catalog. It leverages AWS Lake Formation for fine-grained access controls and supports the Iceberg REST specifications; federation is available in the Lake Formation console and via SDKs/APIs.
Fri, November 21, 2025
AWS Glue zero-ETL now supports CloudFormation & CDK
🚀 AWS Glue zero-ETL integrations now support AWS CloudFormation and the AWS Cloud Development Kit (CDK), enabling creation and management of zero-ETL integrations using infrastructure as code. This lets teams ingest data from DynamoDB and enterprise SaaS sources (Salesforce, ServiceNow, SAP, Zendesk) into Amazon Redshift, S3, and S3 Tables. CloudFormation and CDK support makes it easier to deploy, update, and version-control zero-ETL configurations consistently across multiple AWS accounts.
Thu, November 20, 2025
AWS Glue Adds Zero-ETL Support for More SAP Entities
🔄 AWS Glue now provides full snapshot and incremental zero-ETL ingestion for additional SAP entities. The update adds snapshot ingestion for entities without deletion tracking and timestamp-based incremental loads for non-ODP systems, extending existing ODP support. Organizations can ingest SAP data directly into Amazon Redshift or the lakehouse architecture used by Amazon SageMaker, reducing engineering effort and operational complexity. This feature is available in all Regions where AWS Glue zero-ETL is offered.
Thu, November 20, 2025
SageMaker Studio: Long‑Running Sessions with Corporate IDs
⏳ Amazon SageMaker Unified Studio now supports long-running background sessions using corporate identities via AWS IAM Identity Center's trusted identity propagation (TIP). Users can launch interactive notebooks and data processing on SageMaker, Amazon EMR, and AWS Glue that persist when they log off or experience network or credential interruptions. Sessions retain corporate permissions and can run up to 90 days (default 7 days), reducing the need for continuous monitoring and improving productivity for multi-hour or multi-day workloads.
Wed, November 5, 2025
AWS Glue Schema Registry Adds Native C# Client Support
🔧 AWS Glue Schema Registry now provides C# support in its client library, extending beyond the existing Java SDK to offer first-class integration for .NET streaming applications. C# services using Apache Kafka, Amazon MSK, Amazon Kinesis Data Streams, or Apache Flink can register, validate, and enforce schemas to keep producers and consumers aligned. The serverless registry enforces centralized schema validation at no additional charge. C# support is available in all regions where Glue Schema Registry is offered and the SDK is distributed via NuGet.
Fri, October 3, 2025
AWS Glue Adds Write Support for Four Application Connectors
🔁 AWS Glue now supports write operations for SAP OData, Adobe Marketo Engage, Salesforce Marketing Cloud, and HubSpot connectors, allowing ETL jobs to create and update records directly in those applications. Announced Oct 3, 2025, the enhancement lets teams sync leads and CRM records, update subscribers and campaign data, and manage contacts, companies, and deals without custom scripts or intermediate systems. This capability simplifies end-to-end ETL pipelines and reduces integration complexity and latency. The feature is available in all Regions where AWS Glue is offered; consult the AWS Glue documentation for supported entities.
Wed, September 3, 2025
AWS Config Adds Five New Resource Types for Monitoring
🔔 AWS Config now supports five additional AWS resource types, expanding its ability to discover, assess, audit, and remediate resources across your accounts. The new types — AWS::CodeArtifact::Domain, AWS::Config::ConformancePack, AWS::Glue::Database, AWS::NetworkManager::TransitGatewayPeering, and AWS::RolesAnywhere::TrustAnchor — are tracked automatically if you record all resource types and are available for Config rules and aggregators. Support applies in all Regions where these resources are available, enabling broader compliance and operational visibility. This update simplifies monitoring and remediation workflows.
Fri, August 29, 2025
Amazon EMR Adds Spark FGAC and Glue Data Catalog Views
🔒 Amazon EMR on EC2 now supports Apache Spark native fine-grained access control (FGAC) through AWS Lake Formation and adds support for AWS Glue Data Catalog views. These capabilities let administrators define and enforce granular Lake Formation policies once and apply them consistently to Spark jobs and interactive sessions, reducing administrative overhead and security risk. Access checks support named resource grants, data filters, and tag-based controls and are logged in AWS CloudTrail for auditing.