All news with #apache datafusion tag
Thu, September 25, 2025
R2 SQL Deep Dive: Serverless Queries over R2 Data Platform
#Product Release
#Cloudflare Workers
#R2 SQL
#R2 Data Catalog
#Apache Iceberg
#Apache Parquet
#Apache DataFusion
⚡ R2 SQL is Cloudflare’s serverless query engine that runs SQL directly against Iceberg tables stored in R2, eliminating the need for Spark or Trino clusters. The Query Planner uses R2 Data Catalog metadata and multi-level stats to prune manifests, files, and Parquet row groups so only necessary bytes are read. Execution is distributed across Cloudflare’s network using Workers and query workers running Apache DataFusion, with results serialized via Apache Arrow. An ordered, streaming planning pipeline enables early termination for ORDER BY ... LIMIT queries; R2 SQL is currently available in open beta.