Tag Banner

All news with #pyspark tag

Tue, September 16, 2025

Data Science Agent Adds BigQuery ML, DataFrames, and Spark

🧭 Google Cloud has expanded the Data Science Agent in Colab Enterprise notebooks to support BigQuery ML, BigQuery DataFrames and Spark, enabling large-scale data transformation, model training, and inference directly on BigQuery or via Serverless for Apache Spark. The agent can now auto-retrieve BigQuery table metadata and lets you add tables via an @ mention from your current project to provide prompt context. To invoke frameworks, include keywords such as BigQuery ML, BigFrames, or PySpark; sample prompts are provided to guide forecasting, supervised learning, and dimensionality reduction workflows. Notable limitations: generated PySpark targets Spark 4.0 and @ mentions only search the current project; BigQuery improvements are available now in BigQuery notebooks and coming soon to Vertex AI.

read more →

Wed, August 20, 2025

AWS Clean Rooms adds PySpark error message controls

🔧 AWS Clean Rooms now lets code authors configure error message detail for analyses using PySpark. When every collaboration member approves an analysis, authors can enable more detailed errors to accelerate debugging and testing. This reduces troubleshooting time for models such as marketing attribution from weeks to hours or days while preserving collaborator data protections.

read more →