The new way to Spark easier, smarter, and faster.
Run Apache Spark workloads on Google Cloud with less operational overhead, more AI-powered assistance, and better price-performance. Focus on your code, not your cluster.
Benefits
Easier - Eliminate the operational burden of Spark
Choose between zero-ops Google Cloud Serverless for Apache Spark or managed Dataproc clusters. Both automate away infrastructure complexity so you can accelerate your development life cycle.
Smarter - AI assisted Spark development
Accelerate your entire workflow with Gemini in Dataproc and Google Cloud Serverless for Apache Spark. Get Gemini-powered assistance to generate and debug code, and troubleshoot failed jobs.
Faster - Accelerate Spark performance
Get industry-leading price-performance, automatically. For your most demanding jobs, unlock over 4.3x faster performance with Lightning Engine. This reduces TCO and accelerates time-to-insight.
Key features
Select from Serverless for Apache Spark for zero-ops simplicity or Dataproc for managed clusters with deep customizations.
Focus solely on your code and accelerate development. With tiers for both cost-effective batch processing and high-performance AI/ML, it’s ideal for new Apache Spark pipelines, interactive analysis, and workloads with unpredictable demand where a "NoOps" model is preferred.
Best for: Data scientists & ML engineers, ad-hoc queries, new applications, developer productivity.
Get maximum control over your cluster environment. Perfect for migrating existing Apache Hadoop/Spark workloads, running long-lived persistent clusters, or using a diverse open source ecosystem.
Best for: Enterprise engineering and operations, on-prem migrations, long-running jobs, deep customization.
Customers
Documentation
Stop choosing between the power of SQL and the flexibility of Spark. BigLake lets you use both engines on the same governed data. It's a unified experience that lets you use the best tool for every job.
Go from data preparation to model training and inference, faster. Our Premium tiers are designed for AI/ML, letting you use pre-configured ML Runtimes with built-in GPU support, such as NVIDIA RAPIDS, to eliminate complex setup.
What's new
Apache Spark is a trademark of The Apache Software Foundation.
** The queries are derived from the TPC-DS standard and TPC-H standard and as such are not comparable to published TPC-DS standard and TPC-H standard results, as these runs do not comply with all requirements of the TPC-DS standard and TPC-H standard specification.
Tell us what you’re solving for. A Google Cloud expert will help you find the best solution.