Skip to main content
Cloud / Google Cloud / Products / Dataproc Serverless for Apache Spark - Serverless Spark Execution

Dataproc Serverless for Apache Spark - Serverless Spark Execution

Dataproc Serverless enables running Apache Spark jobs without cluster management on Google Cloud.

Data Analytics
Pricing Model Pay-per-use (per DCU-hour)
Availability Global with EU regions
Data Sovereignty EU regions available
Reliability 99.9% SLA

What is Dataproc Serverless for Apache Spark?

Dataproc Serverless for Apache Spark is a service from Google Cloud that enables running Apache Spark jobs without cluster management. You submit your Spark code, and the platform automatically provisions the required resources, runs the job, and releases the resources.

Unlike Dataproc on Compute Engine, there is no need to provision, configure, or manage clusters. Jobs start in seconds instead of minutes, and billing is purely usage-based.

Core Features

  • No cluster management: Spark jobs without provisioning or configuring clusters
  • Fast start: Jobs begin in seconds instead of the usual 90 seconds for clusters
  • Auto-scaling: Automatic resource adjustment during job execution
  • BigQuery integration: Direct reading and writing of BigQuery tables in Spark jobs

Typical Use Cases

Ad-Hoc Data Analysis

Data scientists and analysts use Dataproc Serverless for exploratory analysis with Spark without waiting for or managing clusters. Notebooks start instantly.

Scheduled ETL Pipelines

Regularly executed Spark ETL jobs benefit from Dataproc Serverless since no clusters need to be maintained between executions. Integration with Cloud Composer enables orchestration.

Benefits

  • No infrastructure management or cluster tuning
  • Faster iteration cycles for data engineers
  • Cost-effective: pay only for actual execution time
  • Seamless integration with BigQuery, Cloud Storage, and Vertex AI

Integration with innFactory

As a Google Cloud Partner, innFactory supports you with Dataproc Serverless: Spark job migration, pipeline architecture, and cost optimization.

Typical Use Cases

Serverless Spark jobs
Ad-hoc data analysis
ETL pipelines

Frequently Asked Questions

What is Dataproc Serverless for Apache Spark?

Dataproc Serverless enables running Apache Spark jobs without cluster provisioning or management. Google Cloud handles the infrastructure entirely, and jobs start within seconds.

What is the difference from Dataproc on Compute Engine?

With Dataproc on Compute Engine, you provision and configure your own clusters. With Dataproc Serverless, you only submit Spark code and the platform handles all infrastructure aspects.

How is Dataproc Serverless billed?

Billing is per Dataproc Compute Unit (DCU) hour. You only pay for resources actually used during job execution, with no costs for idle time.

Google Cloud Partner

innFactory is a certified Google Cloud Partner. We provide expert consulting, implementation, and managed services.

Google Cloud Partner

Comparable Products from Other Clouds

As a multi-cloud partner, we help you choose the right platform for your specific requirements.

Ready to start with Dataproc Serverless for Apache Spark - Serverless Spark Execution?

Our certified Google Cloud experts help you with architecture, integration, and optimization.

Schedule Consultation