When are Spot VMs terminated?

Spot VMs are terminated when Google needs the capacity for regular VMs or after a maximum of 24 hours runtime. You receive 30 seconds warning via an ACPI G3 Mechanical Off Signal. During this time, the application should save checkpoints or perform graceful shutdown.

How do Spot VMs differ from Preemptible VMs?

Spot VMs are the successor to Preemptible VMs with identical functionality but a more flexible pricing model. The main difference is the pricing model (dynamic vs. fixed) and availability in newer regions. Google recommends migrating new workloads to Spot VMs.

Which workloads are suitable for Spot VMs?

Ideal workloads are fault-tolerant, horizontally scalable without strict SLA requirements. Batch processing, CI/CD, rendering, data analytics, ML training, and simulation workloads benefit significantly. Avoid Spot VMs for databases, web servers, or other stateful, latency-critical services.

How can I handle Spot VM preemptions?

Implement checkpointing (regularly saving state), use shutdown scripts for graceful termination, and rely on retry logic. Managed Instance Groups can automatically start new Spot VMs. For ML training, use TensorFlow checkpoints or similar mechanisms.

Can I combine Spot VMs with regular VMs?

Yes, hybrid setups are common. Use Spot VMs for scalable worker nodes and regular VMs for master nodes or critical components. Example: A Kubernetes cluster uses Spot VMs for batch jobs and regular VMs for API servers and databases.

What is the average preemption rate?

The preemption rate varies by region, zone, and machine type. On average, it is around 5-15% per day. During periods of high demand, it can be higher. Use multiple zones for better availability and monitor preemption rates via Stackdriver.

Can I use Persistent Disks with Spot VMs?

Yes, Spot VMs can use Persistent Disks. Disks persist after preemption and can be attached to new Spot VMs. This enables stateful workloads with checkpointing. Boot disks can be configured as Persistent Disks for fast restarts.

Are Spot VMs GDPR-compliant?

Yes, Spot VMs are available in EU regions and meet all GDPR requirements. They are subject to the same compliance standards as regular Compute Engine VMs. Google Cloud offers comprehensive data protection controls and data residency options.

Spot VMs - Google Cloud Preemptible Computing · innFactory

Cost-effective VM instances for batch jobs and fault-tolerant workloads with up to 91% discount compared to regular VMs.

What are Google Cloud Spot VMs?

Spot VMs are short-lived VM instances that offer excess Compute capacity from Google Cloud at significantly reduced prices. The service enables up to 91% cost savings compared to regular on-demand VMs with identical performance. The trade-off: Google can terminate (preempt) Spot VMs at any time when capacity is needed for regular workloads or after a maximum of 24 hours runtime. This makes Spot VMs ideal for fault-tolerant, horizontally scalable workloads without strict SLA requirements.

Before terminating a Spot VM, Google sends a 30-second preemption notice via an ACPI G3 Mechanical Off Signal. During this short window, the application should save checkpoints, write state to persistent storage, or perform graceful shutdown. Shutdown scripts can be automatically executed to complete cleanup tasks. After preemption, the boot disk remains intact (if configured as a Persistent Disk), enabling fast restarts.

Spot VMs support all Compute Engine machine types from Micro to high-memory and high-CPU instances. They are available in all Google Cloud regions and zones, though availability and preemption rates vary between zones. Best practice is to distribute workloads across multiple zones for higher availability. Managed Instance Groups can automatically start new Spot VMs when old ones are preempted, enabling self-healing for fault-tolerant workloads.

Common Use Cases

Batch Processing and Data Pipelines

Spot VMs are ideal for batch jobs that process large amounts of data and scale horizontally. Data processing pipelines with Apache Beam, Dataflow, or Spark can use Spot VMs for worker nodes. On preemption, the pipeline simply starts new workers. Example: A nightly ETL job uses 100 Spot VMs for 2 hours and saves 90% compared to regular VMs.

CI/CD Builds and Test Environments

CI/CD pipelines can use Spot VMs for build and test agents. Builds are short-lived and can be restarted on preemption. GitLab Runners, Jenkins Agents, or GitHub Actions Self-Hosted Runners benefit from Spot VMs. Example: A development organization uses Spot VMs for test environments that only run during work hours.

Rendering and Transcoding

Video rendering, 3D rendering, or media transcoding are perfect Spot VM workloads. Rendering frameworks like Blender or FFmpeg can distribute frames across hundreds of Spot VMs. On preemption, the unfinished frame is simply reassigned to another worker. Example: An animation studio renders films on 500 Spot VMs and saves hundreds of thousands of euros.

Scientific Simulations and HPC

High-performance computing workloads like climate modeling, protein folding, or financial simulations can use Spot VMs. Checkpointing enables resume after preemption. Combined with Persistent Disks for state storage, long simulation runs are possible. Example: A research institute simulates weather patterns on 1,000 Spot VMs with hourly checkpoints.

Machine Learning Training

ML training jobs are often long-running and benefit significantly from Spot VMs. TensorFlow, PyTorch, and other frameworks support checkpointing. On preemption, training loads the last checkpoint and continues. Combine Spot VMs for training and regular VMs for inference. Example: An ML team trains models on Spot VMs and saves 85% of compute costs.

Best Practices

Implement Checkpointing

Regularly save application state to persistent storage (Cloud Storage, Persistent Disk). On preemption, the application can resume from the last checkpoint instead of starting from scratch. Checkpoint frequency should be aligned with workload length: short jobs every 5-10 minutes, long jobs hourly.

Use Shutdown Scripts

Configure shutdown scripts that are automatically executed on preemption. These can save state, complete in-flight requests, or perform cleanup. Note the 30-second limit: scripts must be fast. Use the Compute Engine Metadata API for preemption signals.

Distribute Workloads Across Zones

Use multiple zones for better availability. Managed Instance Groups can automatically start Spot VMs in different zones. This reduces the risk of all instances being preempted simultaneously. Regional Managed Instance Groups simplify multi-zone deployments.

Combine Spot and On-Demand VMs

Hybrid setups offer the best of both worlds: Spot VMs for scalable, fault-tolerant worker nodes and regular VMs for master nodes, databases, or critical services. Example: A Kubernetes cluster uses Spot VMs for 80% of worker nodes and regular VMs for control plane and stateful workloads.

Monitor Preemption Rates

Monitor preemption metrics per zone and machine type. High preemption rates may indicate capacity constraints. Switch to other zones or machine types with persistent high preemption. Use Cloud Monitoring dashboards for preemption trends.

Use Instance Templates

Define Instance Templates for consistent Spot VM configuration. Templates can include startup scripts, Persistent Disks, and metadata. Managed Instance Groups use templates for automatic scaling and self-healing. Updates are done via rolling updates without manual VM configuration.

Google Cloud Spot VMs Comparison

vs. AWS EC2 Spot: Both offer similar discounts (up to 90%). AWS Spot has variable prices based on market demand, GCP Spot has fixed discounts. AWS offers Spot Fleet for automatic bidding across instance types. GCP Spot VMs have simpler pricing models without bidding.

vs. Azure Spot VMs: Azure Spot has variable pricing like AWS with max-price limits. GCP Spot has fixed discounts (91%) without variable prices. Azure offers eviction policies (deallocate vs delete), GCP VMs are always stopped. Preemption rates are similar across all three providers.

vs. On-Demand VMs: Spot VMs offer identical performance at 10-20% of the cost. The trade-off is preemption risk and no SLA. For production workloads with SLA requirements, on-demand VMs are necessary. Hybrid setups combine both for optimal cost-availability balance.

Integration with innFactory

As a Google Cloud partner, innFactory supports you in migrating cost-intensive batch workloads to Spot VMs, implementing checkpointing strategies, and architecting fault-tolerant systems. We help with Managed Instance Group setup, hybrid architectures, and cost optimization for your compute workloads.

Spot VMs - Google Cloud Preemptible Computing

What are Google Cloud Spot VMs?

Common Use Cases

Batch Processing and Data Pipelines

CI/CD Builds and Test Environments

Rendering and Transcoding

Scientific Simulations and HPC

Machine Learning Training

Best Practices

Implement Checkpointing

Use Shutdown Scripts

Distribute Workloads Across Zones

Combine Spot and On-Demand VMs

Monitor Preemption Rates

Use Instance Templates

Google Cloud Spot VMs Comparison

Integration with innFactory

Available Tiers & Options

Spot VMs

Typical Use Cases

Technical Specifications

Frequently Asked Questions

What are Google Cloud Spot VMs?

When are Spot VMs terminated?

How do Spot VMs differ from Preemptible VMs?

Which workloads are suitable for Spot VMs?

How can I handle Spot VM preemptions?

Can I combine Spot VMs with regular VMs?

What is the average preemption rate?

Can I use Persistent Disks with Spot VMs?

Are Spot VMs GDPR-compliant?

Quick Links

Google Cloud Partner

Comparable Products from Other Clouds

Ready to start with Spot VMs - Google Cloud Preemptible Computing?