What is Parallelstore?
Parallelstore is a fully managed, high-performance parallel file system from Google Cloud. It targets high performance computing as well as AI and ML workloads that require extreme IOPS and high throughput at low latency. The parallel file system enables many clients to access the same data concurrently while preserving data integrity.
Conventional storage options often become a bottleneck for data-intensive training and simulations because GPUs and CPUs wait for data. Parallelstore addresses this with high aggregate throughput and low latency, which shortens training and computation times. The system is built on local SSDs with 2+1 erasure coding and is therefore designed as fast scratch storage for temporary workloads.
Core Features
- High throughput and IOPS: Performance scales per TiB with around 1.15 GiBps read and 0.5 GiBps write throughput, plus roughly 30,000 read and 10,000 write IOPS per TiB.
- Low latency and concurrency: About 0.3 ms latency for 4K reads and support for up to 4,000 concurrent client processes.
- POSIX compliance and integration: The file system is POSIX-compliant and mounts to Compute Engine VMs and Google Kubernetes Engine (GKE) through a CSI driver.
- Fast Cloud Storage transfer: Batch data transfer to and from Cloud Storage at up to 20 GiBps or 5,000 files per second.
Typical Use Cases
AI and ML training: Training large models requires delivering extensive datasets quickly and in parallel to many accelerators. Parallelstore reduces GPU idle time and thereby shortens training duration.
High performance computing: Simulations and scientific computations need concurrent access from many nodes to shared data. The parallel file system provides the aggregate throughput required at low latency.
Scratch storage for batch jobs: Compute-intensive pipelines use Parallelstore as fast intermediate storage. Data can be loaded from Cloud Storage at high speed, processed, and results written back.
Benefits
- High throughput and IOPS for data-intensive HPC and AI/ML workloads
- Shorter training and computation times through reduced accelerator idle time
- Seamless integration with Compute Engine and GKE via a CSI driver
- Availability in EU regions for data residency requirements
Integration with innFactory
As a certified Google Cloud Partner, innFactory supports you with the adoption and operation of this service.
Typical Use Cases
Frequently Asked Questions
What is Parallelstore?
Parallelstore is a fully managed, high-performance parallel file system from Google Cloud. It targets high performance computing as well as AI and ML workloads that require extreme IOPS and high throughput at low latency. The system is POSIX-compliant and supports concurrent multi-client access.
When should I use Parallelstore?
Parallelstore fits AI/ML training and inference, HPC simulations, and compute-intensive batch jobs where many clients access the same data concurrently. Because the system runs on local SSDs with 2+1 erasure coding, it is designed for temporary scratch data rather than as durable primary storage.
How much does Parallelstore cost?
Parallelstore is billed on provisioned capacity rather than used storage. Billing is per GiB in one-second increments. You can find the exact per-region rates on the official Google Cloud pricing page.
What throughput and capacity does Parallelstore offer?
Usable capacity ranges from 12 TiB to 100 TiB. Performance scales per TiB: around 1.15 GiBps read and 0.5 GiBps write throughput, roughly 30,000 read and 10,000 write IOPS, and about 0.3 ms latency for 4K reads. It supports up to 4,000 concurrent client processes.
