What is Amazon FSx for Lustre?
Amazon FSx for Lustre is a fully managed, high-performance parallel file system based on the open-source Lustre file system. It is designed for workloads that require extremely fast access to large data sets: high-performance computing, machine learning training, and media processing.
FSx for Lustre integrates seamlessly with Amazon S3. Data in S3 is made available as files in the Lustre file system without needing to copy it first. After processing, results can be automatically written back to S3.
Core Features
- Sub-Millisecond Latency: Parallel file system with consistently low access times
- S3 Integration: Automatic loading of data from S3 on first access and write-back after processing
- Scratch and Persistent: Temporary file systems for short-term processing or persistent for long-running workloads
- Scalability: Scalable from a few GB/s to hundreds of GB/s throughput
- POSIX-Compatible: Standard POSIX file system interface for existing Linux applications
Typical Use Cases
Machine Learning Training: ML training jobs require fast access to large datasets. FSx for Lustre delivers the performance needed to fully utilize GPU clusters instead of waiting on I/O.
High-Performance Computing: Simulations, genome sequencing, and scientific computations benefit from the parallel file system with hundreds of GB/s throughput across thousands of parallel accesses.
Video Rendering: Film and media production uses FSx for Lustre for rendering pipelines that need simultaneous access to large media assets.
Benefits
- Fully managed without Lustre administration overhead
- Seamless S3 integration as a high-performance cache
- Consistent performance even under high parallelism
- Flexible deployment options for temporary and permanent workloads
Integration with innFactory
As an AWS Reseller, innFactory supports you with Amazon FSx for Lustre: HPC architecture design, S3 integration for ML pipelines, performance optimization, and hybrid storage configuration.
Typical Use Cases
Frequently Asked Questions
What is Amazon FSx for Lustre?
Amazon FSx for Lustre is a fully managed parallel file system based on the open-source Lustre file system. It delivers sub-millisecond latencies and throughput rates of hundreds of GB/s for compute-intensive workloads.
How does the S3 integration work?
FSx for Lustre can be linked directly to an S3 bucket. Files are automatically loaded from S3 on first access (lazy loading) and can be written back to S3 after processing. The file system serves as a high-performance cache for S3 data.
What performance does FSx for Lustre offer?
FSx for Lustre delivers up to hundreds of GB/s throughput and millions of IOPS. Scratch file systems provide 200 MB/s per TiB, while persistent file systems offer configurable throughput rates from 125 to 1,000 MB/s per TiB.