While there are many benefits to leveraging the cloud for HPC, there are challenges as well. Along with security and cost, data handling is consistently identified as a top barrier. Data requirements vary by application, datasets are often large, and accessing data across even the fastest WAN connection is orders of magnitude slower than accessing it locally. In this short article, we discuss the challenge of managing data in hybrid clouds, offer some practical tips to makes things easier, and explain how automation can play a key role in improving efficiency.
Diverse storage and data movement solutions in the cloud
As is the case with local clusters, there are a variety of cloud storage options including block storage, object storage, and elastic and parallel file systems. Unlike compute instances, storage pricing is usually complex. Pricing is usually tiered, and depends on multiple factors including data volume, bandwidth, latency, quality-of-service, and data egress costs.
Just as there are many storage options, there are many approaches to moving, caching and replicating data. For small datasets, users might use simple utilities such as rcp or scp and have the workload manager stage data in advance. Open-source rsync provides a simple way to keep local and remote file systems synchronized. Customers can use cloud-specific solutions such as AWS DataSync (for synchronizing local NFS files to AWS EFS or AWS S3), AWS FSx for Lustre or hybrid cloud file systems that employ connectors and caching such as Elastifile and Microsoft Avere vFXT.
Some practical tips to managing data in HPC Hybrid Clouds
As you devise your strategy for data handling in the cloud, here are some suggestions that can help you arrive at a better more cost-efficient solution.
Intelligent data handling is the key challenge
While there are a variety of cloud storage technologies and caching and synchronization solutions, the key challenge is controlling these components at runtime to deliver the best service at the lowest cost in a manner that is transparent to users.
Navops Launch simplifies and optimizes the financial aspects of HPC deployments across multiple clouds with fast, workload-driven provisioning that sizes cloud footprint dynamically based on application demand. A built-in automation engine and applet facility allows for dynamic marshalling of cloud storage services. By combining application-related metrics from Univa Grid Engine with usage and cost information extracted from the cloud provider, user-defined applets can make decisions at runtime related to data locality, data movement, and optimizing performance and cost.