In an earlier article, we compared the big three container technologies for HPC (Docker, Singularity, and Shifter). In a subsequent article, we provided a tutorial explaining how to use Univa Grid Engine (UGE) to manage Docker workloads. In this tutorial, we’ll take a similar look at SyLab’s Singularity, and show how to run Singularity containers on a UGE cluster.
Singularity is a cross-platform open-source platform for managing containerized workloads. While Singularity and Docker are similar in many respects, the architectures and file formats are different. While Docker images are comprised of multiple layers managed by the Docker system, Singularity stores a container image in a single file. This means that Singularity images can be treated much like binary executables by a Linux user. You can learn about the Singularity Image Format (SIF) here.
While there are drawbacks to this approach (pulling the latest Singularity image from a registry may take a little longer for example), the Singularity approach is simpler, and side steps security concerns in Docker where images run under the docker daemon, usually configured by default to run with root privileges.
While Singularity has its registry called Singularity Hub Singularity users can also pull container images from directly from Docker Hub and save them in Singularity’s native SIF format.
If you don’t have a UGE cluster, you can setup a Grid Engine in a few different ways. One way is to request a free demo copy from Univa at this link. An automated e-mail responder will point you to zipped files including documentation that will help you install a cluster locally or on your favorite cloud. If you have an Amazon Web Services (AWS) account, you can also install UGE via the AWS Marketplace here .
Univa Grid Engine runs across multiple Linux distributions and machine architectures. We have assumed Grid Engine is deployed on a CentOS cluster in our examples below.
Chances are that Singularity won’t be installed on your compute hosts by default, so the first step is to install it.
Installing Singularity is a little trickier than installing Docker. Part of the reason is that the platform is evolving quickly, and there are multiple installation methods with subtle differences between versions. Sylabs provides good installation instructions for each release in their online documentation.
For our purposes, we installed the latest stable 3.0 release. Singularity has been significantly re-written in Go and C languages, so you will need to install pre-requisites packages including the Go runtime before you can install singularity. When installing Singularity in an HPC environment, it is helpful to have a single script so that users can install software via ssh or similar tools. The script below reliably installs Singularity on our CentOS 7 UGE cluster deployed via the AWS marketplace. This script differs slightly from the Sylabs provided instructions because the CentOS 7 AMI was missing a few Linux facilities required for Singularity.
After running the script above to install Singularity on each Grid Engine host, you can verify it is installed and running:
The singularity build command extracts an image from Docker Hub and creates a Singularity format image. In the example below, we extract the latest Ubuntu image from Docker Hub and build a Singularity format image (ubuntu.sif) in our working directory. You can use the same approach to pull and create any Docker container image.
Unlike Docker, Singularity has no built-in facility for image management. There is no command in Singularity analogous to docker images. In Singularity, images are treated as files. From a Grid Engine perspective, this is helpful because it means that Singularity jobs can be managed like any other workload.
Like other container platforms, Singularity is complex under the covers, but there are only a few commands that users need to know to get started and run containers. Some examples are provided below.
Open a shell into a Singularity container:
To run a shell inside a Singularity container, Singularity provides a shell sub-command. In the example below, we open the image we just created (ubuntu.sif) and run a command to show the operating system running in the container. Note that container is reporting Ubuntu 18 04 proving that the shell executes in the container and not on the CentOS UGE compute host.
Run a command in a Singularity container:
The exec sub-command is used to run a single command inside a Singularity container. Below we use the exec command to run “cat /etc/os-release” to report on the OS release rather than typing the command into a shell.
Running a Singularity container directly:
Similar to Docker, Singularity containers can have default entrypoints. In Singularity, these are referred to as runscripts. For example, a MySQL container would typically run a script that would parse command line arguments and start MySQL services. When you create a Singularity image from a Docker image, the Singularity container will inherit any default entrypoints. If there is no entrypoint, a symbolic link (singularity) in the root directory of the container will point to a default runscript that starts a shell and executes any arguments passed on the command line.
You can explore how this works by opening a shell into a Singularity container and examining the file /singularity and the default runscript that it points to:
When Singularity starts a container using the run subcommand, the default runscript parses the remainder of the command line and runs the command inside the container. In this example, the runscript would parse the command line and run the command sleep 10.
Singularity provides a convenient shorthand. If we simply run the container from the command line, this is equivalent to running singularity run <command>. When the container is executed the runscript is called automatically.
Unlike Docker environments where containers execute under control of a Docker daemon, Singularity containers are run directly by a user. Assuming Singularity is installed on each compute host, we can submit a Singularity container and the command we want to run inside the container as a Grid Engine job. In the example below the container is stored in /share, an NFS filesystem accessible to all cluster hosts.
The output of the Grid Engine job is placed in the file <jobname>.o<job-id> in the user’s home directory. Inspecting the output file we see that the command ran inside the Ubuntu container on a UGE compute host.
As we mentioned previously, running a container directly invokes the container’s runscript and is equivalent to “singularity run container.sid <arguments>”. The simplified example below shows how we can submit the container as a job directly to UGE, passing arguments that are executed by the containers runscript. Whether the runscript accepts arguments depends on the container and how it was created.
A problem with our simple examples above is that they only work if Singularity is installed on each compute host. If a container is dispatched to a host without the Singularity runtime, the job will fail. We can use a Boolean resource in UGE to address this issue.
The qconf -mc command in Grid Engine allows us to add a new resource to what is referred to as the complex configuration. Running this command requires root or Grid Engine administrator privileges. The command opens up an editor and us to enter a new resource to the list. In the example below we add a Boolean resource called singularity and make it requestable, but not consumable (because multiple singularity containers can run on the same host at the same time). The full list of resources is abbreviated in the figure below. When the editor contents are saved, singularity should be added to the complex entry list.
Next, for each cluster host that has singularity installed, we run qconf -me <hostname> to set the singularity value to TRUE for each compute host. When we save the editor contents, changes are registered with the qmaster process.
We can verify that the host resource for singularity is visible using the qhost -F command to check resource availability on each host.
With host resources configured, we can specify singularity as a hard resource requirement on the qsub command line (-l switch) to ensure that containers are only scheduled to hosts that have Singularity installed.
Univa brings many advantages to Singularity workloads. UGE helps simplify application deployments by providing sophisticated management and monitoring of containerized and non-containerized workloads allowing applications to co-exist and share resources efficiently. Navops Launch extends these capabilities enabling automated deployment of Singularity capable clusters and seamless bursting of Singularity containerized workloads to a variety of public or private clouds.
In July of 2018, Univa announced a partnership with Sylabs Inc. that will see Univa strengthen the integration between Singularity and Univa products to deliver more seamless management of Singularity containers on Univa Grid Engine clusters.