Open OnDemand: AISURREY¶
OOD sessions on AISURREY always start from an Apptainer image. According to the current policy, GUI interactive sessions are restricted to the debug partition, unlike OOD on eureka2, where users can choose any partition to start interactive apps.
While interactive sessions are limited to the debug partition, jobs can still be submitted to other partitions using the Job Composer menu.
The maximum session time on the debug partition is 4 hours. Due to this limitation, OOD interactive apps are unsuitable for running full experiments but are ideal for code development and debugging tasks, utilizing rich GUI-based tools such as VS Code and JupyterLab.
Using Custom Apptainer images¶
The general policy on the AISurrey cluster is to always run your code inside an Apptainer image. This approach improves reproducibility, reduces the need for additional tools and libraries on the cluster simplifying administration and maintenance. As a result, running experiments directly in a bare Conda environment is not recommended.
By default, OOD interactive apps on AISurrey run within a standard Apptainer image. However, if you need to debug code inside a specific Conda environment, you must launch the interactive app (e.g., VS Code or JupyterLab) using your custom Apptainer image, which automatically activates your chosen Conda environment.
To do this, specify a custom Apptainer image when starting the interactive app, ensuring that the required tools code-server for VS Code or JupyterLab are pre-installed.
The following screenshot illustrates launching a VS Code interactive app session with the following resources: a 1-hour runtime, 1 CPU core, 1 GPU, and 8GB of RAM. The session is initiated using a custom .sif Apptainer image stored in the user’s directory.
Screenshot of the OOD VS-Code myinteractive sessions.¶
Below are two example .def files for building custom Apptainer images to use with VS Code and JupyterLab in OOD interactive sessions:
The following .def file builds an Apptainer image that creates a Conda environment named apptenv, installs PyTorch and several other packages, and automatically activates this environment when the container starts. In addition, it installs code-server, which is required for running VS Code through OOD in this environment.
Bootstrap: docker
From: continuumio/anaconda3
%post
# Update package list and install prerequisites
apt-get update
# Create the Conda environment
conda create -y --name apptenv
# Activate the environment and install additional packages
/opt/conda/bin/conda run -n apptenv /bin/bash -c "pip install --upgrade pip"
/opt/conda/bin/conda run -n apptenv /bin/bash -c "pip install torch torchvision pytorch-lightning numpy"
# Ensure the environment is activated by default in interactive shells
echo "source /opt/conda/etc/profile.d/conda.sh && conda activate apptenv" >> /etc/profile
# Install code-server
wget https://github.com/coder/code-server/releases/download/v4.95.3/code-server_4.95.3_amd64.deb
apt install -y ./code-server_4.95.3_amd64.deb
rm code-server_4.95.3_amd64.deb
# Clean up to reduce image size
apt-get clean && rm -rf /var/lib/apt/lists/*
%environment
# Set up the environment for the container
source /opt/conda/etc/profile.d/conda.sh
conda activate apptenv
%labels
# Add custom metadata to the container.
Version v0.0.1
When you start VS Code via OOD using this custom Apptainer image, the apptenv environment should be activated in the VS Code terminal. You can then verify GPU availability through PyTorch with the following simple code snippet, which displays the names of the GPUs and the number of available CPU cores:
import os
num_cpu_cores = os.cpu_count()
print(num_cpu_cores)
import torch
num_gpus = torch.cuda.device_count()
print("Num_GPUS="+str(num_gpus))
# Print the name of each available GPU
for i in range(num_gpus):
gpu_name = torch.cuda.get_device_name(i)
print(f"GPU {i}: {gpu_name}")
The following .def file is similar to the one used for starting code-server in an Apptainer image, but it replaces code-server installation with JupyterLab. This enables you to use the OOD JupyterLab interactive app within the custom Apptainer environment.
Bootstrap: docker
From: continuumio/anaconda3
%post
# Update package list and install prerequisites
apt-get update
# Create the Conda environment
conda create -y --name apptenv
# Activate the environment and install additional packages
/opt/conda/bin/conda run -n apptenv /bin/bash -c "pip install --upgrade pip"
/opt/conda/bin/conda run -n apptenv /bin/bash -c "pip install torch torchvision pytorch-lightning numpy jupyterlab"
# Ensure the environment is activated by default in interactive shells
echo "source /opt/conda/etc/profile.d/conda.sh && conda activate apptenv" >> /etc/profile
# Clean up to reduce image size
apt-get clean && rm -rf /var/lib/apt/lists/*
%environment
# Set up the environment for the container
source /opt/conda/etc/profile.d/conda.sh
conda activate apptenv
%labels
# Add custom metadata to the container.
Version v0.0.1
You can use the same code snippet we used to verify GPU access in OOD VS Code to check GPU access with JupyterLab. To confirm that the apptenv Conda environment is automatically activated, open the JupyterLab launcher and start a Terminal. As shown in the screenshot, apptenv should be active by default. When you run the code snippet in a Jupyter notebook, you will see the number of available CPU cores along with the number of GPUs and their types.
Screenshot of the OOD JupyterLab laucher.
Screenshot of the OOD JupyterLab terminal.
Screenshot of running the code snippet on OOD JupyterLab noteebook.