Building a Customized Conda Environment

Last updated July 05, 2023

Anaconda is a package and environment manager primarily used for open-source data science packages for the Python and R programming languages. The Conda module is available on CARC, users do not need to install it themselves. 1. Request an interactive session

The login node is meant for login purposes only and has process limits.

It is a good practice to request an interactive session for package installation. The following example code requests one GPU, 8 CPU cores, and 32GB memory in the gpu partition with a time limit of 1 hour.

[user@discovery1 ~]$ salloc --partition=gpu --gres=gpu:1 --cpus-per-task=8 --mem=32GB --time=1:00:00 
salloc: Pending job allocation 15731446
salloc: job 15731446 queued and waiting for resources
salloc: job 15731446 has been allocated resources
salloc: Granted job allocation 15731446
salloc: Waiting for resource configuration
salloc: Nodes a02-15 are ready for job
[user@a02-15 ~]$ 

Change the resource requests (--cpus-per-task=8 --mem=32GB --time=1:00:00) as needed. 2. Load a Conda module

Once you have been granted the resources and logged in to a compute node, load the Conda module:

module purge
module load conda

This module is based on the minimal Miniconda installer. Included in all versions of Anaconda, Conda is the package and environment manager that installs, runs, and updates packages and their dependencies. This module also provides Mamba, which is a drop-in replacement for most conda commands that enables faster package solving, downloading, and installing. 3. Initialize shell to use Conda and Mamba

Modifies your ~/.bashrc file so that Conda and Mamba are ready to use every time you log in (without needing to load the module):

mamba init bash
source ~/.bashrc 4. Create a virtual environment & install packages

Create new Conda environments in one of your available directories. By default, the packages will be installed in your home directory under /home1/<user_name>/.conda/envs/. Conda environments are isolated project environments designed to manage distinct package requirements and dependencies for different projects. We recommend using the mamba command for faster package solving, downloading, and installing. However, you can use the conda command, with various options, to install and inspect Conda environments.

The process for creating and using environments has three basic steps: 4.1 Create an environment with mamba create

To create a new Conda environment in your home directory, enter:

mamba create --name <env_name>

<env_name> is whatever name you choose for your environment. All packages installed within this new environment are saved under /home1/<user_name>/.conda/envs/<env_name> 4.2 Activate the environment with conda activate or mamba activate.
mamba activate <env_name> 4.3 Install packages into the environment with mamba install.

Once the Conda environment has been activated, install packages into that environment. For example, you can install the PyTorch package within your environment by following the PyTorch documentation. For other data science packages, refer to their websites for the latest package installation instructions in data-science-packages page.

Check each package’s website documentation for installation instructions. Sometimes installation instructions change—the most updated processes will be on the package website.

mamba install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia 5. Verify the software installation

Once you have installed all your data science packages, verify your installation by typing python in the interactive session and import relevant packages for testing.

(env_name) [user@a02-15 ~]$ python
Python 3.11.4 | packaged by conda-forge | (main, Jun 10 2023, 18:08:17) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
>>> print('Using device:', device)
Using device: cuda

After you verify the software installation, type exit() to exit Python within your Conda environment.

>>> exit()

To deactivate an environment, enter:

conda deactivate

Then enter exit in the shell. This returns you to the login node:

(env_name) [user@a02-15 ~]$ exit
salloc: Relinquishing job allocation 15731446
[user@discovery1 ~]$ Create a new environment in /project

Create a new environment in your project directory instead by using the --prefix option. For example:

mamba create --prefix /project/<project_id>/<env_name>

<project_id> is your project’s account ID of the form <pi_username>_<id>.

Then activate the environment:

conda activate /project/<project_id>/<env_name>

To view a list of all your Conda environments, enter:

mamba env list

To remove a Conda environment, enter:

mamba env remove --name <env_name>