How to use Eddie: the ECDF Linux compute cluster

More about Eddie please see here. Its command is based on the Grid Scheduler.

Login

Open the terminal and input:

ssh <YOUR UUN>@eddie.ecdf.ed.ac.uk

to login to the home directory.

WARNING

As mentioned by the IT staff, Eddie of UoE does not support the Remote-ssh login in VScode extension due to security issues. But it is fine to use the scp or sftp extension to only manage the file system.

Folder Path

This is the shared group space in UoE.
Eddie:

/exports/<COLLEGE>/eddie/<SCHOOL>/groups/<GROUP NAME>/

DataStore:

/exports/<COLLEGE>/datastore/<SCHOOL>/groups/<GROUP NAME>/
TIP

All-caps words in < > should be replaced with your own choices. DataStore is a network file storage service.

Basic Commands on Eddie

To start an interactive session run:

qlogin
# The default maximum runtime is 48 hours. The default resources: 1 core, 1 GB memory.

qlogin -l h_vmem=1G
# request memory for an interactive session

qlogin -pe gpu 1
# request 1 gpu (need for torch and memory usage is at least 2GB)

qlogin -q staging
# choose the staging environment

Staging data (transfer data between Eddie and DataStore):

qsub stagein/out.sh
# stagein/out.sh file content can be seen in Eddie handbook (need UUN login)
TIP

It is also possible to sync the files between local computer and Eddie directly using scp or sftp.

To view current running jobs

qstat

To cancel a job,

qdel <JOB IDENTIFIER>

Detailed information about completed jobs is available with the qacct command:

qacct -j <JOB IDENTIFIER>

Set up the Anaconda Environment

Add path and create a virtual environment

Configure the path for your environments directory, i.e., the directory where all your conda environments will be stored, and packages (pkgs) directory used as a cache for your conda packages. By default cache for packages will be placed under ~/.conda/pkgs and will very likely cause your home directory to go over quota as a result. We recommend using the group space you have used for your conda environments instead. Note that this should be an existing directory, so you need to create it first if it doesn’t already exist, before running the command below.

module load anaconda # version 5.0.1

Skip this step if already created the virtual environment

conda config --add envs_dirs /exports/<COLLEGE>/eddie/<SCHOOL>/groups/<GROUP NAME>/anaconda/envs

conda config --add pkgs_dirs /exports/<COLLEGE>/eddie/<SCHOOL>/groups/<GROUP NAME>/anaconda/pkgs

Once you have configured your envs_dirs and pkgs_dirs, you can create an environment. The example below creates an environment specifying particular Python versions (also the version used for this project):

conda create -n %env_name% python=3.9

To list your conda environments:

conda info --envs

Activate your environment

Switch to the created environment and deactivate the environment when no longer needed:

source activate %env_name%
source deactivate

Install required packages

conda install pip
pip --no-cache-dir install -r requirements.txt
# `--no-cache-dir` is to ensure torch installed successfully

Some package management commands:

conda list
# To see what packages are in the active environment
conda search matplotlib
# To search for a package in the Anaconda repositories
conda install matplotlib=1.4.1
# To install a package e.g. matplotlib version 1.4.1

Using pip

If a package is not available in the Anaconda repositories it can still be installed with pip in the usual way. First, install pip into your active Anaconda environment. You must install pip into your environment first before running a pip command. Otherwise, pip install will use the pip installed in the Anaconda directory (which users do not have write access to) and your installation will fail. Once you have installed pip in your environment, then you can install a package using pip into your environment:

conda install pip
pip install matplotlib

Set up R Environment

Configure R packages path

module load cmake/3.5.2
# some packages may need this to be loaded first
module load igmm/apps/R/4.1.3
export R_LIBS=/exports/<COLLEGE>/eddie/<SCHOOL>/groups/<GROUP NAME>/rlibs

Install required packages in R console

install.packages("tidyverse")

Run Code on Eddie

Through job submission

Example

qsub run_job.sh
# job submission files are available in shell folder

Through interaction session

First start an interaction session, specify max run time, memory usage (gpu) and activate the environment then run according to the following:

Run Jupyter notebook

jupyter nbconvert --to notebook --inplace --execute jupyter-notebook.ipynb

Note that nbconvert will fail if any cell takes longer than 30s to run, then add the following option to disable timing

--ExecutePreprocessor.timeout=-1

View jupyter notebook ouptputs

nbpreview jupyter-notebook.ipynb



    Enjoy Reading This Article?

    Here are some more articles you might like to read next:

  • tmux: a fancy terminal tool for running jobs on compute server
  • Summary of R commands
  • Installing R in Conda Env and easy use with VScode
  • Summary of Conda commands
  • Understanding ChatGPT