Running R on ManeFrame II

Types of Nodes on Which R is to Run

First, you must identify the type of compute resource needed to run your calculation. In the following table the compute resources are delineated by resource type and the expected duration of the job. The duration and memory allocations are hard limits. Jobs with calculations that exceed these limits will fail. Once an appropriate resource has been identified the partition and Slurm flags from the table can be used in the following examples.

Partition	Duration	Cores	Memory [GB]	Example Slurm Flags
development (1 standard-mem, 1 MIC, 1 P100)	2 hours	various	various	`-p development`
htc	1 day	1	6	`-p htc --mem=6G`
standard-mem-s	1 day	36	256	`-p standard-mem-s --mem=250G`
standard-mem-m	1 week	36	256	`-p standard-mem-m --mem=250G`
standard-mem-l	1 month	36	256	`-p standard-mem-l --mem=250G`
medium-mem-1-s	1 day	36	768	`-p medium-mem-1-s --mem=750G`
medium-mem-1-m	1 week	36	768	`-p medium-mem-1-m --mem=750G`
medium-mem-1-l	1 month	36	768	`-p medium-mem-1-l --mem=750G`
medium-mem-2	2 weeks	24	768	`-p medium-mem-2 --mem=750G`
high-mem-1	2 weeks	36	1538	`-p high-mem-1 --mem=1500G`
high-mem-2	2 weeks	40	1538	`-p high-mem-2 --mem=1500G`
mic	1 week	64	384	`-p mic --mem=374G`
gpgpu-1	1 week	36	256	`-p gpgpu-1 --gres=gpu:1 --mem=250G`
v100x8	1 week	1	20	`-p v100x8 --gres=gpu:1 --mem=20G`
fp-gpgpu-2	various	24	128	`-p fp-gpgpu-2 --gres=gpu:8 --mem=120G`
fp-gpgpu-3	various	40	384	`-p fp-gpgpu-3 --gres=gpu:2 --mem=370G`

Running R Interactively with RStudio

The RStudio graphical user interface can be run directly off of ManeFrame II (M2) compute nodes using the HPC OnDemand Web Portal.

Running R Interactively with Jupyter

Initial Setup

Log into the cluster using SSH or the HPC OnDemand Web Portal’s “Shell Access” and run the following commands at the command prompt.
module load python/3 to enable access to Anaconda.

3. conda create -y -n jupyter_r -c conda-forge jupyterlab r-irkernel to install Jupyter and R.

Usage from the HPC OnDemand Web Portal

Follow the HPC OnDemand Web Portal’s JupyterLab instructions, where the “Additional environment modules to load” contains “python/3” and “Custom environment settings” contains “source activate ~/.conda/envs/jupyter_r”.

Running R Non-Interactively in Batch Mode

R scripts can be executed non-interactively in batch mode in a myriad ways depending on the type of compute resource needed for the calculation, the number of calculations to be submitted, and user preference. The types of compute resources outlined above. Here, each partition delineates a specific type of compute resource and the expected duration of the calculation. Each of the following methods require SSH access. Examples can be found at /hpc/examples/r on M2.

Submitting an R Job to a Queue Using Wrap

An R script can be executed non-interactively in batch mode directly using sbatch’s wrapping function.

Log into the cluster using SSH and run the following commands at the command prompt.
module load r to enable access to R.
cd to the directory with R script.
sbatch -p <partition and options> --wrap "R --vanilla < <R script file name>" where <partition and options> is the partition and associated Slurm flags for each partition outlined in the table above. and <R script file name> is the R script to be run.
squeue -u $USER to verify that the job has been submitted to the queue.

Example:

module load r
sbatch -p standard-mem-s --exclusive --mem=250G --wrap "R --vanilla < example.R"

Submitting an R Job to a Queue Using an sbatch Script

An R script can be executed non-interactively in batch mode by creating an sbatch script. The sbatch script gives the Slurm resource scheduler information about what compute resources your calculation requires to run and also how to run the R script when the job is executed by Slurm.

Log into the cluster using SSH and run the following commands at the command prompt.
cd to the directory with R script.
cp /hpc/examples/r/r_example.sbatch <descriptive file name> where <descriptive file name> is meaningful for the calculation being done. It is suggested to not use spaces in the file name and that it end with .sbatch for clarity.
Edit the sbatch file using using preferred text editor. Change the partition and flags and R script file name as required for your specific calculation.

#!/bin/bash
#SBATCH -J R_example                   # Job name
#SBATCH -o example.txt                 # Output file name
#SBATCH -p standard-mem-s              # Partition (queue)
#SBATCH --exclusive                    # Exclusivity
#SBATCH --mem=250G                     # Total memory required per node

module purge                           # Unload all modules
module load r                          # Load R, change version as needed

R --vanilla < example.R                # Edit R script name as needed

sbatch <descriptive file name> where <descriptive file name> is the sbatch script name chosen previously.
squeue -u $USER to verify that the job has been submitted to the queue.

Submitting Multiple R Jobs to a Queue Using a Single sbatch Script

Multiple R scripts can be executed non-interactively in batch mode by creating a single sbatch script. The sbatch script gives the Slurm resource scheduler information about what compute resources your calculations requires to run and also how to run the R script for each job when the job is executed by Slurm.

Log into the cluster using SSH and run the following commands at the command prompt.
cd to the directory with the R script or scripts.
cp /hpc/examples/r/r_array_example.sbatch <descriptive file name> where <descriptive file name> is meaningful for the calculations being done. It is suggested to not use spaces in the file name and that it end with .sbatch for clarity.
Edit the sbatch file using preferred text editor. Change the partition and flags, R script file name, and number of jobs that will be executed as required for your specific calculation.

#!/bin/bash
#SBATCH -J R_example                   # Job name
#SBATCH -p standard-mem-s              # Partition (queue)
#SBATCH --exclusive                    # Exclusivity
#SBATCH --mem=250G                     # Total memory required per node
#SBATCH -o R_example_%A-%a.out         # Job output; %A is job ID and %a is array index
#SBATCH --array=1-2                    # Range of indices to be executed

module purge                           # Unload all modules
module load r                          # Load R, change version as needed

R --vanilla < array_example_${SLURM_ARRAY_TASK_ID}.R
# Edit R script name as needed; ${SLURM_ARRAY_TASK_ID} is array index

sbatch <descriptive file name> where <descriptive file name> is the sbatch script name chosen previously.
squeue -u $USER to verify that the job has been submitted to the queue.

Support

Navigation

Parts

Sections

Engage