Table of Contents |
---|
...
|
Technical Details
System Component | Configuration |
---|---|
Compute Nodes | |
CPU Type | AMD EPYC 7352 |
4
Sockets | 2 |
Cores/socket | 24 |
Clock speed | 2.3 GHz |
Memory | 256 GB RAM |
Local Storage | 512GB Micron 1300 SSD (/scratch) |
Memory Bandwidth | 409.6 GB/s |
System | |
Total Compute Nodes | 44 |
Total Compute Cores | 2,112 |
Total Memory | 11.6 TB |
Total Storage | 342 TB |
Interconnect | Mellanox Infiniband EDR |
Link bandwidth | 100 Gb/s |
MPI Latency | 1.64 µs |
Systems Software Environment
Software | Description |
---|---|
Operating System | CentOS Linux 7.9 |
Cluster Management | Scyld Clusterware 11.0 |
Compilers | AOCC, GCC, Clang, Go |
Parallel Frameworks | Open MPI, MVAPICH2, Sandia OpenSHMEM |
System Access
To use the Lotus cluster, you must first request an account on the system. Rhodes College faculty, students, and staff should submit a request using the following form for gaining access to the cluster:
Rhodes HPC Cluster Access Request Form
Non-Rhodes users must have a guest researcher account that is sponsored by a Rhodes faculty or staff member.
Access to Lotus is through the Secure Shell (SSH) to:
lotus.arc.rhodes.edu
For on campus users:
From your terminal window at the prompt , type the following (not including the $ and replacing the “user” with your username) to log in!
$ ssh user@lotus.arc.rhodes.edu
For off-campus users:
Direct SSH access is not permitted from off-campus. Users may either use a VPN to access the cluster (and then SSH in), or can login to a virtual desktop at http://desktops.rhodes.edu and use PuTTY to access the cluster. For more information on using these resources, see the Getting Started information.
Notes:
When you login to lotus.arc.rhodes.edu
you will be directed to either lotus-login01
or lotus-login02
. These machines are identical in hardware and software configuration.
You may add your SSH public key to ~/.ssh/authorized_keys
to enable password-less login using ECDSA, RSA, and ed25519 key types. Please ensure that your private keys are secured with a strong local password. You can use ssh-agent to avoid having to repeatedly type your private key password.
Hosts which attempt to connect very frequently (many times per second) may be blocked temporarily in order to improve system security. If you are blocked, wait 15 minutes and try again.
Modules
The cluster provides the modules
system for loading specific software packages and environments. Module commands can update your shell environment to automatically find optional tools, compilers, and libraries that you may need to support your application. Modules also provide a flexible mechanism for maintaining several versions of the same software or specific combinations of dependent software packages. New modules can be added upon request.
To list all of the available modules on the system, use the following command:
module available
To load a specific specific module you can use the load command:
module load mvapich2
This would load the MVAPICH2 MPI library into your environment, replacing any other version of MPI that was previously configured. Running a module command only affects the current running shell. You may wish to add specific module commands to batch files for submitting jobs or add then to shell configuration files that are read on login (typically .bashrc
or .zshrc
)
Other useful module commands are listed below:
Command | Description |
---|---|
| List the modules that are currently loaded |
| List the modules that are available to be loaded |
| Show the environment variables modified by the <module_name> module |
| Load the module <module_name> into the environment |
| Remove the module <module_name> from the environment |
| Replace <mod1> with <mod2> in the environment |
Job Charging and Queue Limits
Currently, the cluster is operating under a free use billing model. There are no explicit time allocations for the cluster or enforced limits on overall usage of the system. This use model is subject to change depending on how usage evolves over time.
This resource is a shared, campus-wide resource. We ask that you use the system in a manner that is consistent with campus community standards and respect the shared nature of the system.
Jobs are subject to the following limits:
Maximum wall clock time for a single job is 48 hours
Jobs may request up to the max number of cores on the system (2,112)
Jobs may request up to the max number of nodes on the system (44)
Users may have at most 128 jobs queued at a time
Queued jobs may be preempted to support priority jobs (e.g. a paper deadline) or for emergency maintenance.
Compiling
All hosts in the cluster have access to GNU, AOCC (AMD), and Clang compilers along with multiple MPI implementations (OpenMPI and MVAPICH2). The default compiler is GCC 10.2.0, which has been compiled with AMD Rome specific optimizations (-march=znver2). GCC and AOCC compilers can be configured to generate Advanced Vector Extensions 2 (AVX2). Using AVX2, up to eight floating point operations can be executed per-cycle per-core. AVX2 is not enabled by default and is enabled by setting the appropriate compiler flags.
Using GCC
The GNU GCC compiler family can be loaded with the module system (it is loaded by default):
module load gcc
To compile a program with the GNU toolchain use the following commands:
Serial | MPI | OpenMP | MPI+OpenMP | |
---|---|---|---|---|
Fortran |
|
|
|
|
C |
|
|
|
|
C++ |
|
|
|
|
To compile your programs with AVX extensions, compile with the -march=core-avx2
compiler flag. You will probably want to use this in conjunction with normal optimization flags (i.e. -O3
)
For more information on the GNU compilers, check the manual pages:
man gcc
or man g++
or man gfortran
Using AOCC (AMD compiler)
The AMD Optimizing C/C++ Compiler (AOCC) is available and can be loaded with the module system:
module load aocc
To compile a program with the AMD toolchain use the following commands:
Serial | MPI | OpenMP | MPI+OpenMP | |
---|---|---|---|---|
Fortran |
|
|
|
|
C |
|
|
|
|
C++ |
|
|
|
|
Running Jobs on Lotus
Running programs on the cluster is done by interacting with the job scheduling system. Lotus uses the SLURM job scheduler for managing both batch jobs and interactive runs. You should not run computationally intensive tasks on the login nodes – use the compute nodes.
If you have special needs for running jobs on the cluster, please contact the cluster support staff to help. Submitting large quantities of jobs (esp. short jobs) can impact overall scheduler response for all users.
Requesting Interactive Resources
You can request an interactive session by using the srun
command.
srun --pty --nodes=2 --ntasks-per-node=48 -t 30:00 --wait=0 /bin/bash
This command requests two full compute nodes with 48 cores each (for a total of 96 cores) for 30 minutes. When this request is granted, you will automatically be logged into the assigned node and can work normally. If you would like to run a parallel program from within the interactive job you can use srun
without any options:
srun myprog
Submitting Batch Jobs
To submit a job using a batch file, create a short text file in the style of the following examples, updating where necessary to reflect your program and parameters. You can add additional SBATCH
lines to send email notifications (--mail-user
), etc. see man sbatch
for more information. To submit your job, use the sbatch
command:
sbatch jobfile
The jobfile
is the file you create and contains the SLURM resource specifications and shell commands. Several examples are provided below.
MPI Job
This job runs on 2 compute nodes with 48 cores each (for a total of 96 cores), each core is assigned a single MPI rank.
Code Block | ||
---|---|---|
| ||
#!/bin/bash
#SBATCH --job-name="hellompi"
#SBATCH --output="hellompi.%j.%N.out"
#SBATCH --partition=compute
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=48
#SBATCH --export=ALL
#SBATCH -t 01:30:00
srun ./hello_mpi |
OpenMP Job
This job requests a single compute node and uses 48 threads for all OpenMP parallel sections. OpenMP (non-hybrid) will only work when all processes are on the same node (i.e. --nodes must be 1).
Code Block | ||
---|---|---|
| ||
#!/bin/bash
#SBATCH --job-name="hello_openmp"
#SBATCH --output="hello_openmp.%j.%N.out"
#SBATCH --partition=compute
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=48
#SBATCH --export=ALL
#SBATCH -t 01:30:00
export OMP_NUM_THREADS=48
./hello_openmp |
Hybrid MPI-OpenMP Job
This job requests 2 nodes and 96 total processors. This will launch 2 MPI ranks per node (total of 4 MPI processes), with each process using 24 OpenMP threads.
Code Block | ||
---|---|---|
| ||
#!/bin/bash
#SBATCH --job-name="hellohybrid"
#SBATCH --output="hello_hybrid.%j.%N.out"
#SBATCH --partition=compute
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --ntasks=4
#SBATCH --export=ALL
#SBATCH -t 01:30:00
export OMP_NUM_THREADS=24
srun --cpus-per-task=$OMP_NUM_THREADS ./hello_hybrid |
SLURM No-Requeue Option
SLURM will requeue jobs if there is a node failure of if your job is preempted. In some cases, this may cause input or output files to be overwritten that should be preserved. You may request that your job not be automatically re-queued by adding the following line to your batch file:
Code Block | ||
---|---|---|
| ||
#SBATCH --no-requeue |
Monitoring Job Status
Users can monitor their jobs using the squeue
command.
Code Block |
---|
[user1@lotus-login01]$ squeue -u user1
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
256556 compute hellompi user1 R 0:03:57 4 compute[01-02]
256555 compute hellompi user1 R 0:14:44 4 compute[03-04] |
This shows two jobs that are currently running in the compute partition and which compute nodes they are assigned to. You can use additional options to customize your display:
-i <interval>
repeats every interval seconds-j<joblist>
shows information for specific jobs
Users can cancel jobs using the scancel
command:
[user1@lotus-login01]$ scancel <jobid>
Storage Considerations
Lotus has a single storage server with a total of 504TB of disk in a ZFS RAID-Z2 filesystem. This filesystem is the primary storage location for all data on the cluster. Programs that perform a lot of file I/O operations in parallel may have poor performance with this storage design. Lotus does not have any storage that uses a parallel filesystem.
Each compute node has access to 512GB of local SSD storage which can be used for check-pointing and programs that will benefit from local fast storage. The latency for local SSD access is several orders of magnitude lower than accessing the shared network filesystem. Users may use the /scratch
filesystem on each compute node for temporary storage. Scratch storage space will be reclaimed after your job completes.
Software
Users may request that new software packages be added to the cluster if they may benefit multiple users or research groups. If you would like specific software installed, please contact the research computing support staff.