High-Performance Computing Cluster in Aachen

Download Report

Transcript High-Performance Computing Cluster in Aachen

High-Performance Computing Cluster in Aachen
Christian Kocks
April 3, 2012
Typical Simulation Workflow
Simulator development
Simulation
www.kommunikationstechnik.org
Evaluation and visualization of results
Christian Kocks
April 3, 2012
Slide 2
www.kommunikationstechnik.org
Workflow for Simulations on HPC Cluster
•
•
•
•
•
•
•
Develop simulator on local PC / server
Perform short tests on local PC / server
Transfer simulator to HPC cluster (Subversion!)
Perform short tests on HPC cluster using the Linux login shell
Enqueue simulations on HPC cluster
Wait for notification e-mail from HPC cluster
Transfer results to local PC / server (e.g. with WinSCP using SCP
protocol)
• Evaluate and visualize results on local PC
HPC
SCP
High-Performance Computing
Secure Copy
Christian Kocks
April 3, 2012
Slide 3
Local Simulations vs. HPC Cluster Simulations
Local simulations
HPC cluster simulation
Execution of simulations can be directly controlled
Simulations must be enqueued
Very low setup time necessary to start simulation
Some preparation necessary for running simulations
Number of available resources very limited
“Unlimited” resources
Number of parallel jobs limited
“Unlimited” number of parallel jobs
Arrangements with colleagues necessary
No arrangements necessary
total simulation time
www.kommunikationstechnik.org
local
HPC cluster
number of simulations
Christian Kocks
April 3, 2012
Slide 4
Using Matlab on HPC Cluster
• Connect to Linux login shell using SSH client (e.g. PuTTY):
• Server name: cluster-linux.rz.rwth-aachen.de
• Load Matlab modules:
module load MISC
module load matlab
• Start Matlab:
www.kommunikationstechnik.org
matlab -nodisplay -nodesktop -nosplash -nojvm
-logfile job.log -c @lic-serv.uni-due.de
• Alternative: use graphical remote session (with NX client)
Christian Kocks
April 3, 2012
Slide 5
Using Subversion on HPC Cluster
• Use Subversion module svntest for first tests
• Check out Subversion module (e.g. svntest) from KT server:
svn checkout https://134.91.99.160/svn/svntest/trunk [dst]
• Update local working copy:
svn update
• Add file test.m:
www.kommunikationstechnik.org
svn add test.m
• Commit changes to KT server:
svn commit
Christian Kocks
April 3, 2012
Slide 6
Sample Queue Script File
https://versions1.kt.uni-due.de/svn/lib/trunk/simulators/sim_awgn_matlab/sample_simple.sh
#!/usr/bin/env zsh
#BSUB -J sim_awgn_matlab
#BSUB -o sim_awgn_matlab.%J
#BSUB -e sim_awgn_matlab.e%J
#BSUB -W 0:20
#BSUB -M 512
#BSUB -u [email protected]
#BSUB -N
#BSUB -n 2
#BSUB -a openmp
#
#
#
#
#
#
#
#
#
job name
job output (use %J for job id)
error output
hard limits in hours:minutes
memory in MB
e-mail address for notification
enable e-mail notification
request number of compute slots
use esub for OpenMP/shared memory jobs
www.kommunikationstechnik.org
### load matlab modules
module load MISC
module load matlab
### change to the work directory
cd $HOME/svn/lib/simulators/sim_awgn_matlab
### run matlab
matlab -nodisplay -nodesktop -nosplash -nojvm -logfile job.log
-c @lic-serv.uni-due.de <sim_awgn_matlab_run.m
Christian Kocks
April 3, 2012
Slide 7
Sample Queue Script File – Advanced
https://versions1.kt.uni-due.de/svn/lib/trunk/simulators/sim_awgn_matlab/sample_advanced.sh
#!/usr/bin/env zsh
#BSUB -J sim_awgn_matlab
#BSUB -o sim_awgn_matlab.%J
#BSUB -e sim_awgn_matlab.e%J
#BSUB -W 0:20
#BSUB -M 512
#BSUB -u [email protected]
#BSUB -N
#BSUB -n 2
#BSUB -a openmp
#
#
#
#
#
#
#
#
#
job name
job output (use %J for job id)
error output
hard limits in hours:minutes
memory in MB
e-mail address for notification
enable e-mail notification
request number of compute slots
use esub for OpenMP/shared memory jobs
www.kommunikationstechnik.org
### load matlab modules
module load MISC
module load matlab
### change to the work directory
cd $HOME/svn/lib/simulators/sim_awgn_matlab
### run matlab
matlab -nodisplay -nodesktop -nosplash -nojvm -logfile job.log -c @lic-serv.uni-due.de <<EOF
sim_awgn_matlab('ebn0', 6.78952, 'ModulationOrder', 2, 'Log', 'true', 'NumSymbols', 1000,
‚Filename', 'sample_advanced');
EOF
Christian Kocks
April 3, 2012
Slide 8
Job Management
• Enqueue a job:
bsub <myscript.sh
• Query unfinished jobs:
bjobs
• Kill unfinished job:
www.kommunikationstechnik.org
bkill [job ID]
Christian Kocks
April 3, 2012
Slide 9
General Hints for using HPC Cluster
www.kommunikationstechnik.org
• Ask Mrs. Tiedtke from Uni DuE for HPC cluster account
• Write the simulator in a way to allow the execution of multiple small
simulations instead of one long simulation
• Collect simulation parameters and results in MAT file
• Deactivate all graphical outputs
• Read “HPC Primer” for further information on using the cluster
• Visit http://www.rz.rwth-aachen.de/aw/cms/rz/Themen/~mem/hochleistungsrechnen
Christian Kocks
April 3, 2012
Slide 10
www.kommunikationstechnik.org
High-Performance Computing Cluster in Aachen
Demonstration…
Christian Kocks
April 3, 2012
Slide 11