Summary and Setup
Outlining the course
- Targeted audience (see learner profiles: New HPC users, RSE with users on HPC systems, researchers in HPC.NRW)
- Estimated length and recommended formats (e.g. X full days, X * 2 half days, in-person/online, live-coding)
- Course intentions (focus on learners perspective!):
- Speed up research (efficient computations, more per time, shorter iteration times, “less in the way”)
- Improve batch utilization through matching application requirements to requested hardware (minimal resource requirements, maximum resource utilization)
- Convey intuition about job-sizes. What is considered large, what small?
- Sharpen awareness for importance to avoid wasting time/energy on a shared system
- Teach common concepts and terms of performance
- First steps into performance optimizations (cluster-, node-, application level)
- Well defined context:
- Working on HPC Systems (Batch system, shared file systems, software modules, …)
- Performance of jobs
- Application performance is touched (related to job efficiency), but in-depth is outside of the scope. Next steps point towards deeper performance analyses
Learning Objectives
After attending this training, participants will be able to:
- Explain efficiency in the context of HPC systems
- Use batch system tools and third party tools to measure job efficiency
- Discern between worse and better performing jobs
- Describe common concepts and terms related to performance on HPC systems
- Identify hardware components involved in performance considerations
- Achieve first results in performance optimization of their application
- Recall next steps to take towards learning performance optimization
Prerequisites
Prerequisite
- Access to an HPC system
- Example workload setup
- Basic knowledge of HPC systems (batch systems, parallel file systems) – being able to submit a simple job and understand what happens in broad terms
- Knowledge of tools to work with HPC systems:
- Bash shell & scripting
- ssh & scp
- Simple slurm jobscripts and commands like
srun
,sbatch
,squeue
,scancel
Example Workload & Setup
Example workload that:
- Has some instructive performance issues that can be discovered, e.g.
- Mismatch between requested resources in job script and used resources
- Memory leak or unnecessary allocation with a quick fix? Either triggers OOM or just wasting resources, dependent on side and default memory/core
- No vectorization?
- Parallelism issues?
- Uncover several performance issues in layers, one after the other?
- Software that can run on CPU and GPU, to discuss both with the example
- Should be easy to download, compile, run, and be understood (readability)
- Meaningful workflow for batch processing
- Using commonly used programming languages / libraries in HPC
- Will likely not show all performance issues that could exist, only used as a vehicle to follow a narrative with particular performance issues
We are still looking, but are considering:
- Kuwahara filter
- Raytracer
- Simple agent-based simulation game
- (Benchmarks are portable, but they don’t really show performance issues and are often complex)
HPC Access
You will need access to an HPC cluster to run the examples in this lesson. Discuss how to find out where to apply for access as a researcher (in general, in EU, in Germany, in NRW?). Refer to the HPC Introduction lessons to learn how to access and use a compute cluster of that scale.
- Executive summary of typical HPC workflow? Or refer to other HPCC courses that cover this
- “HPC etiquette”
- E.g. don’t run benchmarks on login node
- Don’t disturb jobs on shared nodes
- Setup of example for performance studies
Common Software on HPC Systems
Working on an HPC system commonly involves a
- batch system to schedule jobs (e.g. Slurm, PBS Pro, HTCondor, …), a
- module system to load certain versions of centrally provided software and a
- way to log in to a login node of the cluster.
To login via ssh
, you can use on (remove this since it’s
discussed in HPC introduction?)
- PuTTY
-
ssh
in PowerShell
-
ssh
in Terminal.app
-
ssh
in Terminal
Acknowledgements
Course created in context of HPC.NRW.