HPC 2022/02 Summer Kickstart

Last modified by juhaheli@helsinki_fi on 2024/02/08 06:49

Part of the Scientific Computing in Practice lecture series at Aalto University.

Audience: All FGCI consortium members looking for the HPC crash course.


About the course


The FGCI wide kickstart for all FGCI consortium members. We’ll have support representatives from several Universities. Most of material will be common for all the participants and in addition we organize breaking rooms for different sites (= sort of parallel sessions) when needed.

Overall, it is a three day kickstart for researchers to get started with the available computational resources at FGCI and CSC. The course is modular, see the tentative schedule below. On the day one we start with the basic HPC intro, go through the available resources at CSC and then switch to the FGCI sites practicalities. The days two and three we cover one by one steps on how to get started on the local computational clusters. Learning by doing.

By the end of the course you get the hints, ready solutions and copy/paste examples on how to find, run and monitor your applications, and manage your data. In addition to how to optimize your workflow in terms of filesystem traffic, memory usage etc.

Aalto users note: the course is obligatory for all new Triton users and recommended to all interested in the field.


Times


Time, date: Wed 2.3, Thu 3.3 11:50-16:00 EEST

Place: Online: Zoom link is TBA

Lecturering by: Aalto Science IT and CSC people

Registration: registration link

The daily schedule is flexible, below is the tentative plan. There will be frequent breaks. You will be given time to try and ask, it’s more like an informal help session to get you started with the computing resources.

BTW, HPC stands for High Performance Computing.


Day #1 (Wed 2nd Feb)

Day #2 (Thu 3rd Feb)

All times approximate, breaks every hour.


Day #3 (Fri 4th Feb)

All times approximate, breaks every hour.

  • 11:50 – 13:00: Simple parallelization with array jobs

    Array jobs

  • 13:00 – 14:00: Using more than one CPU at the same time

    Parallel computing

  • 14:00 – 14:30: Laptops to Lumi, Jussi Enkovaara, CSC

    You now know of basics of using a computing cluster. What if you need more than what a university can provide? CSC (and other national computing centers) have even more resources, and this is a tour of them.

  • 14:40 – 15:30: Running jobs that can utilize GPU hardware

    GPU computing

  • 15:30 – 16:00: Questions to presenters


Cost: Free of charge for FGCI consortium members including University of Helsinki employees and students.


Course prerequisite requirements and other details specific to University of Helsinki

Participants will be provided with access to Kale & Turso for running examples. Participants are expected to have SSH client installed. You can use VDI for convenient access point.

ssh kale.grid.helsinki.fi

    • Or, if your username is not the same in kale and in the machine you are running the ssh client (possible e.g. when using VPN), with command 

ssh username@kale.grid.helsinki.fi

    • If you are connecting from Windows 10, you should be able to install ssh client from the software store.  In earlier Windows versions, you need to install Putty (Side note: putty dot org is an advertisement site trying to get you to install something else).  For using graphical programs, you might need to install X server on Windows.  It is far easier to just use VDI in that case.
    • There are some differences in the directory structures as compared e.g. Aalto.   At University of Helsinki:


$PROJ points to a project directory /proj/$USER/ $WRKDIR points to the working directory /wrk/users/$USER/$HOME is a home directory intended for only profile files. Cache and other data should be redirected to $WRKDIR or $PROJ as appropriate.

    • Make sure that you have Aalto software repositories. You can add the repositories by loading the fgci-common module by command:
module load fgci-common
  • In the interactive exercises, the Triton user guide instructs to use 'interactive' queue.  In kale there is no special queue for that, instead you should use 'short' queue for the interactive course jobs.

Storage

  • $WRKDIR is based on Lustre, and optimized for high throughput, low latency workloads, and has a large capacity. See more details from a Lustre User Guide $WRKDIR is always local to a cluster.  It is not backupped, and it is intended for storing the temporary files of your analysis.  You should remove the files at the end of each run, if possible.  Whenever there is shortage of space, the old files will be removed by the admins, so $WRKDIR is not a long term data storage, or safe by any means.
  • $PROJ is based on NFS, and therefore much slower, however it offers good place for binaries, source codes etc that need backup. $PROJ has much greater latencies, and much lower throughput, and is not suitable for runtime datasets. $PROJ is shared between all clusters.

Other

Logical View of HY Clusters

Additional course info at: it4sci <at> helsinki <dot> fi