Last modified by Xwiki VePa on 2024/02/08 08:16

Show last authors
1 \\
2
3
4
5 {{toc/}}
6
7 = (% style="color: rgb(51,153,102);" %)0.0 A Very Short Course to A Batch Scheduling(%%) =
8
9 (% style="color: rgb(51,51,51);" %)The first thing you need is user permission to the clusters. These are handled with [[IDM group management>>url:https://wiki.helsinki.fi/display/it4sci/Ukko2+User+Guide#Ukko2UserGuide-1.0Access||style="color: rgb(51,51,51);" shape="rect"]].
10
11 (% style="color: rgb(51,51,51);" %)Welcome, and nice to see you got here. [[I take it you have seen what we provide?>>doc:it4sci.IT for Science group.Resources for Research.WebHome]] If not, please do have a look. Now, let me introduce you to a novel little utility called batch scheduling. Very old invention but batch schedulers are in use in most, if not all supercomputers over the world and you're bound to meet one eventually if you are planning to work with computational challenges. They come in many flavors, but fundamentals are the same. So, let's get started by logging into Kale:
12
13 {{{ssh turso.cs.helsinki.fi}}}
14
15 We could just start our little process in the host we just logged in, but should everyone do that, nobody would get very far. In fact, we'd struggle to get anywhere really. Let's try something else instead...
16
17 (% style="color: rgb(0,51,0);" %)/usr/bin/srun hostname
18
19 {{{}}}
20
21 What just happened? Command hostname was run in some host called ukko2-05. Yes, that was the compute node of a cluster.
22
23 **Translation:** launch (srun) and request (default resources) and execute the 'hostname' command in the cluster compute node.
24
25 Actually, this equal to doing this with ssh:
26
27 {{{ssh username@yourhost.name.here hostname}}}
28
29 Why would you then need a batch thing? Well, there is more to srun than meets the eye. Let's try this:
30
31 {{{/usr/bin/srun -c8 --mem=2M -t1-0 hostname}}}
32
33 What is the significance of this awkward looking spell? You can allocate resources to the task at hand (//In short, you can ask for specified amounts of memory, CPU's, and time//). Srun handles the process placement, so you actually do run the command with 8 cores and not with just one while seven others idle out. The hostname is just not the most scalable of processes so you do not get a significant performance gain. Try to do that with ssh.
34
35 **Translation:** launch (//srun//) and request (//8 cores (-c)//), 2MB of memory (//-mem//) for a 1 day (//-t//)) to execute the 'hostname' command in the cluster compute node, or finish when the command has been executed, or become terminated if you exceed what you asked for. 8 cores and a day to run hostname is a bit of overkill but let's not think about that now.
36
37 You could use pseudo-terminal instead of hostname and have shell opened on the login node:
38
39 {{{/usr/bin/srun -c1 --mem=2M -t1-0 --pty bash}}}
40
41 We're not done yet - and my program takes a day, not just a second or two, so what now? //I could start a screen and then execute that there, and log out...// Wait! Now I am drifting a bit. Hold on. L(% style="letter-spacing: 0.0px;" %)et's, make this a bit more convenient and see what we come up with (//pick an editor you like, I prefer vi, others do prefer other editors//):
42
43 {{{vi job.script.sh}}}
44
45 Then type the following:
46
47
48
49
50
51
52
53 \\
54
55 {{{#!/bin/bash#SBATCH -o /wrk/users/myusernamehere/output.txt#SBATCH -c8#SBATCH --mem=2M#SBATCH -t1-0srun hostnamesrun sleep 600}}}
56
57 (% style="letter-spacing: 0.0px;" %) Well, that wasn't (%%)really more compact, wasn't it? Now, let's see if we do the following:
58
59 {{{sbatch job.script.sh}}}
60
61 Here's an interesting thing. You could now end your session and log out. Your job would go about doing its business without your further input. This allows you to send jobs into the system in batches of one or more. Hence we called it a batch job. You could ask it to send you an e-mail once it's done if you'd like.
62
63 Output is written to the output.txt file.
64
65 **Translation:** sbatch command sends the script into a queue waiting for execution, and then srun executes the command in the resources you have requested. This is called submitting a batch job (//every line with #SBATCH is a resource reservation, lines after that, well that is good old Unix//). -o option defines the output file, in this case, output.txt.
66
67 **Finally, for the bigger picture...**
68
69 Slurm does not replace Linux functionalities. In fact, Slurm has two parts. One part (sbatch script) creates the boundaries of the playground, second (srun) makes sure your toys are placed in the right places in the playground.
70
71 What one does inside the playground is entirely up to the boundaries set by Linux and creativity. All functions, environment variables, etc. are working as they would in any ordinary shell. The difference is that if you exceed the playground boundaries in any way, your right to play will be terminated.
72
73 The only limitation is, that Slurm allows ordinarily a process to run once, after which it considers the process successfully completed and then terminates the surrounding playground. This makes it non-trivial to play with daemons, or processes set to operate in a similar fashion.
74
75 **Congratulations!** You have just finished the short course for batch scheduling! Was that hard, was it? [[There is plenty more you can do...>>doc:it4sci.IT for Science group.Internal Documentation.HPC Environment.Archive.OBSOLETE - Ukko2 User Guide.WebHome]]
76
77 = (% style="color: rgb(51,153,102);" %)0.1 How to install software(%%) =
78
79 Well, let's make one thing straight. You may have already thought about it, or asked about it but you cannot gain root privileges to install readily packaged software to a cluster as you could on your laptop. However, you can install a lot of software under $WRKDIR or $PROJ directory. You can of course use your own codes, and vast majority of other software you do come up with does not require privileges to install. Also remember that in a tricky looking cases Conda, or Python virtual environments are your dearest friends.
80
81 To make it easier, and to maintain different versions, you can and should use [[Environment Modules>>url:http://modules.sourceforge.net/||shape="rect"]]. That way you can run your work independently from system libraries. Here is our simplified [[guide>>doc:it4sci.IT for Science group.HPC Environment User Guide.Module System.WebHome]] to get you going. Building modules manually is really easy.
82
83 Administrators install new modules system-wide when the software is needed by multiple users, and when they are related to development, and when they have time to do that. They cannot provide marginally used, or eccentric software packages and modules for every need.
84
85 = (% style="color: rgb(51,153,102);font-size: 24.0px;letter-spacing: -0.01em;text-transform: uppercase;" %)1.0 Python VirtualENV(%%) =
86
87 (% style="color: rgb(51,51,51);" %)Python virtualenv is the most common software package that is requested and extremely easy to create and customize to your specific needs. Here's the example to get you around the creation of Python virtualenv. Simple, efficient, and customizable. Remember that if you wish make sure that the pip caches end up in the right places, set ~-~-cache-dir -flag to the right location.
88
89 (% style="color: rgb(0,51,0);" %)**Note:** (% style="color: rgb(0, 51, 0); color: rgb(255, 0, 0)" %)We use the default module versions in the examples. You may wish to use some other than the default, or define the version you wish to use. Also, note that default module versions are bound to change in time.
90
91 >1. module purge2. module load Python3. cd $PROJ4. python -m venv myVirtualEnv5. source myVirtualEnv/bin/activate6.  pip install ~-~-cache-dir <cache directory> <package you need>7. pip install <//any additional packages you wish to install...//>
92
93 When you are done for the day
94
95 >deactivate
96
97 ... and when ready to start again
98
99 > source myVirtualEnv/bin/activate
100
101 Oh, yes. Last but not least. Your [[environment is indeed inherited>>url:https://wiki.helsinki.fi/display/it4sci/Ukko2+User+Guide#Ukko2UserGuide-3.1CreatingaBatchjob||shape="rect"]] when you start a batch job, or interactive session on the compute nodes. This includes Python virtualenv.
102
103 = (% style="color: rgb(51,153,102);" %)2.0 Tensorflow(%%) =
104
105 Since TensorFlow is very common, here are instructions how you can create your own Tensorflow installation by using Python virtualenv above. Additionally, we'll go through how to use it in interactive mode and through the batch system, and also a sample program to get you off the ground. If you do not specify the versions of modules, you will be loading the defaults, which are the newest available. If in doubt, or not sure about dependencies, you can always have a clean slate by purging modules and then load a correct set. Note that if you need to relocate pip cache from the default directory of ~~/ to someplace else, you can use the ~-~-cache-dir option.
106
107 You cannot leave out the cache -option because the default location at $HOME is restricted by quota. Leaving the cache -option out will cause error messages about missing files, etc. Your cache directory can be in $WRKDIR, or $PROJ as per your discretion, albeit we do recommend $WRKDIR.
108
109 **Note:** (% style="color: rgb(255,0,0);" %)Use Python 3.6. It seems that Tensorflow does not work well with 3.7.
110
111 >1. module purge2. module load Python cuDNN3. cd $PROJ4. python -m venv  myTensorflow5. source myTensorflow/bin/activate6.  pip ~-~-cache-dir </wrk-[vakka|kappa]/users/<username>> install tensorflow7. pip (% style="color: rgb(122,134,154);" %)~-~-cache-dir </wrk-[vakka|kappa]/users/<username>> (%%)install tensorflow-gpuYou can then add additional libraries etc..8. pip (% style="color: rgb(122,134,154);" %)~-~-cache-dir </wrk-[vakka|kappa]/users/<username>> (%%)install keras9. pip (% style="color: rgb(122,134,154);" %)~-~-cache-dir </wrk-[vakka|kappa]/users/<username>> (%%)install <//any additional packages you wish to install...//>
112
113 (% style="margin-left: 30.0px;" %)
114 == 2.1 Running Example ==
115
116 (% style="margin-left: 30.0px;" %)
117 Here's an example batch script for you to use with TensorFlow, assuming that you have followed the directions above to create the virtual environment and wish to evoke it in the batch script. Note that because the module environment variables are inherited when the batch job is submitted, and hence module purge to make sure that you load only the modules required.
118
119 >
120 >
121 >
122 >[vakka|kappa]
123 >
124 >
125 >
126 >
127 >
128 >
129 >
130 >
131 >
132 >
133
134 {{{#!/bin/bash############## This section states the requirements the job requires:#SBATCH --job-name=test#SBATCH --workdir=#SBATCH -o result.txt#SBATCH  -p gpu#SBATCH -c 2#SBATCH --gres=gpu:1#SBATCH --mem-per-cpu=100############## Here starts the actual UNIX commands and payload: module purgemodule load Python cuDNNsource myTensorflow/bin/activatesrun tensorflow-program}}}
135
136 (% style="margin-left: 30.0px;" %)
137 Interactive use example (//gives a default wall time. This can be altered by -t<time in format [DD]-[hh][mm][ss]>//):
138
139 >(% style="color: rgb(112,112,112);" %)~-~-gres=gpu:1 (% style="color: rgb(112, 112, 112); color: rgb(112, 112, 112)" %)-pgpu ~-~-pty bash
140
141 {{{srun -c 1 --ntasks-per-core=1 }}}And once your session on the node starts, you can do the regular unix -things:
142
143 >
144 >
145 >
146
147 {{{module purgemodule load Python cuDNNsource myTensorflow/bin/activatesrun tensorflow-program}}}
148
149 (% style="margin-left: 30.0px;" %)
150 == 2.2 TensorFlow Example ==
151
152 (% style="margin-left: 30.0px;" %)
153 Here's a simple program [[attach:it4sci.OBSOLETE - Scientific Software FAQ@script.py]] example you can use to determine that your environment works as expected.
154
155 >
156 >
157 >
158 >
159 >
160 >
161 >
162 >
163 >
164 >
165 >
166 >
167 >
168
169 {{{Import `tensorflow` import tensorflow as tf# Initialize two constantsx1 = tf.constant([1,2,3,4])x2 = tf.constant([5,6,7,8])# Multiplyresult = tf.multiply(x1, x2)# Intialize the Sessionsess = tf.Session()# Print the resultprint(sess.run(result))# Close the sessionsess.close()}}}
170
171 == 2.3 Additional Tensorflow References ==
172
173 [[https:~~/~~/www.tensorflow.org/install/install_sources#tested_source_configurations>>url:https://www.tensorflow.org/install/install_sources#tested_source_configurations||shape="rect"]].
174
175 [[https:~~/~~/www.datacamp.com/community/tutorials/tensorflow-tutorial#basics>>url:https://www.datacamp.com/community/tutorials/tensorflow-tutorial#basics||shape="rect"]]\\
176
177 = (% style="color: rgb(51,153,102);" %)3.0 Anaconda(%%) =
178
179 You should refer to[[ Aalto Conda/Anaconda documentation>>url:https://scicomp.aalto.fi/triton/apps/python-conda/||shape="rect"]], since instructions below are outdated.
180
181 \\
182
183 Marking these for removal as obsolete:
184
185 --Anaconda is not available as a module and we strongly recommend that you install Anaconda by yourself to your home directory. The reason for this is to give you better control over the environment and packages. You have the option to use either Python 2.7 or Python 3 as in the examples below.--
186
187 (% style="margin-left: 30.0px;" %)
188 == --3.1 Python2 version is now obsolete. - Python 2.7-- ==
189
190 (% style="margin-left: 30.0px;" %)
191 --First chance to your proper working directory, here we assume that you are using $PROJ:--
192
193 --cd $PROJ--
194
195 {{{}}}
196
197 (% style="margin-left: 30.0px;" %)
198 --Then, get the installation packages for Anaconda:--
199
200 --wget https:~/~/repo.continuum.io/archive/Anaconda2-4.4.0-Linux-x86_64.sh--
201
202 {{{}}}
203
204 (% style="margin-left: 30.0px;" %)
205 --Run the installer:--
206
207 --sh ./Anaconda2-4.4.0-Linux-x86_64.sh -p $PROJ/anaconda2--
208
209 {{{}}}
210
211 (% style="margin-left: 30.0px;" %)
212 --Following line in .bashrc will make Anaconda your default Python. To disable, use #.--
213
214 --echo 'export PATH="$PROJ/anaconda2/bin:$PATH"' >> ~~/.bashrc--
215
216 {{{}}}
217
218 (% style="margin-left: 30.0px;" %)
219 --Remove installation packages:--
220
221 --rm Anaconda2-4.4.0-Linux-x86_64.sh--
222
223 {{{}}}
224
225 (% style="margin-left: 30.0px;" %)
226 --Source your bashrc:--
227
228 --source ~~/.bashrc--
229
230 {{{}}}
231
232 (% style="margin-left: 30.0px;" %)
233 == --3.1 Python 3-- ==
234
235 (% style="margin-left: 30.0px;" %)
236 --First, get the installation packages for Anaconda.--
237
238 --wget https:~/~/repo.continuum.io/archive/Anaconda3-4.4.0-Linux-x86_64.sh--
239
240 {{{}}}
241
242 (% style="margin-left: 30.0px;" %)
243 --Run the installer:--
244
245 --sh ./Anaconda3-4.4.0-Linux-x86_64.sh -p /proj/$USER/anaconda3--
246
247 {{{}}}
248
249 (% style="margin-left: 30.0px;" %)
250 --Following line in .bashrc will make Anaconda your default Python. To disable, use #.--
251
252 --echo 'export PATH="/proj/$USER/anaconda3/bin:$PATH"' >> ~~/.bashrc--
253
254 {{{}}}
255
256 (% style="margin-left: 30.0px;" %)
257 --Remove installation packages:--
258
259 --rm Anaconda2-4.4.0-Linux-x86_64.sh--
260
261 {{{}}}
262
263 (% style="margin-left: 30.0px;" %)
264 --Source your bashrc:--
265
266 --source ~~/.bashrc--
267
268 {{{}}}
269
270 (% style="margin-left: 30.0px;" %)
271 == --3.2 Setting up the environment-- ==
272
273 (% style="margin-left: 30.0px;" %)
274 --You can now create a conda environment for your project and install all the necessary packages to the environment. Note that you can have multiple environments with conflicting requirements without the need to reinstall Anaconda. See Tensorflow Python virtual environment above as an example of a similar use case.--
275
276 (% style="margin-left: 30.0px;" %)
277 --Go to the project directory:--
278
279 --cd $PROJ--
280
281 {{{}}}
282
283 (% style="margin-left: 30.0px;" %)
284 --Create project:--
285
286 --conda create -y -n myProject--
287
288 {{{}}}
289
290 (% style="margin-left: 30.0px;" %)
291 --Activate your project:--
292
293 --source activate myProject--
294
295 {{{}}}
296
297 (% style="margin-left: 30.0px;" %)
298 --Install some additional [[packages>>url:https://anaconda.org/anaconda/repo||shape="rect"]], just like in Python virtualenv:--
299
300 --conda install -y <packages>--
301
302 {{{}}}
303
304 (% style="margin-left: 30.0px;" %)
305 --Or use pip, as you see fit:--
306
307 --pip install <packages>--
308
309 {{{}}}
310
311 (% style="margin-left: 30.0px;" %)
312 --When you need to activate your project again, then just:--
313
314 --source activate myProject--
315
316 {{{}}}
317
318 (% style="margin-left: 30.0px;" %)
319 --It is just that easy. Enjoy. Once you are done, then just:--
320
321 --conda deactivate--
322
323 {{{}}}
324
325 (% style="margin-left: 30.0px;" %)
326 --Sharing your environment--
327
328 (% class="X5LH0c" %)
329 1. --Activate the environment to export:--
330 1. --conda activate myenv. Note. Replace myenv with the name of the environment.--
331 1. --Export your active environment to a new file: conda env export > environment. yml.--
332 1. --Email or copy the exported environment.yml file to the other person.--
333
334 (% style="margin-left: 30.0px;" %)
335 == 3.3 Additional References ==
336
337 (% style="margin-left: 30.0px;" %)
338 [[Conda User Guide>>url:https://conda.io/docs/user-guide/index.html||shape="rect"]]
339
340 (% style="margin-left: 30.0px;" %)
341 [[Machine Learning Anaconda>>url:https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/||shape="rect"]]
342
343 (% style="margin-left: 30.0px;" %)
344 [[Conda Extra Packages>>url:https://anaconda.org/anaconda/repo||shape="rect"]]
345
346 = (% style="color: rgb(51,153,102);" %)4.0 Comsol(%%) =
347
348 For Comsol users, here is a specific example of a batch script when running Comsol in batch mode:
349
350
351
352
353
354
355
356 \\\\\\\\\\\\\\
357
358 {{{#!/bin/bash##SBATCH --workdir=$WRKDIR#SBATCH --time=04:00:00#SBATCH --mem-per-cpu=2048#SBATCH --ntasks=8#### SLURM 8 processor COMSOL test to run for 4 hours.module purgemodule load COMSOL# Change the below path to an actual temporary directory!# To create one on the fly on local node storage, use:# export TMPDIR=$(mktemp -d)export TMPDIR="/path/to/temp/dir"comsol -clustersimple batch -inputfile ${SLURM_JOB_NAME} -outputfile ${SLURM_JOB_NAME//\.mph/}.out.mph -batchlog ${SLURM_JOB_NAME}.${SLURM_JOB_ID}.comsol_log -tmpdir $TMPDIR}}}
359
360 You can run Comsol in the interactive session, but the graphical visualization toolkit is not an optimum method to use cluster resources, and there are some limitations (including the use of OpenGL). Once you start an interactive session, Comsol can be started in the interactive session as follows:
361
362 ~-~-time=04:00:00 ~-~-pty bash
363
364 {{{srun --ntasks=8 }}}
365
366 After the session starts, you can start Comsol as in any other Unix system:
367
368 {{{comsol -3drendr sw}}}
369
370 Note that Comsol will require .comsol, which is created the first time the binary is executed. As a result, the execution of the Comsol binary may end up dumping the core if the .comsol is not present.
371
372 = (% style="color: rgb(51,153,102);" %)5.0 MiniSat(%%) =
373
374 If you need a MiniSat solver the most convenient way is to compile it to your own software repository, for example, $PROJ/MySoftware.
375
376 First, get the sources:
377
378 (% class="nolink" %)http:~/~/minisat.se/downloads/minisat-2.2.0.tar.gz
379
380 {{{wget }}}
381
382 Unpack the sources to the proper place. Then load necessary modules:
383
384 {{{module load GCC zlib}}}
385
386 Set the environment to match the location of sources:
387
388 {{{export MROOT=$PROJ/MySoftware/sources/minisat}}}
389
390 Choose which one to compile:
391
392 {{{cd { core | simp }}}}
393
394 Compile static binary:
395
396 {{{make rs}}}
397
398 After which you can copy the binary to your program folder. If you wish, [[then you can create your own module>>url:https://wiki.helsinki.fi/display/it4sci/Module+System#ModuleSystem-1.2Creatingmodulesforyourownsoftware||shape="rect"]] for the program and even share it with your colleagues.
399
400 = (% style="color: rgb(51,153,102);" %)6.0 CosmoMC(%%) =
401
402 Below you can find a step to step instructions for installing CosmoMC on Ukko2 and Kale. Please note that for Ukko2 the installation location should be in $PROJ. First, load necessary modules - in this case, OpenMPI does include the necessary dependencies:
403
404 {{{ module load OpenMPI/3.1.1-GCC-7.3.0-2.30}}}
405
406 Get CosmoMC:
407
408 [[https:~~/~~/github.com/cmbant/CosmoMC.git>>url:https://github.com/cmbant/CosmoMC.git||shape="rect"]]
409
410 {{{git clone }}}
411
412 Compile CosmoMC from sources:
413
414 {{{cd $PROJ/cosmoMC}}}
415
416 {{{make}}}
417
418 Get COM likelihood code: plc-2.0:
419
420 (% class="nolink" %)http:~/~/irsa.ipac.caltech.edu/data/Planck/release_2/software/COM_Likelihood_Code-v2.0.R2.00.tar.bz2
421
422 {{{wget }}}
423
424 Open package:
425
426 {{{tar xvfj COM_Likelihood_Code-v2.*.tar.bz2}}}
427
428 (% style="letter-spacing: 0.0px;" %)...and compile plc-2.0:
429
430
431 {{{cd plc-2.0}}}
432
433 {{{ ./waf configure --lapack_mkl=${MKLROOT} --install_all_deps --extra_lib m}}}
434
435 And finally, Install:
436
437 {{{ ./waf install}}}
438
439 Don't forget: If you wish, [[you can create your own module>>url:https://wiki.helsinki.fi/display/it4sci/Module+System#ModuleSystem-1.2Creatingmodulesforyourownsoftware||rel="nofollow" shape="rect"]] for the program and even share it with your colleagues.
440
441 = (% style="color: rgb(51,153,102);" %)7.0 SUMO(%%) =
442
443 [[http:~~/~~/sumo.dlr.de/wiki/Installing/Linux_Build>>url:http://sumo.dlr.de/wiki/Installing/Linux_Build||shape="rect"]]
444 \\First, load necessary modules:
445
446 \\
447
448 {{{module load GDAL/2.2.3-foss-2018a-Python-3.6.4module load X11/20180131-GCCcore-6.4.0}}}
449
450 Then change to the proper directory:
451
452 {{{cd $PROJ}}}
453
454 Install 3rd party libraries, FOX Toolkit:
455
456 [[ftp:~~/~~/ftp.fox-toolkit.org/pub/fox-1.6.57.tar.gz>>url:ftp://ftp.fox-toolkit.org/pub/fox-1.6.57.tar.gz||shape="rect"]]
457
458 {{{wget }}}
459
460 {{{cd fox-1.6.57/}}}
461
462 {{{./configure --prefix $PROJ}}}
463
464 {{{make install}}}
465
466 {{{cd $PROJ}}}
467
468 Then time to install SUMO
469
470 [[https:~~/~~/github.com/eclipse/sumo>>url:https://github.com/eclipse/sumo||shape="rect"]]
471
472 {{{git clone --recursive }}}
473
474 {{{cd sumo}}}
475
476 {{{git fetch origin refs/replace/*:refs/replace/*}}}
477
478 {{{export SUMO_HOME="$PROJ/sumo"}}}
479
480 {{{make -f Makefile.cvs}}}
481
482 {{{./configure --prefix $PROJ --with-fox-config=$PROJ/bin/fox-config}}}
483
484 {{{make install}}}
485
486 Then you are done.
487
488 If you encounter an error about version.h (//example: [[https:~~/~~/github.com/eclipse/sumo/issues/3967>>url:https://github.com/eclipse/sumo/issues/3967||shape="rect"]]//), then you have to do:
489
490 {{{python sumo/tools/build/version.py}}}
491
492 And then it should compile fine.
493
494 \\
495
496 Don't forget: If you wish, [[you can create your own module>>url:https://wiki.helsinki.fi/display/it4sci/Module+System#ModuleSystem-1.2Creatingmodulesforyourownsoftware||rel="nofollow" shape="rect"]] for the program and even share it with your colleagues.
497
498 = (% style="color: rgb(51,153,102);" %)8.0 General Notes(%%) =
499
500 (% style="margin-left: 30.0px;" %)
501 == (% style="color: rgb(51,51,51);" %)8.1 IBM ILOG CPLEX(%%) ==
502
503 (% style="margin-left: 30.0px;" %)
504 A free license is available for academic use:
505
506 (% style="margin-left: 30.0px;" %)
507 [[https:~~/~~/www.ibm.com/developerworks/community/blogs/jfp/entry/CPLEX_Is_Free_For_Students>>url:https://www.ibm.com/developerworks/community/blogs/jfp/entry/CPLEX_Is_Free_For_Students||rel="nofollow" shape="rect" class="external-link"]]
508
509
510 (% style="margin-left: 30.0px;" %)
511 == 8.2 Turbomole ==
512
513 (% style="margin-left: 30.0px;" %)
514 If Turbomole is set to use a semi-direct method, it will require fast or very fast IO to provide a performance boost. Conversely, any slowness in I/O will dramatically affect the performance. --This can be overcome by using local fast drives when available, or the creation of RAMdisk for the node. In either case, the content needs to be loaded to the node by sbcast.--  This can be provided with Lustre for high throughput and low latencies, by storing data under /wrk/users/$USER. Turbomole may benefit from exclusive node reservations, and MPI implementation & performance tuning. Hence, slurm special flag may resolve some of the performance-related issues. However, bear in mind that reserving entire nodes exclusively is not a good idea if only minor portion of node core count is used:
515
516 >
517
518 {{{#SBATCH --exclusive}}}
519
520 (% style="margin-left: 30.0px;" %)
521 == 8.3 Matlab ==
522
523 (% style="margin-left: 30.0px;" %)
524 === 8.3.1 Kilosort ===
525
526 (% style="margin-left: 30.0px;" %)
527 GPU based Spike sorting program [[Kilosort>>url:https://github.com/MouseLand/Kilosort||shape="rect"]] provides quite good GPU performance and simple installation. To start, user needs to clone the git repo, and then compile within Matlab development tools. In our case, clone git repo, then load up necessary modules for compilation:
528
529 {{{module load gcccuda/2019b MATLAB/2019a CUDA/10.0.130}}}
530
531 (% style="margin-left: 30.0px;" %)
532 You can then start up matlab in interactive session (//mostly to avoid loading login node too much//), and proceed with compilation:
533
534 {{{matlab -nodisplay -nodesktop}}}
535
536
537 \\
538
539 {{{&#x3e;&#x3e; addpath(genpath('/wrk/$USER/path/here/Kilosort'))&#x3e;&#x3e; cd('/wrk/$USER/path/here/Kilosort/CUDA')&#x3e;&#x3e; run mexGPUall}}}
540
541 (% style="margin-left: 30.0px;" %)
542 Matlab will place your newly compiled packages in:
543
544 {{{/wrk/$USER/path/here/Kilosort/CUDA}}}
545
546 (% style="margin-left: 30.0px;" %)
547 Then you have to alter the startup file (//and perhaps have a specific copy of it elsewhere//) to point to the correct packages:
548
549 {{{/wrk/$USER/path/here/Kilosort/main_kilosort3.m}}}
550
551 (% style="margin-left: 30.0px;" %)
552 === 8.3.2 Phy GUI ===
553
554 (% style="margin-left: 30.0px;" %)
555 For visualization you can use Phy GUI on the Infiniband capable VDI machines which have Kappa $WRKDIR mounted directly on them. They provide an excellent throughput/latencies and pretty good capacity of storage. You can use the [[Conda installation instructions>>url:https://github.com/cortex-lab/phy||shape="rect"]] straight from the box.
556
557 = (% style="color: rgb(51,153,102);" %)9.0 Development Tools(%%) =
558
559 (% style="margin-left: 30.0px;" %)
560 == 9.1 PrgEnv: Python Parallel memory consumption ==
561
562 (% style="margin-left: 30.0px;" %)
563 By default, Parallel uses the Python multiprocessing module to fork separate Python worker processes to execute tasks concurrently on separate CPUs. This is a reasonable default for generic Python programs but it induces some overhead as the input and output data need to be serialized in a queue for communication with the worker processes.
564
565 (% style="margin-left: 30.0px;" %)
566 == 9.2 PrgEnv: MPI Implementations and performance ==
567
568 (% style="margin-left: 30.0px;" %)
569 **IntelMPI** is a commercial MPI implementation developed and supported by Intel. IntelMPI may provide a significant performance boost in certain applications compared to OpenMPI. Based on the 3.1 standard, it does support multiple interconnects.  The free runtime environment is available through Intel: [[https:~~/~~/software.intel.com/en-us/intel-mpi-library>>url:https://software.intel.com/en-us/intel-mpi-library||rel="nofollow" shape="rect" class="external-link"]].
570
571 (% style="margin-left: 30.0px;" %)
572 **MVAPICH2** is another MPI implementation, now available under the BSD-like license. It may provide a significant performance boost with some applications compared to OpenMPI, and in some cases over Intel implementation.
573
574 (% style="margin-left: 30.0px;" class="title" %)
575 == 9.3 PrgEnv: Intel Compilers ==
576
577 (% style="margin-left: 30.0px;" %)
578 Intel Parallel Studio XE compilers are available for free for classroom teaching and student use. [[Please see intel>>url:https://software.intel.com/en-us/parallel-studio-xe||rel="nofollow" shape="rect" class="external-link"]] for details. For others, [[we have two floating licenses>>doc:it4sci.IT for Science group.Resources for Research.WebHome]].
579
580 (% style="margin-left: 30.0px;" %)
581 == 9.4 Profiling ==
582
583 (% style="margin-left: 30.0px;" %)
584 Intel vTune is available through the [[Parallel Studio XE>>url:https://software.intel.com/en-us/parallel-studio-xe||shape="rect"]] license agreement and is the default profiling tool at this time.
585
586 (% style="margin-left: 30.0px;" %)
587 If you wish to do [[I/O profiling, please see details for Lustre>>url:https://wiki.helsinki.fi/display/it4sci/Lustre+User+Guide#LustreUserGuide-6.0AdvancedI/OProfiling||shape="rect"]].
588
589 (% style="margin-left: 30.0px;" %)
590 == (% style="color: rgb(77,77,77);font-size: 20.0px;letter-spacing: 0.0px;" %)9.5 Jupyter(%%) ==
591
592 (% style="margin-left: 30.0px;" %)
593 Please see details about University JupyterHub implementation from the User Guide (//not available yet//).
594
595 (% style="margin-left: 30.0px;" %)
596 [[https:~~/~~/zonca.github.io/2015/04/jupyterhub-hpc.html>>url:https://zonca.github.io/2015/04/jupyterhub-hpc.html||rel="nofollow" shape="rect" class="external-link"]]
597
598 (% style="margin-left: 30.0px;" %)
599 == 9.6 Data Scientists - Julia ==
600
601 (% style="margin-left: 30.0px;" %)
602 The user should install Julia packages locally. They are not yet stable enough to be installed as a global module.
603
604 (% style="margin-left: 30.0px;" %)
605 Julia in HPC environment and parallel use: [[http:~~/~~/www.stochasticlifestyle.com/multi-node-parallelism-in-julia-on-an-hpc/>>url:http://www.stochasticlifestyle.com/multi-node-parallelism-in-julia-on-an-hpc/||rel="nofollow" shape="rect" class="external-link"]]
606
607 (% style="margin-left: 30.0px;" %)
608 Julia sources etc: [[https:~~/~~/github.com/JuliaLang>>url:https://github.com/JuliaLang||rel="nofollow" shape="rect" class="external-link"]]
609
610 (% style="margin-left: 30.0px;" %)
611 Julia vs. Python - optimization: [[https:~~/~~/www.ibm.com/developerworks/community/blogs/jfp/entry/Python_Meets_Julia_Micro_Performance?lang=en>>url:https://www.ibm.com/developerworks/community/blogs/jfp/entry/Python_Meets_Julia_Micro_Performance?lang=en||rel="nofollow" style="text-decoration: underline;" shape="rect" class="external-link"]]
612
613 (% style="margin-left: 30.0px;" %)
614 == 9.7 Environment Modules ==
615
616 (% style="margin-left: 30.0px;" %)
617 You can [[create your own>>url:https://wiki.helsinki.fi/display/it4sci/Module+System#ModuleSystem-1.2Creatingmodulesforyourownsoftware||shape="rect"]] modules.
618
619 (% style="margin-left: 30.0px;" %)
620 == 9.8 AMD Specific issues ==
621
622 (% style="margin-left: 30.0px;" %)
623 Currently, only Carrington is AMD-based, but this may change in time.
624
625 (% style="margin-left: 30.0px;" %)
626 === 9.8.1 Python libgomp Error (//resource temporarily unavailable//) ===
627
628 (% style="margin-left: 30.0px;" %)
629 You may encounter the following issues with Python and libgomb.
630
631 {{{joblib import Parallel, delayed}}}
632
633 (% style="margin-left: 30.0px;" %)
634 Idiom:
635
636 {{{Parallel(n_jobs=num_cores)(delayed(...))}}}
637
638 (% style="margin-left: 30.0px;" %)
639 Resulting error:
640
641 {{{libgomp: Thread creation failed: Resource temporarily unavailable}}}
642
643 (% style="margin-left: 30.0px;" %)
644 In this case, you have to change the MKL_THREADING_LAYER variable either before launching the job (//inherited by the batch/interactive session//) or prior Python threading:
645
646 {{{MKL_THREADING_LAYER=GNU}}}
647
648 = 10.0 Additional Quirks For Advanced Users =
649
650 We are adding here some additional ideas of how you can get the most out of the system. Things listed here are not supported, and it is your own responsibility to create and maintain your own creations. Most things here do require some level of Linux understanding and some more than others. These are not intended for people new to the system.
651
652 (% style="margin-left: 30.0px;" %)
653 == 10.1 Slurm and bash functions ==
654
655 (% style="margin-left: 30.0px;" %)
656 One can do a whole lot with bash functions. However, if you ever thought that it would be nice to be able to start an interactive session directly from your desktop without first logging into the system - for example when you have $WRKDIR mounted to your Linux desktop, you might consider this kind of approach on your .bashrc profile:
657
658 (% style="font-size: 0.9em;font-family: ~"Open Sans~" , Helvetica , Arial , sans-serif , ~"Nimbus Sans L~";letter-spacing: 0.0px;" %)function ukko2-interactive() {
659 ntasks=${1:-8}
660 mem=${2:-32}
661 workdirectory=/wrk/users/$USER
662 ssh -YA $USER@(% style="font-family: ~"Open Sans~", Helvetica, Arial, sans-serif, ~"Nimbus Sans L~"; font-size: 0.9em; letter-spacing: 0px" %)ukko2.cs.helsinki.fi(% style="font-size: 0.9em;font-family: ~"Open Sans~" , Helvetica , Arial , sans-serif , ~"Nimbus Sans L~";letter-spacing: 0.0px;" %) -t "salloc -pshort -t1:00:00 -c${ntasks} ~-~-mem=${mem}G srun ~-~-chdir=$workdirectory -c${ntasks} ~-~-mem=${mem}G ~-~-pty /bin/bash"
663 }
664
665 {{{}}}
666
667 (% style="margin-left: 30.0px;" %)
668 If you wish to try it out, you can add the function to your .bashrc and then export it:
669
670 {{{export -f ukko2-interactive}}}
671
672 (% style="margin-left: 30.0px;" %)
673 Then just call it and there it goes...
674
675
676 (% style="font-size: 0.9em;font-family: ~"Open Sans~" , Helvetica , Arial , sans-serif , ~"Nimbus Sans L~";letter-spacing: 0.0px;" %)salloc: Pending job allocation 4457837
677 salloc: job 4457837 queued and waiting for resources
678 salloc: job 4457837 has been allocated resources
679 salloc: Granted job allocation 4457837
680 bash-4.2$
681
682 {{{ukko2-interactive 1 1}}}
683
684 (% style="margin-left: 30.0px;" %)
685 (% style="letter-spacing: 0.0px;" %)You can, of course, modify this according to what you need, but as basic, it would allow you to adjust several tasks and memory, but time and partition would be predefined. With a little bit of additional work, you could create yourself a batch job submitter which could allow you to create a batch script and submit batch jobs directly from your desktop.
686
687 (% style="margin-left: 30.0px;" %)
688 (% style="letter-spacing: 0.0px;" %)If you need to access the environment from outside the university network, you would need to include a jump host to the above:
689
690 ssh -J $USER@pangolin.it.helsinki.fi
691
692 {{{}}}
693
694 (% style="margin-left: 30.0px;" %)
695 == (% style="letter-spacing: 0.0px;" %)10.2 Code optimization(%%) ==
696
697 (% style="margin-left: 30.0px;" %)
698 You should have a look at [[NLOpt>>url:https://nlopt.readthedocs.io/en/latest/||shape="rect"]] for your code optimization needs. The package is not globally available at this time but you can easily compile it to your $PROJ. We will consider making it publicly available at a later time.
699
700 (% style="margin-left: 30.0px;" %)
701 You can find I/O optimization and profiling hints from [[Lustre User Guide>>url:https://wiki.helsinki.fi/display/it4sci/Lustre+User+Guide#LustreUserGuide-6.0AdvancedI/OProfiling||shape="rect"]].
702
703 (% style="margin-left: 30.0px;" %)
704 == 10.3. Array example: Boosting up Data Transfer ==
705
706 (% style="margin-left: 30.0px;" %)
707 Why not move your files parallel in a batch job?
708
709 (% style="margin-left: 30.0px;" %)
710 ~-~-array specifies the number of batch processes you would like to have. This correlates with the number of lines of the arrayparams.txt where you define the transfers. The script assumes that you use rsync, but this can be modified as required.
711
712
713
714
715
716
717
718 \\\\\\\\
719
720 {{{#!/bin/bash#SBACTH -c 1#SBATCH -t 1-0#SBATCH -p short#SBATCH --array=1-528n=$SLURM_ARRAY_TASK_ID                  # define nline=`sed "${n}q;d" arrayparams.txt`    # get n:th line (1-indexed) of the fileecho $linetime rsync -rah $line}}}
721
722 (% style="margin-left: 30.0px;" %)
723 File arrayparams.txt would contain the directories to transfer...
724
725
726
727
728 \\
729
730 {{{/scratch/$USER/foo /wrk/$USER/foo/scratch/$USER/bar /wrk/$USER/bar/scratch/$USER/oof /wrk/$USER/oof/scratch/$USER/rab /wrk/$USER/rab .... }}}
731
732 (% style="margin-left: 30.0px;" %)
733 Then just submit the batch job and it does the I/O ops in parallel while you wait...
734
735 (% style="margin-left: 30.0px;" %)
736 P.S. You can use this Array template for many, many different purposes, and you can use this as the temple.
737
738 = (% style="letter-spacing: 0.0px;" %)11.0 Containers(%%) =
739
740 (% style="letter-spacing: 0.0px;" %)Singularity is available on ukko2 and Kale compute nodes ready to be used. Note that the login node does not have singularity installed because we do not wish containers to be run there. At this time we do not have our own examples of how to build containers, nor do we have ready containers yet. However, you can use for example Dockers to develop containers and then use Singularity to run them on the clusters.
741
742 When you use Singularity containers on the clusters, please do set cache directory environment variable and point to a proper place (eg. /wrk/users/$USER). Default points to your $HOME and building container there will fill quota quickly:
743
744 {{{SINGULARITY_CACHEDIR}}}
745
746 Note that /wrk is not automatically visible to the running container. You will need to specify two options, first clear environment, and then set /wrk for the running environment (**Important to note**: __interactive session on compute node does see /wrk normally, while batch job does not__):
747
748 {{{srun singularity run --cleanenv -B /wrk:/wrk &#x3c;actual container stuff here&#x3e;}}}
749
750 (% style="letter-spacing: 0.0px;" %)Here are pretty good [[building instructions>>url:https://singularity.lbl.gov/docs-build-container||shape="rect"]] for the time being.
751
752 = (% style="letter-spacing: 0.0px;" %)12.0 Scientist's Full Use Case: Trinotate Annotation of the Transcriptome of An Organism(%%) =
753
754 Work in progress... (Juhana)
755
756 This tutorial describes a full use case of a scientist. We attempt an automatic annotation of the transcriptome of a mammal genome using a pipeline called Trinotate ([[https:~~/~~/github.com/Trinotate/Trinotate.github.io/wiki>>url:https://github.com/Trinotate/Trinotate.github.io/wiki||shape="rect"]]). We describe everything from installing individual pieces of software needed by the Trinotate pipeline to transferring your data and visualizing our results.
757
758 1. Install Trinotate: [[https:~~/~~/github.com/Trinotate/Trinotate.github.io/wiki>>url:https://github.com/Trinotate/Trinotate.github.io/wiki||shape="rect"]]
759
760 = 13.0 HPC Garage FAQ =
761
762 === 13.1 My Tmux does not work, or Socket does not work on Lustre ===
763
764 Tmux does not like lustre /tmp dir. If this is changed to another/tmp location, it works. One should avoid creating sockets to Lustre.
765
766 {{{alias tmux="tmux -S /run/user/$UID/tmux-socket"}}}
767
768 === 13.2 Can one use Zarr or HDF5 on Lustre? ===
769
770 (% style="letter-spacing: 0.0px;" %)Yes, you can, these are like filesystems on a file. This causes no issues at all. You may get some performance difference by adjusting the stripe count, lfs getstripe -c filename shows current striping. See [[Lustre user guide>>doc:it4sci.IT for Science group.HPC Environment User Guide.Lustre User Guide.WebHome]] (%%)for examples of how to alter stripe settings.
771
772 === 13.3 I use Conda but Cannot Remember Commands ===
773
774 You are lucky that [[Cheat Sheet>>url:https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf||shape="rect"]] is available for download. Handy helper for most common, and some less common commands and options for Conda users.
775
776 \\