Pilot for changing work culture in Meilahti

Last modified by smaisala@helsinki_fi on 2024/02/08 06:49

Background work

Adapt volunteers to change their work practices
We do not care about people who do not wish to adapt workflows to increasing complexities.
- - We'll focus on user that are willing to adapt either by personal reasons (personal development.
Current situation
- Storage:
  - sensitive: netApp, Umpio, local disks?
  - non-sensitive: netApp, allas, local disks, kappa
- Applications and OS
  - Applications
    - Windows ?? → wild wild west
    - Linux: packages and local installations

- - cPouta and ePouta users: Is it possible to move all applications under Modules
    - ePouta, cPouta heterogeneous OS available → Standardize OS → General module repository
  - Windows/Linux users: Is there a way to introduce VDI-Linux/HPC as an alternative solution for analysis?

We cannot change the current data handling workflow (pipeline) to something else
- We can change something under hood:
  - how data is copied, moved or analyzed
  - automatization of some critical 'hand made processes'
  - where data is analyzed (HPC. VDI, VDI-GPU)?
People do not want any changes!
People do not want to learn anything new if it's not necessary → Prove them benefits of new approach of analysis procedure
Current culture is afraid of limitations:
- HPC batch job queue for analysis
- Storage quota
- etc..
Can we give users such tools to improve the analysis and research

Pilot case
- Jessica Lucenius case (Eläintalli, Neurotieteiden tutkimuskeskus)
- Current issues:
  - They don't have a storage, where team can do the analysis from multiple locations
  - space needed 100TB (will LSDCC solve this in the future?)
    - Question to solve: How I/O intensive the analysis is?
- Solution:
  - Create IDM-group to have an access to kappa-wrk Lustre storage, which can be mounted under Windows and Linux
  - Question: Do they need backups of raw data and where it will be stored?
Next steps
- Figure out current pipeline and the data flow
- Figure out where we can fork data to unchanging target place (CEPH/ALLAS ?) →
current pipeline and identify bottlenecks and manually steps of pipeline
- First need is