Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


Background work

  • Adapt volunteers to change their work practices

    How to handle data
  • sensitive
  • non-sensitive
      • We do not care about people who do not wish to adapt workflows to increasing complexities.

      • We'll focus on user that are willing to adapt either by personal reasons (personal development.
  • Current situation

    • Storage:
      • sensitive:  netApp, Umpio, local disks?
      • non-sensitive: netApp, allas, local disks, kappa  
    • Applications and OS
      • Applications
        • Windows ?? → wild wild west
        • Linux: packages and local installations
      • cPouta and ePouta users: Is it possible to move all applications under Modules
        • ePouta, cPouta heterogeneous OS available → Standardize OS → General module repository
      • Windows/Linux  users: Is there a way to introduce VDI-Linux/HPC as an alternative solution for analysis?

Pitfalls/ Challenges

  • We cannot change the current data handling workflow  (pipeline) to something else
    • We can change something under hood:
      •  how data is copied, moved or analyzed
      • automatization of some critical 'hand made processes'
      • where data is analyzed (HPC. VDI, VDI-GPU)?
  • People do not want any changes!
  • People do not want to learn anything new if it's not necessary → Prove them benefits of new approach of analysis procedure
  • Current culture is afraid of limitations:
    • HPC batch job queue for analysis
    • Storage quota
    • etc..
  • Can we give users such tools to improve the analysis and research


  • Pilot case
    • Jessica Lucenius case (Eläintalli, Neurotieteiden tutkimuskeskus)
    • Current issues:
      • They don't have a storage, where team can do the analysis from multiple locations
      • space needed 100TB (will LSDCC solve this in the future?)
        • Question to solve: How I/O intensive the analysis is?
    • Solution:
      • Create IDM-group to have an access to kappa-wrk Lustre storage, which can be mounted under Windows and Linux
      • Question: Do they need backups of raw data and where it will be stored?
  • Next steps
    • Figure out current pipeline and the data flow
    • Figure out where we can fork data to unchanging target place (CEPH/ALLAS ?) →
  • current pipeline and identify bottlenecks and manually steps of pipeline
    • First need is