2020-06-04 Digi task force meeting

Last modified by Xwiki VePa on 2025/02/04 07:00

Time: 4.6.2020, 10.00 – 12.00

Place: Online – Teams

Anne Koivunen, Esko Piirainen, Jaana Haapala, Jaakko Hyvönen, Zhengzhe Wu, Jere Kahanpää, Sanna Laaka-Lindberg, Sanna Laine, Hanna Laakkonen, Henry Väre, Anniina Kuusijärvi (secretary)

The current status of the Workflow Process description

  • Link: https://wiki.helsinki.fi/display/LUOMUSdigi/Process+descriptions
  • Sanna L-L and Sanna L have described some processes, will send process descriptions asap, we can use these as an example how the process descriptions could look like
    • Bryophyte old data databasing
    • GPI
    • Brotherus digitisation
    • Currently in Finnish, preferably will be translated to English
    • Anniina will create a template if needed based on those descriptions

Improvement of Multispecimen Workflow

  • See Zhengzhe's notes: Plant digi line issues 2020-06-04
  • There have been issues with plant digi line, multiple specimens and QR codes per sheet but image gets linked to only one QR code
  • Takes a long time for computer to process these images, to try to find all QR codes on a sheet, computationally heavy load
  • Suggestion: introduce a new step to the digitisation workflow: mark multi-specimen sheets when digitising
    • Operator would write down at least number of QR codes in the picture (if more than one exists), maybe also write down the QR codes for the specimens, if it won’t take too much time,
    • Technical details need to be discussed, how to do this in Kotka import sheets
  • Zhengzhe will make changes to the software and write down instructions, after the technical details have been confirmed with Kotka

Missing Kotka image

  • See Zhengzhe's notes: Plant digi line issues 2020-06-04
  • Current situation in Kotka: There are ca. 15 000 pictures still missing, majority of them are multispecimen images. Need to investigate what the rest are and what to do with them.
  • Update to the situation: Zhengzhe has processed those included in messy images dataset, but has found only 2000 pictures, should have been about 8000. Started to process all images, found about 1000 more multi-specimen images from 100 000 processed images. Still 400 000 images to go, after that we can start discussing the rest that are really missing and investigating reasons why they are missing, and then improve the workflow.
    • Plantago lanceolata (digitisation on demand thing that was done separately) images also included in the amount that are missing an image, where are the images?

Update on Brotherus digitation processes

  • Sanna L and Jaana take photos of the specimens with a camera system, the image is saved as tiff on P drive
  • Output about 50 images per day, about 2000 specimens have been digitised so far
  • P drive not optimal, could use IDA instead
    • command line tools or web user interface
  • Would it be possible to get same kind of file transfer system as we have for the plant lines? Then there would be no need for manual file uploads.
  • Data transcription done also, data not yet in Kotka but will be later.
    • timeline unclear, maybe start uploading data to Kotka this autumn
  • Need to link the data to the image, need to keep the originals in IDA and show smaller version in Kotka through our image API, need to discuss the technical details in more detail later.

Hi-resolution photos (incl. GPI). and image loans

  • Workable long time solution: IDA; Tike?
  • Need to delete old images from Kotka and update to the new ones. Specimen data has been updated for lichens and bryophytes, not yet vascular plants. Images not updated yet for any material.
  • Maybe we could have a similar solution for these as for Brotherus, described above?
  • Need to check if IDA can be used for sending temporary image links to the loanees (John?)
  • Anniina will add "Virtual loan" as an option to transaction type in Kotka as a first aid for registering virtual loans
  • Do we actually need to store the tiffs or not, are the smaller ones enough?
    • For JSTOR and GPI  project the high resolution images were needed,
    • Sanna L et al. need to discuss with Soili Stenroos and others whether we still need to send images to JSTOR in the future
    • Type specimens: for types it is important to store the tiffs, because all possible quality is needed
    • depends on the research use: for plants example need to look at hairs and small things, so need quality
    • digitisation on demand idea: take photos in poorer quality when digitising in large masses and if someone asks, provide better quality photos
    • keep originals for types and stack images for example, if we run out of space get rid of mass digitised bulk stuff large (tiff) files
  • Possible to add links to full size images in Kotka or laji.fi?
    • At least information about the availability of full size image should be available in laji.fi and Kotka

Communications

  • What is the best way /platform for this?

    • Use Wiki for archiving files, instructions, memos, etc.: https://wiki.helsinki.fi/display/LUOMUSdigi/luomus-digi+Home, active use encouraged!
    • Anne will create a new Team for us in Teams, to use as a discussion forum and to use for the online meetings
    • Slack was not easy to use for all
    • Online meetings have been nice

Other matters

  • Also 3D images were discussed briefly
    • not time for this right now, but something to keep in mind related to data storage discussions
    • storing of data, raw data, how to show it online etc.