Page tree

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


0. Introduction





These are instructions for drafting a data management plan, which is separate from a research plan. However, particularly in research which is based on collecting and analysing data, a research plan and data management plan may be closely interconnected and often overlap.

The main difference between a research plan and a data management plan is that while the research plan describes which data will be used in the research, as well as why and how the data will be used, the data management plan lays out how the data will be managed, and how further use of the data is enabled in the course of research.

These instructions supplement the general data management plan guidelines as they pertain to datasets which contain sensitive personal data. All of the protective measures described in these instructions will not be relevant if the personal data is not deemed sensitive. 

Personal data means any information relating to an identified or identifiable natural person. An identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier, or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

Special categories of personal data which is particularly sensitive (Articles 9 and 10 of the GDPR) include:

  1. Racial or ethnic origin
  2. Political opinion
  3. Religion or beliefs
  4. Trade union membership
  5. Genetic or biometric data processed for the purpose of uniquely identifying a person
  6. Health information
  7. Sexual behaviour or orientation
  8. Criminal convictions and offences

Purely to make this guide easier to understand, we call "sensitive personal data" the data described above. However the exact legal term is "special categories of personal data".

The processing of personal data is regulated by legislation. The legislation governing the processing of personal data is the EU's General Data Protection Regulation (GDPR), along with the Data Protection Act that supplements it. The purpose of the new legislation is to improve people's opportunities to decide how information about them is processed, and it also has implications for how personal data is processed in research. New features include the accountability requirement, which means the controller or processor of the personal data must in the future demonstrate in writing that they comply with data protection legislation and the principles of processing personal data while ensuring the legal rights of the data subjects. In addition, there are changes to the rules governing how personal data collected with the consent of the subject can be used.

There are also organisation-specific instructions for many stages of the processing of personal data which must be followed.

Data management planning is particularly important when processing datasets containing personal data, as it allows you to protect your rights and the rights of your organisation, as well as the rights of your research subjects. The breach of data protection legislation may result in administrative sanctions, criminal liability and liability for damages. Letting personal data fall into the wrong hands may cause serious damage to the research subject.

Further information: The Data Protection Ombudsman's office is currently drafting instructions on applying the new data protection legislation:

Kommentoitava versio löytyy täältä:

1. General description of data


1.1 What kinds of data is your research based on? What data will be collected, produced or reused? What file formats will the data be in?

The data management plan should describe the kind of personal data the collection and analysis methods generate. The justifications for the research and the reasons for collecting and processing personal data should be included in the research plan. 

Describe all relevant data sources in the data management plan. For example, list the people or groups of people, authorities and registers involved in the research.

For each data source:

Please note that when you collect personal data or sensitive information, you must also ensure the security of the media used to collect and transport the data. A more detailed description of this is included in section 4.1.

1.2 How will the consistency and quality of data be controlled?

Consider the quality of the data throughout its life cycle, from collection to publication and archiving. What are the biggest risks and how will they be managed? Does the collection of data which contains personal information feature elements that require special attention in relation to the quality of the data? (Information security will be covered in section 4.1)


  • Consider when the data should be protected with a code or whether it should be anonymised.
  • Remember the difference between anonymised and pseudonymised data.
  • Consider whether anonymisation or pseudonymisation will impact the quality of the data. Will the data still be useful after anonymisation?
  • Remember to ensure that no valuable information is lost if the data is made less specific.
  • Recording metadata and using metadata standards are also quality measures and should be entered in more detail in section 3, "Documentation and metadata" of your plan.

2. Ethical and Legal Compliance


2.1 What ethical issues are related to your data management, for example, in handling sensitive data, protecting the identity of participants, or gaining consent for data sharing?

Indicate in your plan who, or what organisation, is the data file controller of the data you collect or produce.

Also indicate who the processors are who process the personal data on behalf of the controller. The processing of personal data means any operation which is performed on personal data, such as collection, recording, organisation, use, storage, adaptation or alteration, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction.

Data processing also includes cases in which parties outside the organisation or research project analyse samples. Processing agreements must be drafted with such third parties.

The processor must take protective steps to safeguard the rights of the data subject. Such protective measures include:

    • pseudonymisation
    • anonymisation
    • sufficient safeguards: technical restrictions, use monitoring, described in section 4 of the plan
    • training, instructions, regulations, commitments and agreements
    • processes, practices and certificates
    • data encryption
    • audits

Data protection impact assessment

Your plan should indicate how the impact assessment will be carried out.

The purpose of the impact assessment is to describe how the personal data will be processed. Assess the necessity and proportionality of the processing and assess the risks resulting from the processing as well as measures necessary to address the risks. Impact assessment is required when the processing of personal data is likely to carry a high risk. The purpose of the impact assessment is to help the controller comply with the requirements of the GDPR and to demonstrate this compliance. Data protection impact assessment should begin as early as possible when the processing of personal data is being planned. The assessment must be constantly monitored and updated whenever necessary. 


  • Refer to the data protection instructions for your organisation.
  • Refer to your organisation's instructions on processing contracts.
  • Refer to the impact assessment instructions of your organisation and the office of the Data Protection Ombudsman.


2.2 How will data ownership, copyright and Intellectual Property Right (IPR) issues be managed? Are there any copyrights, licenses or other restrictions which prevent you from using or sharing the data?

The ownership, copyright and intellectual property rights of the data must also be recognised. This is particularly important for sensitive data of any kind.


  • Carefully read the terms of use for all of the IT services you use.
  • Written agreements regarding data ownership, use rights and publication authorship help ensure data protection.

3. Documentation and metadata


3.1 How will you document your data in order to make it findable, accessible, interoperable and re-usable for you and others? What kind of metadata standards, README files or other documentation will you use to help others to understand and use your data?


  • In the description of variables, mention whether the variable contains personal or sensitive data. Refer to, e.g., the Data Management Guidelines.
  • Even if your research data contains personal data, you may publish the metadata if it contains no identifiers which could be used to identify the research subject.

4. Storage and backup during the research project


4.1 Where will your data be stored, and how will it be backed up?

If your research involves collecting or using personal data or sensitive personal data:

  • Consider the requirements of the party disclosing or transmitting the data as early as possible
  • Draft the statutory risk assessment, indicating the information security measures required

    Data protection measures include:
  • Backup copies: ensure the ability to recover after a systems failure
  • Access control: who is granted access and on what grounds, how is the access restricted, this is described in more detail in section 4.2
  • Encryption: whenever necessary. Encryption is especially recommended for mobile devices, laptop computers and external storage devices.
  • Monitoring: both a technical log and monitoring of data processing and use, described in more detail in section 4.2.
  • Protecting the technical environment: how can the processing environment be protected from third parties
  • Personnel security: orientation of research group members, data protection and information security training, instructions and shared practices
  • Facility security: locks on work spaces, storage furniture, camera surveillance and access control, described in more detail in section 4.2.


  • Whenever possible, use the protected processing environments recommended by the controllers.
  • Remember that the transfer of personal data outside the EU and EEA has been restricted.
  • Bear in mind that consent forms also contain personal data.


4.2 Who will be responsible for controlling access to your data, and how will secured access be controlled?

Access control: who is granted access and on what grounds, how is the access restricted, and who is responsible for access control?

  • A person must be designated to be in charge of access control
  • A list of granted access rights and users must be drafted
  • Access is only granted when needed, and the access must be as limited as possible.
  • The user's need and basis for accessing the data must be inspected before granting access
  • A system must be in place for revoking and deleting access rights

Monitoring: this means both a technical log and procedures for monitoring the processing and use of the data.

  • Consider how the use of the data will be monitored over the course of the research.
    • Where and in what ways will the data be processed?
    • Where and for whom can it be copied?
    • Who can transfer data outside the research group and on what grounds?  Remember that this must be in line with the consent from the data subjects if the data has been made available based on consent.
    • Examine whether, and describe how, the technical tools used can keep a log of who used which data and when. Ask your organisation's IT support for use and change logging.

Facility security: locks on work spaces, storage furniture, camera surveillance and access control.

  • A person must be designated to be in charge of access control
  • A list of holders of access rights and keys must be drafted
  • Which doors are locked or are lockable between the work space and the outside?
  • Are there theft-proof storage facilities or furniture available in the work spaces for documents, other analogue material and external storage devices?
  • Is camera surveillance available?

You will need:

  • A document that complies with the accountability requirement of the GDPR
  • A statement of data protection measures


5. Opening, publishing and archiving the data after the research project


5.1 What part of the data can be made openly available or published? Where and when will the data, or its metadata, be made available?

Material containing personal data can only be released once it has been anonymised. Pseudonymised data still constitutes personal data and can consequently not be released. Material which contains personal data may, however, be shared with interested parties upon request for the purpose cited in the original basis for processing.

The basis for processing material containing personal data, for example a statutory reason or consent, may restrict the ways the data can be used later.

Acceptable ways to release or publish material which contains personal data include:

  1. The data is anonymised and released into a data archive with an appropriate level of data protection
  2. Only the metadata for the material is published in a suitable research database or data repository.


  • Key metadata for material containing personal data should be released even if the material itself cannot be.
  • Pseudonymised data is still personal data, and cannot be released for further use. However, further use of the material may be possible by request.
  • Further use of the material may require that new consent be requested from the research subject.


5.2 Where will data with longterm value be archived, and for how long?

When drafting an archiving plan, it is important to consider which parts of the material will be archived, and for what period of time. It is also important to decide which parts will be destroyed and how this can be done securely.

Traditionally, the recommendation has been to destroy all sensitive data after the research project, as storing it carries risks and requires special arrangements. Other unnecessary files and intermediate files generated by IT systems must also be deleted once they are no longer necessary.

Just deleting a file and emptying the recycle bin on the computer does not mean that the file has been permanently destroyed. It is possible to retrieve deleted files even after the hard disk has been reformatted. A variety of applications exist for permanently destroying data, based on overwriting data or magnetising the hard disk. It is also possible to mechanically crush the storage device so that it cannot be read.

Archiving material that contains sensitive personal data requires permission from the National Archives, and the data must be minimised before archiving. Any later use of such material requires a research permit.


  • Please remember that the anonymisation and destruction or archiving of the data must be done by the deadline of the research permit.
  • Genuine anonymisation requires that there is no possibility of either direct or indirect identification, and that the code key is destroyed.
  • Data relating to samples may be archived in a biobank.
  • Many universities and public authorities have their own internal guidelines for destroying storage devices.


5.3 Estimate the time and effort required for preparing the data in order to publish or to archive it.

When evaluating the costs associated with the management of sensitive data, consider:

  • the costs of anonymising data (the time and programs required)
  • the technical requirements of a higher level of security