Page tree
Skip to end of metadata
Go to start of metadata

Why should you manage your research data and write a data management plan (DMP)?

  • Because it is good research practice!
  • You will reduce the risk of losing your data.
  • You will be able to anticipate complex ownership and user rights issues in advance.
  • It helps you support open access to create productive future collaborations.
  • You will meet funder requirements.
  • It helps you save you time and money.
  • The DMP reflects your managerial skills as a project leader.

Data is understood as a broad term that includes ”all information that is needed to replicate a study should be preserved, and everything that is potentially useful for others.” – Sarah Jones /DCC

Your DMP should describe how you will manage data during the whole research life cycle. The DMP is a living document which should be updated as the research project progresses.

Your research data management practices should follow the FAIR principles which dictate how your data will be Findable, Accessible, Interoperable, and Re-usable.

Good luck with your DMP!

  • Lue ensin kysymykset läpi
  • Katso, onko organisaatiollasi omaa aineistonhallinnan ohjeistuksia
  • Kysymyksiin pitäisi pystyä vastaamaan sellaisenaan - apua voit katsoa ohjeista. Tips for best practices -osiossa on tarkempia alakohtaisia käytännön vinkkejä.


Uudet kysymyksetVanhat ohjeet uudessa rakenteessa + kommentitUudet ohjeetOhjeryhmän kommentit

1. General description of data




1.1 What kinds of data your research is based on? What data will be collected, produced or reused? What file formats will the data be in?

Briefly describe your research data. Explain what kind of data you are reusing, collecting or producing. Outline how the data will be collected: e.g. via surveys, interviews, laboratory experiments, or observations. Moreover, Explain what kind of existing data you will reuse.

Describe in short what types of data will be used and are expected to be produced: e.g. texts, images, photographs, statistics, physical samples, or codes.

File format is a primary factor in the accessibility and reusability of your data in the future. List the file formats the data will be stored in. Note that a file format used during the project might not be the one most suitable for long-term preservation and reuse.

Tips for best practices

  • Describe your data in such a way that you can refer to it later in the plan. Your answer to this question forms the basis of the whole plan.
  • Explain your methods in more detail in the research plan. 
  •  List the file formats that data will be in: e.g. .csv, .txt, .docx, .xslx, .tif.
  • When listing the data formats you will be using, make sure to include any software necessary to view the data.

Consider your DMP as a part of your research plan. Standalone readability of a DMP is not necessary. DMP completes your research plan with a description of technical management of your data. To avoid redundancy, refer to your research plan in your DMP and vice versa.

Briefly describe what types of data are you collecting or producing. Also explain what kinds of already existing data you will use. For example, what types of texts, images, photographs, measurements, statistics, physical samples, or codes.

Categorize your data in a way that you can refer to it later in the plan. That is, your answer to this question can form a general structure for the rest of the plan. For example, A) data collected for this project, B) data produced as an outcome of the process, C) previously collected existing data which is reused in this project, D) managerial documents and project deliverables, and so on.

List the file formats for each data set. In some cases file formats used during the research project may differ from those used in archiving the data. List both. File format is a primary factor in the accessibility and reusability of your data in the future.

Tips for best practices

  • Data analysis and methodological issues related to data and materials should be described in your research plan.
  • Examples of file formats: .csv, .txt, .docx, .xslx, .tif.

  • When listing the file formats you will be using, make sure to include any special or uncommon software necessary to view or use the data, especially if the software coded in your project.
  • Use a table or bullet points for a concise way of presenting data types, file formats, software used and so on.

1.2 How will the consistency and quality of data be controlled?

Data quality control ensures that no data will be lost or accidentally changed during the research process. Quality control of data is an integral part of all research and takes place during data collection, data entry or digitization, and data checking.

Tips for best practices

  • Explain how the data collection methods used will affect the quality of data. You can provide evidence of data quality by documenting in detail how the data is collected.

Explain how the data collection, analysis and processing methods used may affect the quality of data and how you will minimize the risks related to data accuracy.

Data quality control ensures that no data is accidentally changed and that accuracy of data is maintained over its entire life-cycle. Quality problems can emerge due to a technical handling, converting or transferring of data, or during its contextual processing and analysis.

Tips for best practices

  • Transcriptions of audio or video interviews should be checked by other than transcriber.
  • Analog material should be digitized in as high resolution as possible for accuracy.
  • In all conversions original information content should be ensured.
  • Software producing checksums should be used.



2. Ethical and Legal Compliance




2.1 What ethical issues are related to your data management - for example handling sensitive data, protecting identity of participants, or gaining consents for data sharing?

Describe how you will maintain high ethical standards and comply with relevant legislation when managing your research data. Ethical issues must be considered throughout the whole research data life cycle from planning to publication as well as in paving the way for future reuse.

For example, following the guidelines regarding informing research participants is considered an ethical requirement for most research. Moreover, if you are handling personal or sensitive information, describe how you will ensure privacy protection and data anonymization.

Tips for best practices

  • Check your institutional Ethical Guidelines and Security Policy and prepare to follow instructions that are given in these guidelines.
  • Check whether an ethical review is required for your research project.
  • If your research is to be reviewed by an ethical committee, outline in your DMP how you will comply with the protocol (i.e. how to remove personal or sensitive information from your data before sharing it to ensure privacy protection; or, how you will use restricted access procedures). 
  • See e.g. Finnish Advisory Board on Research Integrity for more information about the responsible conduct of research.
  • See e.g. The European Code of Conduct for Research Integrity

Describe how you will maintain high ethical standards and comply with relevant legislation when managing your research data. Ethical issues must be considered throughout the whole research data life cycle.

For example, following the guidelines regarding informing research participants is considered an ethical requirement for most research. Moreover, if you are handling personal or sensitive information, describe how you will ensure privacy protection and data anonymization or pseudonymization.

Tips for best practices

  • Check your institutional Ethical Guidelines and Security Policy, and prepare to follow instructions that are given in these guidelines.
  • If your research is to be reviewed by an ethical committee, outline in your DMP how you will comply with the protocol (i.e. how to remove personal or sensitive information from your data before sharing it to ensure privacy protection; or, how you will use restricted access procedures). 
  • See e.g. Finnish Advisory Board on Research Integrity for more information about the responsible conduct of research.
  • See e.g. The European Code of Conduct for Research Integrity



2.2 How will data ownership, copyright and Intellectual Property Right (IPR) issues be managed? Are there any copyrights, licenses or other restrictions which prevents you from using or sharing the data?

Describe who will own the data and who can issue permissions to reuse it. If you use research material or data collected or produced by a third party, consider the copyright issues and potential licenses which may affect its distribution. These issues should be solved already at the planning stage of the research project. If ownership issues have not been considered early enough in the research life cycle, sharing and reusing the data may become impossible.

Tips for best practices

  • Check your organizational data policy for ownership guidelines
  • Also consider the funder's policy on copyrights or IPR
  • It is recommended to make all research data, code and software created within a research project available for reuse e.g. under Creative Commons, GNU, MIT or another relevant license. The recommended CC license according to open science principles is the CC-BY.

Describe who will own the data and how the ownership issues have been agreed upon. Describe who can issue permissions to (re)use it.

Tips for best practices

  • Check your organizational data policy for ownership, right of use, and right to distribute.
  • Ownership agreements should be made as early as possible in the project life cycle.
  • Also consider the funder's policy on copyrights or IPR.
  • It is recommended to make all research data, code and software created within a research project available for reuse e.g. under Creative Commons, GNU, MIT or another relevant license.
  • Lisää linkit GNU ja MIT


3. Documentation & metadata




3.1 How will you document your data in order to make it findable, accessible, interoperable and re-usable for you and others?  What kind of metadata standards, README files or other documentation you will use to help others to understand and use your data?

Data documentation enables data sets and files to be discovered, used, and properly cited. Metadata is essentially information regarding the data: e.g. where, when, why, and how were the data collected, processed and interpreted. Metadata may also contain details about experiments, analytical methods, and research context. 

Metadata elements can include descriptive metadata which enables indexing, discovery and retrieval (e.g. keywords); technical metadata which describes how data sets were produced, structured and how they should be used (e.g. file naming); as well as rights to metadata which define who owns and who can access the data, and who has the right to manage it.

Tips for best practices

  • Consider how the data will be organized during the project. Describe e.g. your file naming conventions, version control and folder structure.
  • Identify the types of information that should be captured to enable a researcher like you to discover, access, interpret, use, and cite your data.
  • Repositories for long-term preservation often require the use of a specific metadata standard. Check whether a discipline/community or repository based metadata schema or standard (i.e., preferred sets of metadata elements) exists that can be adopted.
  • Ensure interoperability of your data for example by utilizing research instruments which create standadized metadata automatically. Then data can be moved from one manufacturer tool to another.

Data documentation enables data sets and files to be discovered, used, and properly cited also by other users (human or computer). Metadata is essential information regarding the data, for example, where, when, why, and how were the data collected, processed and interpreted. Metadata may also contain details about experiments, analytical methods, and research context. 

Tips for best practices

  • Describe all types of documentation (README files, metadata, etc.) you will provide to help secondary users to find, understand and reuse your data.
  • Following the FAIR principles will help you ensure the Findability, Accessibility, Interoperability, and Re-usability of your data.
  • Use research instruments which create standardized metadata formats automatically. Then your data can be moved from one manufacturer tool to another.
  • Consider how the data will be organized during the project. Describe, for example, your file naming conventions, version control and folder structure.
  • Identify the types of information that should be captured to enable a researcher like you to discover, access, interpret, use, and cite your data.
  • Repositories often require the use of a specific metadata standard. Check whether a discipline/community or repository based metadata schema or standard (i.e., preferred sets of metadata elements) exists that can be adopted.
  • Ehdotettiin linkkiä tieteenalakohtaisista metadatoista esim. http://www.dcc.ac.uk/resources/metadata-standards. Standardeja on listailtu oman organisaation oppaisiin.
    Palataan tähän vielä myöhemmin.
  • Ehdotettiin README → README.txt, eli lisätään txt -tiedostomuoto mukaan. Huom. Ei lisätty, sillä voi olla myös muita tiedostomuotoja.
  • Kysyttiin ilmauksesta What community standards (if any) will be used to annotate the (meta)data? - Bullet point purettu.
  • Use research instruments which create standardized matadata formats automatically – voisiko tässä olla jokin esimerkki / esimerkkejä? Esimerkit voisivat tukea myös humanistia huomaamaan tämä. Ei aiheuttanut toimenpiteitä
4. Storage and backup during the research project



4.1 Where will your data be stored and how it will be backed up?

Describe where you will store and back up your data during your research project. Methods for preserving and sharing your data after your research project has ended are explained in more details in Section 5.

Consider who will be responsible for backup and recovery. If there are several researchers involved, create a plan with your collaborators and ensure safe transfer between participants.

Tips for best practices

  • The use of a safe and secure storage provided and maintained by your organization’s IT support is preferable. 

Describe where you will store and back up your data during your research project. Usually, data during the active phase of the project is still dynamic and under processing and analysis. Methods for archiving and publishing your static data sets after your research project has ended are considered in Section 5.

Consider who will be responsible for backup and recovery.

Tips for best practices

  • The use of a safe and secure storage provided and maintained by your organization’s IT support is preferable.

4.2 Who will be responsible for controlling access to your data and how will the secured access be controlled? 

It is vital to consider data security issues, especially if your data is sensitive e.g., personal data, politically sensitive information or trade secrets!

Describe who has access to your data and what they are authorized to do with it. Who will be responsible for access control?

Tips for best practices

  • Access controls should always be proportionate to the kind of data and level of confidentiality involved. 

It is essential to consider data security issues, especially if your data is sensitive, for example, personal data, politically sensitive information or trade secrets. Describe who has access to your data, what they are authorized to do with it, or how you ensure safe transfer of data to your collaborators.

Tips for best practices

  • Access controls should always be in line with the level of confidentiality involved.

5. Opening, publishing and archiving the data after the research project




5.1 What part of the data can be made openly available or published? Where and when will the data, or its metadata, be made available?

Describe whether you will share all your data or only parts of it, and for how long will it be made available. If your data or parts of it cannot be shared, explain why. Valid explanations might include confidentiality, trade secrets or ownership issues (license, copyright). Sometimes data cannot be shared due to the unreasonable effort required for its sharing (e.g. legacy data or large volumes of analog data).

Tips for best practices

  • Consider data sharing both during and after research.
  • The openness and sharing of research data promotes its reuse.
  • When sharing your data, it is recommended that it be made available for reuse e.g. under Creative Commons or another relevant license. The recommended CC license for open science is the CC-By license.

Describe whether you will publish or otherwise make all your data or only parts of it openly available. If your data or parts of it cannot be opened, explain why.

The openness of research data promotes its reuse.

Tips for best practices

  • You can publish a description (i.e. the metadata) of your data without making the data itself openly available, which enables you to restrict access to the data.
  • Publish your data in a data repository or peer reviewed data journal.
  • Check re3data.org to find a repository for your data. 
  • Remember to check funder, disciplinary or national recommendations for data repositories.
  • It is recommended to make all research data, code and software created within a research project available for reuse e.g. under Creative Commons, GNU, MIT or another relevant license.
  • Ehdotettiin Remember to check funder, disciplinary or national recommendations for data repositories. → Remember to check funder, disciplinary or national recommendations for data repositories and research data finders.
  • Lisää GNU ja MIT -linkit

5.2 Where will data with long-term value be archived and for how long?


The aim of long-term preservation is to store and keep data usable and comprehensible for dozens or even hundreds of years. Data selected for long-term preservation will be submitted to a data repository or data archive. Long-term preservation will ensure your data can be found, understood, accessed and used in the future, even for generations.

Tips for best practices

  • Briefly describe what data to preserve and for how long – as well as what data to dispose of after the project.
  • Remember to check funder, disciplinary or national recommendations for data repositories, data archives or data banks.

Briefly describe what data to archive and for how long – as well as what data to dispose of after the project. Describe the access policy to the archived data.

Tips for best practices

  • Remember to check funder, disciplinary or national recommendations for data archives.

5.3 Estimate time and effort required for preparing the data in order to publish or to archive it.


Tips for best practices

  • Remember to remark that you will specify your data management costs in the budget.
  • Will you need to hire expert help to manage, preserve and share the data?
  • Do you have sufficient storage space, or will you need to include charges for additional services? Consider the additional computational facilities and resources that need to be accessed, and what the costs associated will amount to.

Estimate the need to hire an expert help to manage, preserve and share the data. Consider the additional computational facilities and resources that need to be accessed, and what the costs associated will amount to.

Tips for best practices

  • Remember to specify your data management costs in the budget.
Ohjeen sijainti Zenodossa ja linkitykset siihen, tuleeko uusi versio vai miten toteutetaan.
  • No labels