It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
This guide outlines the process of research data management, how to develop and share data management plans and FAIR data principles. Planning for the effective creation, management and sharing of your data enables you to get the most out of your research
US National Science Foundation: Proposals submitted or due on or after 18 January 18 2011, must include a supplementary document of no more than two pages labeled 'Data Management Plan'. This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results
Why develop a data plan?
Why develop a data plan?
There are many benefits to managing and sharing your data:
you can find and understand your data when you need to use it
there is continuity if project staff leave or new researchers join
you can avoid unnecessary duplication e.g. re-collecting or re-working data
the data underlying publications are maintained, allowing for validation of results
data sharing leads to more collaboration and advances research
your research is more visible and has greater impact
other researchers can cite your data so you gain credit
Planning helps you to achieve these benefits; it is ultimately most useful to you. Making a plan helps you to save time and effort and makes the research process easier. By considering what data will be created and how, you can check you have the necessary support in place. Planning also enables you to make sound decisions, bearing in mind the wider context and consequences of different options.
Publishers and research funders may require that you share your data so it is worth investing time to plan for effective data management. Several funders ask for data plans as part of grant proposals. The DCC views plans submitted in grant proposals as preliminary outlines, which should then be developed into more coherent processes and procedures at the outset of your research. For the purposes of this guide we will focus on the application stage requirements. A further guide - How to put the data management and sharing plan into practice - will address data management during the research process.
How to Develop and Sharing a Data Management
1. Data Types, Formats, Standards and Capture Methods
What data outputs will your research generate?
- outline volume, type, content, quality and format of the final dataset
Outline the metadata, documentation or other supporting material that should accompany the data for it to be interpreted correctly
What standards and methodologies will be utilised for data collection and management?
State the relationship to other data available in public repositories e.g.
- existing data sources that will be used by the research project
- gaps between available data and that required for the research
- the added value that new data would provide in relation to existing data
Outline and justify your choices: You should detail what data you will create and explain why you have opted for particular formats, standards and methodologies. Bear in mind that the choices you make may make it easier or harder to share and preserve your data.
It can be useful to capture your data in (or convert it to) community-accepted data formats. Using standard or widely-adopted formats will make your data interoperable. Open or non-proprietary formats are preferable, as you and others will have less trouble processing these later. If your data are to be deposited into an archive, particular formats may be preferred.
Documentation and metadata allow your data to be understood and discovered by others. It is fundamental to capture contextual details about how and why the data were created. Metadata is a subset of this broad documentation, describing the data in detail. There are various metadata standards which can help you to describe your data in a consistent way. Librarians, data repositories or your colleagues may be able to advise on relevant standards.
Make informed decision based on review: It can help to show your awareness of good practice or that you have sought advice to develop your plans. Some funders also expect you to demonstrate that existing data are not sufficient for your needs, so you may need to show that you have reviewed repository and data centre holdings or consulted with similar projects.
2. Ethics and Intellectual Property
Demonstrate that you have sought advice on and addressed all copyright and rights management issues that apply to the resource
Make explicit mention of consent, confidentiality, anonymisation and other ethical considerations, where appropriate
Are any restrictions on data sharing required – for example to safeguard research participants or to gain appropriate intellectual property protection?
Present a strong case for any restrictions on sharing: Explain any constraints, such as embargo periods or restricted access, and ensure these are properly justified as there is a common expectation that publicly funded research data will be openly available as soon as possible. These justifications may also be of use in the event of a Freedom of Information request for your research data.
All research involving human data or material is subject to formal ethical review. Where appropriate, you should outline the steps you will take to protect research participants, e.g. anonymising data. It helps to show that you’ve balanced concerns with the desire to share e.g. by negotiating informed consent for data sharing. Many University Ethics Committees provide sample consent forms and services such as the UK Data Archive provide excellent guidance in this area. You should also demonstrate awareness of relevant legislation such as the Data Protection Act.
Data ownership should be clarified and, where necessary, plans should be in place to negotiate licences at the start of the research process. If you agree/purchase licences to reuse third party data, be aware of any restrictions this places on subsequent deposit and data sharing. JISC Legalprovides lots of advice on copyright, IPR and relevant legislation such as the Data Protection Act and Freedom of Information. Institutional support is also available from experts in university libraries, records management and research offices.
3. Access, Data Sharing and Reuse
What are the further intended and/or foreseeable research uses for the completed dataset(s)?
How you will make the resource accessible to the potential audience(s) identified.
- Where will you make the data available?
- How will other researchers be able to access the data?
- Will a data sharing agreement be required?
- What is the timescale for public release of the data?
State any expected difficulties in data sharing, along with causes and possible measures to overcome these difficulties.
How will data sharing provide opportunities for coordination or collaboration?
Anticipate and plan for data reuse: It can help to envisage which users your data would be of value to, and address their needs when deciding how to make the data available. Data centres may also ask you to meet minimum quality standards to make sure your data can be understood and reused by other researchers.
Provide specific details on access: Reassure funders by being very clear about where, when and how your data will be made available. The DCC offers guidance on how to licence your data to make clear who can use it and for what purpose. Funders often state expected timeframes for release, such as making data available on publication. If you can’t meet these expectations or need to impose any restrictions, try to demonstrate that you have considered various means of overcoming these challenges.
Use existing infrastructure: Where possible select an appropriate disciplinary database, data centre or institutional repository. If you are unsure which services are available to you, check the repository list collated by DataCite, BioMed Central and the DCC. If access to your data needs to be restricted, look for secure data services or data enclaves.
4. Short-Term Storage and Data Management
Describe the planned quality assurance and back-up procedures [security/storage]
Specify the responsibilities for data management and curation within research teams at all participating institutions
Define data management support: Outline what provision is available to you within your institution and any additional skills or resources that you need to secure. If local support is available, it helps to demonstrate that you have discussed and agreed requirements. If you need to secure external support, justify the selections made and budget requested. Be clear about who will be responsible for different tasks.
Consider the practicalities: Are the investigators co-located, or will you need infrastructure that accommodates secure remote access? How will data quality be monitored if you are working in a distributed network across several sites? Strong file-naming conventions and versioning applications may be of use to keep track of the development process, particularly when several people are working together.
Apply appropriate levels of data management: Funders want to be reassured that the day-to-day data management is fit for purpose. You may apply differing levels of service or adopt a combination of approaches:
Security may be more robust for any sensitive data you collect than for secondary data you hold under licence. Think about how you will transfer data securely e.g. encrypting data or using secure online storage. If using online services, you should know where your data are hosted and be certain that this is legally permissible.
Back-up of your unique data is more critical than copies of secondary data. The more important the data and the more often it is used, the more regularly it needs to be backed up. Fully managed file services with automated back-up, such as those offered by university IT services, are very robust and save you the time and effort of implementing your own system. Such services could be used in combination with portable storage or cloud computing to meet particular needs.
5. Deposit and Long-Term Preservation
Identify which of the data sets produced are considered to be of long-term value
Outline the plans for preparing and documenting data for preservation and sharing
Explain your archiving/preservation plan to ensure the long-term value of key datasets
Select data of long-term value: Data sharing and preservation may not be applicable in every case. The DCC provides a ‘How to …’ guide on appraisal, which offers practical strategies to help you select important data. Deciding what has long-term value and preparing those data to expected standards for deposit are time-consuming processes, for which you should allocate significant resources.
Safeguard the data behind the graph: It is a common expectation among RCUK funders that published results will include information on how to access the supporting data. Even if there is no obvious home for the majority of your data, the data which underpin publications should be extracted, captured in machine-readable form and deposited somewhere so they remain accessible.
Assure that your data will remain accessible: Whatever approach you adopt, focus on making a convincing case that your data will remain accessible. If you plan to deposit in a data centre, it helps to speak with their staff early on as they can advise what is appropriate and feasible in terms of preservation. Universities are increasingly providing infrastructure to support data management and there are some disciplinary services which may be of use.
What resources will you require to deliver your plan?
Outline additional hardware, software and technical expertise, support and training that is likely to be required and how it will be acquired
Outline and justify costs: If you need to purchase storage, outsource services such as back-up and preservation, or plan to pay for data management support, these costs should be outlined and justified in your proposal. Where institutional provision is available, show that the support you require has been discussed and agreed. It also helps to link resources with roles and responsibilities to demonstrate how the plan will be implemented.
Don’t underestimate the human effort required: Creating documentation and making your data understandable to others is very time consuming, so be realistic about how much effort is needed to prepare your data for sharing and preservation. The UKDA offers a toolkit to help researchers cost activities related to managing and sharing social science data.
Show efficient use of public funds: The RCUK Common Principles on Data Policy state that it is appropriate to use public funds to support the management and sharing of publicly-funded research data, but this is expected to be efficient and cost-effective. A summary of individual funder’s views on meeting associated costs is available via the DCC policy pages.