In this module we will simply introduce the topic of data management, and lay the groundwork for the rest of the modules you will take.
Recall some of the points we have touched on up to now:
We have learned what data is, put it into an ocean’s context and learned about why data management is so important. With this base understanding of ocean data and that data’s management needs, the next step is learning about data management plans.
This module will give concrete steps for developing and implementing data management into research plans.
Data management plans (DMPs) don’t have to be scary, and in fact, it’s likely you’ve used some sort of basic management plan before, if you’ve ever done research with a post-secondary institution. Likely, on any major group project you’ve worked on, you’ve had a plan: who was doing which part, when it was due and other such considerations. A data management plan is a similar planning process, but for data. They are valuable in that they can offer built-in check-ins that allow researchers to make sure their project is on track.
A good way to think about data management plans is to divide your project into sections and work out the plan for each section.
Some funders may require a specific DMP template be used, always check and see if that is the case. A DMP typically uses future tense, and is written in first person: I (or we if in a group) plan to do this, we will store the data in this repository, we will use this program to collect data, etc.
DMP Templates
There are free DMP templates and tools available to help you develop a DMP and feel comfortable that you haven’t missed anything. The DMP Assistant is one of these DMP template tools. The DMP Assistant templates use a series of key data management questions, supported by best-practice guidance and examples to help develop a DMP.
The DMP Assistant has been developed by the Digital Research Alliance of Canada (DRAC) in collaboration with the University of Alberta. The DMP Assisant allows you to select and use a standard template or start with templates made by other organizations and institutions for specific data or research types.You can also create your own custom template (which is useful if you know you are going to do a number of projects that will require similar parameters). Portage is a very efficient system, simply select the template you are going to use and fill in each section and each question that is applicable to your project to the best of your ability.
DMP Questions
Why have a DMP? A DMP can be referred back to over the course of your project to make sure you are on track, and help if there are any discrepancies over what to do next, or which process is best. A data management plan is also important because if someone drops out of the project for any reason, the DMP will have all the information you need to give to a new person and get them up to speed!
DMPs are best practice! They are part of ethical research, especially on an international scale and are now required by most grant funders and institutions in Canada, including the Tri-Agency, one of the largest publishers and funders of research in the country.
DMPs contain a lot of information, so if it is helpful, consider making a streamlined version of it just for your team in point form or as a checklist, with the full version as the master document to refer back to.
While templates are great, here are some questions and considerations you could use to create your own custom DMP if that is what you prefer: (most of these questions are similar to what would be on a template)
A good way to divide your plan is to consider:
- Before the project begins – the time before you start collecting your data, such as when you are filling out things like your grant or proposal, or making the data management plan.
- During the project – the time when you are actively collecting data, manipulating it into graphs, compiling your findings and writing your final report) and
- After – the time when you have completed data collection and are submitting the project to a repository, or presenting it to a conference.
Now we will go into more detail about each of those three sections. These are just guidelines; answer whatever is applicable and add it to your data management plan.
Before you begin getting deep into your research you’ll want to think about:
- What data will you be collecting and what standard will you follow?
- What metadata will you be collecting and what standard will you follow?
- How will you use and manipulate that data?
- What programs will you need to use to read and manipulate the data? Do you have access to those programs already through your workplace/post-secondary institute/whoever is funding the project? If not, how much will it cost to get access to the programs you need? This is important because some organizations have limited licenses for different programs and will need to know so that they can allocate those resources to your project accordingly.
- What repository will you submit your data to, if there isn’t one already selected by the parameters of the project. Set aside time in your plan to research repositories, and find the best one.
- Once you’ve chosen a repository, find out what data format it accepts and the requirements for metadata. Add these data formatting and metadata requirements to your plan. Knowing this will help you plan for the potential need to modify or add to your data and metadata so that it can be added to the repository. Or, if they use an acceptable standard for your type of data, plan to use that standard from the beginning of your project and write the requirements into your plan. If you do need to make changes to the dataset and metadata to submit it, who on the team will do the work and how much time will you need to set aside?
- If your funder and the repository you are planning to submit to haven’t specified a license, this is the time to decide on licensing. Will you use Creative Commons or some other form of licensing?
During your research you’ll want to think about:
- How will you store data while you’re working? Who will be in charge of active data management? One of you? All of you?
- Will you have back-ups? What method will you use to back-up your information?
- How will you manage citations? Who will make sure citations are getting recorded properly?
- Is there someone to do Quality Assurance/Quality Control (QA/QC)? Who is that person? How long will the QC take? What process will you use for QA/QC and how will it be documented? Make sure they have adequate time before the next step in the process.
You know what repository you’ll need, so this is the time to double check that all your data meets the requirements.
After your research you’ll want to think about:
- This is the time to apply a license to your data, and choose someone to monitor for license violations. What will you do if someone violates your license? Will you need to set aside time or money to deal with them?
- What happens to the data in 1 year from now? 5 years? 10? Will you need to find a long term storage solution later or will the repository you’re using store it indefinitely?
- Do you have ORCID?(This is a stable, permanent ID that connects you to your data.) Make sure you’re properly connected to your data when it is submitted!
- What will you do with the conclusions you have drawn from your project? Will you share at conferences? How will you make sure that your work is seen by people who might be most interested in it? Or most benefit from it?
And these are just a few questions that will help you create a data management plan. You may think of more as you are writing it. Don’t be afraid to add extra info that isn’t on the template. While the data management plan need not be a static document, changing it during the project should something that happens with full discussion and consent of all project members. You may also have to update your funder too, if that is part of the funding agreement. Make sure to discuss how changes like this will be handled by the team before the project starts.
All of these considerations for DMPs will be covered in more detail in the remaining modules.
Before you go! Please consider for the next module:
At the beginning of the module, we discussed how you’ve likely used some sort of plan before if you’ve done work on a project in post-secondary. If you have done so, how useful was that plan? Knowing what you do now, are there any things you would do differently if you did the project again?