In this module we will go over some common terms within the ocean research community, the ways to talk about data and add some suggestions for who to talk to about data. It is not necessary to memorize these terms, but having at least a cursory idea about them will create more comfort talking about them.
In our previous module we asked:
What you consider data literacy, and to think about how you can practice this skill
Data literacy is important to help you understand what publishers and funders are looking for, and another important factor is understanding the terms and resources that will help you have conversations about data.
This module will help you practice skills for the evaluate part of data literacy, which is, as you might recall: “Evaluate – This means you would be able to know the right data tool for the job, or the right way to analyze data, identify practical solutions to data problems in the ocean sector, and can develop and implement a plan for your data.”
Common terms to know that will help you have an ocean data conversation:
Buoys – Data buoys are physical devices that float in the ocean and monitor various variables, transmitting data back to shore. They can be either moored or drifting and come in a variety of shapes and sizes. The buoy itself, however, does not do the monitoring and is merely the platform that the monitoring device is attached to.
Gliders – An autonomous (unmanned) underwater vehicle that collects a variety of ocean data.
Tri-Agency/Tri-Council – This is the main governing body guiding federal research in Canada, made of three unique councils; they set out guidelines for ethical research and offer federal grants.
DMP/Data Management Plan – A plan that contains all the elements pertaining to the data of your research project.
FAIR – FAIR standards create ethical sharing of your data.
CARE – These are complementary standards to FAIR detailing the ethical sharing of Indigenous data.
Dirty Data – Dirty data is data that has incorrect or null values. These incorrect values may not be obvious on first glance, and are often missed by machines, requiring human understanding and context to parse out.
Data Dictionary – A data dictionary is a list of common ranges and values that might be found within a particular dataset. They help identify dirty data by listing standard deviations to compare to. These are essentially guides with all the information you would need to know about your data, such as controlled vocabulary, metadata, and types or formats of data.
LIDAR – a particular kind of laser scanning and measurement for determining range measurements. Used in many different ocean contexts.
EOV – EOVs are essential ocean variables and these are key variables that have agreed upon recommendations and best practices for how they should be used and interpreted in the broader ocean research community.
Geospatial Data – data that both directly and indirectly references a specific geographic area/location.
CODAR – land based, high frequency radar, used to measure ocean currents. For those studying coastlines, you will likely learn about CODAR.
Metadata – Data about data. Categorizes and defines data, allowing it to be read and understood by machines on the web. Metadata is a catalogue of the language that allows data to be found on the web.
Repository – a place where data is deposited or stored, and others can see or access your datasets.
CKAN – CKAN stands for Comprehensive Knowledge Archive Network. CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers hundreds of data portals worldwide.
ERDDAP – A data server that shares data and can reformat data in common formats at your request, allowing for high levels of interoperability. To quote ERDDAP themselves “ERDDAP is a data server that gives you a simple, consistent way to download subsets of scientific datasets in common file formats and make graphs and maps.”
QA/QC – Quality Assurance/Quality Control – The people in an organization who make sure that best practices are being followed in each step of a project.
QARTOD – is a QA/QC program created and helmed by IOOS US that outlines best practices to evaluate the quality of ocean data across different parameters.
Controlled Vocabulary – a controlled vocabulary is a set of agreed upon terms that make data easier to organize and retrieve by both humans and machines
Metadata Schema – This is the organization of metadata. A metadata schema explains the metadata’s controlled vocabulary, hierarchy and standards.
Darwin Core – A particular metadata schema, focused around biological diversity. It uses biological taxa to organize itself.
CF Conventions – a kind of metadata focused around the description of Earth sciences. The conventions define metadata that provide a definitive description of what the data in each variable represents, and the spatial and temporal properties of the data.
How to talk about data:
A good way to get a taste of the ocean sector and the common terms is to attend an ocean research conference, either in person or online. Even just watching, if you’re not comfortable talking to anyone just yet is a good way to learn more of the terms and the most recent and relevant topics in the industry. Conferences aren’t just great for networking, they’re also a great way for more casual relationships to form. Who knows, you might meet your next project coworker at one! Try to go to talks that are most interesting to you, participate in the Q&As if possible and talk to people there who share your interests.
Another great way to talk about data is to meet your professors, if you’re still in post-secondary! Those of you who are graduates probably have a supervisor that you already work with a lot, but perhaps there are still professors in your field you haven’t met yet. Most professors love when students come to their office hours to talk to them about their research, and there’s never a bad time to get to know your favourite professor.
And finally, there’s the option to cold call or cold email, although in some cases cold calling can be more efficient. What this means is essentially sending an unsolicited call/email to the organization. While this may seem anxiety inducing and potentially rude, if done correctly this can help create new connections within the industry. Doing this requires a bit of research and patience. Many organizations likely have someone that would love to connect with you, but some will not. When doing this make sure you’re not emailing a person who is specifically for troubleshooting or other such worries. Be willing to take no for an answer if there’s no one who can help you at the moment. Above all be polite, show interest in the company and do your research before reaching out.
Before you go! An activity to do before the next module:
Find an ocean data organization that interests you and then reach out to one of the members there and ask them about their data work and things that interest you about their organization. Or reach out to one of the organizations you researched in module 4. Alternatively, we challenge you to introduce yourself to one of your professors if you haven’t done so yet and ask them about their research. Talk to them about ocean data and the work that they do!