Advanced data sharing
FAIR open data
If you would like to engage more deeply with Open Science through open data publication, the FAIR principles (Findable, Accessible, Interoperable and Reusable) are widely regarded as the key schema for analysing the openness of a dataset. The FAIR principles were drafted at a Lorentz Center workshop in the Netherlands in 2015, and have since been taken up by major organisations such as FORCE11, National Institutes of Health (NIH) and the European Commission.
The FAIR principles are a way of determining the openness of a particular dataset, as well as providing advice on how to make your data more open. The Australian Research Data Commons has created a FAIR data self-assessment tool you can use to determine the openness of your data. You can also look at the brief guidelines below to see how to make your data increasingly FAIR,
How do I make my data Findable?
How do I make my data Accessible?
Clearly indicate under what conditions the data may be reused. There are good reasons not to share certain kinds of data, such as data owned by commercial partners, confidential data, or disclosive data about at-risk populations. This may be fully open (such as under a Creative Commons Attribution-Only) licence, or with specific usage constraints. UCT's DataFirst repository has a number of access conditions for different levels of sensitivity, while ZivaHub has the option to make the metadata open while keeping the data files themselves confidential.
How do I make my data Interoperable?
To make your data interoperable, use disciplinary or community data standards, formats, and standardised vocabularies to describe your data. Your metadata should also adhere to disciplinary standards, optionally linking out to related information using identifiers.
How do I make my data Reusable?
- As with making data Findable, ascribing your dataset with rich metadata assists users in being able to make sense of the data and understand how it can be used in their studies. Using discipline-specific metadata further enhances its reusability.
- Ensure that the repository you are depositing in has machine-readable licences, which is increasingly standard. Both ZivaHub and DataFirst provide machine-readable licences.
5-Star Open Data
Another schema for judging the openness of a particular set of data is the 5-Star Open Data model developed Tim Berners-Lee. The 5-Star Open Data model describes data shared online according to how open it is, with the following ratings:
- 1-Star: the data is online and under an open licence.
- 2-Star: the data is online and under an open licence, and is in a structured form (i.e. MS Excel format, rather than a picture of a table).
- 3-Star: the data is online and under an open licence, and is in a structured form in a non-proprietary format (i.e. .csv instead of .xlsx).
- 4-Star: the data is online and under an open licence, in a structured form in a non-proprietary format, using URIs.
- 5-Star: the data is online and under an open licence, in a structured form in a non-proprietary format, using URIs, and links to other data to provide context.
The 5-Star model is graphically displayed below:
The 5-Star Open Data model is geared towards quantitative data, and reaching the 4-Star and 5-Star levels requires specific kinds of data and data expertise that may not be possible in all projects.
For more information on the model, please visit the 5-Star website.
If you would like to read further into the field of research data management (RDM), a number of institutions, consortia and other high-level bodies have created guidance on how to implement RDM, listed below:
- ANDS (Australian National Data Service): Data management overview
- DCC (Digital Curation Centre): Making the Case for Research Data Management
- JISC (Joint Information Systems Committee): Managing research data in your institution
- OCLC (Online Computer Library Center): The Realities of Research Data Management