Data sharing

To allow others to verify and build on the work published in Royal Society journals, it is a condition of publication that authors make the data, code and research materials supporting the results in the article freely and publicly available to allow for reuse. This policy can be cited by DOI via FAIRsharing.org. It is not permitted to state that data will be available from the authors upon request.

Why do I need to make my data available?

We require primary data and materials (such as protocols, software), including source code and other digital research materials, to be made publicly available on publication of articles, as well as at submission for verification/review purposes. As a minimum, sufficient information and data are required to allow others to replicate all study findings reported in the article. This is in line with our policies to promote greater openness in scientific research.

Back to top

Where can I submit my data?

There are two options for archiving data, code and other materials: in a publicly accessible repository (our preferred option) or as supplementary material associated with the published paper. If your datasets exceed the maximum repository file size, please contact the Editorial Office of the relevant journal.

Repositories

Our preference is for authors to archive their primary data and code with an external repository. Authors should deposit research data in a FAIR-aligned repository, with a preference for those that explicitly follow the FAIR Data Principles and demonstrate compliance with international standards for data repositories (eg CoreTrustSeal).

Your chosen repository should:

  • be publicly available
  • retain data under an open license (CC0 or CC-BY) (clearly visible on the landing page of your dataset)
  • provide files with a DOI
  • make versioning/changes clear
  • have provisions for permanent access
  • have an English-language translation

To encourage best practice in data sharing, we have provided a list of example repositories below – this list is not exhaustive; authors are encouraged to use the most appropriate repository for their field.

Supplementary material

Our preference is that data is archived in an external repository, and that supplementary material is used for supporting figures, videos and other small files (under 2 MB). In addition, we deposit all supplementary material into the Figshare repository on the author's behalf on publication.

Back to top

When do I submit my data?

Data files, code and other supporting material must be provided at the point of submission for our editors and reviewers for peer-review. Files can be provided by hosting them in an external repository with an accessible link included in the data accessibility section (you will be prompted for this during submission of the manuscript). Some material, such as small datasets, videos or figures, may be uploaded as supplementary material via the electronic submission system. For publication, all files have to be made publicly available, and a complete data statement included providing details of the repository where data and code have been deposited and the DOI of the deposited data. Details of the repository must also be included in the reference list.

Back to top

What level of data needs to be made available?

It is a condition of publication that authors make the primary data, materials (such as protocols, software) and code publicly available. Data and code should be deposited in a form that will allow maximum reuse.

Studies that do not rely on data, code or other material (eg theoretical studies) to generate their conclusion may be replicated without recourse to data etc. Such papers do not need to take explicit action under the open data policy, but this must be clearly and explicitly stated in the cover letter and data accessibility section in our online submission form.

All files, and all data columns within files, should be clearly labelled and readily interpretable. Authors must also provide a 'read-me' file alongside their data and code, describing each file and each column in each dataset file (including units of measurement).

Authors do not need to submit the raw data collected during an investigation if the standard in the field is to share data that have been processed (eg CSV files recording response to stimuli rather than the electrical signals on which they were based). If processed data are supplied, rather than raw data, this should be stated in the data accessibility section during submission. Further, in such cases, the code used to process the data should also be supplied.

Raw image data for digital morphology should be provided with processed 3D data; eg, field standards are to share such data in museum-linked repositories such as morphosource.org. Raw data is also required for gel imagery like western blots.

Back to top

What are the policies around code?

Shared code needs to be made accessible for free and should be easy to locate and download. We strongly recommend that all code be deposited in a permanent, public repository that issues citable digital object identifiers (DOI) or other persistent identifiers, for example using Zenodo to archive GitHub packages, CodeOcean, or the Software Heritage archive. Ideally, the repository should allow for versioning so that the version of the published record is permanently documented and assigned its own DOI.

Please provide access to all original (not proprietary) code used to generate statistics and generate figures, along with any (processed) data required as inputs, along with details of what software it requires (program and version) without access restriction. Code executed within commercial software packages (eg Matlab, SPSS, office software), where directly related to the study’s findings, also needs to be shared. Details of the version of the commercial package should be included in the Data Availability Statement. As a minimum, sufficient information and data are required to allow others to replicate all study findings reported in the article if proprietary software was used. Analysis code (such as R scripts) must be made available at the point of submission, as well as any previously unreported algorithms. Where authors have used statistical software with graphical user interfaces and no method to export code, authors must document completely and clearly the choices input into the statistical software for every statistical model or test conducted. This description of detailed methodology can be uploaded along with any other relevant data and/or code. Any restrictions on or reasons for prohibiting the sharing of important code or algorithms must be discussed with the Editors before submission.

Code should be made available under an open source licence and deposited in an appropriate repository such as Zenodo or Github. Code deposited in repositories should be cited in the reference list of the manuscript as a software citation using the DOI or other persistent identifiers wherever possible. A statement about where and how your code can be accessed must be included in the Data Availability statement in your manuscript. Access to code should be available to editors and reviewers at the time of submission and throughout the editorial process. As authors you are responsible for the quality of your code. Data editor and reviewers will be asked to assess if you have complied with this policy and are able to review the code at their discretion. Refusal to share code, or to provide to access to the code, in accordance with this policy, may be grounds for rejection.

If you cannot provide full details on code, then please explain what you have used and how, eg if you used someone else’s code, then please explain who the code belonged to and how it was used.

Back to top

How do I prepare the data accessibility section?

Authors of all papers that report primary data will be required to provide a statement in the manuscript submission form that states where the article's supporting data, materials and code can be accessed.

If these have been deposited in an external repository this section should list the database, accession number/DOI and any other relevant details to clearly identify the dataset(s). Datasets included here must also be listed in the reference section. Citing datasets and code ensure effective and robust dissemination and appropriate credit to authors.

For example:

  • DNA sequences: Genbank accessions F234391-F234402 [REF#]
  • Phylogenetic data, including alignments: TreeBASE accession number S9123 [REF#]
  • Climate data and MaxEnt input files: Dryad doi:10.5521/dryad.12311 [REF#]

If supporting data have been included in the article’s supplementary material, this should be stated here, for example:
The datasets supporting this article have been uploaded as part of the supplementary material.

It is not permissible to state that data will be available upon request to the authors. Unless previously agreed with the editorial office of the journal, refusal to share data in accordance with this policy may be grounds for rejection.

Back to top

How do I reference third party data or code in the data accessibility section?

Please provide details about the previous published article, with a link where possible eg [X] data are available from Smith et al. [2021]: [URL XXX].

Where the third-party material isn't covered by an open licence, please provide evidence that you have permission to use the data. If your data, code or supporting material was supplied by a third-party under licence, and the sharing agreement means it is not possible to share the full dataset or code, please provide as much information as you can regarding the supporting material, and provide contact details of a data manager/curator (or organisation) who may be contacted for additional details if required.

Back to top

How do I cite datasets and code in the references?

Citing datasets and code ensures effective and robust dissemination and appropriate credit to authors. Therefore, we strongly encourage authors to include datasets and code in the reference list as well as in your data accessibility section.

Citations in Royal Society journals are in the Vancouver style, for example:

1.Torres-Campos I, Abram PK, Guerra-Grenier E, Boivin G, Brodeur J. 2016 Data from: A scenario for the evolution of selective egg colouration: the roles of enemy-free space, camouflage, thermoregulation, and pigment limitation. Dryad Digital Repository https://doi.org/10.5061/dryad.5qt2k

Source code or the commercially available software used should be referenced in an appropriately formatted citation – this article provides guidance.

Back to top

Common repositories

General repositories

Where no appropriate subject-specific repository exists, data should be deposited in a general repository such as Dryad, Zenodo or the Open Science Framework (OSF).

Biological Sciences

Nucleotide sequence data

Accession numbers must be provided in the data accessibility section of your manuscript.

Microarray data

Protein sequences

Accession numbers must be provided in the data accessibility section of your manuscript.

Proteomics data

We recommend that all proteomics data, including mass spectrometry and protein interaction data is deposited via the EBI PRIDE website.

Physical Sciences

Chemical data

Chemical structures and bioassays should be deposited in PubChem.

Earth, space and environmental science data

A useful list of repositories can be found on the AGU website.

A frequently used repository is pangaea.de a member of the World Data System.

Back to top

Frequently asked questions

What are the benefits of making data and other materials available?

We require supporting data and information, including source code and other digital research materials, to be made publicly available on publication of articles, as well as at submission for verification/review purposes. It can increase citation levels and draw attention to your work

  • Verification of results – readers and reviewers can replicate studies and identify statistical or methodological errors
  • Allow others to build on your work, find new uses for your data and use in meta-analyses (and reduce effort in data collection), thereby increasing the visibility and impact of your work
  • Preserve your full scientific contributions (beyond published articles) in an organised, citable system
  • Take advantage of professional curation services
  • Reducing the likelihood of repeated work
  • Increased value for money for funders
  • Providing data at submission means that accidental errors or problems with analysis may be picked up prior to publication, potentially leading to fewer retractions.

Learn more in our Data archiving video.

Back to top

What licence should apply to datasets?

Please ensure that the licence applied to your dataset is clearly visible on the repository landing page of your data record. All data deposited to Dryad through the integrated submission system will be published under a Creative Commons BY 4.0 licence; as will all supplementary files. Data which do not explicitly have an open licence are not open data. Wherever possible, we ask that authors ensure that the license accompanying their data record is given as an open data license of either CC0 or CC-BY.

Exceptions to the above may be made for authors dependent on the circumstances (for example, due to ethical considerations, or if data are obtained from a third party where re-use restrictions may apply) but we ask that authors please query this with the editorial office prior to submission to the journal.

Back to top

What do I do if there are restrictions on accessing my data due to ethical and/or legal reasons?

If data are restricted eg, for ethical and/or legal reasons, you should discuss with the editors prior to submission.

Back to top

What are the embargo restrictions?

The general policy is that data, code and materials must be made publicly available at the time of publication. Exceptions to this policy are rare and can only be approved at the journal’s discretion.