Analysis of data journal guidelines in re3data COREF

Main Authors: Kindling, Maxi, Strecker, Dorothea
Format: Proceeding poster Journal
Terbitan: , 2020
Subjects:
Online Access: https://zenodo.org/record/4264513
Daftar Isi:
  • Data journals enable the publication of data papers, i.e. peer-reviewed, citable and indexed publications that describe datasets as well as the methodology and circumstances of their collection. Data Papers usually describe research data to show the potential and possibilities of their reuse. Some journals also aim to show data sets that are considered of outstanding importance for research in the respective field. Data journals often provide explicit guidelines for authors and reviewers that cover the formal requirements and review criteria for data papers and the data sets described. Data Journals expect a data availability section or statement, some encourage data sharing, some mandate it. Some of these guidelines also recommend research data repositories for depositing and sharing datasets, as well as criteria for selecting suitable repositories. This poster presents preliminary results of an extensive study on data journal guidelines, providing a quantitative description of the data journal sample as well as findings from a qualitative content analysis. Since data papers and data journals can currently neither be clearly defined nor identified, it is not possible to create a complete sample of all data journals. Therefore, the sampling is based on previous studies and includes additional sources, for example publicly available lists of data journals and websites of publishers known from previous studies. The resulting sample comprises 142 journals. Information on all journals in the sample was collected automatically (adding bibliographic information via the Crossref API) and manually (visiting journal websites). 10.56 % (15) data journals in the sample are categorized as being pure data journals focussing primarily on publishing data paper. Most data journals started publishing after 2000. Recommended repositories and criteria for repository selection were extracted from journal guidelines using qualitative content analysis. For most journals, several guidelines were included in the analysis (e.g. author guidelines, reviewer guidelines, data policies, specific guidelines for data papers). The guidelines mention more than 220 repositories explicitly, 166 of these recommended repositories are currently listed in re3data. Large generalist repositories are mentioned frequently, but most (120) recommended repositories overall are disciplinary. Criteria for repository selection include assigning persistent identifiers, ensuring long-term availability of datasets, widespread use in the respective community, open licensing, enabling the standardized descriptions of datasets as well as the sustainability and trustworthiness of the repository. To determine the extent to which all research data repositories listed in re3data (2580) currently meet these requirements, information on the criteria PID assignment, open licensing and metadata standardization was extracted from the re3data API. The analysis showed that 53.68 % of all repositories offer open licenses, only 37.36 % explicitly state which metadata schema is used, and only 35.43 % assign persistent identifiers. The poster provides a brief insight into the landscape of data journals and summarizes characteristics of research data repositories recommended in data journal guidelines. First results show that data journal guidelines reflect the need for transparency of research and reusable data across all disciplines. The analysis is a first step in the process of aligning re3data more closely with the needs and requirements of different actors in the field of research data sharing.