Since Open Science has become a recurring buzzword for recent meta-scientific developments, this article summarizes what these developments entail. What are the reasons for discussions about Open Access, Open Data and Open Peer Review? Which technological changes can we expect and which impact will they have on society and the research community?
Rima-Maria Rahal1 & Johanna Havemann2
1 Tilburg University, The Netherlands
2 Institute for Globally Distributed Open Research and Education, Germany
This article was written as part of the Open Science Fellow Program 2018/19. and published originally in German (“Wissenschaft in der Krise: Ist Open Science der Ausweg?”) under CC-BY-SA license on 04.04.2019 in Forum Wissenschaft at bdwi.de/forum/archiv/archiv/10732812.html. This English translation is archived as a preprint at osf.io/preprints/metaarxiv/3hb6g/, Preprint DOI: 10.31222/osf.io/3hb6g.
Open Science is committed to making scientific questions, methods and findings freely accessible and usable for all.[1] In this sense, the term Open Science stands for open and transparent research whose processes and results are as comprehensible, robust and transparent as possible, involving both openness and enabling good scientific practice in the digital age, for which appropriate tools and techniques are being developed.
Why do we need Open Science?
In fact, the majority of scientific work is currently not done “openly”. On the contrary, most research results in the empirical humanities and natural sciences are hidden behind the paywalls of profitable private science publishers and only accessible to a few financially strong institutions and individual researchers. In addition, positive and ground-breaking results are published preferentially, so that important information and findings about other studies remain hidden in file drawers and in data storage places, never reaching the scientific discourse. Methodological approaches, software and laboratory equipment are usually inadequately documented in the scientific literature and are difficult, if not impossible, to standardize, understand and analyze methodically and experimentally.
In the last 10 to 15 years, the dogma “publish or perish” has manifested itself deeply in the scientific landscape, describing the enormous pressure to publish as quickly as possible in a career to produce as long a list as possible of peer-reviewed publications in as ‘prestigious’ scientific journals as possible. Consequently, the volume of specialist literature is constantly increasing, making it virtually impossible to obtain an overview of relevant results in any field of research. Decision paths in the granting of research funds and the allocation of scientific positions have so far often been based on the number of publications in journals with the highest possible impact factor. Although the impact factor provides information on the average number of citations per article in the journal, it cannot provide information on the quality of individual articles. Therefore, research funding institutions and science consortia are increasingly moving away from the impact factor as an indicator of research quality.[2] Nevertheless, great pressure remains, so that scientists often feel forced to publish in high impact factor journals and to produce results making such publications more likely.
Because of this pressure to publish, the reticence of scientific results can also lead to challenges for scientific practice and questionable research results. These problems are by no means new, even if the current developments in the course of the Open Science movement have heralded a new phase in how they are being addressed. As early as 1962, for example, J. Cohen complained that empirical psychological studies were based on too small sample sizes.[3] In 2011, researchers came to the same conclusion.[4] With small samples and the regrettable tendency to publish only significant results (the so-called file drawer problem, which affects all empirical sciences and leads to distorted scientific literature[5]), however, the probability increases that published research results represent findings that are actually wrong.[6] Consequently, the reliability and robustness of large parts of the scientific literature is questionable.
The accumulation of problematic research publications finally led to the big bang in psychology and other empirical sciences at the beginning of the 2010s: In the aftermath of a few prominent scientific fraud cases, it turned out that not only these obvious cases of scientific misconduct produced questionable research results. Rather, there are also other problematic practices in everyday scientific life which make it difficult to distinguish robust and comprehensible scientific papers from those that have little veracity. Some opinion leaders[7] lament this “crisis” in the empirical sciences and fear that it will lead to a far-reaching loss of credibility, while others see no “crisis” at all. Still others see this “crisis” rather as an opportunity for positive change, releasing energy to achieve sustainable changes in the way science is made.[8]
The path to Openness: What can Open Science accomplish?
The Open Science movement is broadly committed to ensuring that the entire research cycle shifts towards openness, thereby advancing good scientific practice. Important components of this shift are the support of free access to scientific literature (Open Access) and to data sets (Open Data), of open software and hardware for data collection and processing, as well as of freely available teaching and learning materials (Open Educational Resources). These overarching strategies for opening up science are enriched by tools and practices that enable science to be conducted in a more transparent and robust manner.
Open Science through reliable and robust Research
The Open Science movement has already achieved a number of improvements and advances in everyday research. These include, for example, the increasing number of replication studies carried out and published. Replications are research projects aiming to repeat previous research as accurately as possible to determine whether the same conclusions can be reached again. In this way, reliable findings can be identified in the existing literature, and future research can build on them. In addition to a large number of projects resulting from individual efforts of interested research teams or networks, coming together for instance via the study exchange network Study Swap[9], research laboratories from different institutions increasingly work together on large-scale replication attempts. In an example following this notion, the Replication Project Psychology[10] was collaboratively conducted by more than 250 scientists who repeated 100 studies from various fields of psychology. Similar replication projects have been carried out in cancer research[11], behavioral economics[12], and experimental philosophy[13]. Other large-scale projects focus on one central study and repeat it in multiple laboratories.[14] Replication projects – whether large or small – are suitable for systematically reviewing scientific literature. Changes in scientific practice can also be seen in traditional research work and address for instance the above-mentioned adequacy of sample sizes for empirical studies. Increasingly, journals demand that the necessary sample size is determined a priori, that is before data is collected for the study. With the stringent planning of experiments, the reliability of empirical findings can increase substantially.
Open Science through open material and transparent methods
Open Science also increasingly influences research methods and materials. For example, the Gathering for Open Science Hardware[15] aims to reduce barriers between manufacturers and users of research materials and equipment. Based on open materials, 3D printers and user manuals, simple to highly complex laboratory devices and materials, can often be produced locally at low cost and scientists can adapt them specifically to the needs and for the experiment in question.[16] This makes the materials more cost-effective, which lowers the barriers to their acquisition, and scientists can carry out their experiments in a timely, uncomplicated and precisely tailored manner. In this way, generalized apparatuses and research equipment by commercial manufacturers are bypassed, which are often highly complex, intransparent in their setup, and therefore difficult to adapt.
A similar approach is being applied to materials, chemical compounds and protocols of laboratory experiments. On digital platforms such as Protocols[17] and with electronic laboratory books such as Labfolder[18] or Benchling[19], scientists can document their methodology in detail, making it accessible and reproducible for other researchers. Open source software, the source code of which is freely accessible, is increasingly being utilized for the presentation of stimuli or for carrying out data evaluations and statistical analyses (e.g. R[20]). In addition to increased transparency, Open Source Software offers users the opportunity to participate in the development of the research programs by making their own code publicly available. The code written by users can be integrated into the programs as a package or collected on software platforms such as GitHub[21] where it is accessible to others. There is now a large community of users of these software solutions who exchange information and help each other in forums such as Stack Overflow[22]. The process of writing articles is also becoming increasingly reproducible by linking data analysis and text production with software solutions such Jupyter Notebooks[23].
Open Science opens up the publishing landscape
In addition to technical changes, Open Science has also led to developments in the publishing process. Meta-analyses, summaries of the status of a field across studies, often provide indications that the tendency to publish studies with significant and surprising effects preferentially has led to a systematic distortion of the literature.[24] Although methods are being developed to be able to mathematically correct this distortion retrospectively, the Open Science debate has sparked calls for changed publication formats to avoid such distortions from the outset or to make them more readily detectable in the aftermath. Central to this idea is the concept of pre-registrations: hypotheses to be tested are announced and published in advance, so that it is clear in peer reviewed published articles which results correspond to these original hypotheses and at which point interesting new but still explorative and thus uncertain findings have emerged. Currently, nearly 280,000 pre-registrations are publicly available on registration platforms.[25] A growing number of scientific journals now ask authors if their hypotheses were registered in advance. Since 2013, more than 300 scientific articles have been published, whose hypotheses were pre-registered in one of the newly created forums for pre-registrations, the Open Science Framework[26].[27] The platform AsPredicted currently counts more than 12,000 pre-registrations.[28] In addition, the number of studies that make data, analysis scripts and data collection materials publicly available in the course of publication is steadily increasing.[29],[30] In about 20 percent of the Anglophone academic journals, open access data is part of the submission requirements.[31] Data repositories such as Metabus[32], Curate Science[33], Gesis[34], Dryad[35], The Dataverse Project[36] and Dspace[37] collect online records for large metastudies. In total, there are currently more than 2,000 data repositories of various disciplines and fields of research.[38] Research accelerators[39], groups of research teams that jointly decide on new projects, also contribute to a better coordination of research resources. Thus, the scientific process is not only becoming more transparent but also more efficient.
The change in scientific publication practice is also reflected in a newly introduced format called Registered Reports, which are increasingly made possible by journals when submitting manuscripts. Registered reports can currently be published in 162 journals.[40] Submissions are reviewed via peer review before the data are collected, and thus accepted as publications solely on the basis of the theoretical background and the intended methodology, without the significance of the results influencing the decision to publish. Another publication format that is becoming increasingly popular is posting scientific papers in publicly accessible Preprint Repositories before submission to a traditional journal. For instance, the preprint repository arXiv recorded an increase of 14 percent in the number of newly submitted preprints from areas such as Mathematics, Physics, Computer Science and Economics in 2018 compared to the previous year.[41] New preprint repositories in psychology, agricultural studies, social sciences, engineering and law are emerging. These preprint repositories enable a critical dialogue between authors and scientific audiences even before the traditional publication, which can shape and improve the work while it is created, rather than restricting criticisms to the phase after publication when it can no longer be changed. The classic peer review process is also undergoing cautious improvement efforts. Several journals offer Open Peer Review, making the review process public and thereby further increasing the transparency of the overall publication process.[42] Reviewers can make their voluntary work visible and contribute to public discourse with their comments. At the same time, the authors are involved as equal partners at eye level. The Open Science movement contributes to revising the “publish or perish” culture, building instead an approach of “publish (Open Access) and flourish”, which promotes collaboration beyond disciplines and – in many cases – makes it possible in the first place.
Open Science in Teaching and Learning
The shift towards Open Science also has an impact on teaching and learning in the higher education sector. Methodological changes in the course of the Open Science movement have created the need to develop and communicate new learning content. Why does science need openness? What epistemological and statistical necessities are at the root of these developments? How can data be prepared in a way that others can understand, question and use them? Where do you store materials that should be accessible to the public? How can the necessary sample size be calculated? Such questions raised by the shift towards Open Science concern scientific core competences and organizational know-how. By developing community guidelines and standards, individual scientists can benefit from these recommendations and students are trained in science relevant skills from an Open Science perspective in the digital age.
At the same time, the development of new learning materials is accompanied by a change in their publication formats. The proportion of freely available, expertly curated and freely licensed teaching and learning resources is constantly increasing.[43] Free educational resources reduce financial and structural barriers for interested readers, are thus accessible to a broader public globally and make an important contribution to bridging the gap between the countries of the southern and northern hemispheres. In addition, the often digital format of these resources enable timely updates, revisions and corrections driven by current developments in the field. Prominent examples for such open learning materials are the Khan Academy[44], the Open Learning portal of Harvard University[45], the portal OER Commons[46] as well as the online course series Open Science MOOC[47] which is currently being developed. Conclusively, in addition to the core aspects of science, access to higher education is now also opening up.
Open Science in Science Policy
Open Science is also becoming increasingly relevant in science policy decisions. Funders and sponsors, including the German Research Foundation[48] (DFG) and the European Research Council[49], are more and more committed to the principles of Open Science. About 40% of scientific funding institutions now have Open Data guidelines.[50] Conditions and incentives are progressively created in a way that supports the move towards openness. Whether Open Science practices are used is also increasingly becoming a criterion in the hiring process of scientists, for example in the Department of Psychology of the LMU Munich.[51]
Furthermore, the list of positive effects of comprehensively practiced Open Science also includes economic and ethical aspects. After all, how many financial resources could be used more sensibly if methods and standards were established in a transparent and subject-specific way? Making methods, intermediate results and negative or unexpected results available would not only reduce the amount of materials and time resources invested into research, but can also significantly reduce the number of animals used in biomedical and natural science experiments.
In addition, the added economic value also benefits from the fact that research findings could be applied economically and medically in a timely manner.
Moreover, Open Science also contributes to improving science communication with the public. If research results are communicated transparently, interested non-scientists can gain easier access to research results. In this way, the public and science can move closer together and enter into a livelier debate. The Citizen Science Movement takes this notion further by actively involving the public in the scientific process. Non-specialists become observers, data collectors and real Citizen Scientists who use platforms such as Bürger schaffen Wissen[52] to support large-scale research projects.
Finally, Open Science holds great potential for the democratization of research and learning. At present, for example, many contents, techniques and articles are produced by Western researchers or members of privileged majority groups. To establish globally democratized conditions in knowledge exchange, institutions and participants from Africa, Latin America and Southeast Asia, as well as members of social minority groups, must also be able to participate. Open Science reduces access barriers and multiplies opportunities for participation.
Conclusion: What happens next with Open Science?
In general, Open Science is a comprehensive reform about how science can be practiced in an accessible, transparent and reusable way. This reform has numerous quantifiable positive effects, in terms of both everyday scientific practice[53] and the interaction between science and society worldwide. Open Science promises improvements not only at the system level for scientific practice in general, but also for individual scientists. Open Science can create the momentum to put the science system in a state that allows scientists to focus on what actually matters for their work: the search for reliable insight.[54] Registered Reports, for example, enable methodologically clean work to be published independently of the significance of the results and thus to be part of the public discourse, which can be almost completely freely accessible through Open Access, Open Peer Review, Open Data, etc. By means of pre-registrations, researchers can make their thought processes in achieving insights transparent, and work more efficiently due to improved project planning. In the end, this provides freedom from the pressure to chase after surprising findings in one’s own work whatever the cost.
Open Science addresses a number of ongoing and new challenges, such as database standardization, interoperability between different data collection and processing systems, as well as transferability across disciplines. Which data should and may be openly accessible and where does the protection of personal rights and data security apply? Who benefits from Big Data in science and how will the technical and methodological changes affect society? These questions must be answered specifically by each working group and individually within the framework of the discipline and current research question. Establishing standards that are as comprehensive as possible, but also allow flexibility in the interpretation and implementation of individual scientific questions improve comparability and enable the reproduction and re-use of scientific results.
Open Science is a movement based on the principles of good scientific practice, the scope of which can only be imagined so far. We are at the beginning of the way out of the crisis, all the signs are pointing to “open” and give reason to hope for good things to come.
[1] Open Definition 2.1, https://opendefinition.org/od/2.1/en/, retrieved 23.01.2019.
[2] Press release oft he German Research Foundation (23.02.2010), http://www.dfg.de/en/service/press/press_releases/2010/pressemitteilung_nr_07/, retrieved 06.02.2019
[3] Cohen, J. (1962) The statistical power of abnormal–social psychological research: A review. Journal of Abnormal and Social Psychology, 65, 145-153.
[4] Marszalek, J. M., Barber, C., Kohlhart, J., & Cooper, B. H. (2011). Sample Size in Psychological Research over the Past 30 Years. Perceptual and Motor Skills, 112(2), 331–348. https://doi.org/10.2466/03.11.PMS.112.2.331-348
[5] For the example of clinical studies: Dickersin, K., & Min, Y. I. (1993). NIH clinical trials and publication bias. The Online Journal of Current Clinical Trials, Doc No 50, [4967 words; 53 paragraphs].
[6] Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), 696–701. https://doi.org/10.1371/journal.pmed.0020124
[7] Psychology is in crisis over whether it’s in crisis. https://www.wired.com/2016/03/psychology-crisis-whether-crisis/, retrieved 06,02.2019.
[8] https://openaccess.mpg.de/Berlin-Declaration
[9] https://osf.io/view/StudySwap/
[10] http://science.sciencemag.org/content/349/6251/aac4716, Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716. https://doi.org/10.1126/science.aac4716
[11] https://elifesciences.org/collections/9b1e83d1/reproducibility-project-cancer-biology
[12] https://experimentaleconreplications.com
[13] https://sites.google.com/site/thexphireplicabilityproject/home
[14] E.g., Bouwmeester, S., Verkoeijen, P. P. J. L., Aczel, B., Barbosa, F., Bègue, L., Brañas-Garza, P., … Wollbrant, C. E. (2017). Registered Replication Report: Rand, Greene, and Nowak (2012). Perspectives on Psychological Science: A Journal of the Association for Psychological Science, 12(3), 527–542. https://doi.org/10.1177/1745691617693624
[15] http://openhardware.science/
[16] Maia Chagas A (2018) Haves and have nots must find a better way: The case for open scientific hardware. PLoS Biol 16(9): e3000014. https://doi.org/10.1371/journal.pbio.3000014
[17] https://www.protocols.io/
[18] https://www.labfolder.com/
[20] https://www.r-project.org
[22] https://stackoverflow.com
[24] Ferguson, C. J., & Brannick, M. T. (2012). Publication bias in psychological science: Prevalence, methods for identifying and controlling, and implications for the use of meta-analyses. Psychological Methods, 17(1), 120–128. https://doi.org/10.1037/a0024445
[25] https://osf.io/registries, retrieved 06.02.2019
[27] https://www.zotero.org/groups/479248/osf/items/collectionKey/VKXUAZM7, retrieved 06.02.2019
[28] https://credlab.wharton.upenn.edu, Retrieved 06.02.2019.
[29] Kidwell, M. C., Lazarević, L. B., Baranski, E., Hardwicke, T. E., Piechowski, S., Falkenberg, L.-S., … Nosek, B. A. (2016). Badges to acknowledge open practices: A simple, low-cost, effective method for increasing transparency. PLOS Biology, 14(5), e1002456. https://doi.org/10.1371/journal.pbio.1002456
[30] Also see: https://www.ejp-blog.com/blog/2019/1/16/reflection-on-open-science-practices-in-2018, retrieved 05.02.2019.
[31] Vasilevsky, N. A., Minnier, J., Haendel, M. A., & Champieux, R. E. (2017). Reproducible and reusable research: are journal data sharing policies meeting the mark? PeerJ, 5, e3208. https://doi.org/10.7717/peerj.3208
[33] http://curatescience.org/#
[34] https://www.gesis.org/en/services/archiving-and-registering/data-archiving/
[35] http://www.datadryad.org/
[37] https://duraspace.org/dspace/
[38] https://www.re3data.org/metrics, retrieved 06.02.2019
[39] E.g., the Psychological Science Accelerator, https://psysciacc.org
[40] https://cos.io/rr/, retrieved 06.02.2019.
[41] arXiv Update January 2019, https://confluence.cornell.edu/display/arxivpub/arXiv+Update+-+January+2019, retrieved 06.02.2019.
[42] Who’s using open peer review? https://publons.com/blog/who-is-using-open-peer-review/
[43] Chiappe, Andrés, & Adame, Silvia Irene. (2018). Open Educational Practices: a learning way beyond free access knowledge. Ensaio: Avaliação e Políticas Públicas em Educação, 26(98), 213-230. Epub December 18, 2017.https://dx.doi.org/10.1590/s0104-40362018002601320
[44] https://www.khanacademy.org/
[45] https://www.extension.harvard.edu/open-learning-initiative
[46] https://www.oercommons.org/
[47] https://opensciencemooc.eu/
[48] Guidelines for handling research data, http://www.dfg.de/download/pdf/foerderung/antragstellung/forschungsdaten/richtlinien_forschungsdaten.pdf, retrieved 05.02.2019
[49] Open Research Data and Data Management Plans, https://erc.europa.eu/sites/default/files/document/file/ERC_info_document-Open_Research_Data_and_Data_Management_Plans.pdf, retrieved 05.02.2019
[50] http://v2.sherpa.ac.uk/view/funder_visualisations/1.html, retrieved 06.02.2019
[51] Hiring Policy at the LMU Psychology Department: Better have some open science track record, https://www.nicebread.de/open-science-hiring-policy-lmu/, retrieved 05.02.20199
[52] https://www.buergerschaffenwissen.de
[53] Open Science Monitor of the European Research Council, https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science/open-science-monitor_en, retrieved 06.02.2019
[54] Also see: https://fivethirtyeight.com/features/psychologys-replication-crisis-has-made-the-field-better/