Replicability as a Contextual Concept: Rethinking the Reproducibility Crisis in Science

Document Type : علمی - پژوهشی

Authors

1 Student in Philosophy of Science, Sharif University of Technology, Tehran, Iran

2 Associate Professor, Department of Philosophy of Science, Sharif University of Technology, Tehran, Iran

Abstract

Extended Abstract
 
Introduction and Objectives:Replicability has long been regarded as one of the fundamental criteria distinguishing science from other forms of knowledge. The notion that a scientific finding must be reproducible under similar conditions by independent researchers has implicitly served as a foundation for public trust in scientific results. However, over the past two decades, an increasing number of reports concerning failures to replicate reputable studies—particularly in psychology, biomedical sciences, and social sciences—has triggered what is now widely known as the replicability crisis.
This paper, drawing upon philosophical analysis and a review of scholarly and media discussions, seeks to provide new insights by challenging the view of replicability as a singular, universal, and value-free concept. Instead, we propose understanding replicability as a context-dependent and value-laden construct. Based on this perspective, we offer a revised interpretation of what has been labeled a “crisis,” suggesting that it actually reflects a mismatch between discipline-specific expectations of replication and the inappropriate generalization of criteria derived from particular sciences to all others.
Accordingly, we propose a conceptual model to examine diverse forms of replication across disciplines, outlining a spectrum of replication types and suggesting corresponding guidelines for research management. The article ultimately aims to provide a conceptual framework for understanding replicability as a gradational phenomenon, with significant implications for the philosophy of science and research policy.
Method: Since the mid-20th century, critical discussions have challenged the ideal of value-free science, emphasizing that the amount of evidence deemed sufficient for theory confirmation often depends on value-based considerations. Recent findings reveal that researchers are guided—often unintentionally—by judgments influenced by various epistemic and non-epistemic values throughout scientific inquiry. Consequently, the traditional notion of a value-neutral science has lost much of its support, with most contemporary philosophers acknowledging the inevitable role of values in scientific reasoning.
Rather than debating the mere presence or absence of values in science, contemporary scholarship focuses on distinguishing between legitimate and illegitimate influences of values—a distinction referred to as the new demarcation criterion. In this paper, building on the literature concerning the role of values in science, we critically examine major responses to the replicability crisis and, in the final section, address replicability as an epistemic value. Specifically, we investigate whether replicability can be considered a universal epistemic standard across all scientific domains.
Results: Media and academic discussions surrounding the replicability crisis generally fall into two broad categories:

Statistical and methodological approaches, which focus on the deficiencies and limitations of testing instruments and statistical tools.
Institutional and sociological approaches, which emphasize the influence of structural and external incentives on declining replicability.

Three major factors contributing to the crisis can be identified. The first two relate to statistical methodology: (1) the choice of significance thresholds and (2) the determination of prior probabilities—both of which, as shown, are susceptible to value-laden judgments.
The third factor concerns bias in its various forms, arising from the influence of scientific institutions and broader social contexts. Although most scholars acknowledge that bias contributes to the replicability problem, its mechanisms of influence are often overlooked. Bias can distort the epistemic dimensions of inquiry (e.g., data interpretation) or the disseminative phase (e.g., publication practices).
While institutional perspectives rightly highlight structural, motivational, and policy-related factors, they often neglect the conceptual and philosophical dimensions of replicability. As a result, proposed solutions tend to focus on administrative or policy reforms rather than conceptual clarification.
Discussion and Conclusion:The findings of this paper indicate that the so-called replicability crisis is not necessarily a symptom of the decay of science or the failure of the scientific method. Rather, it reflects deeper issues—namely, misunderstandings about the nature of replicability, neglect of disciplinary diversity, and the dominance of a monolithic epistemology in evaluating research validity.
In many cases, what appears as a “crisis” is actually the result of misapplied criteria—for instance, extending standards appropriate to experimental physics to disciplines such as psychology or anthropology. From a contextual standpoint, replicability should thus be viewed not as a rigid, universal benchmark but as a variable construct contingent upon the type of science, research aims, cultural context, and institutional expectations.
Reconceptualizing replicability in this way can help move beyond simplistic dichotomies (e.g., good science = replicable / bad science = non-replicable), promoting a comparative epistemology that values methodological diversity rather than enforcing uniformity.
In this study, by identifying a spectrum of replication types across multiple dimensions, we propose corresponding policy and regulatory recommendations tailored to each level. Applying such frameworks can assist researchers, reviewers, and scientific institutions in addressing replicability issues more effectively and enhancing the credibility of research outcomes within their respective contexts.
Scientific institutions, by adopting strategies and infrastructures that treat replicability as context-sensitive, can develop evaluation guidelines and indicators aligned with the specific needs of each field (based on three dimensions: control, dynamics, and purpose). This approach can prevent the imposition of uniform standards across all disciplines. Similarly, journal reviewers can benefit from a spectral framework to better assess the contextual features of submitted research and make more balanced judgments regarding its replicability.
Acknowledgments: The authors would like to express their sincere gratitude to all professors and colleagues who provided guidance and assistance during this research.
Conflict of Interest: The authors declare that there is no conflict of interest regarding this study.

Keywords


  1. Amrhein, V., Greenland, S., & McShane, B. (2019). Retire statistical significance. Nature, 567, 305–307.

    Baker, M. (2016a). Is there a reproducibility crisis? Nature, 533, 452–454.

    Benjamin, D., James O. Berger, Magnus Johannesson, Brian A. Nosek, E.-J. Wagenmakers, Richard Berk, Kenneth A. Bollen, Björn Brembs, Lawrence Brown, Colin Camerer, David Cesarini, Christopher D. Chambers, Merlise Clyde, Thomas D. Cook, Paul De Boeck, Zoltan Dienes, Anna Dreber, Kenny Easwaran, Charles Efferson, Ernst Fehr, Fiona Fidler, Andy P. Field, Malcolm Forster, Edward I. George, Richard Gonzalez, Steven Goodman, Edwin Green, Donald P. Green, Anthony G. Greenwald, Jarrod D. Hadfield, Larry V. Hedges, Leonhard Held, Teck Hua Ho, Herbert Hoijtink, Daniel J. Hruschka, Kosuke Imai, Guido Imbens, John P. A. Ioannidis, Minjeong Jeon, James Holland Jones, Michael Kirchler, David Laibson, John List, Roderick Little, Arthur Lupia, Edouard Machery, Scott E. Maxwell, Michael McCarthy, Don A. Moore, Stephen L. Morgan, Marcus Munafó, Shinichi Nakagawa, Brendan Nyhan, Timothy H. Parker, Luis Pericchi, Marco Perugini, Jeff Rouder, Judith Rousseau, Victoria Savalei, Felix D. Schönbrodt, Thomas Sellke, Betsy Sinclair, Dustin Tingley, Trisha Van Zandt, Simine Vazire, Duncan J. Watts, Christopher Winship, Robert L. Wolpert, Yu Xie, Cristobal Young, Jonathan Zinman & Valen E. Johnson et al. (2018). Redefine statistical significance. Nature Human Behavior, 2, 6–10.

    Bird, A. (2018). Understanding the replication crisis as a base rate fallacy. The British Journal for the Philosophy of Science, Volume 72, Number 4, December 2021, 965–993

    Crane, D. (1967). The Gatekeepers of Science: Some Factors Affecting the Selection of Articles for Scientific Journals. The American Sociologist, 2(4), 195–201.

    Fanelli, D. (2018). Is science really facing a reproducibility crisis, and do we need it to? PNAS, 115, 2628–2631.

    Ferguson, C. J. and M. Heene (2012). A vast graveyard of undead theories: Publication bias and psychological science’s aversion to the null. Perspectives on Psychological Science, 7, 555–561.

    Francis, G. (2012). Publication bias and the failure of replication in experimental psychology. Psychonomic Bulletin & Review, 19, 975–991.

    Guttinger, Stephan (2020). The limits of replicability. European Journal for Philosophy of Science, 10(2), 1–17.

    Hudson, R. (2021). Should We Strive to Make Science Bias-Free? A Philosophical Assessment of the Reproducibility Crisis. Journal for General Philosophy of Science, 52, 389–405.

    Ioannidis, J. (2005). Why most published research findings are false. PLoS Med, 2, e124.

    John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the Prevalence of Questionable Research Practices With Incentives for Truth Telling. Psychological Science, 23(5), 524–532.

    Lakens, D., et al. (2018). Justify your alpha. Nature Human Behavior, 2, 168–171.

    Leonelli, S. (2018). Rethinking reproducibility as a criterion for research quality. In L. Fiorito, S. Scheall, & C. E. Suprinyak (Eds.), Research in the history of economic thought and methodology (pp. 129–146). Emerald Publishing Limited.

    1. Longino, H. E. (1996). Cognitive and non-cognitive values in science: Rethinking the dichotomy. In L. H. Nelson & J. Nelson (Eds.), Feminism, science, and the philosophy of science (pp. 39–58). Kluwer Academic.Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349, aac4716.

    Pashler, H., & Harris, C. (2012a). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science, 7, 531–536.

    Peterson, D., & Panofsky, A. (2021). Self-correction in science: The diagnostic and integrative motives for replication. Social Studies of Science, 51(4), 583–605.

    Popper, K. (2005). The logic of scientific discovery. Routledge.

    Romero, F. (2016). Can the behavioral sciences self-correct? A social epistemic study. Studies in History and Philosophy of Science, 60, 55–69.

    Rooney, P. (2017). The borderlands between epistemic and non-epistemic values. In K. C. Elliott & D. Steel (Eds.), Current controversies in values and science (pp. 31–46). Routledge.

    Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86(3), 638–641.

    Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 22(11), 1359–1366.