Abstract
Youyou et al. (1) estimated the replicability of more than 14,000 psychology papers using a machine learning model, trained on main texts of 388 replicated studies. The authors identified mean replicability scores of psychological subfields. They also verified the causality of the model predictions; correlations between model predictions and study details the model was not trained on (i.e., P value and sample size) were reported.
In attempting replication, we identified important shortcomings of the approach and findings. First, the training data contain duplicated paper entries. Second, our analysis shows that the model predictions also correlate with variables that are not causal to replicability (e.g., language style). These issues impede the validity of the model output and thereby paint an erroneous picture of replication rates of the psychological science. In this letter, we attempt to mitigate these issues and nuance the findings of the original paper.
In attempting replication, we identified important shortcomings of the approach and findings. First, the training data contain duplicated paper entries. Second, our analysis shows that the model predictions also correlate with variables that are not causal to replicability (e.g., language style). These issues impede the validity of the model output and thereby paint an erroneous picture of replication rates of the psychological science. In this letter, we attempt to mitigate these issues and nuance the findings of the original paper.
Originalsprog | Engelsk |
---|---|
Tidsskrift | Proceedings of the National Academy of Sciences of the United States of America |
Vol/bind | 120 |
Udgave nummer | 33 |
ISSN | 0027-8424 |
DOI | |
Status | Udgivet - 7 aug. 2023 |
Emneord
- Replicability
- Machine learning
- Psychology
- Causal inference
- Model validation