35 Comments
⭠ Return to thread
Jan 9·edited Jan 9Liked by Ben Recht

> When you write scripts to digest tens of thousands of trial reports, and you don’t look at any of

> them, no one has any idea what the content or value of all of this data is [...]

> But you shouldn’t believe anything. Especially nothing that comes from an unreproducible process

> of web crawling. Let me reiterate: this paper about reproducibility is itself unreproducible.

Ben: These are some serious allegations you are making here publicly. I did the data collection very rigorously, and the Cochrane data from each systematic review is extremely well structured data in XML including meta data. I created a R package in 2021 that can import such data files. I just gave it a minor update and included more documentation. You can now download and collect the data yourself. https://github.com/schw4b/cochrane/.

Expand full comment
author

Thanks for the code update, Simon. My main issue is that papers that are about replication must hold themselves to the same standards as they want to set. In particular, your NEJM E paper was based on a model derived in your Stat in Med paper based on code from an online preprint. But it was never clear how to move between these:

1. the osf repo does not have instructions for how to make the csv file: https://osf.io/xjv9g/

2. the stat in med paper leaves out the exclusion rules that Erik notes here: https://open.substack.com/pub/argmin/p/is-the-reproducibility-crisis-reproducible?r=p7ed6&utm_campaign=comment-list-share-cta&utm_medium=web&comments=true&commentId=46899660

You could argue that these are minor issues and can be fixed by private correspondence, but it's 2024 and that shouldn't be necessary. It especially should not be necessary for a paper that is questioning the replicability of all medical trials. My concern is that no one can achieve the Platonic ideal of reproducible science that metascientific critiques argue for.

Expand full comment