Reproducing errors and replication of science crisis

Buttondown Blog:

The usual problem people raise with research is the cost: if you don’t have an institutional subscription to a journal, reading a single paper can cost 40 bucks. If you’re skimming dozens of papers, you’re suddenly paying in the thousands just to learn “is planning good”. Fortunately you can get around the paywalls with things like sci-hub. Alexandra Elbakyan has done more for society than the entire FSF. YEAH I WENT THERE
Not that more recent papers are necessarily good! I mentioned earlier that most secondary sources are garbage. So are most primary sources. Doing science is hard and we’re not very good at it! There are lots of ways to make a paper useless for our purposes.

(emphasis mine)

While this blog post reads (and sounds like a rant), here’s something to munch on for the oncologists. How many papers are a rehash of what’s been covered previously? Why do we need tons of studies around “de-escalation” because that appears the flavour of the season? Where’s the rigorous radiobiology proof that it works? Was there any biological optimisation done?

While the blog post is related to computer science, you can’t miss the parallels happening here. “Scientific-industrial complex” prevents (or discourages) replication experiments.

The author writes further:

The average developer thinks empirical software engineering is a waste of time. How can you possibly study something as complex as software engineering?! You’ve got different languages and projects and teams and experience levels and problem domains and constraints and timelines and everything else. Why should they believe your giant incoherent mess of sadness over their personal experience or their favorite speaker’s logical arguments?

I believe (and I am assuming here) it will have substantial impact on the AI projects being implemented. You need a careful iteration of what’s happening to the plumbing to ensure that the results are meaningful and apply to the use case scenario uniformly. An end user shouldn’t blindly follow the outputs. I still believe that placing the code out in the open domain will prove more beneficial.