Reproducing errors and replication of science crisis

Buttondown Blog:

The usual problem people raise with research is the cost: if you don’t have an institutional subscription to a journal, reading a single paper can cost 40 bucks. If you’re skimming dozens of papers, you’re suddenly paying in the thousands just to learn “is planning good”. Fortunately you can get around the paywalls with things like sci-hub. Alexandra Elbakyan has done more for society than the entire FSF. YEAH I WENT THERE
Not that more recent papers are necessarily good! I mentioned earlier that most secondary sources are garbage. So are most primary sources. Doing science is hard and we’re not very good at it! There are lots of ways to make a paper useless for our purposes.

(emphasis mine)

While this blog post reads (and sounds like a rant), here’s something to munch on for the oncologists. How many papers are a rehash of what’s been covered previously? Why do we need tons of studies around “de-escalation” because that appears the flavour of the season? Where’s the rigorous radiobiology proof that it works? Was there any biological optimisation done?

While the blog post is related to computer science, you can’t miss the parallels happening here. “Scientific-industrial complex” prevents (or discourages) replication experiments.

The author writes further:

The average developer thinks empirical software engineering is a waste of time. How can you possibly study something as complex as software engineering?! You’ve got different languages and projects and teams and experience levels and problem domains and constraints and timelines and everything else. Why should they believe your giant incoherent mess of sadness over their personal experience or their favorite speaker’s logical arguments?

I believe (and I am assuming here) it will have substantial impact on the AI projects being implemented. You need a careful iteration of what’s happening to the plumbing to ensure that the results are meaningful and apply to the use case scenario uniformly. An end user shouldn’t blindly follow the outputs. I still believe that placing the code out in the open domain will prove more beneficial.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.