AI: Academic Fraud and Collusion

Michael Littman writes:

Collusion rings extend far beyond the field of computer architecture. I will share another data point, from artificial intelligence and machine learning….Overall, stakes are high because acceptance rates are low (15%–25%), opportunities for publishing at any given conference are limited to once a year, and publications play a central role in building a researcher’s reputation and ultimate professional success. 

Given that many conferences have to cap the number of accepted papers due to limits on the number of papers that can be presented at the conference, that means other deserving papers are being rejected to make room. Scientific research is a deeply co-operative endeavor. Researchers compete for attention and funding resources, but also build their ideas on top of those of their rivals. Most researchers see their work as a quest for deeper understanding, not just a way to pay the bills. 

I won’t go into the specifics of the “collusion rings”- it doesn’t merit sufficient attention because a variation of these “rings” exists everywhere. I call this as the “academic flywheel” effect- you need to know someone before your paper can be presented as “top-pick”. It helps to align with the “partnering institution” because most overseas publications are “reserved” for “abstracts”- howsoever compelling the merits of submission, may be.

TN Vijay Kumar writes with more details of the “collusion ring”.

Here is what I heard from an award-winning professor about what happened. The professor has first-hand knowledge of the investigations and has communicated directly with the investigators, but wishes to remain anonymous: There is a chat group of a few dozen authors who in subsets work on common topics and carefully ensure not to co-author any papers with each other so as to keep out of each other’s conflict lists (to the extent that even if there is collaboration they voluntarily give up authorship on one paper to prevent conflicts on many future papers). They exchange papers before submissions and then either bid or get assigned to review each other’s papers by virtue of having expertise on the topic of the papers. They give high scores to the papers. If a review raises any technical issues in PC discussions, the usual response is something to the effect that despite the issues they still remain positive; or if a review questions the novelty claims, they point to some minor detail deep in the paper and say that they find that detail to be novel even though the paper itself does not elevate that detail to claim novelty. Our process is not set up to combat such collusion.

Technically speaking, it’s “hearsay” and definitely not admissible, by any stretch of imagination. Likewise, blogs are opinionated rants, but the “editorials” when published have a “ring of truth” to it. They are also glorified blog posts, but published. I don’t think you need to slap a peer review to each of them!

Jacb Buckman writes:

Explicit academic fraud is, of course, the natural extension of the sort of mundane, day-to-day fraud that most academics in our community commit on a regular basis. Trying that shiny new algorithm out on a couple dozen seeds, and then only reporting the best few. Running a big hyperparameter sweep on your proposed approach but using the defaults for the baseline. Cherry-picking examples where your model looks good, or cherry-picking whole datasets to test on, where you’ve confirmed your model’s advantage. Making up new problem settings, new datasets, new objectives in order to claim victory on an empty playing field. Proclaiming that your work is a “promising first step” in your introduction, despite being fully aware that nobody will ever build on it. Submitting a paper to a conference because it’s got a decent shot at acceptance and you don’t want the time you spent on it go to waste, even though you’ve since realized that the core ideas aren’t quite correct.

I think it calls into question the “whole idea of science” itself. Looking beyond the “peer review” is essential, because the “system is rigged and broken”. No one has a “better idea” to replace it anyway, and anyone doing so will only represent a new order of academic gatekeeping, setting up the tone and tenor for more friction. It will eventually boil down to more acrimony in the system – what essentially is our own peers. Therefore, as much as I detest the idea of science going down the way of how venture capitalists fund their startups, it may have merit because they represent the inherent “risk-taking”. They know that most of their investments will turn out to be duds but the ones which do have a breakout success will eventually overshadow the failures. Committees are inherent risk averse where new ideas often go to die.

This often validates my own stand and assumption – the idea of AI in healthcare requires a shakedown to identify what bad apples fall down. I had an intriguing discussion with a researcher on their push towards image recognition of specific patterns in pathology slides. Details don’t matter, but his claims took me aback that his algorithm promised 99% accuracy. It was being validated in some trial/study somewhere. I had a cross-check with some researcher who could point flaws in the system and promised even better accuracy through use of other combination data-sets/algorithms. No one has the “single version of truth” and “good science” will eventually find its way out.

That’s why I insist on having diversity of ideas and breaking our own echo chamber. I may disagree with others “fundamentally”, but it is critical to grasp the larger picture, than narrow confines of “tunnel vision” under the “ambit of specialisation”.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.