Medicine’s machine learning problem

Boston Review writes:

As a starting point, we can take five principles to heart. First, it is crucial to acknowledge that medical data—like all data—can be incomplete, incorrect, missing, and biased. Second, we must recognize how machine learning systems can contribute to the centralization of power at the expense of patients and health care providers alike. Third, machine learning designers and adopters must not take new systems onboard without considering how they will interface with a medical system that is already disempowering and often traumatic for patients. Fourth, machine learning must not dispense with domain expertise—and we must recognize that patients have their own expertise distinct from that of doctors. Finally, we need to move the conversation around bias and fairness to focus on power and participation

I haven’t included the “patient narratives” in the argument. For one simple reason is that these essays overblow the idea of narratives (as singular outliers) rather than representing the population.

This essay ( and the underlying principles) forms the fundamental basis of the flaws I have been pointing out. Data scrubbing and the useless EMR systems. They have become mainstream in certain geopgraphies and are blamed for the physician burnouts. There has been no fundamental research in making things easier and currently they represent a mish-mash of the systems.

If you expect to run algorithms on flawed data- you’d only get the garbage out.

Here’s an interesting anecdote though:

To take another important example of the way medical datasets may systematically misrepresent reality, diagnosis delays are common for many illnesses, leading to incomplete and incorrect data at any one snapshot in time. On average, it takes five years and five doctors for patients with autoimmune diseases such as multiple sclerosis and lupus to get a diagnosis; three-quarters of these patients are women, and half report being labeled as chronic complainers in the early stages of disease. Diagnosis of Crohn’s disease takes twelve months for men and twenty months for women, while diagnosis for Ehlers-Danlos syndrome takes four years for men and sixteen years for women. Consider how many patients have not received an accurate diagnosis yet or who give up before ever finding one. This leads to incomplete and missing data.

The author fails to understand the common dictum- if you diagnose a rare disease- you are rarely correct. If there’s a “delay in the diagnosis”, it is because the rare diseases mimic the symptoms of common problems. Of course, the article has not been vetted by medical experts but I have included the idea nevertheless. It is only going to alarm the lay people because they will push their medical caregivers towards a wild goose chase.

Therefore, we need integrated approaches (a topic for other post).

Consider this:

Researchers in this area have talked about the need to move beyond explainability (seeking explanations for how an algorithm made a decision) to recourse (giving those impacted concrete actions they could take to change the outcome) and to move beyond transparency (insight to how an algorithm works) to contestability (allowing people to challenge it). In a recent op-ed for Nature, AI researcher Pratyusha Kalluri urges that we replace the question “Is this AI fair?” with the question, “How does this shift power?”

This makes for a lofty statement and only confuses the readers. It has nothing to do with the way systems are designed, but the way delivery takes place. It is fallacious to assume that system will deliver the same reproducible results each time without allowing for the flexibility in approach. AI will continually be maligned with these probing questions but remain invalidated in the larger scheme of things.