Guidelines: Evidence-based medicine eats itself

I have warned against using the “statistics and the p-values” blindly. However, this post struck chord, and I have included this in the entirety. The “guidelines” stem from the need to have an “insurance coverage” and don’t represent the populations we are treating. For example, Brachytherapy for Prostate cancers is a safe, cheaper and some may argue, a more effective way (at the expense of man-hours) but receives a lower billable amount. For the same effort and time, a clinician may push for Stereotactic Radiation (the new bling bling!) and claim higher value.

The wicked system of incentives is slaying the medicine. Guidelines aren’t venerated, but again no one got indicted for following “guidelines” ever.

Most papers call for a higher than a working grasp of statistics wherein they are mobilizing more esoteric bunch of “jargon” for “identifying parallels”. It is hair splitting by any other name. Does it benefit patients in any way? Plucking out survival graphs for tweetorials is an exercise fraught with unripe cheery-picking”. Get real. Lets examine what demands to be worked out to make medicine better available.

There are three commonly stated principles of evidence-based research:

1. Reliance when possible on statistically significant results from randomized trials;

2. Balancing of costs, benefits, and uncertainties in decision making;

3. Treatments targeted to individuals or subsets of the population.

Unfortunately and paradoxically, the use of statistics for hypothesis testing can get in the way of the movement toward an evidence-based framework for policy analysis. This claim may come as a surprise, given that one of the meanings of evidence-based analysis is hypothesis testing based on randomized trials.

The problem is that principle (1) above is in some conflict with principles (2) and (3).The conflict with (2) is that statistical significance or non-significance is typically used at all levels to replace uncertainty with certainty—indeed, researchers are encouraged to do this and it is standard practice.The conflict with (3) is that estimating effects for individuals or population subsets is difficult.

A quick calculation finds that it takes 16 times the sample size to estimate an interaction as a main effect, and given that we are lucky if our studies are powered well enough to estimate main effects of interest, it will typically be hopeless to try to obtain the near-certainty regarding interactions. That is fine if we remember principle (2), but not so fine if our experiences with classical statistics have trained us to demand statistical significance as a prerequisite for publication and decision making.

Evidence-based medicine eats itself « Statistical Modeling, Causal Inference, and Social Science