Adjusting for biases in digital well being file (EHR) knowledge – Healthcare Economist

Let’s say you have an interest in measuring the connection between sort 2 diabetes mellitus (T2DM) and melancholy. In lots of circumstances, one would use digital well being information knowledge and conduct a logistic regression with melancholy because the dependent variable and T2DM (probably together with demographics and different comorbidities) because the impartial variables. Nonetheless, the usage of EHR is probably problematic. As famous in Goldstein et al. (2016), there’s a risked of “knowledgeable presence” because the pattern of sufferers in EHR possible differs from these in most people since people solely seem after they have a medical encounter.
Particularly, Goldstein and co-authors be aware that extra frequent visits improve the possibility of being recognized with a illness:
Quan et al. assessed sensitivities based mostly on Worldwide Classification of Illnesses, Ninth Revision, codes throughout 32 widespread situations. They discovered that sensitivities for prevalence of a situation ranged from 9.3% (weight reduction) to 83.1% (metastatic most cancers). Diabetes with problems, for instance, has a sensitivity of 63.6%. Due to this fact, the extra medical encounters somebody has, the extra possible that the presence of diabetes will likely be detected.
On the identical time, whereas extra encounters cut back the chance of a false damaging, additionally they improve the danger of a false optimistic as a result of rule-out diagnoses.
Since phenotype algorithms are usually designed to detect the prevalence of a situation through ever/by no means algorithms (you both have the situation otherwise you don’t), the extra health-care encounters somebody has the upper the likelihood of a false-positive analysis.
Two forms of bias might come up:
- Bias to variety of doctor visits. Determine 1A from this paper reveals that the variety of encounters could also be a confounding issue. It isn’t proof, nonetheless, whether or not there may be potential for M bias, bias from conditioning on a collider. A collider is a variable that’s an end result of two different variables.
- Bias as a result of basic sickness. The authors be aware {that a} basic sickness could also be the reason for each diabetes and melancholy. As an illustration, maybe somebody sustained an harm which cause them to train much less and eat much less wholesome (inflicting T2DM) and the harm itself additionally elevated melancholy. Whereas my instance provides a particular harm, the “basic sickness” within the Goldstein paper might or will not be totally captured or recognized. Thus, the authors declare that the variety of encounters might be able to function a proxy for basic sickness.

Briefly, the authors argue that controlling for variety of visits could be helpful for (i) controlling for the truth that analysis is correlated with variety of encounters and (ii) variety of encounters could also be a proxy for basic sickness.
The authors then conduct a simulation train utilizing EHR knowledge from the Duke College Well being System. The authors conduct 4 analyses analyzing the connection between end result and publicity controlling for: (i) demographics solely, (ii) medical encounters, (iii) Charlson Comorbidity Index (CCI), and (iv) medical encounters and CCI.
The authors summarize their findings as follows:
If the presence of a medical situation isn’t captured with excessive likelihood (i.e., excessive sensitivity), there may be the potential for inflation of the impact estimate for affiliation with one other such situation. This potential for bias is exacerbated when the medical situation additionally results in extra affected person encounters…Idea suggests, and our simulations verify, that conditioning on the variety of health-care encounters can take away this bias. The impression of conditioning is best for diagnoses captured with low sensitivity.
The authors be aware that whereas there may be some concern of M bias–because the variety of encounters is probably going a collider–M bias is probably going considerably much less problematic than confounding bias normally. Others research (Liu et al. 2012) have confirmed that M-bias is usually smaller than confounder bias.
An apart: Berkson’s Bias
The issue of sicker sufferers showing in EHR knowledge causes a manifestation of Berkson’s bias:
As a result of samples are taken from a hospital in-patient inhabitants, quite than from most people, this may end up in a spurious damaging affiliation between the illness and the danger issue. For instance, if the danger issue is diabetes and the illness is cholecystitis, a hospital affected person with out diabetes is extra more likely to have cholecystitis than a member of the overall inhabitants, because the affected person will need to have had some non-diabetes (probably cholecystitis-causing) cause to enter the hospital within the first place. That consequence will likely be obtained no matter whether or not there may be any affiliation between diabetes and cholecystitis within the basic inhabitants.