Algorithm Scours Electronic Health Records to Reveal Hidden Kidney Disease

Diagnosing chronic kidney disease, which is often undetected until it causes irreversible damage, may soon become automated with a new algorithm that interprets data from electronic medical records.

The algorithm, developed by researchers at Columbia University Vagelos College of Physicians and Surgeons, automatically scours a patient’s electronic medical record for results of blood and urine tests and, using a mix of established equations and machine learning to process the data, can alert physicians to patients in the earliest stages of chronic kidney disease.

study of the algorithm was published in the journal npj Digital Medicine in April.

“Identifying kidney disease early is of paramount importance because we have treatments that can slow disease progression before the damage becomes irreversible,” says study leader Krzysztof Kiryluk, MD, associate professor of medicine at Columbia University Vagelos College of Physicians and Surgeons. “Chronic kidney disease can cause multiple serious problems, including heart disease, anemia, or bone disease, and can lead to an early death, but its early stages are frequently under-recognized and undertreated.”

Chronic kidney disease progresses silently

Approximately one in every eight American adults is believed to have chronic kidney disease, but only 10% of people in the disease’s early stages are aware of their condition. Among those who already have severely reduced kidney function, only 40% are aware of their diagnosis.

The reasons for underdiagnosis are complex. People in the early stages of chronic kidney disease usually have no symptoms, and primary care physicians may prioritize more immediate patient complaints.

In addition, two tests, one that measures a kidney-filtered metabolite in blood and another that measures leakage of protein in urine, are needed to detect asymptomatic kidney disease. 

“The interpretation of these tests is not always straightforward,” Kiryluk says. “Many patient characteristics, including age, sex, body mass, or nutritional status, need to be considered, and this is frequently under-appreciated by primary care physicians.” 

Algorithm automates diagnosis

The new algorithm surmounts these obstacles by automatically scanning electronic medical records for test results, performing the calculations that indicate kidney function and damage, staging the patient’s disease, and alerting physicians to the trouble.

The algorithm performs nearly as well as experienced nephrologists. When tested using electronic health records from 451 patients, the algorithm correctly diagnosed kidney disease in 95% of the kidney patients identified by two experienced nephrologists and correctly ruled out kidney disease in 97% of the healthy controls.

The algorithm can be used on different types of electronic health record systems, including those with millions of patients, and could easily be incorporated into a clinical decision support system that helps physicians by suggesting appropriate stage-specific medications. The algorithm can be easily updated if standards for diagnosing kidney disease are changed in the future and is freely available for use by other institutions.


One drawback of the algorithm is that it depends on the availability of relevant blood and urine tests in the medical record. The blood test is fairly routine, but the urine test is underutilized in clinical practice, Kiryluk says. 

Despite these limitations, algorithmic diagnosis could enhance awareness of kidney disease, Kiryluk says, and, with earlier treatment, potentially reduce the number of people who lose kidney function.

Powerful tool for research

The algorithm has other important benefits for researchers. Because it can be applied to EHR datasets with millions of patients and identify all patients with chronic kidney disease, not just those diagnosed with the disease, the algorithm improves the power of many research studies. 

The researchers have already applied the algorithm to a database of millions of Columbia patients to find previously unrecognized associations between chronic kidney disease and other conditions. For example, depression, alcohol abuse, and other psychiatric conditions were considerably more common among patients with mild kidney disease compared to patients with normal kidney function, even after accounting for differences in age and sex. 

“Our analysis also confirmed that a mild degree of kidney dysfunction is often present in blood relatives of patients with kidney disease,” says Ning Shang, PhD, associate research scientist in the Kiryluk lab and the lead author of the paper. “These findings support strong genetic determination of kidney disease, even in its mildest form.”

In the future, Kiryluk says, the algorithm could be used to better understand the inherited risk of chronic kidney disease, because the algorithm empowers genetic analyses of millions of people to discover new kidney genes.


More information

The study is titled “Medical records-based chronic kidney disease phenotype for clinical care and ‘big data’ observational and genetic studies.” 

All authors (from Columbia unless otherwise noted): Ning Shang, Atlas Khan, Fernanda Polubriaginof, Francesca Zanoni, Karla Mehl, David Fasel, Paul E. Drawz (University of Minnesota), Robert J. Carrol (Vanderbilt University), Joshua C. Denny (Vanderbilt University), Matthew A. Hathcock (Mayo Clinic), Adelaide M. Arruda-Olson (Mayo Clinic), Peggy L. Peissig (Marshfield Clinic Research Institute), Richard A. Dart (Marshfield Clinic Research Institute), Murray H. Brilliant (Marshfield Clinic Research Institute), Eric B. Larson (Kaiser Permanente Washington Health Research Institute), David S. Carrell (Kaiser Permanente Washington Health Research Institute), Sarah Pendergrass (Geisinger Research), Shefali Setia Verma (University of Pennsylvania), Marylyn D. Ritchie (University of Pennsylvania), Barbara Benoit (Partners HealthCare), Vivian S. Gainer (Partners HealthCare), Elizabeth W. Karlson (Harvard Medical School), Adam S. Gordon (Northwestern University), Gail P. Jarvik (University of Washington), Ian B. Stanaway (University of Washington), David R. Crosslin (University of Washington), Sumit Mohan, Iuliana Ionita-Laza, Nicholas P. Tatonetti, Ali G. Gharavi, George Hripcsak, Chunhua Weng, and Krzysztof Kiryluk.

The study was conducted as part of the eMERGE Phase III Network, which was initiated and funded by the National Human Genome Research Institute (U01HG8680, U01HG8672, U01HG8657, U01HG8685, U01HG8666, U01HG6379, U01HG8679, U01HG8684, U01HG8673, MD007593, U01HG8676, and U01HG8664). This work was also funded by the National Institute of Diabetes and Digestive and Kidney Diseases’ Kidney Precision Medicine Project (UH3DK114926), the National Library of Medicine (R01LM013061), and the Precision Medicine Pilot from the Irving Institute/Columbia CTSA (UL1TR001873). Additional sources of funding included National Institutes of Health grants (R01DK105124, RC2DK116690, and R01LM006910).

The authors declare no competing interests.