Electronic Health Record Analysis Shows Which Diseases Run in Families

Familial relationships inferred from electronic health records can be used to study the genetics of diseases. Each subgraph in this image is a family reconstructed from EHR data: Each node represents an individual and the colors represent different health conditions. (Source: Nicholas Tatonetti, PhD, Columbia University Vagelos College of Physicians and Surgeons)

Acne is highly heritable, passed down through families via genes, but anxiety appears more strongly linked to environmental causes, according to a new study that analyzed data from millions of electronic health records to estimate the heritability of hundreds of different traits and conditions.

The findings, published today in Cell by researchers at Columbia University Irving Medical Center and NewYork-Presbyterian could streamline efforts to understand and mitigate disease risk—especially for diseases with no known disease-associated genes.

“Knowledge of a condition’s heritability—how much the condition’s variability can be attributed to genes—is essential for understanding the biological causes of the disease and for precision medicine,” says study co-leader Nicholas Tatonetti, PhD, the Herbert Irving Assistant Professor of Biomedical Informatics at Columbia University Vagelos College of Physicians and Surgeons. “It is clinically useful for estimating disease risk, customizing treatment, and tailoring patient care.”

But estimating heritability usually involves difficult and time-consuming studies of family members, especially twins.

Instead, Tatonetti and his colleagues thought heritability could be estimated more easily by using data that is routinely included in hospitals’ electronic health records. Upon admission, patients are usually asked to provide emergency contacts, often family members who are also patients at the same hospital. “It occurred to us that this information could be used to infer family relationships and, combined with each patient and each contact’s health data, give us heritability estimates faster and less expensively than traditional methods,” Tatonetti says.

In the current study, the researchers analyzed data from 5.5 million electronic health records of patients and their emergency contacts at three academic medical centers: NewYork-Presbyterian/Columbia University Irving Medical Center, NewYork-Presbyterian/Weill Cornell Medical Center, and Mount Sinai Health System. To protect privacy, patient and contact identities were removed from the data before the information was provided to the researchers.

They used algorithms to infer 7.4 million family relationships among patients and contacts and then analyzed the incidence of some 500 different traits and conditions reported in the electronic health records to generate heritability estimates. “One algorithm identified the family relationships and a second computed heritability estimates for every available trait,” says study co-leader David K. Vawdrey, PhD, assistant professor of biomedical informatics at Columbia University Vagelos College of Physicians and Surgeons and vice president of the Value Institute at NewYork-Presbyterian Hospital.

The researchers’ heritability estimates were similar across all three medical centers and were consistent with previously published estimates. For many of the conditions, however, heritability had never been estimated, and researchers found a few surprises. HDL cholesterol is significantly more heritable than LDL cholesterol, even after accounting for the use of lipid-lowering statin medications. Respiratory diseases in general appear more heritable among African-Americans, and sinus infections are highly heritable across all populations studied.

“The one about sinus infections surprised me personally,” says Tatonetti. “My family has a lot of oral history about being predisposed to sinus infections. I didn’t really believe it before, but this analysis may change my mind!”

The approach also promises to diversify the study of heritability. “Many heritability studies have focused on very specific populations, usually white Europeans,” says lead author Fernanda Polubriaginof, a PhD candidate in biomedical informatics at Columbia. “Because we used data from a very diverse group of patients in New York City, we were able to stratify disease risk for different ethnicities in ways that hadn’t been done before.”


The paper is titled “Disease heritability inferred from familial relationships reported in medical records.” The other authors are Rami Vanguri (Columbia University Irving Medical Center), Kayla Quinnies (CUIMC), Gillian M. Belbin (Mount Sinai), Alexandre Yahi (CUIMC), Hojjat Salmasian (CUIMC and NewYork-Presbyterian), Tal Lorberbaum (CUIMC),Victor Nwankwo (CUIMC), LiLi (Mount Sinai), Mark Shervey (Mount Sinai), Patricia Glowe (Mount Sinai), Iuliana Ionita-Laza (CUIMC), Mary Simmerling (NYP and Weill Cornell Medicine), George Hripcsak (CUIMC and NYP), Suzanne Bakken (CUIMC), David Goldstein (CUIMC), Krzysztof Kiryluk (CUIMC), Eimear E. Kenny (Mount Sinai), and Joel Dudley (Mount Sinai).

The study was funded by funds from the National Institutes of Health (R01HS021816, R01GM107145, R01DK105124, R01HS022961, OT3TR002027, R01LM006910, and U01HG008680), the Herbert Irving Scholars Award, the AWS Cloud Credits for Research program, and the Open Science Grid (which is supported by the National Science Foundation and the U.S. Department of Energy's Office of Science).

The authors declare no financial or other conflicts of interest.