Using contact networks, administrative, and linked genomic data to understand tuberculosis transmission in BC

December 4, 2018

Drs. Jennifer Gardy and Matías Salibián-Barrera were awarded funds by the Data Science Institute for their project "Using contact networks, administrative, and linked genomic data to understand tuberculosis transmission in BC." The funds will allow their team to recruit a top-tier posdoctoral fellow to help develop and implement a predictive analytics platform into the TB Services Program at the BC Centre for Disease Control (BCCDC). The project will leverage the wealth of tuberculosis data stored at the BCCDC–these include complete epidemiological, demographic, and clinical data for every individual diagnosed with active TB disease in BC dating back decades;  contact/transmission data; genomics data; and healthcare utilization data. A summary of their project follows below.

Tuberculosis (TB) is still a problem in British Columbia, with approximately 250 cases diagnosed each year. In order to meet the WHO’s goal of achieving TB pre-elimination by 2030, TB rates in BC need to decline at a faster rate, and a change in how we manage TB prevention and care in the province is needed. Fortunately, the fact that all TB-related laboratory, epidemiology, clinical, and public health activities are centralized at the BC Centre for Disease Control means that we have unique access to a range of datasets that can be used to understand TB transmission in BC, which can in turn inform public health policy and action. Specifically, we have access to complete epidemiological, demographic, and clinical data for every individual diagnosed with active TB disease in BC between 2005-2014 (approx. 2300 cases). We also have detailed information on the contact network data of these individuals, including person-to-person transmission events confirmed by M. tuberculosis genome sequences for approximately 700 cases diagnosed between 2005-2014, as well as linked administrative data describing these individuals’ interactions with the health care system. For these cases, which we have grouped into clusters ranging from two-person simple transmission pairs up to large, 50+ person outbreaks, we have both transmission networks indicating who likely infected whom and linked healthcare utilization data (e.g. physician visits, hospital admissions and discharges, prescriptions filled, deaths, etc.) This project’s objective is to explore whether i) contact or transmission network properties – either static or over time, ii) clinical/epidemiological/demographic attributes of early cases within a network, and/or iii) genomic data can be used to predict whether a newly diagnosed case is likely to lead to a sustained outbreak. In addition, we are interested in exploring whether patterns of interaction with the healthcare system can be used to infer potentially undiagnosed TB infections.