Data Science for Social Good (2018)

The Data Science Institute (DSI) at UBC, in partnership with the University of Washington eScience Institute, offers the Data Science for Social Good (DSSG) Summer Fellowships—modelled after successful programs at the University of Chicago, Georgia Tech and the eScience Institute. This 14-week summer program is designed to bring together undergraduate students and graduate students from diverse backgrounds with experience in data science, urban research and planning, design thinking and other domains to work on focused, collaborative research projects that have the potential to benefit society.

2018 Projects

Improving Early and Middle Childhood Outcomes

The goal of this project is to improve decision making in the areas of planning and investment across the early and middle years sector in Surrey. To achieve this goal, this project will asses the impact the City is having on the community through the use of data and measurement. For example, we will examine relationships between neighbourhood environments (e.g., community service and program registrations and access records, service work requests, built-environments), the EDI (Early Development Index) and the MDI (Middle Development Index) to better understand conditions that support childhood outcomes. Ultimately, the hope is that the project will uncover new methods to capture data and determine what are the indicators and measures that can ampliy positive change to improve outcomes for children and families.

Final PresentationFinal ReportPublicationGitHub RepoShiny App

Transportation Energy and Emissions Baseline and Forecast for Ongoing Modelling and Policy Analysis

Surrey has the highest transportation emissions out all municipalities’ greenhouse gas (GHG) emissions in the Greater Vancouver Regional District (GVRD), representing 25% of the total transportation emissions in the region. Currently, transportation emissions represent 65% of Surrey’s community GHG emissions. The goal of this project is to analyze the trends of transportation and vehicle registration to establish a baseline and business-as-usual (BAU) forecast for Surrey to 2050. For example, the project will analyze passenger vehicle data alongside demographic, transit, and property use data to help guide the design and development of targeted intermodal infrastructure (e.g. transit, fuelling infrastructure, bike lanes), and to identify areas within Surrey for potential electric vehicle adoption. Overall, a better understanding of trends in vehicle registrations and intermodal commuting options for Surrey’s current and future population can help in future development projects and improving liveability in Surrey’s neighbourhoods.

Final PresentationFinal Report

Uncovering the Hidden Universe of Rental Units in Surrey

Current information on the distribution and numbers of rental units in the City of Surrey is incomplete. This hampers the City's ability to accurately plan for community services, schools, transportation, and other infrastructure. Without a full accounting of rental units, the City also does not know the actual vacancy rate, which has policy implications in terms of affordable housing supply. The goal of this project is to assemble information (e.g., create a database and visualizations) from multiple sources and datasets to construct as complete a picture as possible. Equipped with this data, the City will review and shape policy development on a number of fronts; including parking management policies, secondary suite legalization campaigns, and affordable housing policies that specifically address the supply on rental housing.

Final Presentation

Use of Machine Learning Techniques to Classify Laboratory Test Results

While laboratory tests have been standardized to LOINC coding schemes, the results of those tests are often unstructured text, or where result codes exist, many different codes can be used to represent the same result. New result codes are often created when small differences in the associated text are needed (e.g. test negative for WNV vs. test negative for WNV-convalescent sera required). The BC Centre for Disease Control employs a data warehouse team who works with lab personnel to review and classify all result codes as either positive, negative or indeterminate. This is a time-intensive process, but necessary to determine if someone meets the case definition for a reportable communicable disease. This project will explore whether and how machine learning techniques might be applied to laboratory data in order to automatically classify the results.

AMIA PresentationFinal PresentationFinal Report

2018 Fellows


Nimrah Anwar

Nimrah is a second year Master of Chemical Engineering student at UBC. She has a breadth of experience in process engineering and has worked in FMCG, Fertilizer and Petroleum industries. She sees data science as a ubiquitous field that can be used to bring positive change in the society. Apart from her technical portfolio, Nimrah is deeply interested in women’s rights and education in underdeveloped countries and the politics surrounding it. On her free time, Nimrah enjoys cooking and eating good food.

Sizhe Chen

Sizhe is a recently graduated BSc student majoring in Statistics. She has always been interested in utilizing data to gain insight into real-world issues. Her previous undergraduate research assistant experience in an agricultural project has equipped her with hands-on skills in data manipulation, as well as enhanced her understanding of data processing in the field of social good. As a DSSG fellow, she will strive to further apply data analysis and computer programming techniques for social good.

Kenny Chiu

Kenny is a recent UBC graduate having completed his Bachelor of Science with a combined honours in Computer Science and Statistics. He has a diverse technical skill set having worked as a developer, human-robot interaction research assistant, and data analyst through the Computer Science Co-op program. With data science being at the intersection of his interests, he aims to learn more about the field and its applications to social development by working with DSSG.


Andy Fink

Andy has recently completed his fourth year as an undergraduate BSc. in the department of Geography. He has experience in data collection through field surveying and remote sensing and enjoys analyzing spatial data with GIS. His current research interests include network optimization through GIS analysis, various applications of LiDAR and smart cities. When he’s not at school or work, Andy enjoys going to the beach, playing croquet and skiing.

Cody Griffith

Cody completed his Master's degree at UBC in Applied Mathematics as of Spring of 2018. His research area consisted of studying the delayed dynamical behavior within a model for ocean dynamics. His theoretical results have since been observed within our oceans and pave the way for future exploration in this field. His interests have melded applied mathematics, statistics, and the broad area of machine learning to allow for a unique mathematical perspective into real problems and models. Cody hopes to use his time as a DSSG fellow to continue to broaden his available tools and make connections in the area of consulting where he plans to continue his career.

Andy Hong

Andy will soon be finishing his Bachelor of Science, majoring in Physics and minoring in Mathematics. After graduation, he is excited to find more opportunities to pursue his aspirations as a Data Scientist, while keeping up with his hobbies in learning different foreign languages and food. To Andy, the most exciting aspect of data science is finding quantitative insight and clearly communicating to a diverse audience. His natural curiosity has lead to his success in a plethora of research projects during his undergraduate training, but he hopes to bring his experience to industry and see a meaningful impact first-hand. As a foodie, he enjoys catching the latest food trends in Vancouver or may even hop on a plane and travel half-way across the world if it looks good enough.


Zhe Jiang

Zhe is a fourth year UBC student in Computer Science and Mathematics. He is highly interested in machine learning and artificial intelligence techniques. He enjoys working on interesting and challenging projects in a collaborative and interdisciplinary environment. As an international student, he studied in Italy, China, Israel and has an open perspective towards social issues. During his free time, he likes hiking, traveling, photography and Kendo.

Mia Kramer

Mia is an undergraduate in her fourth year working in the combined honours physics and astronomy program at UBC. She’s enjoyed doing data analysis and astronomy research in the past, and is very excited to be able to apply the same skills here on Earth. Outside of work, she enjoys stargazing, photography and music.

Jocelyn Lee

Jocelyn is a Bachelor of Computer Science (BCS) student in her final semester at the University of British Columbia. She completed her first BSc in Microbiology and Immunology, and her experiences in a genomics lab motivated her to explore data science. Since then, her diverse interests have included studying information security at ETH Zurich and working as a DevOps intern. Jocelyn is excited to expand her data science and statistics skills as a DSSG fellow, and contribute to data-driven policies for social good. In her free time, she enjoys travelling and baking.


Yiying (Catherine) Lin

Catherine is a third-year double majors student in computer science and statistics from University of British Columbia. She is interested in machine learning and data analysis. She is hoping to gain hands-on experience with real-world datasets and also ameliorate some social issues to the best of her ability by being part of the DSSG program. Besides study, she loves watching movies and reading detective fictions.

William Lu

William is a third-year undergraduate student in the Honours Computer Science program with a minor in Statistics. He is interested in the application of machine learning and data science to solving healthcare problems. He also hopes to pursue graduate studies in machine learning, artificial intelligence, or computer vision. In his free time, William enjoys playing the piano and reading nonfiction books.

Varoon Mathur

Varoon is a Software Engineer by day, a Global Health Activist by night, and a Data Scientist any time in between. He is a recent graduate of UBC's Bachelor of Computer Science program with a focus on machine learning, and has a Bachelor of Science in Life Sciences from Queen's University, completing his honors thesis on neuro-degenerative disease. Having advised on global health policy and research at the World Health Organization and with UAEM, Varoon is most interested in developing health systems that address the unmet needs of vulnerable and marginalized populations.


Hyeongcheol (Tom) Park

Hyeongcheol is a Master of Science student in the Department of Statistics, UBC. He completed his Bachelor in English Literature and Applied Statistics from the Chung-Ang University, South Korea. He believes data science could contribute to relieving global issues such as poverty and hunger. In his free time, he loves to listen to music and go to swimming.

Kevin Wong

Kevin Wong is entering fourth-year Civil Engineering this fall, with a great passion in transportation planning, economics and data analytics. From his past experiences, he has become an avid proponent of evidence-based decision-making, with the help of analytical tools and proper economic reasoning. He wishes to become a transportation analyst someday and help improve urban mobility for the social good.

Nilgoon (Nelly) Zarei

Nilgoon is a PhD candidate at UBC in the interdisciplinary program. Nilgoon’s background is in Electrical and Computer Engineering and her current PhD research is engaged in developing more affordable and reliable tools for detection and identifying abnormally in different tissue sites. She has developed various algorithm using Machine Learning and Deep Learning techniques to classify prostate cancer grades, locate abnormality in cervix and renal carcinoma sub-type identifications. She believes using learning methods can improve disease diagnosis/prognosis and advanced medical devices, and has the potential to improve the life quality of patients significantly.


Kevin Zhu

Kevin recently completed his Bachelor of Science at UBC, with a double major in Mathematics and Statistics. He is interested in applying statistical and machine learning techniques to aid decision making processes in real life issues. Kevin is hoping to expand his statistical toolbox and further his skill set in data analysis with the opportunity provided by DSSG to do original research and tackle social problems.






This year we are hosting a series of talks called "DSSG External Speakers Series" open to the public, for more details click here.


2017 Data Science for Social Good Program


Cascadia Urban Analytics Cooperative (CUAC)

eScience Institute, University of Washington

Applied Statistics and Data Science Group (ASDa)

City of Surrey

BC Centre for Disease Control

This program is sponsored by Microsoft.

Resources for Students:



For questions about this program, email