Data Science for Social Good

The Data Science for Social Good (DSSG) program at the University of British Columbia is an interdisciplinary and applied research training program that partners with public organizations to extract insights from open and proprietary data sets. This 14-week, full-time summer program is designed to bring together undergraduate students and graduate students from diverse backgrounds with experience in data science, urban research and planning, design thinking and other domains to work on focused, collaborative research projects that have the potential to benefit society. The UBC DSSG program is modelled after successful programs at the University of Chicago, Georgia Tech and the University of Washington.

2023 Projects

Close All  

Increasing Transparency in Canadian Politics: When Data Journalism meets Data Science

Description: The Investigative Journalism Foundation (IJF) maintains public interest datasets on Canadian politics, including political donations, government funding decisions, and the lobbying activity of special interest groups across the country. By hosting these data for public access, the IJF increases transparency and accountability in Canadian democracy. This project seeks to contribute to this end by developing various interactive visualizations that will help Canadians more easily explore the data. These include interactive network visualizations depicting relationships between many political entities, interactive text plots showing how the content of lobbying registrations has changed over time, and various other interactive dashboards that help users visualize the activity and influence of different political entities. Jointly, these visualizations will make it easier for journalists, academic researchers, and everyday Canadians to see what is shaping Canadian politics, and to hold accountable those powerful entities that attempt to steer our democracy.

Final Presentation:


Fine Tuning a Large Language Model for Multi-label Theme Classification on Canada Energy Regulator Datasets

Description: The Canada Energy Regulator (CER) is a federal agency tasked with managing the lifecycle of energy related projects that cross provincial, territorial or international borders. Presently, the CER has a vast number of public documents and other text data that are not indexed and therefore hard to explore. A number of documents have been manually labeled, but the process is slow and costly. This project aims to explore alternatives to streamline the labelling process, hopefully increasing the accessibility of valuable knowledge presented to the CER and the general public. With the two manually labeled datasets containing de-identified but publicly available text provided by the CER, the goal of this project is creating a multi-label classifier by fine tuning a large language model (BERT) to identify themes. This project explores topic modelling as a method for labelling data without predefined themes.

Final Presentation:


2023 Fellows


Annabelle Purnomo

Annabelle is a third year majoring in Linguistics with a minor in Data Science. Her current interests include all things NLP. She is passionate about the intricacies of communication through both language and data. Through the DSSG program, she hopes to strengthen her statistical and problem solving skills, as well as develop data science solutions that are transparent and accessible to the public. Outside of this work, Annabelle enjoys language learning, studying languages including Japanese and French. She often participates in language and cultural exchanges, and works with English and Japanese language learners.

Chloe Curry

Chloe (she/they) is a fifth-year student pursuing a BSc in Computer Science and Environmental Science. She is passionate about leveraging her technical skills to solve real-world challenges, particularly those related to the climate crisis. Since 2019, Chloe has served on the board of UBC Women in Computer Science, who work to develop a strong community for students typically underrepresented in the major. As a DSSG Fellow, they are enthusiastic about tackling complex challenges and making information and data accessible to others. In their free time, Chloe can often be found backpacking, skiing or knitting.

Yadong Liu

Yadong is a Ph.D candidate who is passionate about studying speech communication from behavioural and cognitive perspectives. His research interests include human speech postures, silent speech interfaces, and detecting mental diseases through speech analysis. Additionally, Yadong has expertise in natural language processing, with a focus on topics such as sentiment analysis, semantic role labeling, and sentence parsing. As a DSSG Fellow, Yadong aims to leverage his knowledge and skills to tackle complex challenges and make a positive impact on society. He values collaboration and enjoys working with individuals from diverse backgrounds to exchange ideas and learn from one another. When he's not working or studying, Yadong enjoys hiking, music, and photography.


Sebastian Santana Ortiz

Sebastian is pursuing an MSc in Population and Public Health at UBC. He completed his BA(Hons) at the University of Victoria, where he did an honours thesis at the Canadian Institute for Substance Use Research and was awarded the Jamie Cassels Undergraduate Research Award for his work on the Voices in Motion project. His past research encompasses a variety of topics (e.g., substance use, education policy, dementia, and technology adoption). Sebastian is a strong advocate for open and replicable science. As a public servant, he has leveraged data to foster evidence-informed and equitable policy measures at the federal and provincial levels. His hope as a DSSG fellow is to expand his machine-learning skills and use them to build a project that serves the community.

William Jettinghoff

Will is a PhD student in UBC’s Department of Psychology, and his research focuses on individual differences in reasoning and personal values. His current research uses a combination of experiments, structural equation modeling, and mixture modeling to identify new sources of bias in peoples’ beliefs. As a DSSG fellow, Will is excited to learn new methods from data science that could be applied to research questions in psychology. He is also eager to learn how he can use data analysis and computational social science to make a positive impact on society.

Jialu (Cindy) Jin

Jialu (Cindy) is in her fourth year studying Statistics with a minor in Data Science. Her current interests include model fitting and statistical inferencing. She is excited to meet people from different backgrounds through DSSG and to combine everyone's knowledge to build a meaningful project. She also hopes to gain more work experience and a more comprehensive understanding of data science this summer. In her spare time, she enjoys horror movies and video games.


The 2023 DSSG Program is supported by:


2023 Partners:

Investigative Journalism Foundation

Canada Energy Regulator



2022 Data Science for Social Good Program

2021 Data Science for Social Good Program

2020 Data Science for Social Good Program

2019 Data Science for Social Good Program

2018 Data Science for Social Good Program

2017 Data Science for Social Good Program


For questions about this program, email