Computer vision and machine learning techniques for video and facial understanding
In this project, Drs. Sigal and Schmidt are pursuing a number of research goals at the intersection of computer vision and machine learning. In part one, the team will advance automatic video summarization by exploring novel richer joint video-linguistic and graph-structured representations to facilitate video retrieval, summarization and--potentially--action recognition tasks. In part two, the team will develop generative models that are able to effectively "imagine" what images of faces or objects would look like in a canonical (e.g., frontal face image in face recognition), or more broadly, any view or unobstructed configuration. In the last section of this project, the team aims to develop much faster methods for deep neural networks underlying computer vision systems (such as those applied in part one and two) by tuning the “parameters” of deep neural networks and tuning the “hyper-parameters”—this includes choosing the structure of the network and other design choices by developing an automated technique. The outcomes of this project will result in significant improvements to computer vision performance and runtime and applications for surveillance applications.
Trainees: Mohit Bajaj (MSc candidate), Polina Zablotskaia (MSc candidate)
This project is sponsored by the DSI-Huawei Research Program