Skip directly to content

Minimize RSR Award Detail

Research Spending & Results

Award Detail

  • Smita Krishnaswamy
Award Date:07/20/2021
Estimated Total Award Amount: $ 586,187
Funds Obligated to Date: $ 120,442
  • FY 2021=$120,442
Start Date:10/01/2021
End Date:09/30/2026
Transaction Type:Grant
Awarding Agency Code:4900
Funding Agency Code:4900
CFDA Number:47.070
Primary Program Source:040100 NSF RESEARCH & RELATED ACTIVIT
Award Title or Description:CAREER: Deep representation learning for exploration and inference in biomedical data
Federal Award ID Number:2047856
DUNS ID:043207562
Parent DUNS ID:043207562
Program:Info Integration & Informatics
Program Officer:
  • Wendy Nilsen
  • (703) 292-2568

Awardee Location

Street:Office of Sponsored Projects
City:New Haven
County:New Haven
Awardee Cong. District:03

Primary Place of Performance

Organization Name:Yale University
Street:Office of Sponsored Projects
City:New Haven
County:New Haven
Cong. District:03

Abstract at Time of Award

Biological systems are inherently complex. Increasingly sophisticated technologies are being used in biomedical science in order to make sense of this complexity and to understand the underlying factors that cause disease. These technologies generate vast amounts of data in many different forms, from changes in how genes and proteins are expressed in individual cells over time, to detailed clinical imaging data on large patient populations and whole genome sequencing studies across hundreds of thousands of people. These newly developed datatypes could help uncover important mechanisms and pathways that underpin health and disease. However, there is a large gap between the information contained in these datasets and the ability to extract meaningful insights. Here the PI proposes to address this by developing new machine learning approaches based on mathematical foundations that will allow us to make sense of these complex datasets. The PI will develop deep representation learning techniques that focus on gaining overall insight into the structures, dynamics, interactions, and predictive features of the data, and will allow specific hypotheses regarding the underlying regulatory mechanisms that drive disease in different contexts to derived. The proposal will also involve training a postdoc, graduate student, and mentorship of local high school students. In addition, it will enable the development of an online workshop to widely disseminate knowledge of unsupervised data analysis to a diverse array of participants from across the country. This project proposes to advance biomedical data analysis via three main thrusts. The first thrust is focused on forming deep multiscale representations of the data based on data geometry, graph signal processing, and topological concepts, in combination with powerful, deep learning systems. Such representations will allow for exploration of structure and meaningful, predictive abstractions of the data in a scalable fashion. Our second thrust is focused on integrating multiple modalities of data and organizing multitudes of related datasets using optimal transport and generative models to gain insight into entire cohorts of patients or perturbation conditions. Our third thrust is focused on learning high dimensional stochastic dynamics of the data using neural SDE (stochastic differential equation) and graph ODE (ordinary differential equation) networks to gain insight into underlying gene regulatory networks. We apply our approaches in the context of several specific biomedical challenges. Achieving these aims will enable integration and exploration of a large volume of data for explaining underlying regulatory mechanisms and dynamic phenotypic changes. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

For specific questions or comments about this information including the NSF Project Outcomes Report, contact us.