Skip directly to content

Minimize RSR Award Detail

Research Spending & Results

Award Detail

Awardee:PRESIDENT AND FELLOWS OF HARVARD COLLEGE
Doing Business As Name:Harvard University
PD/PI:
  • Pragya Sur
  • (650) 285-9364
  • pragya@fas.harvard.edu
Award Date:06/11/2021
Estimated Total Award Amount: $ 170,000
Funds Obligated to Date: $ 57,108
  • FY 2021=$57,108
Start Date:07/01/2021
End Date:06/30/2024
Transaction Type:Grant
Agency:NSF
Awarding Agency Code:4900
Funding Agency Code:4900
CFDA Number:47.049
Primary Program Source:040100 NSF RESEARCH & RELATED ACTIVIT
Award Title or Description:Inference for Functionals in High-Dimensional Regression
Federal Award ID Number:2113426
DUNS ID:082359691
Parent DUNS ID:001963263
Program:STATISTICS
Program Officer:
  • Pena Edsel
  • (703) 292-8080
  • epena@nsf.gov

Awardee Location

Street:1033 MASSACHUSETTS AVE
City:Cambridge
State:MA
ZIP:02138-5369
County:Cambridge
Country:US
Awardee Cong. District:05

Primary Place of Performance

Organization Name:HARVARD UNIVERSITY
Street:One Oxford Street
City:Cambridge
State:MA
ZIP:02138-2901
County:Cambridge
Country:US
Cong. District:05

Abstract at Time of Award

Modern science and engineering applications involve large datasets with a multitude of variables or features. A key challenge in this context is to distinguish the scientifically relevant variables from the irrelevant ones - in other words, the signal from the noise. The challenge is compounded by subtle nonlinear relationships among these variables. Generalized linear models are the most often used tools in classical statistics for discovering such nonlinear relationships and they are routinely employed, even in contemporary big data settings. Unfortunately, classical statistical theory, traditionally used to justify the validity of these methods, fails in this regime. This project will develop novel approaches for inferring scientifically relevant parameters in the framework of generalized linear models, adapted to the setting of high-dimensional or big data. The theory developed will facilitate principled inference regarding the relations among observed variables in applications such as genomics, computational neuroscience, signal and image processing. The principal investigator will also engage graduate students in the project by mentoring them and develop courses that will incorporate results from this project. This research project will develop statistical theory and methods for inferring scientifically relevant low-dimensional functionals in high-dimensional generalized linear models, organized around two broad themes: (1) frequentist inference for signal-to-noise ratio type functionals; (2) Bayesian inference for functionals under continuous shrinkage priors. The first theme will develop novel estimators for the signal-to-noise ratio and the genetic relatedness, a generalization of the signal-to-noise ratio that measures the shared genetic basis between multiple traits in statistical genetics. The second thrust will construct data-driven credible intervals for components of the underlying signal under computationally tractable continuous shrinkage priors. Both thrusts will develop inference procedures agnostic to sparsity level of the underlying signal. To achieve this, the research will focus on the proportional asymptotics high-dimensional regime and utilize novel insights from approximate message passing theory, developed originally in probability, information theory, and statistical physics. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

For specific questions or comments about this information including the NSF Project Outcomes Report, contact us.