Skip directly to content

Minimize RSR Award Detail

Research Spending & Results

Award Detail

  • Jing Lei
Award Date:02/05/2016
Estimated Total Award Amount: $ 400,000
Funds Obligated to Date: $ 220,679
  • FY 2016=$56,744
  • FY 2017=$79,479
  • FY 2018=$84,456
Start Date:08/01/2016
End Date:07/31/2021
Transaction Type:Grant
Awarding Agency Code:4900
Funding Agency Code:4900
CFDA Number:47.049
Primary Program Source:040100 NSF RESEARCH & RELATED ACTIVIT
Award Title or Description:CAREER: Modernizing Classical Nonparametric and Multivariate Theory for Large-scale, High-dimensional Data Analysis
Federal Award ID Number:1553884
DUNS ID:052184116
Parent DUNS ID:052184116
Program Officer:
  • Gabor J. Szekely
  • (703) 292-8869

Awardee Location

Street:5000 Forbes Avenue
Awardee Cong. District:14

Primary Place of Performance

Organization Name:Carnegie-Mellon University
Street:5000 Forbes Avenue
Cong. District:14

Abstract at Time of Award

The constantly increasing dimensionality and complexity of modern data has motivated many new data analysis tools in various fields, and urgently call for rigorous theoretical investigation, such as robustness against different sources of model misspecification,uncertainty quantification in classification and prediction, and statistical performance guarantee of conventional methods under non-standard settings. Although most classical theory are not directly applicable to methods developed for complex data, partially due to highly specialized model assumptions and diversified algorithms, the profound statistical thinking carried in these long-established results can still provide deep theoretical insights. When combined with cutting-edge results in modern context such as random matrix theory, matrix concentration, and convex geometry, these classical theory will lead to novel principled methods for a general class of problems ranging from high dimensional regression and classification to network data analysis and subspace learning. All methods developed in the proposed research will be implemented as standard R packages freely available and will have high pedagogical value and will be used to develop new courses. The proposed research has applications in astronomy and medical screening data. The proposal also provides new inference tools for applied areas in genetics, psychiatry, brain sciences. Integrated educational activities include designing courses on new perspectives in nonparametric statistics and modern multivariate analysis. The proposed work will further integrate classical nonparametric and multivariate analysis theory with modern elements in four major areas of statistical research, including assumption-free prediction bands in high dimensional regression; a generalized Neyman-Pearson framework for set-valued multi-class classification; statistical performance guarantee of some greedy algorithms in network community detection as well as goodness-of-fit tests for network model selection; and a unified singular value decomposition framework for structured subspace estimation formulated as a convex optimization problem. These research activities will lead to modernized nonparametric and multivariate analysis courses, featuring new theoretical frameworks such as computationally constrained minimax analysis, additional topics such as functional data analysis, and cutting-edge examples in genetics, brain imaging, traffic, and astronomy.

Publications Produced as a Result of this Research

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Chen, K. and Lei, J. "Network Cross-Validation for Determining the Number of Communities in Network Data" Journal of the American Statistical Association, v.113, 2018, p.241.

Zhu, L., Lei, J., Devlin, B., and Roeder, K. "Testing High Dimensional Differential Matrices, with Application to Detecting Schizophrenia Risk Genes" Annals of Applied Statistics, v.11, 2017, p.1810.

For specific questions or comments about this information including the NSF Project Outcomes Report, contact us.