Skip directly to content

Minimize RSR Award Detail

Research Spending & Results

Award Detail

Awardee:IOWA STATE UNIVERSITY OF SCIENCE AND TECHNOLOGY
Doing Business As Name:Iowa State University
PD/PI:
  • Xiongtao Dai
  • (515) 294-2182
  • xdai@iastate.edu
Award Date:06/02/2021
Estimated Total Award Amount: $ 175,000
Funds Obligated to Date: $ 58,232
  • FY 2021=$58,232
Start Date:07/01/2021
End Date:06/30/2024
Transaction Type:Grant
Agency:NSF
Awarding Agency Code:4900
Funding Agency Code:4900
CFDA Number:47.049
Primary Program Source:040100 NSF RESEARCH & RELATED ACTIVIT
Award Title or Description:Collaborative Research: Halfspace Depth for Object and Functional Data
Federal Award ID Number:2113713
DUNS ID:005309844
Parent DUNS ID:005309844
Program:STATISTICS
Program Officer:
  • Huixia Wang
  • (703) 292-2279
  • huiwang@nsf.gov

Awardee Location

Street:1138 Pearson
City:AMES
State:IA
ZIP:50011-2207
County:Ames
Country:US
Awardee Cong. District:04

Primary Place of Performance

Organization Name:Iowa State University
Street:1138 Pearson
City:Ames
State:IA
ZIP:50011-2103
County:Ames
Country:US
Cong. District:04

Abstract at Time of Award

Complex data objects are increasingly being generated across science and engineering. Non-Euclidean data such as wind directions, neural connectivity networks, and phylogenetic trees draw practical interest, but are challenging to analyze due to their intrinsic constraints. Functional data such as trajectories and images also provide examples of another type of data of high complexity, which are observed on a continuous domain in time or space. In general, practitioners are interested in first exploring the data distributions before any modeling analysis. For instance, given a sample of growth trajectories of children, a first step is to identify typical versus extreme growth patterns, where the latter can be non-trivial to uncover. Also, when analyzing brain connectivity matrices, it is important to find unusual brain networks and differences between healthy and diseased populations. Data-driven methods robust to anomalies are essential in these settings since little is known about the data generating process, and outliers can affect the analysis. Due to the lack of a natural ordering in data objects, exploratory tools such as boxplot and quantile are unavailable for these types of data. The project will address the lack of techniques for exploring non-Euclidean and functional data. Principled statistics and visualization methods will be developed based on a novel way of ranking the observations. The project will also provide training for graduate and undergraduate students. The central research theme is to develop exploratory data analysis tools for non-Euclidean and functional data objects. To overcome the absence of a canonical ordering for object data, the PIs will develop suitable data depth notions to quantify the centrality of data points with respect to the distribution. This will provide a center-outward ranking of the data that will be used as a building block for outlier detection methods, rank tests, and robust classifiers. Analogous to Tukey's halfspace depth for the multivariate Euclidean case, the new depth notions for object data are expected to be intuitive and robust, and have desirable properties well-grounded in theory. Specifically, the research project will investigate a depth notion for non-Euclidean objects; a data visualization and an outlier detection procedure for non-Euclidean data; halfspace depth notions for functional data, one based on theory and another one from an algorithmic perspective; and a depth notion for sparsely observed longitudinal data. Key challenges that will be addressed include a lack of vector space structure when dealing with non-Euclidean objects; the infinite dimensionality and degeneracy when defining depth notions for functional data; detecting outlying trajectories and images in shape and not just at any time point; and the sparsity and irregularity of observations in longitudinal data. Method and theory development will draw from metric geometry, functional data analysis, empirical process, and M-estimation. Software implementing a suite of depth-based methods will be made available to the public as an outcome of the project. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

For specific questions or comments about this information including the NSF Project Outcomes Report, contact us.