Skip directly to content

Minimize RSR Award Detail

Research Spending & Results

Award Detail

Awardee:THE UNIVERSITY OF SOUTH DAKOTA
Doing Business As Name:University of South Dakota Main Campus
PD/PI:
  • Paula M Mabee
  • (605) 677-6171
  • pmabee@usd.edu
Award Date:09/17/2019
Estimated Total Award Amount: $ 425,211
Funds Obligated to Date: $ 217,642
  • FY 2019=$217,642
Start Date:10/01/2019
End Date:09/30/2021
Transaction Type:Grant
Agency:NSF
Awarding Agency Code:4900
Funding Agency Code:4900
CFDA Number:47.070
Primary Program Source:040100 NSF RESEARCH & RELATED ACTIVIT
Award Title or Description:Collaborative Research: Biology-guided neural networks for discovering phenotypic traits
Federal Award ID Number:1940340
DUNS ID:929930808
Parent DUNS ID:929538999
Program:HDR-Harnessing the Data Revolu
Program Officer:
  • Peter McCartney
  • (703) 292-8470
  • pmccartn@nsf.gov

Awardee Location

Street:414 E CLARK ST
City:vermillion
State:SD
ZIP:57069-2307
County:Vermillion
Country:US
Awardee Cong. District:00

Primary Place of Performance

Organization Name:University of South Dakota Main Campus
Street:
City:
State:SD
ZIP:57069-2307
County:Vermillion
Country:US
Cong. District:00

Abstract at Time of Award

Unlike genetic data, the traits of organisms such as their visible features, are not available in databases for analysis. The lack of machine-readable trait data has slowed progress on four grand challenge problems in biology: predicting the genes that generate traits, understanding the patterns of evolution, predicting the effects of ecological change, and species identification. This project will use advances in machine learning and machine-readable biological knowledge to create a new method to automatically identify traits from images of organisms. Images of organisms are widely available, and this new method could be used to rapidly harvest traits that could be used to solve the grand challenges in biology. Large image collections and corresponding digital data from fishes will be used in this study because of the extensive resources available for these organisms. The new machine learning model can be generalized to other disciplines that have similar machine-readable knowledge, and it will help in explaining the results of artificial intelligence, thus advancing the field of computer science. The new method stands to benefit society in application to areas such as agriculture or medicine, where trait discovery from images is critical in disease diagnosis. The project will support the education of students and postdocs in biology, computer science, and information science. It will disseminate its findings through workshops, presentations, publications, and open access to data and code that it produces. This project will leverage advances in state-of-the-art machine learning to develop a novel class of artificial neural networks that can exploit the machine readable and predictive knowledge about biology that is available in the form of phylogenies and anatomy ontologies. These biology-guided neural networks are expected to automatically detect and predict traits from specimen images, with little training data. Image-based trait data derived from this work will enable progress in gene-phenotype mapping to novel traits and understanding patterns of evolution. The resulting machine learning model can be generalized to other disciplines that have formally structured knowledge, and will contribute to advances in computer science by going beyond black-box learning and making important advances toward Explainable Artificial Intelligence. It may be extended to applied areas, such as agriculture or the biomedical domain. The research will be piloted using teleost fishes because of many high-quality data resources (digital images, evolutionary trees, anatomy ontology). Methods for automated metadata quality assessment and provenance tracking will be developed in the course of this project to ensure the results and processes are verifiable, replicable and reusable. These will broadly impact the many domains that will adopt machine learning as a way to make discoveries from images. This convergent research will accelerate scientific discovery across the biological sciences and computer science by harnessing the data revolution in conjunction with biological knowledge. This project is part of the National Science Foundation's Harnessing the Data Revolution (HDR) Big Idea activity, and is jointly supported by the HDR and the Division of Biological Infrastructure within the NSF Directorate of Directorate for Biological Sciences. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

For specific questions or comments about this information including the NSF Project Outcomes Report, contact us.