Skip directly to content

Minimize RSR Award Detail

Research Spending & Results

Award Detail

Awardee:RECTOR & VISITORS OF THE UNIVERSITY OF VIRGINIA
Doing Business As Name:University of Virginia Main Campus
PD/PI:
  • Vicente Ordonez
  • (434) 982-2225
  • vo2m@virginia.edu
Award Date:06/15/2021
Estimated Total Award Amount: $ 499,760
Funds Obligated to Date: $ 149,760
  • FY 2021=$149,760
Start Date:09/01/2021
End Date:08/31/2026
Transaction Type:Grant
Agency:NSF
Awarding Agency Code:4900
Funding Agency Code:4900
CFDA Number:47.070
Primary Program Source:040100 NSF RESEARCH & RELATED ACTIVIT
Award Title or Description:CAREER: Teaching Machines to Recognize Complex Visual Concepts in Images through Compositionality
Federal Award ID Number:2045773
DUNS ID:065391526
Parent DUNS ID:065391526
Program:Robust Intelligence
Program Officer:
  • Jie Yang
  • (703) 292-4768
  • jyang@nsf.gov

Awardee Location

Street:P.O. BOX 400195
City:CHARLOTTESVILLE
State:VA
ZIP:22904-4195
County:Charlottesville
Country:US
Awardee Cong. District:05

Primary Place of Performance

Organization Name:University of Virginia Main Campus
Street:85 Engineers Way
City:Charlottesville
State:VA
ZIP:22904-4740
County:Charlottesville
Country:US
Cong. District:05

Abstract at Time of Award

Modern computational systems for image recognition can be taught to detect objects among large sets of categories. However, in order to teach machines to recognize every new category, human operators need to annotate a large number of images with categorical labels. In practice many applications require a custom set of categories. For instance, a visual recognition model for detecting different types of furniture for an e-commerce application might require very specific categories such as ‘rocking chair’, ‘swivel chair’, ‘accent chair’, or ‘swivel accent chair’. Even an expert domain user that has a good idea in mind for what should be the visual characteristics that are important to recognize in each type of chair, would have to teach the system through annotating images individually. The goal of this project is to enable richer modes of interaction where ‘machine teachers’ would be able to guide the image recognition through direct feedback on the types of visual characteristics that are important for each new category. To this end we plan to exploit principles of compositionality where new categories can be defined based on basic concepts that are easier to recognize. The project will integrate research with the education and involve undergraduate students from underrepresented groups in the research. This project will devise new models that learn to recognize visual concepts compositionally by first discovering and then learning to recognize visual primitives that are shared across many classes. This process will also be tailored to maximize the utility in an environment where a user can guide the model through natural interactions including the use of language and direct manipulation through a visual interface. The project will be 1) developing methods to compositionally and interactively learn from textual descriptions 2) proposing methods to automatically discover primitives that are composable across categories, and 3) proposing models that can support interactions even after deployment. These three research aims will be complemented by a comprehensive evaluation plan, a public platform that exposes our methods in an interactive environment, and broadening participation activities. This research effort will bring novel designs in visual recognition models that offer people more expressive ways for guiding them and training them. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

For specific questions or comments about this information including the NSF Project Outcomes Report, contact us.