Skip directly to content

Minimize RSR Award Detail

Research Spending & Results

Award Detail

Awardee:ROCHESTER INSTITUTE OF TECHNOLOGY (INC)
Doing Business As Name:Rochester Institute of Tech
PD/PI:
  • Richard Zanibbi
  • (585) 475-7525
  • rlaz@cs.rit.edu
Co-PD(s)/co-PI(s):
  • Anurag Agarwal ~000555626
Award Date:12/01/2017
Estimated Total Award Amount: $ 498,928
Funds Obligated to Date: $ 498,928
  • FY 2018=$498,928
Start Date:12/01/2017
End Date:11/30/2020
Transaction Type:Grant
Agency:NSF
Awarding Agency Code:4900
Funding Agency Code:4900
CFDA Number:47.070
Primary Program Source:040100 NSF RESEARCH & RELATED ACTIVIT
Award Title or Description:III: Small: Improving Technical Paper Database Search through Math-Aware Search Engines
Federal Award ID Number:1717997
DUNS ID:002223642
Parent DUNS ID:002223642
Program:INFO INTEGRATION & INFORMATICS
Program Officer:
  • James French
  • (703) 292-0000
  • jfrench@nsf.gov

Awardee Location

Street:1 LOMB MEMORIAL DR
City:ROCHESTER
State:NY
ZIP:14623-5603
County:Rochester
Country:US
Awardee Cong. District:25

Primary Place of Performance

Organization Name:Rochester Institute of Tech
Street:
City:
State:NY
ZIP:14623-5608
County:Rochester
Country:US
Cong. District:25

Abstract at Time of Award

Today's search engines make use of sophisticated techniques for searching based upon words, but are not able to make nuanced use of mathematical notation. This project aims to allow scientists, engineers, mathematicians, and students to locate technical information using words, mathematical notation, or some of each. For example, a mathematician studying graph theory could use these new capabilities to find related applications in physics, ecology, and social network analysis, despite any differences in the notation and terminology used in those disciplines. Given a large collection of technical documents, we will apply machine learning techniques to construct associations between the formulae and words used to explain mathematical ideas, and determine how to translate automatically between those two forms of expression. These associations and translations can then be used by students who write what they are looking for using words, with the search engine finding documents that express those same ideas, even if only in mathematical notation. These new math-aware search engines will accelerate innovation by allowing searchers to discover information both across technical disciplines and, by using mathematical notation as a pivot, even across human languages. To accomplish these goals, the project will develop novel scalable techniques for indexing and retrieval of mathematical content in technical documents. These methods will accommodate a broad range of notational conventions, formats, and encodings. New context-based methods for inferring associations between formulae and related text will be used to build rich and flexible models of content equivalence. These equivalence models will be used in new ranking algorithms that integrate results found using words or using mathematical notation into a single ranked list. Open-source reference implementations will be shared publicly, and new test collections created to evaluate these implementations will be shared with other researchers. To gain experience with the use of these new capabilities, the project will add math-aware search to the CiteSeerX digital library of scientific literature. CiteSeerX is an open Web service that can be used to compare alternative retrieval methods in actual use. For further information see the project Web page: https://www.cs.rit.edu/~dprl/math-aware-search.html.

For specific questions or comments about this information including the NSF Project Outcomes Report, contact us.