|Awardee:||FLORIDA INTERNATIONAL UNIVERSITY|
|Doing Business As Name:||Florida International University|
|Estimated Total Award Amount:||$ NaN|
|Funds Obligated to Date:||
|Awarding Agency Code:||4900|
|Funding Agency Code:||4900|
|Primary Program Source:||040100 NSF RESEARCH & RELATED ACTIVIT|
|Award Title or Description:||Investigation of Geospatial Data Management on MapReduce Platform|
|Federal Award ID Number:||0837716|
|Parent DUNS ID:||159621697|
|Program:||CLUSTER EXPLORATORY (CLuE)|
|Street:||11200 SW 8TH ST|
|Awardee Cong. District:||26|
Primary Place of Performance
|Organization Name:||Florida International University|
|Street:||11200 SW 8TH ST|
Abstract at Time of Award
The High Performance Database Research Center at Florida International University is leveraging the Hadoop framework, which implements Google's computational paradigm MapReduce and provides distributed file system services, for serving geospatial imagery and to execute spatial queries with heterogeneous predicates. This work is laying the foundation for high-performance geospatial querying. For instance, queries such as "the percentage of Florida state's land-mass that has vegetation" can be computed using basic image processing (map operation) at each image tile, followed by a simple summation (reduce operation) across tiles that comprise the aerial imagery of the Florida land-mass. A potentially infinite number of such semantic queries can thus be computed using the MapReduce paradigm and a large-scale raster imagery dataset. This exploratory work is providing a bridge between geospatial Web services and the MapReduce platform which has demonstrated success in other data-intensive applications. This work is expected to produce a major impact on the field of geospatial data management and especially decision support based on geospatial data, by enabling decision support queries which were not previously practical. This will provide a foundation to enable critical decision support applications in fields such as disaster mitigation and environmental protection.This work is also providing a uniquely comprehensive collection of geospatial data to a broad research community.
Publications Produced as a Result of this Research
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
Ariel Cary, Ouri Wolfson, and Naphtali Rishe "Efficient and Scalable Method for Processing Top-k Spatial Boolean Queries" Lecture Notes in Computer Science: Proceedings of the 22nd International Conference on Scientific and Statistical Database Management (SSDBM 2010), v.6187, 2010, p.87.
Publications Produced as Conference Proceedings
Cary, A;Sun, ZG;Hristidis, V;Rishe, N "Experiences on Processing Spatial Data with MapReduce" 21st International Conference on Scientific and Statistical Database Management, v.5566, 2009, p.302 View record at Web of Science
Project Outcomes Report
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
MapReduce at Florida International University
NSF Program Director: Xiaoyang Wang
PI: Naphtali Rishe
Co-PIs: Vagelis Hristidis and Raju Rangaswami
Researchers at the NSF Industry-University Research Center CAKE at Florida International University have leveraged their Geospatial Data Server TerraFly project to deploy data and algorithms on the CluE infrastructure and to develop new algorithms with applications in geographic information retrieval, urban improvement, and disaster mitigation.
TerraFly users visualize and query aerial imagery and data layers. Users virtually "fly" over imagery via a web browser, without any software to install or plug in. Tools include user-friendly geospatial querying, data drill-down, interfaces with real-time data suppliers, demographic analysis, annotation, route dissemination via autopilots, customizable applications, production of aerial atlases, application programming interface (API) for web sites.
The TerraFly project has been featured on TV news programs (including FOX TV News), worldwide press, covered by the New York Times, USA Today, NPR, and Science and Nature journals.
The 40TB TerraFly data collection includes, among others, 1-meter aerial photography of almost the entire United States and 3-inch to 1-foot full-color recent imagery of major urban areas. TerraFly vector collection includes 400 million geolocated objects, 50 billion data fields, 40 million polylines, 120 million polygons, including: all US and Canada roads, the US Census demographic and socioeconomic datasets, 110 million parcels with property lines and ownership data, 15 million records of businesses with company stats and management roles and contacts, 2 million physicians with expertise detail, various public place databases (including the USGS GNIS and NGA GNS), Wikipedia, extensive global environmental data (including daily feeds from NASA and NOAA satellites and the USGS water gauges), and hundreds of other datasets.
In the present project, we used MapReduce to execute and benchmark massive data computations in the GIS domain.
The specific problems that FIU’s team has addressed are:
For specific questions or comments about this information including the NSF Project Outcomes Report, contact us.