Institute of Information Science Academia Sinica
Harvesting Geographic Features from Heterogeneous Raster Maps
*Abstract*
Maps are widely available for areas around the globe and are an 
important source of geospatial data. Due to the popularity of 
Geographic Information System (GIS), high quality scanners, and 
Internet, we can now obtain various maps in raster format. Comparing 
to other geospatial data, raster maps are easily accessible and provide 
geographic features that are difficult to find elsewhere, such as 
landmarks in historical maps. Moreover, for certain types of 
geographic features, raster maps contain the most complete set of data, 
such as the United States Geological Survey (USGS) topographic maps 
that have the contour lines of the entire United States in various 
scales. We can exploit the geographic features in raster maps (e.g., 
roads, text labels, etc.) to provide additional knowledge for viewing 
and understanding other geospatial data. For example, we can produce 
map context by extracting layers of geographic features (e.g., text 
layers) and recognizing the features (e.g., text labels) from raster 
maps. The map context then can be used for fusing the maps with other 
geospatial data and further for indexing and retrieval of the maps and 
the other geospatial data that are aligned to the raster maps. For 
instance, we can create a keyword-search function for imagery by 
exploiting the recognized text labels from raster maps that are aligned 
to the imagery.

Harvesting the geographic features in raster maps is a challenging task 
because of the varying image quality (e.g., scanned maps with poor 
image quality and digital generated map with good image quality), the 
complexity of maps (i.e., overlapping features in maps), and the 
typical lack of metadata (e.g., map geocoordinates, map source, 
original vector data, etc.). To overcome these difficulties, I present 
two map decomposition techniques, each requiring a different amount 
of user input to first decompose raster maps with varying image quality 
into feature layers (i.e., a feature layer is an images of a particular 
geographic feature). Second, I present feature recognition techniques 
that convert the feature layers into machine-editable map context, 
such as extracting road vector data from a road layer. We can then fuse 
the extracted features layers and recognized features with other 
geospatial data to generate a hybrid view and create context of the 
integrated data, such as an notating roads by aligning an extracted 
text layer from a street map to imagery. In conclusion, my approach 
enables us to make use of the geospatial information of heterogeneous 
maps locked in raster format.

*BIO*

Yao-Yi Chiang is currently a Ph.D. candidate at the University of 
Southern California. He received his M.S. degree in Computer Science 
from the University of Southern California in December 2004; and his 
Bachelor degree in Information Management from the National Taiwan 
University in June 2000. His research interests are on the automatic 
fusion of geographical data.  He has worked extensively on the problem 
of automatically utilize raster maps for understanding other 
geospatial data. He has written and co-authored several papers on 
automatically fusing map and imagery as well as automatic map 
processing. Prior to his doctoral study at USC, Yao-Yi worked as a 
Research Scientist for Information Sciences Institute and Geosemble 
Technologies.