Skip directly to content

Wikipedia and the meaning of words

Word sense disambiguation (WSD) is the task of finding the meaning of a word given its context. With more than 20,000 ambiguous words in each language, as listed in broad coverage dictionaries, WSD is one of the most difficult problems in natural language processing.

Using Wikipedia as a source of definitions, synonyms and translations, NSF-funded researchers built automated WSD systems with significantly reduced error rates compared to other WSD systems for English and other languages, including Spanish, Italian and German.

Although there are some limitations inherent to this approach, they are overcome by the advantage that Wikipedia offers: Sense-tagged data for a large number of words at virtually no cost. The approach is applicable to any language for which a Wikipedia version exists. Currently, Wikipedia versions with at least 20,000 articles are available for more than 100 languages.

WSD can be used in research applications such as machine translation or information retrieval, as well as in educational ones such as text adaptation, answer grading or to assist language learners by providing synonyms or translations in context.

Image

  • a wikipedia-based language system provides definitions, synonyms and translations
A Wikipedia-based system assists language learners.
Rada Mihalcea, University of Michigan

Recent Award Highlights

large numbers of indian scientists and engineers are choosing to return to their home country

Why some foreign-born researchers choose to return home

Career growth, cultural and familial connections draw scientists home

Research Areas: People & Society Locations: New Mexico Texas International
map shows density of Jewish-designated ghetto houses in budapest 1944

Understanding the spatial dimensions of the Holocaust

Maps created from geographic information systems show movement of Jews in Nazi-controlled Budapest

Research Areas: People & Society Locations: Texas International