Lesk's Algorithm

In Natural Language Processing (NLP), word sense disambiguation (WSD) is the challenge of determining which “sense” (meaning) of a word is activated by its use in a specific context, a process that appears to be mostly unconscious in individuals.

Lesk Algorithm is a way of Word Sense Disambiguation. The Lesk algorithm is a dictionary-based approach that is considered seminal. It is founded on the idea that words used in a text are related to one another, and that this relationship can be seen in the definitions of the words and their meanings. The pair of dictionary senses having the highest word overlap in their dictionary meanings are used to disambiguate two (or more) terms. Michael E. Lesk introduced the Lesk algorithm in 1986 as a classic approach for word sense disambiguation in Natural Language Processing. The Lesk algorithm assumes that words in a given “neighborhood” (a portion of text) will have a similar theme. The dictionary definition of an uncertain word is compared to the terms in its neighborhood in a simplified version of the Lesk algorithm.

Basic Lesk Algorithm implementation involves the following steps:

  • Count the number of words in the neighborhood of the word and in the dictionary definition of that sense for each sense of the word being disambiguated.
  • The sense to be picked is the one with the greatest number of items in this count.
1 Like