WordCruncher Monthly

Vocabulary Dispersion Report

Acclivity Rings

What if we measured vocabulary rankings not by the frequency of a word, but by the number of texts in which it occurs? The term “acclivity” is ranked # 343 in frequency (occurring 101 times) in Brandon Sanderson’s novels Skyward and Starsight, but does not even rank in the top 60,000 words in the Corpus of Contemporary American English (occurring only 9 times). While it occurs many times in a single book, it isn’t a common word you’ll encounter elsewhere.

Dispersion

Linguists use a measurement called “dispersion” to determine how dispersed a word is. If a word has an even dispersion, the word occurs through many of the texts. On the other hand, if a word has an uneven dispersion, the word occurs only in a few texts. One common usage for dispersion is in frequency dictionaries, which sort words by both a word’s frequency and dispersion. Here is an example from Routledge’s A Frequency Dictionary of German:

Frequency Dictionary Example

This dictionary shows the normalized frequency (Freq) and a dispersion metric (1.0 being the most evenly dispersed and 0.0 being the least). The methods for calculating dispersion vary, but for the purposes of this article, I’ll use R%—essentially a percentage of how many texts contain the word “X” (not accounting for the frequency of the word).

Ranking Words by Dispersion

In the TED Talk corpus, the word king has a frequency of 2,770 and its dispersion is 50.6. It is ranked #67 by frequency, but ranked #178 by dispersion. This is because the word is not as evenly dispersed in its frequency. On the other hand, the word eyes has a frequency of 651 and is ranked #209 by frequency, but #177 by dispersion. Although the word is less frequent than king, we can trust that the word eyes is found in more sections of the corpus.

That doesn't mean that frequency is inherently bad or that dispersion is better. Both are valuable pieces of information and tell a different story about the text. When I look at word lists, I prefer to see both frequency and dispersion to help me understand more about the words within my text. If you want to see the vocabulary dispersion report, remember to:

  1. Open a book.
  2. Go to Analyze > Book Reports > Vocabulary Dispersion Report.

Want to know more about the stats available in the Vocabulary Dispersion report? Click the button below:

See Other Articles from October 2021