Neighborhood Report
This report analyzes the words surrounding a search word. This report is based on the idea that "you shall know a word by the company it keeps" (John Firth). Neighboring words, especially ones that occur commonly with your search word, can provide new insights in word usage.
To access the report, do a search and then go to Book Reports (Analyze) > Search Results Report > Neighborhood (Collocation)
Using the Report
The Neighbors (Collocates)
tab displays a list of all the words that occur near
a search hit. These words are the “neighbors” of the search words.
The report ranks each word based on how often the word appears near the search hit in relation to how often it appears with other words. Those that are highlighted in blue are called “friends” because they co-occur the most with the search word. The darker the blue, the stronger the friend.
Save, rest, and everyday are the strongest friends of the word life in the TED Talk Corpus.
Double-clicking on any of the neighbors will filter the Neighborhoods tab to show only the neighborhoods with that neighbor.
Options
The hamburger menu button has the following options:
Option | Description |
---|---|
Copy | Copy the selected neighbor(s) to the clipboard. |
Copy All | Copy all neighbors to the clipboard. |
Export All | Export all neighbors to a CSV or TXT file. |
Filter | Filter the list of neighbors based on any of the visible columns. |
View Selected Neighbors | View the selected neighbor(s) in the neighborhood tab. This filters the neighborhood tab to only display neighborhoods that contain the selected friends. |
Remove Filters | Remove any filters that are currently filtering the neighbors list. |
Sort | Sort the list of neighbors based on any of the visible columns in descending order. Alternatively, just click on any column header to sort in descending or ascending order. |
Reset Sort | Resets the sort options to show the default sort, which shows the strongest friends first. |
Columns | Show or hide additional columns, including several of the statistics used. You can also optimize the widths of all the columns in this submenu. |
The Neighborhoods
tab displays the search word(s) in relation to
the strongest friends. This tab is used for finding patterns between the search word(s) and
neighboring words.
If you are analyzing a book tagged by part of speech, you will see part of speech displayed above each word (as long as you don't have Ignore part of speech
on in the user preferences menu).
Note: It can be good to limit your window size to what the search term generally influences. For example, adjectives in English generally influence the words after it, so you may want to change your window size to only view words after the search term and not before. To change your window size, see Window Size in settings.
Options
The hamburger menu button for the Neighborhood Report has the following options:
Option | Description |
---|---|
Copy | Copy the selected neighborhood(s) to the clipboard. |
Copy All | Copy all neighborhoods to the clipboard. |
Export All | Export all neighborhoods to a CSV or TXT file. |
Filter | Filter the list of neighborhoods based on any of the visible columns. |
Remove Filters | Remove any filters that are currently filtering the neighborhoods. |
Sort | Sort the list of neighbors based on any of the visible columns in descending order. Alternatively, just click on any column header to sort in descending or ascending order. |
Reset Sort | Resets the sort options to show the default sort, which shows the strongest friends first. |
Optimize Column Widths | Resizes the column widths to show all text. |
The Phrases
tab lists all of the phrases with the search term that
occur more than once. Frequency, size, and position columns are provided to sort and filter the
phrases.
Double clicking on any of the phrases will filter the Neighborhoods
tab to show only the neighborhoods with that phrase.
Option | Description |
---|---|
Copy | Copy the selected phrase(s) to the clipboard. |
Copy All | Copy all phrases to the clipboard. |
Export All | Export all phrases to a CSV or TXT file. |
Filter | Filter the list of phrases based on any of the visible columns. |
View Selected Phrases | View the selected phrase(s) in the neighborhood tab. This filters the neighborhood tab to only display neighborhoods that contain the selected phrases. |
Remove Filters | Remove any filters that are currently filtering the phrases. |
Sort | Sort the list of phrases based on any of the visible columns in descending order. Alternatively, just click on any column header to sort in descending or ascending order. |
Reset Sort | Resets the sort options to show the default sort, which shows the most frequent phrases first. |
Optimize Column Widths | Resizes the column widths to show all text. |
Families are composed of two or more friends that all co-occur with each other. There is not a linguistic term for this, but these could be considered a word network. This report is intented to give insight in word usage in clusters rather than just with individual words.
The Visit Cousins
button in the upper-right portion of this tab will show the
search results where the family occurs without the current search word. If there are no cousins for a family, it means that these two words only ever occur with the search word, which indicates a close co-occurrence between the words.
Double-clicking on any of the families will filter the Neighborhoods tab to show only the neighborhoods with that family.
Option | Description |
---|---|
Copy | Copy the selected neighbor(s) to the clipboard. |
Copy All | Copy all neighbors to the clipboard. |
Export All | Export all neighbors to a CSV or TXT file. |
Filter | Filter the list of neighbors based on any of the visible columns. |
View Selected Families | View the selected families in the neighborhood tab. This filters the neighborhood tab to only display neighborhoods that contain the selected families. |
Remove Filters | Remove any filters that are currently filtering the neighbors list. |
Sort | Sort the list of families based on any of the visible columns in descending order. Alternatively, just click on any column header to sort in descending or ascending order. |
Reset Sort | Resets the sort options to show the default sort, which shows the most frequent families first. |
Columns | Show or hide additional columns. You can also optimize the widths of all the columns in this submenu. |
Visit Cousins | Show the search results of the selected family without the current search word. |
The Neighborhood
, Neighbors
, and
Friends
reports have options for user preferences. Click on the
Report Preferences icon in the upper-right corner of the
Neighborhood Report window.
Neighborhood Settings
By default, neighbors and neighborhoods analyze five words before and after the search word. You can set the window from 0 to 10 in either direction.
The Neighborhood Report ignores paragraphing, so if the search term is near the beginning or end of the lowest level (usually a paragraph), it analyzes words from surrounding paragraphs. If the words before or after a paragraph break are unrelated, you may not want them in your search results:
The Do not cross lowest-level bound with neighborhoods
checkbox will
enforce that neighborhoods must be within the same lowest level.
Note: Loading the report will take longer with this option turned on.
Neighbors Settings
Options to ignore case, diacritics, and part of speech (if applicable) are available.
You can use the uncorrected statistics to calculate neighbors. Evert (2008) suggests using the corrected values for surface cooccurrence (e.g., in same neighborhood, L5–R5 span), and the uncorrected values for textual (in same sentence, paragraph, document, web page, …) and syntactic (e.g., verb-object [make + decision], adj. + noun [blue + coat]) cooccurrence.2 (See the PDF here about statistics used in WordCruncher for more information.)
Friends Settings
The default settings for friends (the strongest neighbors) are calculated by a statistic called Mutual Information (MI). This statistic tends to show the best results, but there are several other statistics used for calculating friends. (See the PDF here about statistics used in WordCruncher for more information.)
Available Statistics: Dice, Expected, LL, Log Dice, Log Ratio, MI, MI2, MI3, MS, Percent, Rating, T-score(o), T-score(pq), Z-score(e), Z-score(pq), ΔP k→n, and ΔP k←n
To be counted as a friend, a word must have 1) a minimum threshold based on the statistic used (default 3 for MI) and 2) a minimum sample frequency (default 2).
Try It Out
- Do a search.
- Find neighbors (collocates) through the neighborhood report.
- Export the report as a CSV or TXT file.