neighborhood icon Neighborhood Report

This report analyzes the words surrounding a search word. This report is based on the idea that "you shall know a word by the company it keeps" (John Firth). Neighboring words, especially ones that occur commonly with your search word, can provide new insights in word usage.

To access the report, do a search and then go to Book Reports (Analyze) > Search Results Report > Neighborhood (Collocation)

Using the Report

    The Neighbors (Collocates) tab displays a list of all the words that occur near a search hit. These words are the “neighbors” of the search words.

    The report ranks each word based on how often the word appears near the search hit in relation to how often it appears with other words. Those that are highlighted in blue are called “friends” because they co-occur the most with the search word. The darker the blue, the stronger the friend.

    Example of the Neighbors tab with the most frequent friends

    Save, rest, and everyday are the strongest friends of the word life in the TED Talk Corpus.


    Example of the Neighbors tab with the most frequent friends

    Double-clicking on any of the neighbors will filter the Neighborhoods tab to show only the neighborhoods with that neighbor.


    Options

    The hamburger menu button has the following options:

    Option Description
    Copy Copy the selected neighbor(s) to the clipboard.
    Copy All Copy all neighbors to the clipboard.
    Export All Export all neighbors to a CSV or TXT file.
    Filter Filter the list of neighbors based on any of the visible columns.
    View Selected Neighbors View the selected neighbor(s) in the neighborhood tab. This filters the neighborhood tab to only display neighborhoods that contain the selected friends.
    Remove Filters Remove any filters that are currently filtering the neighbors list.
    Sort Sort the list of neighbors based on any of the visible columns in descending order. Alternatively, just click on any column header to sort in descending or ascending order.
    Reset Sort Resets the sort options to show the default sort, which shows the strongest friends first.
    Columns Show or hide additional columns, including several of the statistics used. You can also optimize the widths of all the columns in this submenu.

    The Neighborhoods tab displays the search word(s) in relation to the strongest friends. This tab is used for finding patterns between the search word(s) and neighboring words.

    If you are analyzing a book tagged by part of speech, you will see part of speech displayed above each word (as long as you don't have Ignore part of speech on in the user preferences menu).

    Note: It can be good to limit your window size to what the search term generally influences. For example, adjectives in English generally influence the words after it, so you may want to change your window size to only view words after the search term and not before. To change your window size, see Window Size in settings.

    Example of the Neighbors tab with the most frequent friends

    Options

    The hamburger menu button for the Neighborhood Report has the following options:

    Option Description
    Copy Copy the selected neighborhood(s) to the clipboard.
    Copy All Copy all neighborhoods to the clipboard.
    Export All Export all neighborhoods to a CSV or TXT file.
    Filter Filter the list of neighborhoods based on any of the visible columns.
    Remove Filters Remove any filters that are currently filtering the neighborhoods.
    Sort Sort the list of neighbors based on any of the visible columns in descending order. Alternatively, just click on any column header to sort in descending or ascending order.
    Reset Sort Resets the sort options to show the default sort, which shows the strongest friends first.
    Optimize Column Widths Resizes the column widths to show all text.

    The Phrases tab lists all of the phrases with the search term that occur more than once. Frequency, size, and position columns are provided to sort and filter the phrases.

    Example of the Neighbors tab with the most frequent friends

    Double clicking on any of the phrases will filter the Neighborhoods tab to show only the neighborhoods with that phrase.

    Example of the Neighbors tab with the most frequent friends

    Option Description
    Copy Copy the selected phrase(s) to the clipboard.
    Copy All Copy all phrases to the clipboard.
    Export All Export all phrases to a CSV or TXT file.
    Filter Filter the list of phrases based on any of the visible columns.
    View Selected Phrases View the selected phrase(s) in the neighborhood tab. This filters the neighborhood tab to only display neighborhoods that contain the selected phrases.
    Remove Filters Remove any filters that are currently filtering the phrases.
    Sort Sort the list of phrases based on any of the visible columns in descending order. Alternatively, just click on any column header to sort in descending or ascending order.
    Reset Sort Resets the sort options to show the default sort, which shows the most frequent phrases first.
    Optimize Column Widths Resizes the column widths to show all text.

    Families are composed of two or more friends that all co-occur with each other. There is not a linguistic term for this, but these could be considered a word network. This report is intented to give insight in word usage in clusters rather than just with individual words.

    The Visit Cousins button in the upper-right portion of this tab will show the search results where the family occurs without the current search word. If there are no cousins for a family, it means that these two words only ever occur with the search word, which indicates a close co-occurrence between the words.

    Example of the Neighbors tab with the most frequent friends

    Double-clicking on any of the families will filter the Neighborhoods tab to show only the neighborhoods with that family.

    Example of the Neighbors tab with the most frequent friends
    Option Description
    Copy Copy the selected neighbor(s) to the clipboard.
    Copy All Copy all neighbors to the clipboard.
    Export All Export all neighbors to a CSV or TXT file.
    Filter Filter the list of neighbors based on any of the visible columns.
    View Selected Families View the selected families in the neighborhood tab. This filters the neighborhood tab to only display neighborhoods that contain the selected families.
    Remove Filters Remove any filters that are currently filtering the neighbors list.
    Sort Sort the list of families based on any of the visible columns in descending order. Alternatively, just click on any column header to sort in descending or ascending order.
    Reset Sort Resets the sort options to show the default sort, which shows the most frequent families first.
    Columns Show or hide additional columns. You can also optimize the widths of all the columns in this submenu.
    Visit Cousins Show the search results of the selected family without the current search word.

    The Neighborhood, Neighbors, and Friends reports have options for user preferences. Click on the report preferencesReport Preferences icon in the upper-right corner of the Neighborhood Report window.

    Neighborhood Settings

    By default, neighbors and neighborhoods analyze five words before and after the search word. You can set the window from 0 to 10 in either direction.

    neighborhood settings

    The Neighborhood Report ignores paragraphing, so if the search term is near the beginning or end of the lowest level (usually a paragraph), it analyzes words from surrounding paragraphs. If the words before or after a paragraph break are unrelated, you may not want them in your search results:

    example of crossing lowest level


    The Do not cross lowest-level bound with neighborhoods checkbox will enforce that neighborhoods must be within the same lowest level.

    Note: Loading the report will take longer with this option turned on.

    example of crossing lowest level


    Neighbors Settings

    Options to ignore case, diacritics, and part of speech (if applicable) are available.

    You can use the uncorrected statistics to calculate neighbors. Evert (2008) suggests using the corrected values for surface cooccurrence (e.g., in same neighborhood, L5–R5 span), and the uncorrected values for textual (in same sentence, paragraph, document, web page, …) and syntactic (e.g., verb-object [make + decision], adj. + noun [blue + coat]) cooccurrence.2 (See the PDF here about statistics used in WordCruncher for more information.)

    neighborhood settings

    Friends Settings

    The default settings for friends (the strongest neighbors) are calculated by a statistic called Mutual Information (MI). This statistic tends to show the best results, but there are several other statistics used for calculating friends. (See the PDF here about statistics used in WordCruncher for more information.)

    Available Statistics: Dice, Expected, LL, Log Dice, Log Ratio, MI, MI2, MI3, MS, Percent, Rating, T-score(o), T-score(pq), Z-score(e), Z-score(pq), ΔP k→n, and ΔP k←n

    To be counted as a friend, a word must have 1) a minimum threshold based on the statistic used (default 3 for MI) and 2) a minimum sample frequency (default 2).

    neighborhood settings

Try It Out

  1. Do a search.
  2. Find neighbors (collocates) through the neighborhood report.
  3. Export the report as a CSV or TXT file.