Tip #12: Lemmatization in Web of Science

This week's tip is brought to you by the brilliant SRLibrarianProblems Twitter account:

 
 
 What the heck is lemmatization (lemmatisation)? And why is it important to consider for sensitive (yet precise) searches?

Lemmatization is a feature in many databases (we will demonstrate other examples in later blog posts) that attempts to make a relatively simple keyword search more robust (sensitive) behind the scenes. In the SRLibrarianProblems tweet above, you can see that their example shows that the Topic keyword search for aging with and without quotes produces significantly different results. So what is going on between the two versions of this simple search?

According to the Web of Science help documentation:

"Web of Science automatically applies lemmatization rules to search queries. Lemmatization reduces inflected forms of a word to their lexical root. With lemmatization turned on, a search term is reduced to its "lemma" and inflected forms of the word are retrieved. As a result, lemmatization can reduce or eliminate the need to use wildcards to retrieve plurals and variant spellings of a word.

For example:

  • cite finds inflected forms of the word cite, such as citing, cites, cited and citation.
  • defense finds spelling variants such as defense and defence

Lemmatization applies only to English-language search terms."

Lemmatization is turned on by default when searching the Topic (TS) or Title (TI) fields. Depending on the type of search you are doing or the specific topic, lemmatization can help or hinder the quality (precision) of your search. If you find that lemmatized terms introduce too much noise (irrelevant results), you can turn it off by: 

  1. Using quotation marks in any part of your search ("aging" finds records that contain the word aging but not ageing)
  2. Using search terms with wildcards (color* finds records that contain the words color, colors, and colorful but not colour, colours, and colourful)
  3. Having a very long search string (I wasn't able to find the exact limit in the documentation). When your search exceeds the limit, WOS returns only exact matches.
  4. Going to Advanced Search, clicking"More options," and turning on "Exact Search."

Talk to your librarian to see if lemmatization is right for you 😆...or you can use NOT to check both variations of the search against each other to see if the non-lemmatized version improves the precision of your results like this:

Comments

  1. ok this is possibly life-changing... i am embarrassed to admit i have been searching WOS for 13 years without ever really learning what lemmatisation is! Bookmarking this post for the next time I search it.

    ReplyDelete
    Replies
    1. Yay! I'm so happy this one was helpful! They definitely don't make it easy to figure out without digging into the help documentation.

      Delete
  2. Agreeing with Andy - this is a game changer. Shared with my department!

    ReplyDelete

Post a Comment

Popular posts from this blog

Tip #1: Bulk export from Google Scholar

Tip #23: PubMed's [tiab] vs. [tw]

Tip #4: Ovid MEDLINE Adjacency and Field Tags