Tip #55: How and why to search previous indexing of MeSH terms in PubMed, plus the alphabet soup of PubMed's date fields

You probably know that Medical Subject Headings (MeSH) evolve over time; you can read more about why and how in the Introduction to MeSH, under Changes to MeSH Terminology. You may also have noticed that, in the MeSH Database, some MeSH headings have no dates, some have one, and some have two. "What do the dates mean in MeSH?" is a hot topic (and is explained in this brief YouTube video, if you'd like the basics in less than two minutes). 

To sum up:

  • If no date is listed, it means the term has been used since MEDLINE began in 1963. 
  • If there's a single date, it's the year the term was introduced, and no other term has ever been used for that concept.
  • If two dates are listed next to Year introduced, which looks like YYYY (earlier yyyy):
    • The first date, YYYY, is when that MeSH heading was added, but not when the concept was first added to MEDLINE.
    • The date in parentheses (earlier yyyy) is when the concept was introduced, but at the time was called something else. Sometimes the term was changed because a better name for the concept was identified; in other cases, the term began as a Supplementary Concept. (Supplementary Concepts are used for many things, but in this instance would likely be a term added because it became significant in the literature between MeSH review cycles.) 

Whatever the reason for the change, in cases where two dates appear, all uses of the earlier term have been replaced with the new MeSH heading. The earlier term now appears in that MeSH heading's entry terms (although it is not identified as such).

In cases where only a single date appears, the concept may have existed in the world, but didn't have its own MeSH heading. In those circumstances, references on that topic may have either been (a) assigned a different, existing heading, or (b) assigned a heading above the current heading in the hierarchy. It's also possible that (c) indexers simply didn't recognize the concept at all, in which case free text searching becomes even more important than usual.

The MeSH Database calls circumstance (a), where a different heading was used and that heading still is still part of the MeSH thesaurus, "previous indexing." The previous indexing details--the heading and the dates when it was used--appear in the MeSH heading record. Previous indexing will be relevant in single date situations, but also in situations where two dates are listed, because some kind of indexing may still have been going on before the first date, earlier yyyy. Prior to earlier yyyy, (a) or (b) or (c) may apply.

Searching for cases of (a) or (b) is essential for comprehensive searching. Following is a demonstration of this process in the MeSH Database using an example term, with a special bonus explanation of the three date fields found in PubMed.

Here's the MeSH Database entry for the MeSH term Transgender Persons (screenshot taken in mid-January, 2025):

Transgender Persons MeSH database entry showing Year introduced: 2016 (2013)

Based on Year introduced, this concept was first added to MEDLINE in 2013 with an unidentified MeSH term (now appearing somewhere in the Entry Terms section of the record), and assigned the MeSH heading Transgender Persons in 2016. All records back to 2013 have been reindexed with Transgender Persons.

Thus, to search for records indexed by MEDLINE as about this concept as far back as 2013:

"Transgender Persons" [mesh]

Below the Entry Terms, the record shows Previous Indexing, so this is a situation where there are two dates but (a) still applies. From 2001-2012, the MeSH heading Transsexualism was used to index this concept, and MEDLINE records during that period still retain that indexing  (screenshot taken in mid-January, 2025).

Previous indexing showing Transsexualism (2001-2012)

To ensure all references back to 2001 on our desired concept are included in the search results, we must search for Transgender Persons (which gets us references from 2013 on) OR'd with Transsexualism (which gets us references on the same concept from 2001-2012). 

Searching using previous indexing can introduce noise to the search. This is because sometimes the previous indexing MeSH term still exists but is not used for our concept or has a revised definition, or because it's now above our newer MeSH term in the hierarchy. The best way to reduce noise is to limit a search for previous indexing only to the years in which it was used as defined in the MeSH record. 

That's the case here for Transsexualism: the MeSH term still exists, but in our case, we want references that match the definition of Transgender Persons; we'll reduce noise by excluding references indexed with Transsexualism in MEDLINE since Transgender Persons was introduced. 

The PubMed date field tag that should be used in this case is [mhda], or MeSH Date. According to PubMed's documentation, MeSH Date is:

The date the citation was indexed with MeSH Terms and elevated to MEDLINE for citations with an Entry Date after March 4, 2000. The MeSH Date is initially set to the Entry Date when the citation is added to PubMed. MeSH Date is not included in All Fields retrieval; the [mhda] search tag is required.

Dates must be entered using the format YYYY/MM/DD [mhda], e.g., 2000/03/15 [mhda]. The month and day are optional (e.g., 2000 [mhda] or 2000/03 [mhda]).

To enter a date range, insert a colon (:) between each date, e.g., 1999:2000 [mhda] or 2000/03:2000/04 [mhda].

To search for records back to 2001, using PubMed's syntax:  

"Transgender Persons" [mesh] OR ("Transsexualism" [mesh] AND 2001:2012 [mhda])

But wait, you say! What if there's no previous indexing?

Sometimes this is a circumstance (b): the concept wasn't yet defined in MeSH, but a broader term was used to index articles that could be of interest to us. In those cases, we could try ORing in the broader term for dates prior to when our MeSH term was introduced, although we'll want to look closely at those results to see if we've added a lot of noise to the search.

I'll use a different example for this one that's still in the same branch of the MeSH tree. Femininity was introduced in 2010. The MeSH entry for Femininity lists one date and no previous indexing. Above it in the MeSH hierarchy is Gender Identity, which has two dates: 1991 (1975). So we know that Gender Identity as a concept was introduced in 1975 with a different name, received the new name Gender Identity in 1991, and all references to it back to 1975 are labeled with Gender Identity. Gender Identity is the broader term in the MeSH tree for Femininity, Masculinity, and Transsexualism  (screenshot taken in mid-January, 2025):

MeSH tree showing Gender Identity and the 3 narrower terms mentioned in the text

We can thus infer that it's possible references about Femininity may have been labeled with Gender Identity between 1975 and 2009. 

To construct a search for Femininity that also includes references with Gender Identity before Femininity was introduced, using PubMed's syntax: 

"Femininity" [mesh] OR ("Gender Identity" [mesh] AND 1975:2009 [mhda])

Again, this may introduce more noise than is desirable, but it is more thorough by far than using just Femininity, at least where MeSH searching alone is concerned. There is previous indexing for Gender Identity, but let's not get too into the weeds in this example. And yes, there are other branches of the tree where these terms appear, and sometimes the introduction of a new MeSH term means the tree itself is reorganized, but that's beyond the scope of this blog post.

And now the promised bonus info! 

Date fields in PubMed can feel a bit like alphabet soup. Three date fields are to the publication described in the record, while three others have to do with MEDLINE processing. In this post, we're concerned with the processing side. Dates are applied in the following order:

  1. Create Date, or [crdt], is when a record is uploaded to PubMed.
  2. Entry Date, or [edat], "is typically set within 24 hours of the citation’s availability in PubMed" and is used for things like sorting results by date. This actually used to be called Entrez Date.
  3. MeSH Date, or [mhda], as mentioned above, is when the citation was indexed with MeSH terms. Until it is indexed, it's the same as Entry Date.
Interestingly, none of these fields are included in an All Fields, or [all], search; you must add them yourself.

Comments

Popular posts from this blog

Favorite Features & Sneaky Solutions: A Database Tips Lightning Round: View the recording!

Tip #1: Bulk export from Google Scholar

Tip #23: PubMed's [tiab] vs. [tw]