Tip #24: PubMed’s Phrase Index
Many thanks to Erica Lake (Outreach Coordinator, NNLM Region 6), and Amanda Sawyer and Jessica Chan from the NCBI PubMed team for this week's tip!
Why is there a phrase index?
The PubMed database contains more than 34 million citations and abstracts of biomedical literature and is growing by more than 1 million citations each year. PubMed uses a phrase index to provide efficient, cost-effective phrase searching while preserving system speed and performance for its 3.4+ million daily visitors.
How does the phrase index work?
Many phrases are automatically recognized by the subject translation table used in PubMed's Automatic Term Mapping (ATM). For example, if you enter fever of unknown origin without enclosing it in double quotes, PubMed recognizes this phrase as a MeSH Term.
You can bypass ATM and search for a specific phrase using the following formats:
- Enclose the phrase in double quotes: "kidney allograft"
- If you use quotes and the phrase is not found in the phrase index, the quotes are ignored and the terms are processed using automatic term mapping.
- Use a search tag: kidney allograft[tw]
- If you use a search tag and the phrase is not found in the phrase index, the phrase will be broken into separate terms, e.g., "psittacine flight" is not in the phrase index, so a search for psittacine flight[tw] is broken up and translated as: ((("psittaciformes"[MeSH Terms] OR "psittaciformes"[All Fields]) OR "psittacine"[All Fields]) OR "psittacines"[All Fields]) AND "flight"[Text Word]
- Use a hyphen: kidney-allograft
- If you use a hyphen and the phrase is not found in the phrase index, the search will not return any results for that phrase.
When you enter search terms as a phrase, PubMed will not perform automatic term mapping that includes the MeSH term and any specific terms indented under that term in the MeSH hierarchy. For example, "health planning" will include citations that are indexed to the MeSH term, Health Planning, but will not include the more specific terms, e.g., Health Care Rationing, Health Care Reform, Health Plan Implementation, which are included in the automatic MeSH mapping.
If you are unsure whether it is best to double quote a phrase or allow automatic term mapping, try the search both ways (with and without double quotes), and review the search details and results to determine which one is delivering the desired level of comprehensiveness verses specificity for the search. Or, start without double quotes and then add them if the results are too broad.
What is the process for determining when/how a phrase gets added to the index?
Automated processes regularly add new phrases to the index based on standard criteria such phrase length and how often the phrase appears in the PubMed database. New phrases are typically added to the phrase index twice per month. While new citations are added to PubMed every day, not every phrase from these citations are indexed, and therefore searchable, immediately.
In some cases, phrases will be manually added to the phrase index. If your phrase is not recognized by PubMed but you think it should be, write to the PubMed Help Desk.
How to tell if your phrase is in the phrase index
Phrases may appear in a PubMed record but not be in the phrase index. To browse indexed phrases, use the Show Index feature included in the Advanced Search Builder, which provides an alphabetical display of terms appearing in selected PubMed search fields. To do this, select a search field, enter the beginning of a phrase, and then click Show Index.
The index displays an alphabetic list of search terms and the approximate number of citations for each term (the actual citation count is returned when the search is executed). Scroll until you find a term you want to include in your search, and then highlight it to add it to the search box. Multiple terms may be selected from the list and added to the search box. Add terms from the builder to the query box to construct your search. Once you have finished adding terms to the query box, click Search (or Add to History) to run the search.
What are the best workarounds for situations where you need an exact phrase that isn't in the index?
The phrase “cancer and cardiac care” is not included in the PubMed phrase index. To retrieve more results for this specific phrase, you might try:
- Combining similar concepts with a Boolean that are included in the phrase index, i.e. “cancer care” AND “cardiac care”
- If you are searching within a specific field, try tagging each of your terms individually with the field tag, i.e., cancer[Title/Abstract] AND cardiac[Title/Abstract] AND care[Title/Abstract]
PubMed is the only literature database that I'm aware of that limits what phrases you can search. For instance, if I search "ards caused by covid 19" [ti] you would expect to retrieve the following article: Clinical features, ventilatory management, and outcome of ARDS caused by COVID-19 are similar to other causes of ARDS. Nope! Can't find it, or the other 9 articles with that phrase in the title. It works in Ovid Medline/In Process/etc., but not good ole PubMed. The supposed "work arounds" that break up the phrase into separate terms combined with AND basically ignores the fact that you're looking for a PHRASE, so it's not a work around at all.
ReplyDeleteAnother workaround is to search PubMed via Google. For example, you can search [“cancer and cardiac care” site:pubmed.ncbi.nlm.nih.gov] (w/o the brackets) in the Google search box. You can use this to see if a phrase actually exists in PubMed and then go from there. Obviously, dealing with results in Google has it's own challenges so using it depends on what you are doing.
ReplyDeleteBrilliant workaround! I may push this up into its own post to make sure everyone sees it! Thanks!
Delete