Tip #56: Tricky Truncation in ProQuest Databases

Many thanks to Jocelyn Boice, Colorado State University Libraries for this week's post! This work is based on the author’s lightning talk “Tricky Truncation in ProQuest Databases” presented at the Medical Library Association UX Caucus’ event Favorite Features & Sneaky Solutions: A Database Tips Lightning Round on October 7, 2024.

The Basics

In the ProQuest interface, the asterisk truncation feature works in a different way than one might expect. Instead of returning results including any variant of a truncated word, it only includes variants with five letters or fewer after the root. This character limit has the potential to eliminate relevant results from a search rather than expanding the results set as one would anticipate.

ProQuest’s documentation describes truncation with an asterisk as follows.

"The truncation character in ProQuest is an asterisk (*) -- used to replace up to five characters [emphasis added]. For example, a search for farm* will retrieve documents with the terms farm, farms, farmer, or farming.” (Source: ProQuest Search Tips)

Fortunately, there is another option for truncation in the ProQuest interface. ProQuest calls this “defined truncation” and it allows the user to designate how many letters to include, with twenty being the upper limit. Manual configuration of the search using this option can increase the comprehensiveness of the search and reduce the likelihood of excluding relevant results unintentionally.

ProQuest explains defined truncation syntax in this way.

"Defined truncation ([*n]) replaces up to the number of characters specified, for example [*9]. The maximum number of characters that can be entered is 20." (Source: ProQuest Command Line Search)

A Sample Search

The following example demonstrates the potential impact of ProQuest’s truncation character limit on database search results. The screenshots below show several iterations of a search performed in the Aquatic Sciences and Fisheries Abstracts (ASFA) database on the ProQuest platform on February 18, 2025. The search uses the Marine Stewardship Council’s sustainable seafood certification as the topic and uses the database’s default search settings in the Colorado State University Libraries instance.

The first image shows the outcome of an initial search for the phrase marine stewardship council and the word certification, connected by the Boolean operator AND. This search returned 422 results.
 
A screenshot of a search in the Aquatic Sciences and Fisheries Abstracts database on the ProQuest interface.  The search is "marine stewardship council" AND certification.  The number of results is 422.  The search string and number of results are circled in yellow.

The next image shows a modified version of the search where the word certification has been truncated with an asterisk following the root certif*.  This search retrieved 306 results, far fewer than the initial search, and completely counter to the expected behavior of truncation
 
A screenshot of a search in the Aquatic Sciences and Fisheries Abstracts database on the ProQuest interface.  The search is "marine stewardship council" AND certif*.  The number of results is 306.  The search string and number of results are circled in yellow.


Modifying the search again to use defined truncation and specifying the inclusion of twenty characters after the root certif (certif[*20]) returns a set of 513 records, as seen in the last image.  The search now performs in the anticipated manner for truncation, returning a results set that is the same as or larger than the set of results returned from a search using the untruncated word.
 
A screenshot of a search in the Aquatic Sciences and Fisheries Abstracts database on the ProQuest interface.  The search is "marine stewardship council" AND certif[*20].  The number of results is 513.  The search string and number of results are circled in yellow.

An Explanation

The smaller results set seen when using the asterisk for truncation in the preceding example is a direct consequence of ProQuest’s five character limit. But what exactly is happening? In this case, the limit means that the truncation of the word certification as certif* is finding words such as certify, certified, and certificate, which all have five or fewer letters after the root, but not words such as certification, which has more than five letters after the root. The table below shows additional (but not exhaustive) examples of the words that are being found or not found by this truncation.
 

Certif* finds

Certif* does not find

Certifiable

Certificated

Certificate

Certificates

Certified

Certification

Certify

Certifications

Certifying

 


Variable Impact

The impact of the character limit on the results set will change from search to search and will fluctuate based on multiple factors such as such as how much of a word is truncated, the existing forms of a word, and any additional parameters applied to the search. In some cases the impact may be very significant, while in others it may be minimal or even nonexistent.

To illustrate, the subsequent images show search histories for simple queries performed on February 18, 2025 in databases on the ProQuest platform. The searches used the default settings for the databases as provided by Colorado State University Libraries. Each example shows three iterations of a search, one using an untruncated word, one using traditional truncation with an asterisk, and one using defined truncation. The search in the ASFA database, as seen previously, shows a large impact on the results set. The search in the ERIC database shows a much more moderate impact, and the search in AGRICOLA shows no impact. 
 
A screenshot of the search history for three searches in the ERIC database on the ProQuest interface.  The search string intervention AND randomized retrieved 5,191 results, the search string intervention AND random* retrieved 8,816 results, and the search string intervention AND random[*20] retrieved 8,873 results.


A screenshot of the search history for three searches in the AGRICOLA database on the ProQuest interface.  The search string corn AND cultivation retrieved 3,837 results, the search string corn AND cultivat* retrieved 6,026 results, and the search string corn AND cultivat[*20] retrieved 6,026 results.

 Suggestions for Use

As demonstrated, the truncation character limit in ProQuest databases has variable effects on search retrieval, so traditional truncation may be adequate for many purposes. However, the character limit introduces the risk that users may unknowingly narrow a search when they intend to expand it. Defined truncation mitigates this risk and is indispensable for projects or contexts that require comprehensive retrieval of results, such as evidence synthesis.

For additional details about how truncation works in the ProQuest interface, please see ProQuest’s documentation.

Comments

Popular posts from this blog

Favorite Features & Sneaky Solutions: A Database Tips Lightning Round: View the recording!

Tip #1: Bulk export from Google Scholar

Tip #23: PubMed's [tiab] vs. [tw]