Search returns results are missing compared to equivalent library in Endnote

I recently transferred my library of ~900 research article references with PDF attachments from Endnote to Zotero, by importing one collection at a time. All of the PDF attachments are indexed as far as I can tell. I tested a couple of searches to make sure Zotero could search through all of the PDF attachments. Consistently, Zotero does not return certain results that Endnote does return.

For example, in Endnote, searching ["Any Field + PDF with Notes" = "fibrin" And "Any Field + PDF with Notes" = "titanium"] yielded 22 references, of which only 15 show up when searching "fibrin titanium" in Zotero. Each of 7 references that didn't show up seems properly imported into Zotero with the PDF(s) attached and indexed. I can still get the search to return each of those 7 references by searching for other phrases that are known to appear in the PDFs. I am also able to open their PDF attachments from Zotero and can find both "titanium" and "fibrin" in them using Ctrl-F. [One of the missing search results in this example is the PDF attachment of "Mesenchymal stem cell interactions with 3D ECM modules fabricated...".]

I tried to fix the problem by deleting the references that were missing from the search results and importing them by dragging the original PDF into the collection. I also tried to clear the index and reindex the library.

Can someone help? Happy to provide additional details/try other things.

Report ID: 1220457267

Edit: One of the 7 search results in the example above is the "Strategic Design and Recent Fabrication Techniques for Bioengineered Tissue Scaffolds...". In the indexed PDF attachment, the phrase "closest proximity to the VEGF-releasing fibrin gel demonstrated" appears. Searching my library for "fibrin" does not yield this item, but searching my library for "VEGF-releasing" does yield it. How is that possible??

Report ID: 1710278935
  • Can you email the PDF to support@zotero.org with a link to this thread?

    It's possible the "fi" is a ligature in the document, which might be preventing it from being found.
  • Emailed both PDFs above.
  • edited April 13, 2021
    Searching on "fibrin" finds the PDF for me. Are you sure you're using Everything mode in the Zotero search bar?

    From the looks of it, it's the "VEGF-releasing" search that's not right. That seems to be matching on just "releasing". We'll look into that.

    If you're sure you're using Everything mode, what does it say for Indexed in the right-hand pane when you click on the attachment? Can you provide a Debug ID for clicking Reindex Item (the green arrows)? Does the search work after that?
  • edited April 13, 2021
    @Deezotee: Actually, you appear to have sent a different PDF from the one mentioned above. You sent "Strategic Design and Fabrication of Nerve Guidance Conduits for Peripheral Nerve Regeneration", not "Strategic Design and Recent Fabrication Techniques for Bioengineered Tissue Scaffolds…". "VEGF-releasing" doesn't appear in the PDF you sent. "fibrin" does, though, and matches for me.
  • I apologize @dstillman, thank you for noticing that. I've sent the correct PDF now.
  • I went ahead and did Reindex Item (green arrows) on the PDF "Rajaram et al. - 2012 - Strategic Design and Recent Fabrication Techniques.pdf" attached to my reference "Strategic Design and Recent Fabrication Techniques for Bioengineered Tissue Scaffolds" and sent a Debug ID 830064418. I still get the same results searching for "fibrin" and "VEGF-releasing".
  • That's a Report ID, not a Debug ID.
  • Oh I see, thank you. I reindexed the PDF and sent the Debug Output log. This should be the debug ID: D83117940. Please let me know if I can do anything else.
  • Hi, has anyone figured out what's causing the problems here?
  • Sorry for the delay. So, yeah, it's both of the things I said above:

    1) Hyphenated phrases are returning incorrect results matching only one of the words. I already created an issue for that, and the fix for now is just to use a space instead of a hyphen to require both words, though they won't necessarily be part of a phrase.

    2) The extracted text includes "fibrin" with a ligature character for the "f" and "i". If you copy and paste fibrin into the search bar, you'll see that it matches, though, oddly, the actual text of the PDF doesn't show a ligature being used. We'll investigate. Ultimately, Zotero should find results regardless of whether the search text or the extracted text includes a ligature.
  • edited April 25, 2021
    I did some more example searches and it appears that the ligatures problem accounts for the vast majority of the unexplained differences in the search results between my Zotero and Endnote libraries. Unlucky for me because I deal with a lot of "fl" and "fi" subjects! For example, "fluorescen fibers" gives 276 references in my EndNote library and 168 in the equivalent Zotero library, while "immunofluorescen fibrous" gives 41 in Endnote and 18 in Zotero. Happy to provide more info if it helps to get it fixed. Thanks for your help.
Sign In or Register to comment.