Search returns results are missing compared to equivalent library in Endnote
I recently transferred my library of ~900 research article references with PDF attachments from Endnote to Zotero, by importing one collection at a time. All of the PDF attachments are indexed as far as I can tell. I tested a couple of searches to make sure Zotero could search through all of the PDF attachments. Consistently, Zotero does not return certain results that Endnote does return.
For example, in Endnote, searching ["Any Field + PDF with Notes" = "fibrin" And "Any Field + PDF with Notes" = "titanium"] yielded 22 references, of which only 15 show up when searching "fibrin titanium" in Zotero. Each of 7 references that didn't show up seems properly imported into Zotero with the PDF(s) attached and indexed. I can still get the search to return each of those 7 references by searching for other phrases that are known to appear in the PDFs. I am also able to open their PDF attachments from Zotero and can find both "titanium" and "fibrin" in them using Ctrl-F. [One of the missing search results in this example is the PDF attachment of "Mesenchymal stem cell interactions with 3D ECM modules fabricated...".]
I tried to fix the problem by deleting the references that were missing from the search results and importing them by dragging the original PDF into the collection. I also tried to clear the index and reindex the library.
Can someone help? Happy to provide additional details/try other things.
Report ID: 1220457267
Edit: One of the 7 search results in the example above is the "Strategic Design and Recent Fabrication Techniques for Bioengineered Tissue Scaffolds...". In the indexed PDF attachment, the phrase "closest proximity to the VEGF-releasing fibrin gel demonstrated" appears. Searching my library for "fibrin" does not yield this item, but searching my library for "VEGF-releasing" does yield it. How is that possible??
Report ID: 1710278935
For example, in Endnote, searching ["Any Field + PDF with Notes" = "fibrin" And "Any Field + PDF with Notes" = "titanium"] yielded 22 references, of which only 15 show up when searching "fibrin titanium" in Zotero. Each of 7 references that didn't show up seems properly imported into Zotero with the PDF(s) attached and indexed. I can still get the search to return each of those 7 references by searching for other phrases that are known to appear in the PDFs. I am also able to open their PDF attachments from Zotero and can find both "titanium" and "fibrin" in them using Ctrl-F. [One of the missing search results in this example is the PDF attachment of "Mesenchymal stem cell interactions with 3D ECM modules fabricated...".]
I tried to fix the problem by deleting the references that were missing from the search results and importing them by dragging the original PDF into the collection. I also tried to clear the index and reindex the library.
Can someone help? Happy to provide additional details/try other things.
Report ID: 1220457267
Edit: One of the 7 search results in the example above is the "Strategic Design and Recent Fabrication Techniques for Bioengineered Tissue Scaffolds...". In the indexed PDF attachment, the phrase "closest proximity to the VEGF-releasing fibrin gel demonstrated" appears. Searching my library for "fibrin" does not yield this item, but searching my library for "VEGF-releasing" does yield it. How is that possible??
Report ID: 1710278935
It's possible the "fi" is a ligature in the document, which might be preventing it from being found.
From the looks of it, it's the "VEGF-releasing" search that's not right. That seems to be matching on just "releasing". We'll look into that.
If you're sure you're using Everything mode, what does it say for Indexed in the right-hand pane when you click on the attachment? Can you provide a Debug ID for clicking Reindex Item (the green arrows)? Does the search work after that?
1) Hyphenated phrases are returning incorrect results matching only one of the words. I already created an issue for that, and the fix for now is just to use a space instead of a hyphen to require both words, though they won't necessarily be part of a phrase.
2) The extracted text includes "fibrin" with a ligature character for the "f" and "i". If you copy and paste
fibrin
into the search bar, you'll see that it matches, though, oddly, the actual text of the PDF doesn't show a ligature being used. We'll investigate. Ultimately, Zotero should find results regardless of whether the search text or the extracted text includes a ligature.