searching for keywords in pdf attachments

I am having a lot of trouble being able to search using the search bar to find specific words that appear in my attached pdf's but not in the title and author of the entry.

I tried selecting the "full text" option and debugging the cite but nothing has worked. Any suggestions??
  • In the Search pane of the Zotero preferences, do you have the PDF tools installed? Do you see "Indexed: Yes" in the right pane for PDF attachments in your item list?
  • No I have not. How do I go about doing this?
  • Actually, I do have both of the PDF tools installed. I do not however see the "indexed:Yes" in the right pane for PDF attachments....
  • So you see "Indexed: No"? If not you're looking in the wrong place.
  • Has anyone followed up with this? I have PDF indexing installed and am having the same issue. I'm able to search for the keyword I'm looking for within PDFs, but when I do a search through zotero for the same keyword, not all the PDFs containing the keyword pop up.
  • Same question though: do you see Indexed: Yes for the PDF in question?
  • It'd be interesting to play around with this a little - does Zotero find any words in the PDF? Is there something peculiar about the words it doesn't find etc. This works in general, so what's relevant is what's not working in these specific cases.
  • I thought the same thing. It does find other words in that PDF, but not the one I needed. They both are similar to one another (abbreviations for genes).

    What I wanted to search for was EF-1 (can't find it in zotero)
    But if I search "DDB1" which is another gene in the same article, I find it! Maybe the "-" throws it off?
  • almost certainly. Dan or Aurimas may know without checking on why, else we'd have to take a closer look at the article. Can you link to the article (can be gated)?
  • Here's a link to Pubmed. The article itself might not be available if you aren't at an academic institution, though.

    http://www.ncbi.nlm.nih.gov/pubmed/20033231

    The title is "Response of wild-type and high pigment-1 tomato fruit to UV-B depletion: flavonoid profiling and gene expression."
  • Your search term (EF-1) seems to occur only once in that text:

    "encoding the tomato elongation factor 1\alpha(EF-1\alpha), because of its high and stable expression in mature tomato fruit"

    where \alpha is the symbol alpha and I guess this makes problem in the text extraction.
  • The paper I put up is just one example of several pdfs. I have other pdfs where EF-1 is mentioned multiple times and not attached to the alpha symbol.
  • http://www.ncbi.nlm.nih.gov/pubmed/12906715

    There's another paper that says "EF-1" several times. And yes, I tried searching "LeEF-1" too.
  • Here's one more example. I have this paper (both the PDF and snapshot) saved in my Zotero database.

    http://journal.ashspublications.org/content/116/2/265.short

    I searched for "frozen storage" in Zotero and found nothing. However, it's present in the PDF.

    Something goofy is happening here.
  • My apologies, I figured it out. So I was using the general search bar on the right side of the screen and just discovered the magnifying glass (in my defense, it almost completely blends into the background).

    Now I can find all PDFs that say "EF-1" or "Frozen storage" or whatever I'm looking for.

    Thanks for the help everyone.
  • the quick search bar has three different settings. The "Everywhere" setting should find anything you can search for in the advanced search. The two others don't search attachment content. Could you confirm that?
  • Yes, I see that too. For some reason, the advanced search feature seems to work better for me, but the "Everything" option in the quick search is certainly viable.
  • What do you mean by better?
    Whatever works for you is great, but if the "everywhere" function isn't finding things, we'd want to know. (Obviously advanced search does have a number of advantages like combining search conditions and regular expressions, so if that's what makes the difference for you, that'd be by design.
  • It seems to just be on a case by case basis and might be more related to issues I'm having with my standalone copy on my laptop syncing with my copy on my desktop computer. Because I save PDFs with each paper I read, I ran out of space (>300 Mb) so the server doesn't sync all my files on my laptop. I'm going to start deleting PDFs and just save the snapshots that I get from where I download journal articles from.

    Weirdly, I did an "Everything" search in the quick search for "Frozen storage" and got a bunch of results. When I did it in the advanced search, I got nothing. I think the program is just being glitchy.
  • Fixed it. Didn't have PDF indexing installed on this laptop, but had it on other computer.
  • well no, everything searches more than advanced search attachment content - e.g. it would include abstract fields and notes - so getting more results is to be expected. There is no equivalent of "everywhere" in advanced search (there should and likely will be).
  • edited December 18, 2014
    I'm going to start deleting PDFs and just save the snapshots that I get from where I download journal articles from.
    snapshots also take up space (and can take up more space than pdf)

This is an old discussion that has not been active in a long time. Before commenting here, you should strongly consider starting a new discussion instead. If you think the content of this discussion is still relevant, you can link to it from your new discussion.

Sign In or Register to comment.