Zotero doesn't recognise PDF as PDF

Hi all,

After upgrading Zotero to the latest 2.0b4, whenever I drag a PDF file to the library, Zotero doesn't seem to be capable of recognising the PDF to be a PDF. More importantly, I don't get the option to extract metadata from the PDF. Any help?

  • Yes, that's odd. It's happening to me too. Newly-added PDFs also aren't being indexed.
  • it is the same with me. I tried loading PDF's that were originally recognized but it does not see them as PDF's anymore.
  • Sorry, I just found a temporary fix in another thread from a Mac user. (thread: http://forums.zotero.org/discussion/7106/pdf-indexing-and-metadata-retrieving-in-b4/#Item_0 ).After drag and drop, you should open the PDF in firefox and create a new item from current page. Then retrieve metadata works for that new item. It' a lot of extra work if you have many documents, but it works for now.
  • CB
    edited May 21, 2009
    catalintucureanu: thanks for the pointing that thread out. The workaround won't work for me as I prefer to have pdfs open in an external reader so don't have any way of opening them in FF. But it's clearly a bug, anyway, so I'd imagine we can expect it to be fixed before too long.
  • I know. I also prefer opening them in an external viewer on linux (adobe acrobat is horrible with memory management). But at least I can organize my library until the bug gets fixed.
  • Adobe acrobat is just horrible ...
  • Yes, this is a bug in 2.0b4. It's been fixed in the trunk, and the fix will be available in the next beta build, which should be out sometime next week. The workaround is as described above. The PDFs saved incorrectly in the meantime will be automatically corrected when you upgrade to the next version. Sorry for the inconvenience.
  • Hi, Dan,
    Could you please tell me when the next version will be out?
    I am eager to use it, since I am looking forward to importing many PDF files to Zotero.
  • Hi Dan,
    Same thing here! I'm waiting for this fix to begin using Zotero and organize my huge collection of PDF files. Keep up the good work!
  • Any word when this is going to finally be fixed? I would really like to index all of mine, but I have a ton of them and doing it one at a time is going to take forever. Thanks all!
  • Zotero 2.0b5 is out now with the fix.
  • I'm running 2.0b5, but I see that 40% of my PDFs remain unindexed. Is there a method to prompt or promote indexing?
  • you can click on the round green buttons of a file.
    If the pdf is not readable - i.e. if it's a scan rather than a text document - indexing obviously won't work.
  • So 40% of my PDFs have no text whatsoever, including metadata?

    If that is the case, perhaps the Index Statistics page should either include them as completed, or indicate them as unindexible, or show how many files are pending.

    In any case, my question concerned indexing as an overall activity. When does it occur? To promote completion of the index, should one leave Firefox running?
  • I'm not sure that's what's going on - just pointing out a possibility. If your pdf doesn't have text it doesn't have metadata. You can check e.g. by trying to copy and paste text from an open pdf using the text selection (T) tool - if that works, indexing (at least partial) should work.

    Don't know the answer to your second question.
  • In any case, my question concerned indexing as an overall activity. When does it occur?
    When the PDF is first added. Generally speaking, as long as you've had the PDF tools installed as long as you've been adding PDFs, you should never need to manually reindex.

    The one current exception is if you're using file sync, which doesn't yet synchronize the full-text index or automatically trigger indexing on other computers. This will be addressed in a future release.

    But you're right that the statistics should reflect the difference between unindexed and unindexable PDFs. Ticket created. Thanks.
  • It would be great if it were also possible to do a search based on indexed/indexability status. For instance an additional dropdown heading or two for each of those qualities under the advanced search > Attachment ... is/is not ... ___
