How to add an existing library (actual pdfs) to Zotero

Hi,

I followed the documentation for importing metadata from Papers 3 to Zotero. That worked (although some info is deficient) but what I really need is all my pdfs in Zotero.
That could perhaps be done by attaching files via "attach stored copy of file..." but with 22.500 pdfs, that's impracticle. The matter is made worse by the fact that the imported .bib file doesn't seem to transfer sufficient info in many cases, so that items, even though metadata are present, cannot be used for citing.
Bottomline: How do I get Zotero populated with actual pdfs that are already sitting on my harddrive? Drag and drop? And then retrieve metadata for every item?
Any help is greatly appreciated.
  • I followed the documentation for importing metadata from Papers 3 to Zotero. That worked (although some info is deficient) but what I really need is all my pdfs in Zotero. That could perhaps be done by attaching files via "attach stored copy of file..." but with 22.500 pdfs, that's impracticle.

    https://github.com/retorquere/zotero-folder-import

    The matter is made worse by the fact that the imported .bib file doesn't seem to transfer sufficient info in many cases, so that items, even though metadata are present, cannot be used for citing.

    If you install BBT before import, the results will likely have imported more data -- in principle, nothing should get lost.

    Bottomline: How do I get Zotero populated with actual pdfs that are already sitting on my harddrive? Drag and drop? And then retrieve metadata for every item? Any help is greatly appreciated.

    The folder import will trigger "fetch metadata" automatically, but if these PDFs are already pointed to by your bib file, you will probably want to let BBT do the import.

  • Hi Emiliano,
    Thank you very much. I'm trying importing the pdfs into a clean (empty) zotero library now.
  • I have imported all those pdfs and it seems to have worked well.
    Only that the folder import doesn't look up metadata; for 20.000+ items, that's quite a bit of work. Unfortunately, there seems to be no easy way to match the imported metadata with the actual pdfs.
  • It asks Zotero to do the lookup; if no metadata appears, that should mean that Zotero can't find any.
  • That doesn't seem to be the case. In no incident of an imported pdf was there metadata retrieved. Manual lookup via context menu "retrieve metadata" does work in 50%+ of cases. Hence, the lookup must have been skipped at the original import via "zotero folder import".
  • edited May 18, 2021
    Do you have a pdf for me where it should have worked?
  • Well yes, you want a title are the actual pdf or?
  • This one worked, e.g.

    THE RELATIONSHIP BETWEEN EARLY COLONIAL MAYA NEW YEAR'S CEREMONIES AND SOME ALMANACS IN THE MADRID CODEX
    10.1017/S0956536100111034
  • Actual PDF please.
  • How can I send it to you?
  • You can send it to emiliano.heyns@iris-advies.com. You should have seen something like this: https://imgur.com/a/Gde2wLX, did you not see that? Did you import or link the files? When you choose link, metadata is not retrieved.
  • One did work? Did the ones that failed just not show up for metadata retrieval, or did it show up but got a red cross? If it showed up but got a red cross there's nothing I can do, that part is all Zotero.

    If you drag the PDF on Zotero and you get metadata, but not when you use folder import, that's something I can look at. Anything else is out of my hand.
  • edited May 19, 2021
    Hi,
    Thank you very much for your help. To clarify:
    1) I imported the files, not merely link to them.
    2) While doing the folder import, Zotero did not check for metadata. For no pdf.
    The box seen in the image did not appear.
    3) When I drag files to Zotero, i.e. not do the folder import via the plug-in, Z does retrieve metadata, the box appears (if I drag multiple files).
    4) Yes, of course some files get a red x, mostly because they do not contain OCR'ed text or because the first pages do not contain sufficient info.

    So, during the first import via the plug-in, no metadata whatsoever were retrieved because Z apparently didn't attempt it.
  • edited May 19, 2021

    While doing the folder import, Zotero did not check for metadata. For no pdf.

    It might seem like nitpicking but I'm trying to make sure I understand what's happening on your system; during the import, no metadata retrieval happens. After the last file is imported, if any pdfs were imported, the PDF metadata retrieval window ought to pop up. If you followed this scenario and got no retrieval window I'll have to debug your situation, but I'd prefer to do that on github.

    Just to make sure:

    1. you clicked the green plus button and selected "Add files from Folder..."
    2. You selected a folder, and a window pops up with the file extensions in that folder, and a choice to Link/Store
    3. You selected at least the pdf extension (make sure it's highlighted), clicked OK, Zotero started importing the selected extensions

    When I do this I consistently get metadata retrieval.

  • Not at all. Thank you for nitpicking and helping out!

    I did 1., 2., 3. (opting for "store")
    And I tried again with a folder containing far less files. It worked!
    So something specific must have been amiss during the first large import.
    The internet connection was cut during the import and only re-established after a few hours. Could that be the reason (if so sorry for the bother!)
  • edited May 19, 2021
    That could have some effect, but the popup window should still appear. I think. The plugin gives zotero the list of PDFs and asks it to do the necessary for metadata retrieval of those -- I don't really know what happens after I ask that. The popup appearing is not my doing, that's part of what is supposed to happen when I ask by calling "Zotero.RecognizePDF.autoRecognizeItems(pdfs)".

    If you get the problem, can you go into Help -> Debug output logging -> view output and put the result in pastebin or somesuch?

    Would you be OK with bringing the discussion to an issue on the github project? My debugging infra centers around GH's issue tracking.
  • It seems I was able to reproduce the problem by importing a folder with a large amount of pdfs. I'm trying again just now to make sure. Will send the log data.
    Moving to github is fine.
Sign In or Register to comment.