769119084 import from Mendeley consumes all available disk space

Zotero error report #769119084

I was attempting to import my library from Mendeley 1.18 to a new install of Zotero 5.0.49 on Ubuntu 18.04 64-bit with 8GB RAM.

The Mendeley PDF folder contains 3,428 items, occupying 3.7GB.

The Zotero import stopped when it had consumed all the free disk space on the PC. At that point Zotero said it had 455 items and the Zotero `storage` folder (which started empty) contained 131,690 items, occupying 142GB. Looking at the subfolders of `storage`, many of them contained a single PDF with two metadata files. However, some of the subfolders each contained **all** the PDFs that were in the Menedely PDF folder.

This sounds like the issue mentioned in this comment:

https://forums.zotero.org/discussion/comment/310629/#Comment_310629

However, my instance of Zotero did not report running out of RAM.
  • edited June 14, 2018
    @r.gayler: Can you upgrade to 5.0.50, close Zotero, delete your Zotero data directory, and then generate a Debug ID for another go at the import process? If you notice it creating storage folders with all your PDFs, you can send in the debug output even if it hasn't yet finished, and let us know the 8-character folder name of one of the affected folders.
  • Here is what I did:

    * Updated Zotero to 5.0.50
    * Deleted ~/Zotero folder
    * Start Zotero
    * Restart Zotero with logging enabled
    * View logging
    * Import from Mendeley (don't put import in new collection)
    * Tons of logging scrolling past rapidly
    * Logging output halts for a few minutes while Zotero is creating a copy of *all* the Mendeley PDFs in one storage subfolder
    * I estimate maybe 1/20 of storage subfolders contains all the source PDFs. The other subfolders contain 1 PDF each.
    * Two of the subfolders containing all the PDFs: EML3TKW5 4IWUSDL3
    * Clicked the cancel button on the importer. It took a few minutes to respond.
    * Clicked the submit button on the logging viewer. It took a very long time to respond.
    * Dialog box appeared saying report 1474773362 had been submitted
  • edited June 16, 2018
    That's a Report ID, I'm afraid — you would've generated that via the "Report Errors…" menu option. A Debug ID is different, and that's what we need here.

    (Note that you also don't need to use the viewer. While you can use it, for an operation this large it will slow things down. If you really want to follow the output for a very large operation like this, it's better to do so from the terminal.)
  • I followed the debug instructions.
    I couldn't cancel the import, so had to wait until it exhausted all disk space (while I was trying to fill the disk with other stuff in the background).
    The final storage folder was 64,528 items (70.0GB), compared to the Mendeley folder being 3,428 items (3.7GB).
    Zotero wouldn't submit the debug report, so I saved the file (~35k lines), zipped it and emailed it to support@zotero.org.

    I will attempt to fill my disk to almost full and run the process again to get a smaller debug log.
  • No need for a smaller log — we'll see what we can figure out from this first.
  • edited June 16, 2018
    OK, that was (clearly) a rather unfortunate bug. For every URL-associated file in "Downloaded", we were copying all other files in "Downloaded" into the attachment directory. (Zotero always exports attachment files, including multi-file HTML snapshots, into separate directories, and so our normal import code is designed to copy the whole directory, but that's obviously not the right behavior here. This only affected some files, so we didn't notice it in our testing.)

    This should be fixed in the latest Zotero beta. You can delete your Zotero data directory and import again with that, and then switch back to the release version if it succeeds. Let us know if you run into any other problems.

    We'll push this fix out in a new release version shortly, along with code to clean up these extra files where they exist. Thanks for helping us debug this, and sorry for the trouble.
  • Thanks for the rapid fix. The import is currently up to ~33%, which is much further than it ever got previously - plus I still have some free disk space (Bonus!).

    Once I get everything imported it's time to pull the plug on Mendeley.
  • (Just to add here, the fix — and the cleanup process — is now available in Zotero 5.0.51.)
Sign In or Register to comment.