769119084 import from Mendeley consumes all available disk space
Zotero error report #769119084
I was attempting to import my library from Mendeley 1.18 to a new install of Zotero 5.0.49 on Ubuntu 18.04 64-bit with 8GB RAM.
The Mendeley PDF folder contains 3,428 items, occupying 3.7GB.
The Zotero import stopped when it had consumed all the free disk space on the PC. At that point Zotero said it had 455 items and the Zotero `storage` folder (which started empty) contained 131,690 items, occupying 142GB. Looking at the subfolders of `storage`, many of them contained a single PDF with two metadata files. However, some of the subfolders each contained **all** the PDFs that were in the Menedely PDF folder.
This sounds like the issue mentioned in this comment:
https://forums.zotero.org/discussion/comment/310629/#Comment_310629
However, my instance of Zotero did not report running out of RAM.
I was attempting to import my library from Mendeley 1.18 to a new install of Zotero 5.0.49 on Ubuntu 18.04 64-bit with 8GB RAM.
The Mendeley PDF folder contains 3,428 items, occupying 3.7GB.
The Zotero import stopped when it had consumed all the free disk space on the PC. At that point Zotero said it had 455 items and the Zotero `storage` folder (which started empty) contained 131,690 items, occupying 142GB. Looking at the subfolders of `storage`, many of them contained a single PDF with two metadata files. However, some of the subfolders each contained **all** the PDFs that were in the Menedely PDF folder.
This sounds like the issue mentioned in this comment:
https://forums.zotero.org/discussion/comment/310629/#Comment_310629
However, my instance of Zotero did not report running out of RAM.
* Updated Zotero to 5.0.50
* Deleted ~/Zotero folder
* Start Zotero
* Restart Zotero with logging enabled
* View logging
* Import from Mendeley (don't put import in new collection)
* Tons of logging scrolling past rapidly
* Logging output halts for a few minutes while Zotero is creating a copy of *all* the Mendeley PDFs in one storage subfolder
* I estimate maybe 1/20 of storage subfolders contains all the source PDFs. The other subfolders contain 1 PDF each.
* Two of the subfolders containing all the PDFs: EML3TKW5 4IWUSDL3
* Clicked the cancel button on the importer. It took a few minutes to respond.
* Clicked the submit button on the logging viewer. It took a very long time to respond.
* Dialog box appeared saying report 1474773362 had been submitted
(Note that you also don't need to use the viewer. While you can use it, for an operation this large it will slow things down. If you really want to follow the output for a very large operation like this, it's better to do so from the terminal.)
I couldn't cancel the import, so had to wait until it exhausted all disk space (while I was trying to fill the disk with other stuff in the background).
The final storage folder was 64,528 items (70.0GB), compared to the Mendeley folder being 3,428 items (3.7GB).
Zotero wouldn't submit the debug report, so I saved the file (~35k lines), zipped it and emailed it to support@zotero.org.
I will attempt to fill my disk to almost full and run the process again to get a smaller debug log.
This should be fixed in the latest Zotero beta. You can delete your Zotero data directory and import again with that, and then switch back to the release version if it succeeds. Let us know if you run into any other problems.
We'll push this fix out in a new release version shortly, along with code to clean up these extra files where they exist. Thanks for helping us debug this, and sorry for the trouble.
Once I get everything imported it's time to pull the plug on Mendeley.