Importing takes very long + fails

fb14-hpi · March 1, 2019

Hi there,

for a migration to Zotero, I am trying to import a bib file with ~14000 entries (~10 mb filesize) into Zotero. Unfortunately, it's very slow (500 entries take around 60 minutes to process on a SSD). I've tried disabling synchronization for now, but this doesn't seem to help.

Additionally, what's really bothering me is that the import sometimes just doesn't continue. I think that's because there are some broken entries in the bib file (it's from like 2003 and has evolved with several people using it), but is there a way to ignore broken entries or restart/continue imports without creating duplicates and waiting for everything to import again? I've now started it with debug output to see where the error lies, but it would be amazing to be able to continue the import where it left off so I don't have to wait 10 hrs just to start it again.

I would be glad if you could help me out with the speed and error issues. Thank you so much.

dstillman · March 1, 2019

Try with the Zotero beta and let us know if that helps. There's a bug in 5.0.60 that can cause these problems.

fb14-hpi · March 1, 2019

Thank you for your fast response! I will try tomorrow with a full import, but it didn't seem to speed it up significantly.

Maybe it could also have to do something with this issue: https://github.com/zotero/zotero/issues/1574 as we do have file names containing characters like ä,ö,ü,ß in their pdf filenames. I will be able to tell you what the debug output says tomorrow. But the issue reporter says it just silently fails which does not match my problems 1:1.

To what bug exactly were you referring? And what problems? The speed, the duplicates, continuing import where you "paused" it?

dstillman · March 1, 2019

To what bug exactly were you referring?

Speed and hanging (though import speed can definitely still be improved).

Repeated file imports always create duplicates (except when using the Mendeley import option), so if an import process is interrupted you'd want to either first delete those items and empty the trash or edit the BibTeX file to remove the entries that were imported successfully before continuing.

fb14-hpi · March 2, 2019

Thank you for your answer.

I just ran it with the latest beta + debug output and it seems to be pretty slow. If I turn of the debug output, it's definitely faster. Is that intended behaviour? That the import speed depends on the "print speed" of the debug window? I didn't want to close it in case it stops again, but for now I've closed it. Let's see how fast it goes now

dstillman · March 2, 2019

Yes, having the debug output window open can definitely slow things down. Simply leaving debug output logging enabled should have less of an effect, and that's all that's necessary to generate a Debug ID, as least as long as you can still access the menus.

fb14-hpi · March 2, 2019

Indeed, it did stop again at item 1781. How do I generate a debug id? The latest output I see is:

[JavaScript Error: "Discarding invalid field 'place' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'place' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'place' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 33 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 33 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 33 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 33 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'issue' for type 33 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 4 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'series' for type 15 for item 1/null"]

[JavaScript Error: "Discarding invalid field 'type' for type 2 for item 1/null"]

version => 5.0.61-beta.20+ad27e0c5f, platform => Linux x86_64, oscpu => Linux x86_64, locale => de, appName => Zotero, appVersion => 5.0.61-beta.20+ad27e0c5f, extensions => Zotero LibreOffice Integration (5.0.14.SA.5.0.61-beta.20+ad27e0c5f, extension)

dstillman · March 2, 2019

https://www.zotero.org/support/debug_output#debug_output_logging

fb14-hpi · March 2, 2019

Ah, didn't find that one. Unfortuantely, Zotero now is just plain black after clicking on the UI. I'll try on a Windows machine, maybe it doesn't like my Arch / i3 setup...

dstillman · March 2, 2019

OK, so some of your BibTeX entries are missing citekeys, which is breaking Zotero's import. We should handle that better, but for now you'll need to fix this in the file. You can search for "{author" to find the entries with this problem.

I've created an issue to address the hanging.

emilianoeheyns · March 5, 2019

BBT will import entries without keys.

dstillman · March 5, 2019

It is technically invalid though, right?

emilianoeheyns · March 5, 2019

Oh yes, but Endnote (IIRC) used to export bibtex without citekeys, so I added that to the import.

emilianoeheyns · March 5, 2019

If you don't care about the citekeys, JabRef can be used to quickly add them: https://sites.google.com/a/york.ac.uk/ref-import/latex/jabref-keys

emilianoeheyns · March 5, 2019

Come to think of it, if you do care about them, Jabref ought to be able to sort on citekeys, which should the empty ones at the front, making select-and-fix easy enough to do.

dstillman · March 5, 2019

Oh yes, but Endnote (IIRC) used to export bibtex without citekeys, so I added that to the import.

Ah, OK. Sounds like we should accept this, then. Issue created.

emilianoeheyns · March 5, 2019

For completeness sake, I think Endnote stopped doing that a while ago, but old bibfiles persist.