Zotero 5.0 beta: Fail on large import, bibtex format
I am trying to import thousands of items into Zotero 5.0 (5.0-beta.111+aa78387) and am running into a consistent problem. I have turned off zotfile, and the problem persists.
I am importing from a bibtex UTF8 file. If the file is larger than a certain size (somewhere around > 1,000 items or > 20,000 lines) I get the message
[JavaScript Application]
An error occurred while trying to import the selected file. Please ensure that the file is valid and try again.
At least once, after getting this error I could not import any files without quitting Zotero and relaunching it.
I can solve this error by breaking the file into blocks of about 15,000 lines (of course being careful to break between bibtex entries).
Two issues--first this seems to be a bug. Second, if the error message gave some context, it would make it easier to figure out what to do to solve it.
I am importing from a bibtex UTF8 file. If the file is larger than a certain size (somewhere around > 1,000 items or > 20,000 lines) I get the message
[JavaScript Application]
An error occurred while trying to import the selected file. Please ensure that the file is valid and try again.
At least once, after getting this error I could not import any files without quitting Zotero and relaunching it.
I can solve this error by breaking the file into blocks of about 15,000 lines (of course being careful to break between bibtex entries).
Two issues--first this seems to be a bug. Second, if the error message gave some context, it would make it easier to figure out what to do to solve it.
If you try to import too many items from a Bibtex UTF8 file, you get a JavaScript error, which raises this error (pasted below). When this error has happened before, I have found that if you break your Bibtex file into blocks of about 15,000 lines, or < 1,000 items, you can import the file, even though you can't import the entire file. I'll update this post if that fix doesn't work in this case.
It has been suggested that this might be an out-of-memory error. I didn't monitor memory pressure while the import was running, but now that it has terminated I see no large memory usage by zotero, or my system. I have 32 GB of memory, and I am using about 13 GB.
I can supply the .bibtex file if desired.
Zotero 5.0-beta.111+aa78387. MacOS Sierra 10.12.2
From the error log:
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding unknown JSON field 'backupPublisher' for item 1/null"]
[JavaScript Error: "Discarding invalid field 'series' for type 7 for item 1/null"]
[JavaScript Error: "Discarding unknown JSON field 'backupPublisher' for item 1/null"]
[JavaScript Error: "Discarding invalid field 'series' for type 7 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Discarding invalid field 'publisher' for type 4 for item 1/null"]
[JavaScript Error: "Item 13488 not loaded" {file: "chrome://zotero/content/xpcom/data/items.js" line: 525}]
[JavaScript Error: "Item 13488 not loaded" {file: "chrome://zotero/content/xpcom/data/items.js" line: 525}]
[JavaScript Error: "Item 13488 not loaded" {file: "chrome://zotero/content/xpcom/data/items.js" line: 525}]
[JavaScript Error: "Item 13488 not loaded" {file: "chrome://zotero/content/xpcom/data/items.js" line: 525}]
Particularly with 32 GB of RAM, it's much more likely that this is a problem with a specific item — or combination of items — than that it's an issue with importing a large file.
If you don't mind sending the BibTeX file to support@zotero.org with a link to this thread, that's probably the easiest. If you prefer, a Debug ID (different from a Report ID) for an import attempt that fails might be enough.
@dstillman Is there something specific I should target for rewrite? Should I split off the import translator so it doesn't carry the weight of the export translator?
@alex.mitrani could I get a copy of that file for testing? It should be possible to attach it to an issue at https://github.com/retorquere/zotero-better-bibtex/issues
https://github.com/zotero/translators/pull/1354/
Without this, the items have to be processed and queued synchronously (which can hang the UI) and then saved to disk together in a single (potentially huge, memory-intensive) transaction.
* Does the async: true only relate to the import? If a translator does both import and export, export is still sync right?
* When configOptions.async is set to true, item.complete() always returns a promise, correct?
* Promise.coroutine does not seem to be available in the translators. Can it be made available?
* If Promise.coroutine would be available, would it be largely sufficient to yield on item.complete() instead of just calling it in doImport to make a translator async?
* How do async translators import collection info? The collection info is tied to the references to be imported using itemIDs that are only meaningful for the duration of the import session. Will this mechanism stay unchanged, even when the items will now not be saved in bulk at the end of the session?
In flux meaning "don't bother for a while"? I'm fine with that.