Errors importing bibtex .bib files

I'm trying to get started with zotero, and I have several large .bib files. Every time I try to import one I just get a message that there's an error and I should verify that my bib file is valid.

How should I go about debugging this? I don't see anything to help me figure out where the error is in my bib file, and AFAICT zotero just throws the whole file on the floor when it fails to parse, so I don't see any prefix of the file that was successfully processed. I looked at the debug log, too, but there's nothing in there that jumps out at me, either.

Sorry -- this seems like it's a FAQ, but Google isn't helping me find previous answers.

BTW, I'm using the Better Bibtex plugin.
  • We'd want a Report ID to start, but to report it here you'd first have to confirm that it happens with BBT disabled and translators reset from the Advanced → Files and Folders pane of the Zotero preferences. Otherwise you'd need to report it to the BBT developer on GitHub (who won't have access to Report IDs, so you'd have to provide the actual error).
  • I am not claiming that this is necessarily a Zotero bug -- it could quite well be a bug in my bib files. I just don't know what to do about it, if it is. Is there some way to determine what line of a bib file is causing an error on import? The debug log just informs me that it's something wrong with an "@incollection" entry, but there are several of them.

    So I'm just asking what I hope is a simple question: assuming that I have a bib file that is buggy from the PoV of Zotero, how do I localize the bugs?

  • Not really anything else to tell you, I'm afraid. You can dig into the code, or you can provide a Report ID or Debug ID after reproducing the problem with BBT disabled and translators reset.
  • OK, I'll have a look at the code, and maybe I can see where the file slurping happens. It just seems odd that parsing failures don't yield error messages tied to lines. But I don't understand the internals, so...
  • If you're determined to do this entirely on your own, the easiest first option is just to do a binary search of the BibTeX file by splitting it in halves until you find the problematic entry.
  • (We might be able to improve logging for import failures, but malformed entries generally shouldn't cause failures, so we always start with a Report ID.)
  • OK, I disabled Better Bibtex and tried again with my ai.bib file. It hung importing (the little "Importing... ai.bib" window continued to show, but stopped updating, and the debug window stopped showing new input).

    The last item in the debug log that showed a string I could recognize seemed to refer to this entry in the import file:

    author = {John McCarthy and Patrick J. Hayes},
    booktitle = {Machine Intelligence},
    volume = 4,
    address = {Edinburgh},
    year = {1969},
    editor = {B. Meltzer and D. Michie},
    publisher = {Edinburgh University Press},
    title = {Some philosophical problems
    from the standpoint of artificial intelligence}

    but that's pretty much a wild guess. If it would help, please contact me, and I will send you the bibtex file.
  • If you want us to look at this, we would need a Debug ID.
  • Sorry -- I'm an idiot and forgot to paste it in. Here it is: D390155511
  • I wonder if anything in the above suggested ways I could try to fix my bibliography file to be importable. Thanks!
  • edited January 10, 2018
    Not that I know bitex at all, but just from looking at that code it looks like the volume is missing the curly brackets, while all the other entries have them. Worth a shot at least.
  • The bibtex above imports fine for me (curly brackets are not required for fields).

    @dstillman could you check the debug?

    But if this were me, I'd do the binary search for problems as Dan suggests above. My recollection is that Zotero doesn't do good logging about where exactly in a file it fails.
  • @adamsmith Yes, as far as I can tell, there's nothing in the log that gives line numbers in a failing import.

    Given how crufty hand-written bib files are, I'm actually surprised that there aren't more problems like mine. bibtex will typically run at least kind of ok against files with errors in them...

    I have 66K lines of .bib (not all in one file, but still) acquired over something like 25 years, so the option of debugging by binary search seems pretty painful to me. This is only one error; I have no way of estimating how many times I'd have to repeat the binary search.
  • I have no way of estimating how many times I'd have to repeat the binary search.
    That's the point of the binary search — absolute worst case (if you keep guessing the wrong half), it wouldn't be more than 16 times.
  • I believe I have found the point in my Bibtex file where things go awry. I have canonical names for authors as BibTex "@string" definitions. So my file has something that looks like this in it:

    author = MCDERMOTT # { and Gerald J. Sussman},
    year = {1972},
    month = May,
    institution = {MIT Artificial Intelligence Laboratory},
    title = {The {\sc CONNIVER} Reference Manual}

    That's legitimate bibtex, but it seems like it could blow up an importer, and indeed this causes what looks like the same hang when I put it in a file by itself.

    This is actually a difficult issue for Bibtex import, because it's quite normal to do something like this, and it's not required to have the string definitions in the same file (indeed, for some abbreviations you wouldn't *want* to have them in the same file, because you would want to expand journal names, for example, differently in conference papers -- abbreviate -- and journal articles -- write out in full).

    So this seems to be the problem. I could see that Zotero might not be able to import these, but it seems like a bug that it causes it to hang.
  • @dstillman The absolute worst case is 16 times if you are looking for a single entry. If there are multiple entries then the log bound doesn't obviously apply...
  • Sure, but if this is a bug in Zotero (which it appears to be), you only need to find one, so that would just make finding it quicker.
    Timestamp: 1/10/18, 2:49:24 PM
    Error: TypeError: rawValue is undefined
    Source File: BibTeX
    Line: 323
    This is the error I get trying to import that entry above.

    I can't look at this at the moment, but someone more familiar with the BibTeX translator might have more to say on it. Obviously this shouldn't cause a hang regardless, so at the very least this should stop the import with an error.
  • Could I get a copy of that bib file? BBT should not be blowing up on bibtex imports.
  • This happens in stock Zotero. You can reproduce the error with the entry provided above.
  • Is that valid bibtex? With the round braces?
  • I guess it is, because BBT imports it :)
  • Is it importing it correctly for you, though? BBT produces a messy import and a note with an import error for me?
  • This page states that parens are ok to delimit bibtex entries, and they have always worked for me:
  • @adamsmith I tried this with stock Zotero after BBT failed to parse this file, just raising an error that, AFAICT, simply said "there's something bad somewhere in this file."

    Possibly stupid question: is there some reason the bibtex importers couldn't count input file lines?
  • @adamsmith: It's importing without throwing an error dialog -- I'm looking at importing the content properly.

    @rpgoldman: thanks, I'll look into that.
  • @emilianoheyns It's importing without throwing an error dialog, but it's not completing the import -- the window showing the progress circle never gets erased. I'm not sure what that means in terms of the status of Zotero.
  • The import progress window is not BBTs -- I don't know what triggers cause it to come up or to go away.
  • Got it -- BTW, string definitions are imported by BBT.
  • @rpgoldman there's no reason the parser couldn't count lines, but the one I use doesn't. Yet, at least.
  • @emilianoheyns The hang happened on Zotero without BBT. Without BBT I get no error, some but not all of the bib file is successfully imported, and the import hangs (at least that's how I interpret the fact that the import progress window never goes away).

    With BBT, I get an error message, and *nothing* is imported.

    In both cases, it would be great to have a line count in the error window to help debug.
Sign In or Register to comment.