BiBTeX Import Issues

Hi,

I have just started using your add in and it looks like it will be very useful. I'm a LaTeX user and am trying to figure out how to get your program to integrate well with my use of BiBTeX.

I have started by importing a BiBTeX database into Zotero, but have noticed that all of my formatting commands within the titles and abstracts, such as \emph{TEXT}, get changed, e.g. to \emphTEXT. I just wanted to make sure you were aware of this problem and was wondering if there was an easy solution. I'm not familiar with the use of databases so I'm hoping I don't have to become familiar in order to install this relatively minor problem.

Along the same vein, I'm wondering if there's an easy way to have my Zotero and BiBTeX files talk to one another so that if I update one it propogates to the other.

Thanks again for an interesting piece of software.

Sincerely,

Mike
  • edited May 13, 2008
    Hi,

    I'm in a similar position. Want to use zotero but my primary citation database is in bibtex and I want to import it rather than regenerate it in zotero.

    Because you picked a good thread name I'd like to propose we use this thread to document the issues we hit with bibtex importing so that they are more visible to developers. We should restrict discussion to the mapping of bibtex data to fields in the zotero database as much as possible. Would you agree?

    Based on some other specific threads:
    http://forums.zotero.org/discussion/2107/bibtex-book-chapter-title/
    It might be worthwhile some of us writing out a mapping of bibtex fields to zotero fields so that writers of import and export filters can have a resource to work from?

    Here is another example of an issue in waiting:

    BibTex@inproceedings:
    Required fields: author, title, booktitle, year
    Optional fields: editor, pages, organization, publisher, address, month, note, key

    Zotero:Conference Paper
    Doesn't provide separate fields for conference organiser and proceedings publisher (bad) and separately stores the "conference name" which will typically be highly redundant with the proceedings title (confusing); there is also no provision for editor information (import ->export may lose important information).

    Small collection of related threads:
    http://forums.zotero.org/discussion/1950/
    http://forums.zotero.org/discussion/2134/
    http://forums.zotero.org/discussion/2125/
    http://forums.zotero.org/discussion/1952/
    http://forums.zotero.org/discussion/1904/
    http://forums.zotero.org/discussion/1789/
  • I'll try and do one entry type at a time. Here is a test techreport entry

    @techreport{ key,
    author = {First von Last and von Last, First and von Last, Jr, First},
    title = {The Title},
    type = {Example Report},
    number = {EG123},
    institution = {The Institution},
    year = {2007},
    month = {October},
    address= {Earth},
    note = {Preprint},
    abstract = {not in bibtex ``standard'' but quite common},
    keywords = {also in common use},
    doi = {will become important},
    leftoffield = {some other data we are not going to worry about}
    }

    Import problems:
    --------------------
    1. First von Last name format is not correctly converted. The von part is grouped with the first name.
    2. Names with a Junior part are not correctly converted. The first name is lost
    3. latex style double quotes are not converted to as ascii or UTF replacement
    4. Institution is lost (field is empty in zotero)
    5. a single key-phrase is broken into words (comma separated phrases are not)
    6. doi is lost (field is not present in zotero Report type)

    Discussion
    ------------
    The name handling is a big issue that will obviously effect all types. A bit of a priority. Name handling in bibtex is hardened to the various difficult forms. A good description can be found here: http://artis.imag.fr/~Xavier.Decoret/resources/xdkbibtex/bibtex_summary.html#names

    I know there are threads on converting to and from latex style non-ascii character specifications.

    The only complete loss of data is the institution field.


    Export problems: (sorry should put them elsewhere but I have them right now)
    --------------------
    1. Exported as @misc instead of @techreport
    2. Report number is lost

    Regards,
  • Here is today's installment: a conference paper.

    @inproceedings{pkey,
    author = {First von Last and von Last, First and von Last, Jr, First},
    title = {paper-title},
    booktitle = {proceedings-title},
    year = {2008},
    editor = {von Last, Jr, First},
    pages = {200--205},
    organization = {Organiser},
    publisher = {Publisher},
    address = {Address},
    month = {Month},
    note = {note, useful for storing the authors affiliations},
    key = {key},
    abstract = {not in bibtex ``standard'' but quite common},
    keywords = {also in common use},
    doi = {will become important},
    leftoffield = {some other data we are not going to worry about}
    }


    Import problems
    --------------------
    1. First von Last name format is not correctly converted. The von part is grouped with the first name.
    2. Names with a Junior part are not correctly converted. The first name is lost
    3. latex style double quotes are not converted to as ascii or UTF replacement
    (as above)
    4. Conference organiser (organization) is lost (field is not present in zotero Conference paper entry)
    5. a single key-phrase is broken into words (comma separated phrases are not)


    Discussion
    -------------
    Problems 1-3,5 are shared with other publication types and represent general bibtex import problems. In this case doi information is preserved.

    Significant to this publication type is 4: the fact that zotero does not provide storage of the conference organiser. It is not always the same as the organiser. Springer is an example of a publisher that publishes proceedings without organising conferences.

    Export problems: (sorry should put them elsewhere but I have them right now)
    ---------------------
    1. Proceedings title is exported to "journal" but should be exported as "booktitle"
    2. Month is not exported unless it is recognised as a real month name. A better behaviour would be to export as a string.

    The second point is again a "general" issue
  • Sure hope this is useful to someone...

    @ARTICLE{pkey,
    author = {First von Last and von Last, First and von Last, Jr, First},
    title = {paper-title},
    journal= {journal-name},
    year = {2008},
    volume = {15},
    number = {5},
    pages = {200--205},
    month = {March},
    note = {note, useful for storing the authors affiliations},
    key = {key},
    abstract = {not in bibtex ``standard'' but quite common},
    keywords = {also in common use},
    doi = {will become important},
    leftoffield = {some other data we are not going to worry about}
    }

    Import problems
    --------------------
    1. First von Last name format is not correctly converted. The von part is grouped with the first name.
    2. Names with a Junior part are not correctly converted. The first name is lost
    3. latex style double quotes are not converted to as ascii or UTF replacement
    (as above)
    4. a single key-phrase is broken into words (comma separated phrases are not)

    Discussion
    -------------
    Problems 1-4 are shared with other publication types and represent general bibtex import problems. In this case doi information is preserved.


    Export problems: (sorry should put them elsewhere but I have them right now)
    ---------------------
    1. issue/number is lost
    2. range hyphenation (en-dash) not used in page range. should be "first--last"; is "first-last". Might have missed this previously (checks... yes the same problem occurs for inproceeeings as expected)
  • So to take stock here is a list of bibtex import and export issues that should (?) have a bug/ticket/whatever raised for them. I have left out a couple of smaller issues that, in retrospect, may not be bugs (such as no doi for reports):

    1. Zotero author data fields insufficient to handle names containing "von" and "Jr" parts
    2. Author information is lost when importing from bibtex where names contain von and Jr parts
    a. "First von Last" name format is not correctly converted. The von part is grouped with the first name.
    b. "von Last, Jr, First" name format is not correctly converted. The first name is lost (and Jr used in its place)
    3. latex double quotation marks are not correctly imported from bibtex. eg. ``text''. The start and end quote marks comprise 2 ascii characters and need to be converted to an appropriate unicode character within zotero. (may be an issue in export also but not tested)
    4. When importing from bibtex, if the keyword field contains a single key-phrase, it is broken into single words in zotero. Import should probably always assume comma delimited keywords to prevent this.
    5. Institution is lost (field is empty in zotero) when importing from a bibtex @techreport entry
    6. When exporting a Report entry to bibtex, an @misc entry is generated and should be a @techreport. Possibly as a consequence, the report number is lost,
    7. zotero does not provide storage of the conference organiser in the conference paper entry type. Consequently, conference organiser (organization) is lost when importing from bibtex @inproceedings
    8. When exporting a conference paper to bibtex, the proceedings title is exported to "journal" but should be exported as "booktitle"
    9. When exporting a journal article to bibtex, "issue" should be exported "number" but is not resulting in loss of information
    10. When exporting page ranges to bibtex and en-dash "--" should be used but a single dash "-" is used.

    I'll stop at that for a while unless anyone asks me to generate test data etc for any other specific bibtex entry types.
  • Re. 1 and 2: Zotero can have a single-field creator & this should probably be used more often--I'd rather have middle names, vons, JR/IIIs, etc. preserved than discarded. There was a past discussion on why names are hard.

    I'd think 2 could be fixed (by using the single field) well before name storage+GUI were improved somehow to separate semantically all possible parts of a name.


    Re. 4: This has been raised before.

    I think one problem might be that some data providers don't comma-separate their keywords & a single super-long tag was found to be very annoying and screw things up. So you may have to choose which annoying thing you end up with. (I could be wrong about this & I don't recall what format(s) this might have been the case for....perhaps BibTeX isn't a problem with most data providers?)

    I think you can work-around it by appending a comma (with or without some other useful keyword--I usually add ', BibTeX import' to mine).
  • edited May 20, 2008
  • I've patched 3,5,6,8,9,10.
    Patch committed. The updated translator will only be available in a 1.0 branch dev build (and Zotero 1.0.5 when it comes out) using Reset Translators and Styles from the Advanced Pane of the Zotero prefs. (This only applies to the BibTeX translator, and it won't be necessary in Zotero 1.5.)

    Thanks to andre for the detailed bug reports and noksagt for the quick patch.
  • Thanks heaps. That's truly amazing turnaround :-)

    Take the point on 4. Keywords is non-standard in bibtex anyway

    Re 1,2 I understand the complexity of this. I've been using last,first in all my bibtex because it is the least ambiguous and results in fewer formating errors, especially in cases where bibliography styles dictate short form names. I think in the longer term 4 part names will probably be required in zotero but ... like I said I understand its a relatively complex implementation issue. (The link I posted on names previously in this thread is useful)

    7 is not going to bite me any time soon. Its more for completeness.

    Stoked!

    When will 1.05 come out BTW?
  • just looking over the patch. Not sure how the mappingTable structure is used but mine contains:

    "\u201C":"{\\textquotedblleft}", // LEFT DOUBLE QUOTATION MARK
    "\u201D":"{\\textquotedblright}", // RIGHT DOUBLE QUOTATION MARK

    and your patch seems to add (not replace the above)

    "\u201D":"''''", // RIGHT DOUBLE QUOTATION MARK (could be a double prime)
    "\u201C":"``", // LEFT DOUBLE QUOTATION MARK (could be a reversed double prime)

    are those doubled up entries going to cause problems?

    Also this page http://www.bitjungle.com/~isoent/ may be a useful resource
  • yes... had to remove duplicates. Might be just a problem with me not using the latest repository version.

    At lest I know where to go now to fix things :-)
  • are those doubled up entries going to cause problems?
    I don't think there shouldn't be doubled-up entries--there are two separate tables (one for bibtex->zotero & one for zotero->bibtex). I modified the bibtex->zotero table to import your entries (``,'') to curly double quotes (as expected).

    On export, the curly double quotes should be mapped (as they had been) to textquotedblleft and textquotedbllright (it is debatable about how they SHOULD be mapped).

    If this isn't the case, can you provide a patch & describe both current & desired behavior?
  • In the diff view here:
    https://www.zotero.org/trac/attachment/ticket/1020/scrapers.diff

    the new / corrected entries look like they are out of order in the table. I guess that's where my confusion arose. And yes I was looking at the zotero->bibtex table... so double confusion.

    I didn't have or post a test case where the zotero entry contained the unicode quotation characters but again my expectation would be that they would map to the latex equivalent.

    Having patched my tables manually (in the most obvious way) it seems to behave perfectly.
  • I didn't have or post a test case where the zotero entry contained the unicode quotation characters but again my expectation would be that they would map to the latex equivalent.
    \\textquotedblleft is a LaTeX equivalent & a quick search through compt.text.tex suggests that it might be more robust than using the (very common) double single quote shorthand; in the past, kerning of the special characters were correct on some fonts where the ligatures were not kerned correctly.

    I don't know whether the technical correctness of this outweighs the slight obscurity. Is there any other advantage that I'm missing that the shorthand has over the longer form?
  • Only readability as far as I know. I personally had not encountered the verbose form so you are starting to stretch my knowledge as a latex user. I had initially assumed it was a XML hybrid thing but it looks like it is a newer latex thing. Looks sensible in fact! And is probably easier for other applications to parse/import/export than the obscure latex forms. It's probably good to leave as is.

    Can the importer handle both the short and long forms though?
  • I have an issue when exporting my Logos Bible Software ver. 6 library to BibTex and then importing the result into Zotero.
    When the source is a chapter in a book, the BiBtex exports as follows:

    @misc{Grounds_1967,
    place={Grand Rapids, MI},
    title={Counseling the Bereaved},
    journal={Baker’s dictionary of practical theology},
    publisher={Baker Book House},
    author={Grounds, Vernon},
    editor={Turnbull, Ralph G.Editor},
    year={1967},
    pages={227}}

    Note that the book title is given as "journal={Baker’s dictionary of practical theology},"

    Issue: Entry turns up blank in Zotero.
    If I manually change the @misc header to @inbook, the Zotero entry shows correctly as a Book Section, but the rest of the entry is still blank.

    Can anyone help?
  • Follow-up: I notice that my Logos-Zotero export-import issue relates to items # 6 and 8 in andre's list.
  • First, please start a new thread. Any issues in this thread are no longer relevant, since the BibTeX translator has undergone numerous revisions in the last 7 years.

    But the entry you post seems to import ok for me (as book in the form you have it and as chapter if I change to @inbook). If you copy the BibTeX entry from above and use Import from Clipboard in Zotero, do you still get a blank entry?

    Do you have any third-party plugins installed, like zotero-better-bibtex? Try disabling them if that's the case. Also, go to Preferences -> General and click Update Now.

    If things are still not working at that point, please generate a Debug log for an attempt to import the above BibTeX entry from clipboard. https://www.zotero.org/support/debug_output
  • Your post doesn't really relate to the thread. The issue with the export from Logos. BibTeXing doesn't list journal as an optional field for inbook or for misc.

    There are a few thread on the Logos support site about this. This one has a link to a way of fixing their BibTeX export and also a suggestion on a work around:
    https://community.logos.com/forums/p/103224/713491.aspx
Sign In or Register to comment.