The surprisingly bad performance of Zotero in creating accurate references in this:
has led me to look at some of the reasons. The biggest issue is, unsurprisingly, poor import quality - I think some of it can be fixed.

Specifically, the Arxiv translator puts "doi:" in front of the doi: number, and puts the arxiv e-pub number in the publication field and the journal title and volume and issue in the extra field. That seems unfortunate...
  • I've fixed the DOI issue, but the other problems you note can't be fixed, and probably shouldn't be. Remember that arXiv is a preprint site, and so publication information is likely to be incomplete and in any case does not represent the artifact at hand. It's merely a pointer to where the actual published article resides, if in fact it was ever published. Because arXiv doesn't provide us with anything structured, even if we wanted to use this data -- and I don't think we do -- we would have to use a messy regular expression to try to extract it...
  • thanks on dois.

    I understand what you say about the other issue, but then arXiv papers shouldn't be treated as journal articles - because arXiv is not a journal. For most citation styles treating arXiv:0704.0001 (or the like) as a journal title will produce wrong citations.
    I would say report is probably the best fit, as these are essentially working papers with a clear identifier.
  edited September 13, 2011
    The ISTL paper is a little less careful than I'd like to see for something in an academic journal-- Sean is completely right that the publication info for a preprint (or manuscript, which is what arXiv often is) is not the same as that for the related published article.

    It does underline the importance of getting journal abbreviation lists working, but Frank is fortunately making progress on that front.
  • I agree that their methodology is maybe questionable, but I thought it'd be worthwhile to take advantage of the work they've done so I got them to send me a scan of the citations they looked at and the errors they found and it has allowed me to fix a couple of small things in styles.

    I'm still wondering, though, whether treating arxiv preprints as journal articles is a good idea then.
  • We could switch to manuscript, but I'm afraid that might not be what people are expecting.
  • as I say above I'd go for report - but the status quo clearly isn't right, no?
  • The only data we can get from arXiv are:
    - paper title
    - description fields (?)
    - tags / subject descriptors
    - DOIs, URLs (these may refer to the published version)
    - article ID (currently used as the journal name)
    - authors
    - date

    My preferred mapping is to manuscript, with the article ID referring to the Location in Archive, and the archive set to arXiv.

    But, as always, our decision here will likely hinge on what the citation guidelines are for preprints. Are they even covered in style guides?
  • yes, I found a couple - I'll post with that tomorrow.
