[translator bug] DOI translator handles Crossref "posted content" in unexpected ways

edited February 8, 2024
The Crossref "posted content" category is often used for preprints but can also hold things like scholarly blog posts, as in material syndicated through Rogue Scholar.

The Crossref specification for posted content includes elements like `group title` (for groupings into categories and domains) and `institution` ("Container for information about an organization that sponsored or hosted an item but is not the publisher").

Rogue Scholar-supplied DOIs for blog posts include the name of the originating blog in `institution` (sensibly I think) and some content categories in `group title`. Example DOI to try this with: 10.59350/rgtg4-17t30 (for a blog post on the Front Matter blog).

However, when importing such a doi into Zotero,
* "group title" ends up in the institution field
* whatever was in "institution" is lost
* report type is set to 'other'

The problem: this means such material cannot be correctly cited, as some information from crossref is lost upon import in Zotero.

Expected behaviour:
* institution would be kept (so that the blog name appears there, and shows up correctly in citations)
* whatever is in "group title" goes to Zotero's Extra field


(Oh, and I would love it if the DOI field comes to all item types, including this one.)
  • Little bump. If this indeed a bug, which I believe, is there another place where I should file it?

    Concrete example: this blog post on the etymology of Zotero:
    https://doi.org/10.59350/mp3sz-q1k17

    Crossref API output looks like this:


    {
    "status": "ok",
    "message-type": "work",
    "message-version": "1.0.0",
    "message": {
    "institution": [
    {
    "name": "The Ideophone"
    }
    ],
    "indexed": {
    "date-parts": [
    [
    2024,
    2,
    5
    ]
    ],
    "date-time": "2024-02-05T00:22:13Z",
    "timestamp": 1707092533002
    },
    "posted": {
    "date-parts": [
    [
    2008,
    1,
    25
    ]
    ]
    },
    "group-title": "Languages and literature",
    "reference-count": 0,
    "publisher": "Front Matter",
    "license": [
    {
    "start": {
    "date-parts": [
    [
    2008,
    1,
    25
    ]
    ],
    "date-time": "2008-01-25T00:00:00Z",
    "timestamp": 1201219200000
    },
    "content-version": "vor",
    "delay-in-days": 0,
    "URL": "https://creativecommons.org/licenses/by/4.0/legalcode";
    },
    {
    "start": {
    "date-parts": [
    [
    2008,
    1,
    25
    ]
    ],
    "date-time": "2008-01-25T00:00:00Z",
    "timestamp": 1201219200000
    },
    "content-version": "tdm",
    "delay-in-days": 0,
    "URL": "https://creativecommons.org/licenses/by/4.0/legalcode";
    }
    ],
    "content-domain": {
    "domain": [],
    "crossmark-restriction": false
    },
    "short-container-title": [],
    "abstract": "<p>If you’ve read yesterday’s post (Zotero, an Endnote alternative) or come across Zotero elsewhere, you may have been wondering about its name.</p>",
    "DOI": "10.59350/mp3sz-q1k17",
    "type": "posted-content",
    "created": {
    "date-parts": [
    [
    2024,
    1,
    30
    ]
    ],
    "date-time": "2024-01-30T21:23:15Z",
    "timestamp": 1706649795000
    },
    "source": "Crossref",
    "is-referenced-by-count": 0,
    "title": [
    "The etymology of Zotero"
    ],
    "prefix": "10.59350",
    "author": [
    {
    "given": "Mark",
    "family": "Dingemanse",
    "sequence": "first",
    "affiliation": []
    }
    ],
    "member": "31795",
    "container-title": [],
    "original-title": [],
    "link": [
    {
    "URL": "https://ideophone.org/zotero-etymology",
    "content-type": "text/html",
    "content-version": "vor",
    "intended-application": "text-mining"
    }
    ],
    "deposited": {
    "date-parts": [
    [
    2024,
    2,
    4
    ]
    ],
    "date-time": "2024-02-04T22:16:27Z",
    "timestamp": 1707084987000
    },
    "score": 1,
    "resource": {
    "primary": {
    "URL": "https://ideophone.org/zotero-etymology";
    }
    },
    "subtitle": [],
    "short-title": [],
    "issued": {
    "date-parts": [
    [
    2008,
    1,
    25
    ]
    ]
    },
    "references-count": 0,
    "URL": "http://dx.doi.org/10.59350/mp3sz-q1k17",
    "relation": {},
    "published": {
    "date-parts": [
    [
    2008,
    1,
    25
    ]
    ]
    },
    "subtype": "other"
    }
    }


    But Zotero turns it into this — the "name" of the publication (The Ideophone) is gone, and the group attribute has been turned into the dc:publisher.


    <rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns:z="http://www.zotero.org/namespaces/export#";
    xmlns:dc="http://purl.org/dc/elements/1.1/";
    xmlns:foaf="http://xmlns.com/foaf/0.1/";
    xmlns:bib="http://purl.org/net/biblio#";
    xmlns:dcterms="http://purl.org/dc/terms/">;
    <bib:Report rdf:about="https://ideophone.org/zotero-etymology">;
    <z:itemType>report</z:itemType>
    <dc:publisher>
    <foaf:Organization>
    <foaf:name>Languages and literature</foaf:name>
    </foaf:Organization>
    </dc:publisher>
    <bib:authors>
    <rdf:Seq>
    <rdf:li>
    <foaf:Person>
    <foaf:surname>Dingemanse</foaf:surname>
    <foaf:givenName>Mark</foaf:givenName>
    </foaf:Person>
    </rdf:li>
    </rdf:Seq>
    </bib:authors>
    <dc:identifier>
    <dcterms:URI>
    <rdf:value>https://ideophone.org/zotero-etymology</rdf:value>;
    </dcterms:URI>
    </dc:identifier>
    <dc:date>2008-1-25</dc:date>
    <dc:description>DOI: 10.59350/mp3sz-q1k17</dc:description>
    <dcterms:dateSubmitted>2024-02-18 10:55:43</dcterms:dateSubmitted>
    <z:type>other</z:type>
    <z:libraryCatalog>DOI.org (Crossref)</z:libraryCatalog>
    <z:language>en</z:language>
    <dcterms:abstract>If you’ve read yesterday’s post (Zotero, an Endnote alternative) or come across Zotero elsewhere, you may have been wondering about its name.</dcterms:abstract>
    <dc:title>The etymology of Zotero</dc:title>
    </bib:Report>
    </rdf:RDF>

  • Is there some way to find out this has been taken notice of? Should I file this somewhere else? Looking over on github it seems translator wizards make the issues there based on threads here so I'll just gently bump this again.
  • edited March 4, 2024
    Yes, I've seen this. I don't think we currently account for the 'posted-content' type at all, but it's also going to be a bit tricky to work with, so might take a bit.
    Added issue, thanks for the nudge: https://github.com/zotero/translators/issues/3266
  • @mark this has now been merged and should import posted-content pretty nicely Although I'm not sure about what other types of contents are in there; I'm currently calling everything that's 'other' (i.e not a preprint) a blogpost. Happy to reconsider if that subtype gets used more (though hopefully it doesn't).
  • nice, I can confirm this works now — thank you so much!

    I think blogposts are the most common form of non-preprint 'posted content' so for now it will work.
Sign In or Register to comment.