Theses repositories where Zotero Connector detects Journal Articles

  • There unfortunately isn't anything in the page's machine-readable metadata marking it as a thesis — they should be setting citation_dissertation_institution. We're working on major improvements to the "generic" translators that we use for sites without site-specific support, though, and we should be able to address this as part of that effort.
  • @AbeJellinek Thanks for answering. I see in the Full Item page link (https://unbscholar.lib.unb.ca/items/568c48b7-9dbd-4bea-90ae-8154f3220524/full):
    dc.type master thesis

    and I believe this is enough machine-readable metadata marking the item as a thesis.
  • I wish we could use that metadata, but it isn't present on the main page, only on /full, and even there, it's just in a human-readable table with no machine-readable semantic markup. I'm not really sure what the point of having it is.
  • edited 28 days ago
    The interface seems to be the frontend of a Dspace7 server, I can see some typical API urls in the network log of my browser.

    There is probably a way to write a translator, maybe calling an existing one, but I can't tell how much work that would involve as I'm not sure what is available at the moment. There's a significant backlog of new translators waiting for review on Github as well...
  • Yeah, it's DSpace, but DSpace is way too diverse to write a single translator for all sites powered by it. The only structured metadata I'm seeing on that site in particular is Datacite XML, which we unfortunately don't have a translator for. (JSON, yes, but XML, no.)
    There's a significant backlog of new translators waiting for review on Github as well...
    True, although many of those have pending comments that were never addressed by the authors (or are just no longer necessary and should be closed). As you can see from the commit history, we regularly merge new translator PRs!
  • edited 28 days ago
    I agree, Dspace instances can be quite different from each other, my unverified hypothesis was that there might be a minimal core on which one might rely. But embedded metadata of passable quality would be easier to deal with, of course - and not unreasonable requirement for repository admins.

    Sorry if my comment was too general: I am aware that updates to at least some existing translators are processed efficiently enough, my perception of the new translator case is perhaps biased by my own experience. One of my PRs has been waiting for any kind of action for over a year ;-)
  • Another example, but in this case Zotero connector does not detect a journal article, but just a webpage: https://www.theseus.fi/handle/10024/875160
  • Same issue; the metadata details is actualy pretty decent here, but nothing thesis specific --and you can actually see the broken 'type' field (DC.type in the metadata) after import, where they try to put three different languages in a single string. There's just no way to reasonably parse stuff like this.
  • Thanks. Anyway, I will continue posting these webs, even if this is not a Zotero issue.

    Some questions:
    -Are they wrong designed websites?
    -Is Datacite schema wrongly implemented in those webpages?
    - Isn't there some standard/ISO to use Datacite in a correct way in webpages?
    -On @AbeJellinek comment on Datacite XML. Why Zotero can only translate JSON, but not XML. Wouldn't be this a feature to be improved in Zotero if Datacite XML is as valid as Datacite JSON?
    -If there is a correct way to use Datacite on webpages and some (like the examples here) are not following it, is there some way to make pressure to correct them? I mean, some declaration, or foundation looking at these implementation?
  • edited 5 days ago
    Why Zotero can only translate JSON, but not XML. Wouldn't be this a feature to be improved in Zotero if Datacite XML is as valid as Datacite JSON?
    Because we haven't needed Datacite XML for anything in the past. I took a look at what would be involved in implementing it — seems pretty straightforward, just a 1:1-ish mapping to JSON.

    The "dc" on that page stands for Dublin Core, not DataCite. I think you may (understandably) be getting the two confused. Zotero supports importing Dublin Core metadata, but it needs to be in a machine-readable format, not just a table on the page. The actually machine-readable metadata made available by UNB Scholar is DataCite XML, which Zotero unfortunately doesn't yet support.

    In any case, we might be able to start building a translator for relatively standard DSpace sites that handles things like the UNB Scholar Dublin Core metadata. I'll keep this thread updated.
  • The "dc" on that page stands for Dublin Core, not DataCite. I think you may (understandably) be getting the two confused.
    Indeed. Sorry. My fault.
    In any case, we might be able to start building a translator for relatively standard DSpace sites that handles things like the UNB Scholar Dublin Core metadata. I'll keep this thread updated.
    Thanks!
  • edited 5 days ago
    Then, my last question before would become if there is some way to ask to webpages with Dublin Core information for having it in a machine-readable format. May we say that Dublin Core not being machine readable is not useful at all?
  • Yeah, any metadata (Dublin Core and other formats like Highwire -- the name for the citation_title etc. tags) should be in meta tags in the site header.
  • edited yesterday at 7:43am
    Another example wher only a webpage is detected:
    https://jyx.jyu.fi/jyx/Record/jyx_123456789_100515
  • @iagogv: That page has COinS metadata that gets prioritized over Embedded Metadata (for mostly historical reasons). Right-click the Zotero Connector toolbar button -> Save to Zotero -> Embedded Metadata.
Sign In or Register to comment.