Handle other blog post sources

edited August 23, 2023
Currently, the Zotero connector for **any** site which doesn't have:

<! -- <meta name=generator content="Wordpress"> -->

in <! -- <head> --> is to assume it is a webpage. This means that the item type metadata is Web Page instead of Blog Post. Since there are now **many** alternate generators of blog posts which can use the same metadata, it makes sense to update it.

I myself use Hugo, but other SSGs (static site generators) like Jekyll are also popular. I am currently forcing the meta tag with Wordpress in my theme but this is silly. It would make more sense for Zotero connector to accept a wider range of generator contents.

EDIT: There are actually a few more supported generators: https://github.com/zotero/translators/blob/b6eb8802779a538752435f567a6c1461d87cdfac/Embedded Metadata.js#L306-L313 but they are still inadequete.

This PR fixes things: https://github.com/zotero/translators/pull/3116


EDIT 2: Reworked the discussion to clarify that "less metadata" is not being saved
  • Currently, the Zotero connector for **any** site which doesn't have:

    <! -- <meta name=generator content="Wordpress"> -->

    in <! -- <head> --> is to assume it is a webpage.
    I mean, that's obviously not true — Zotero has over 700 translators, and the Embedded Metadata translator itself will save high-quality metadata on countless sites that provide it. If you're seeing a problem with a particular site, please provide an example URL.
  • Note that the PR actually just modifies the Embedded Metadata translator to correctly recognize blog posts.

    Consider the following (same theme, inspect the HTML to see the meta tag):
    1. https://64e0caf1c4664500080e4fe5--rgoswami.netlify.app/posts/fortran-oop-python/
    ^--- Without the meta tag, or with Hugo as the generator.
    2. https://64e57527c3e56b000884e3d4--rgoswami.netlify.app/posts/fortran-oop-python/
    ^---- With one of the "blog post" generators (Wooframework) but blogger or wordpress work equally well (as defined in the code, see earlier link).

    If you try to import (1) and (2) the difference should be clear.

    (1) is of item type Web Page, with metadata populated accordingly
    (2) is of item type Blog Post, with (similar) metadata


    The issue here is that (2) is the correct item type, and so the PR fixes that.
  • edited August 23, 2023
    Right, I understand what the patch does, but that wasn't your description — you said Zotero was saving "much less metadata". Obviously there's a limited set of generator values currently, and we can add more so that "Blog Post" is used for more things (though as Zoë notes on GitHub, that could be incorrect on some sites, so we have to do it carefully), but the only difference in your example is the item type. So if you have an example where "much less metadata" is saved, you should share that.
  • I see, apologies for the confusion, I've updated the original post.
Sign In or Register to comment.