Parsing of author from urls submitted to zoterobib

Hello,

How does zbib.org parse the author from a submitted URL?

I tried submitting my blog post URL to it: https://scripter.co/splitting-an-org-block-into-two/, but the author field as parsed by zbib stays empty.

I use Microformats2 h-card to store the author meta data on my pages.

Would you guys add support for microformats2: http://microformats.org/wiki/microformats-2 ?

If not, how should the author meta-data be present on web pages?

Thanks.
  • edited September 18, 2018
    we have some code under review to add microformats, but we haven't seen much useful metadata in microformats, and there's a bit of a challenge in getting the right metadata from microformat. The most straightforward way to add authors that's picked up by Zotero/zbib is DC.creator or DC.contributor
  • > there's a bit of a challenge in getting the right metadata from microformat.

    Is it a coding challenge? Because otherwise the microformats2 is a well-defined standard.

    You can find a variety of mf2 parsers here: http://microformats.org/wiki/parsers. I think that the most updated one is the PHP version: php-mf2 ( https://github.com/microformats/php-mf2 ).

    h-card allows the author information to be embedded in rich detail. Here's how the h-card gets parsed from the same link: https://indiewebify.me/validate-h-card/?url=https://scripter.co/splitting-an-org-block-into-two/

    Would you please update this thread if you end up adding the h-card parsing?

    > DC.creator or DC.contributor

    OK. I heard of that meta data only for the first time. I'll consider adding that too. It's crazy how everyone creates their own standards.. I have opengraph + twitter cards + microformats2 metadata and now adding DC to the family :P
  • I just add the DC.Creator meta data, and confirm that that works. Thanks!
  • Dublin Core is one of the most widely used and oldest metadata formats on the web, but I agree the current situation isn't ideal.
    Zotero does also try to parse og metadata; not sure why we didn't pick up authors there.

    The problem with microformats is not that they aren't well defined (as you point out, they are) but that they refer to _elements_ not to _pages_. So it's possible a microformat tag would refer to the authors of a linked or previewed post or even a cited work and it's tricky to tell which it is.
  • > Zotero does also try to parse og metadata; not sure why we didn't pick up authors there.

    Sorry, I am using og only for images and media. I will look into improving the og article:author metadata (trying to figure out what a "profile array" means at the moment.)

    > The problem with microformats is not that they aren't well defined (as you point out, they are) but that they refer to _elements_ not to _pages_. So it's possible a microformat tag would refer to the authors of a linked or previewed post or even a cited work and it's tricky to tell which it is.

    The microformats2 metadata is hierarchical. Here's a detailed algorithm on how to determine the article authorship: https://indieweb.org/authorship#How_to_determine
  • Thanks, that's useful, we'll look at that.
  • edited September 19, 2018
    You are welcome.

    For example, on my page I linked above, these steps are followed:

    1. Finds h-entry
    2. Parse h-entry
    3. Finds u-author inside that h-entry
    4. [skipped]
    5. (author property i.e. u-author found)
    5.1. [skipped] not an h-card
    5.2. author property is u-author and thus a url.. URL is found (my root domain) and author-page is set to that.
    6. [skipped]
    7. (author page present)
    7.1. Parses author-page
    7.2. Finds h-card on author-page

    Uses that h-card for author info.
Sign In or Register to comment.