Feeds: Quality of metadata

I have tested the feeds of different journals in my field. For some of them, the main metadata displayed in the columns in the centre pane is extracted nicely (Title, Creator, Date, Publication, ...). But in most cases, a large part of the metadata extracted in not correctly attributed to the different fields, usually going all together to the field "Abstract". In the worst case, it could only get the title correctly.

Is there a way to improve the metadata extraction for each journal feed, as with the Zotero Connector?
Or is the problem coming from the publisher of the feed? In that case, what is the best way to report to them the problem? Is there any standard that they should follow so that it works nicely with Zotero?
  • The metadata that's displayed for unsaved items for feeds isn't the same as the data that Zotero saves. The former is just standard feed metadata that comes from the feed, and you should see the same in any feed reader. When you save, it should be more or less equivalent to going to the page and saving using the Zotero Connector.
  • Thank you for your explanations. The metadata I was discussing is the feed metadata displayed in the centre pane of the Zotero Feeds. It is good to know that the metadata used when importing to Zotero relies on the Zotero translators and therefore should be more reliable.

    Then the problem is from the publishers. It seems that they format the data themselves to look ok in some feed readers, without filling correctly the feed metadata that it used by Zotero.
    I will contact some publishers to see if they can fix their feeds.

    I can probably use the feed of Journal of Fluid Mechanics as a good example to follow (except the inlineFormula that cannot be displayed in Zotero) to explain what is needed:
    https://www.cambridge.org/core/rss/product/id/1F51BCFAA50101CAF5CB9A20F8DEA3E4

    Elsevier seems to use a random formatting to put all the metadata in the description element, without filling the standard metadata elements.
    For example, the RSS feed of the Journal of Computational Physics: https://rss.sciencedirect.com/publication/science/00219991

    For this AIP journal Physics of Fluids, they also use their own formatting for the information put in description:
    https://aip.scitation.org/action/showFeed?type=etoc&feed=rss&jc=phf

    For the Springer journal Experiments in Fluids, they do not mention the authors at all.
    https://link.springer.com/search.rss?facet-content-type=Article&facet-journal-id=348&channel-name=Experiments in Fluids
  • edited August 29, 2022
    In my experience, what comes through with the publisher RSS feed is _not_ the metadata that Zotero captures when the items are moved from the feed view into a Zotero collection. As @dstillman said above, metadata from moving a feed item is almost always identical to that if you went to the article itself on the publisher's website and clicked the Zotero import button in your browser. In my opinion the purpose of a feed is to allow a viewer to learn what is new and to go to the source to read the full article. Zotero eliminates at least one step for you by saving the visit to the publishers' websites for downloading the desired article metadata yourself.

    Items inside the RSS feed "collections" are not yet fully accepted / integrated into your true Zotero library collections. Zotero feeds are great for capturing newly published articles. However, Zotero recognizes that seldom will a user want to always add to their library everything in every journal issue. Zotero lets you select only the items you need and allows you to place the items in the collection you choose.
Sign In or Register to comment.