Initial questions about article abstracts

Hello Zotero crew!

I have a gazillion questions about article abstracts in Zotero. I'll start with just a few questions.

Q1) Is the only place where article abstracts show up in the "Abstract" area, below the "Info" area, above the "Attachment" area, on the far right side of the Zotero app, immediately to the right of the list of titles?

I'm using 7.0 as a Flatpak and it looks more or else like what I see when I use Zotero in browser via the website. I just see the abstract in plain text without formatting.

Q2) How can I know whether the abstract data for an article is coming from Crossref?

Q3) Do abstracts usually come from Crossref, or form other sources?

Q4) Are the .js files in github.com/zotero/translators the first point of code where abstract data enters the Zotero system? How does Zotero decide which translator to use? Are these run locally when I do a search to add a new reference item to my Zotero library?

Q5) Is github.com/zotero/zotero/blob/main/chrome/content/zotero/elements/abstractBox.js the code that actually renders/outputs the abstract box I see in Q1?

Q6) Does 5 count as a "few questions"? Are they even really 5 question? Or more like 8?

Cheers,
Castedo

PS I love y'all's work! Zotero is one of the few services that I am willing pay a
subscription for because I like Zotero so much, at so many levels.
  • Happy to answer these, but could you provide some background about what you are actually after? The type of answer that's going to be helpful varies depending on what you are trying to achieve
  • Ah yes, I should clarify that my question is not as a user (even though I am one).

    I am developing an authoring/archiving/self-publishing tool (https://try.perm.pub/baseprinter/) that one can think of as more of less like a special kind of preprint for a distributed archive federation. Sometimes these "preprints" would get registered with Crossref, sometimes not.

    I want to help authors have an idea what their abstract will end up looking like in destinations like Zotero. I also want to help them NOT archive/self-publish abstracts that will look terrible due to using unsupported/obsolete formatting features.

    So for example, I'm planning to NOT allow tables in an abstract. I'm now systematically going through the JATS features that do or do not make sense to archive in an abstract based on today's levels of support/implementation.

    An example of a preview (via GitHub) is https://castedo.github.io/baseprint-example/ So my idea is to have another button or something showing what the abstract would more-or-less look like, say as plain text in Zotero. Or perhaps as HTML if Zotero implements showing abstracts in HTML.
  • OK, thanks, that's helpful.
    1) Yes
    2) They Library Catalog field contains, in most cases, the source of the metadata
    3) Depends on how you import: when you add items by dragging the PDF to Zotero or by using the DOI in the magic wand field, the metadata and abstract will almost always come from CrossRef. On the other hand, if you're using the Browser connector, the abstract would typically come from the site you're importing from, whether that's the journal's publisher (i.e. Sciencedirect, Cambridge Core, etc.) or a catalog like Pubmed or WebOfScience
    4) Yes. The translators are run based on their priority (lower = first) when there target applies -- in other word, web translators (the majority of translators in that folder) are only run when the target (regex) matches the URL in the browser and are only ever run for web import. For dragging PDFs to Zotero, you're going to see mostly the DOI translator, yes. If you have Zotero installed, all translators are run locally, yes.
    5) I think so, yes.


    To your larger question: Zotero doesn't import or use any HTML or any other markup in abstracts (it does import and handle a very limited HTML subset in titles). Abstracts are stored and displayed in plain text.
  • I've discovered that Clarivate WebOfScience also shows abstracts as just plain text and seems to truncate them too. I'm not sure if this is typical or in how many other places abstracts appear.

    A conclusion I'm drawing is that abstracts are most likely normally shown in a plain text format after redistribution (in contrast to all the pretty formatting they might have in the original paper).

    With many diff levels of severity, I have collected a list cases of abstracts with JATS not being converted to text in undesirable way. I will post the most severe cases at a separate forum discussions (unless you have a diff preference).

    After a few of these I might like to document what are expected conversions from JATS to plain text. And then at some point I'd like to offer this expected conversion as a preview in my authoring tool.

    Thanks for the answers! That really helps me narrow down the direction I plan to explore.
Sign In or Register to comment.