Automatically filling 'Abstract' field up

Hello,
I like to have the abstract field of my references filled up. I always opt out for adding citations with them whenever possible (some research databases like IEEE and Scopus) offer this possibility when exporting citations from their own sites.

So, I wanted to know if there would be any way to add a feature to automatically fill these fields up when Zotero indexes PDFs? I think that it would be relatively straight-forward to do that with a regular expression on the extracted text.
  • how are you importing to Zotero? From most decent sources (including Scopus and IEEE), you should get an abstract already.
    Grabbing it from the PDF after the fact isn't a good idea, no. PDF text extraction is a bit problematic -- PDF is a surprisingly bad text format -- regex can err etc. The general idea is to eventually be able to go to article pages and grab missing information, including abstracts, but that's still a bit off.
  • Hi,
    Yeah, I meant when I download the PDF files first rather than the citation file.

    So, given your answer, creating a tool for automatically highlighting words in a PDF file based on some criteria would be a bad idea?
  • no, individual words or short phrases would likely work well if that's something you're interested in. Larger blocks of text less so.

    (you generally should avoid going through retrieve metadata if you can -- it gives you much worse data than import via the Save to Zotero icon, which in many cases will include the PDF as an attachment).
  • Hi,
    How do I do the correct way to add PDF files then?

    The current way I do is to add the citation (when available) from a research database (e.g Scoups or Google Scholar). It provides very tidy info. Later, if (and when) I need the actual PDF file, I drag and drop its link from firefox's location bar onto its corresponding zotero item.

    When I only have the PDF file (no citation), I do use the option Save to Zotero available on Firefox's Download dialog, however this only saves the file, without adding any item metadata. When I mark the 'retrieve metadata' checkbox it does create item metadata, but usually incomplete data.

    So, what would the most productive way of doing this be?
  • The first is fine. What I ususally do is go to the actual publisher and import via URL bar icon -- that will import the data and attach the PDF in one go. It also gives you pretty good data, including the abstract almost all of the time.
  • So, would the reverse process work? That is, from an item`s DOI, access the reference online, refilling all item data fields to keep them consistent with the correct ones provided by the publisher? It would be extremely useful to see this in zotero to re-build items you mistankenly added up through the wrong way (the user would only need to add the DOI for all data to be fetched again).
  • Updating of references with new metadata is a generally planned feature, but I wouldn't expect in the near future.
  • Is there an update on this?
  • Much closer, but still no specific ETA
  • Can I implement hydration for you on a contract basis?
  • This is effectively done, just waiting on some minor touch-ups (see https://github.com/zotero/zotero/pull/1582 if you're interested in the implementation & some of the complexity that went into getting this right)
  • Can I somehow do this now to hydrate abstracts?
  • Generally speaking, yes (you'll note some people in the issue have done this): you'd want to check out that branch and then build Zotero from source -- instructions are in the dev section of the documentation and on the Zotero github repo directly.
    Clearly, that's not a workflow intended for general usage, so you're mostly on your own for getting it to run (or can wait until it lands in regular Zotero or at least the beta version)
Sign In or Register to comment.