Pubmed abstract missing for retrieved PDF metadata

RaimundE · August 2, 2013

I finally started using Zotero after looking into it again and again over the years, and I relly think it's great - thanks to everyone who put work into this prog!

I just have one problem: I filled most my DB by throwing hundreds of PDFs into Zotero and retrieving the Metadata automatically. That's a great feature, but I just realized that for those entries the abstracts are missing.
I learned on the forum that Google Scholar doesn't store abstracts, and that's probably where the metadata were retrieved from.

Is there a way to
- add my missing pubmed abstracts in a batch operation
- find a way of retrieval for the future that would include abstracts?

Thanks for your comments

Raimund

adamsmith · August 2, 2013

unfortunately neither at this time.
Metadata comes from CrossRef or Google Scholar, neither of which includes abstracts and there is currently no option in Zotero to complete metadata from other sources. Might happen at some point, but not anytime soon.

adamsmith · August 2, 2013

Obvious, in the future if you import data with the URL bar icon either from PubMed or from most publisher it will include the abstract.

RaimundE · August 3, 2013

Adam,

thanks for your fast reply! That missing abstract is a pity as it makes the "retrieve metadata" functionality a lot less useful.

- Is there any database that includes both, PMID AND DOI and that I could use in a script to get my abstracts?

- It seems Endnote is able to identify metadata including abstract foir a PDF you import. Do you know, where they get the data from?

- It also has an option "Find reference updates" that can be used to fill in missing abstracts. I'll write items for both in the feature request section.

Thanks again!
Raimund

adamsmith · August 3, 2013

- Is there any database that includes both, PMID AND DOI and that I could use in a script to get my abstracts?

Pubmed.

It seems Endnote is able to identify metadata including abstract foir a PDF you import. Do you know, where they get the data from?

My guess would be pubmed, which would make their feature work for a lot fewer papers but with better data. But no way to know for sure.

don't bother writing these up as feature requests. Devs read all threads anyway, the reference update feature is planned in the medium-long run and if we find a way to improve the data retrieved for PDFs it'll be implemented.

RaimundE · August 3, 2013

Thanks again. What would be great, would be the option to rank the lookup engines for the metadata search, so it could be set to start with Pubmed and only uses Google Scholar when pubmed is negative.

adamsmith · August 3, 2013

we could maybe consider using Pubmed before CrossRef where we get a DOI, but it won't work as a replacement for google scholar (ever), since we take advantage of the fact that GS indexes the full text of articles, pubmed doesn't.

RaimundE · August 3, 2013

that would be very helpful!

kopo · December 21, 2013

Most of the time, there is an URL (i.e. http://www.sciencedirect.com/science/article/pii/S092058619800073X) for the resource, that contains the desired information.
By opening the link it´s possible to save the information into the database, but it´s not saved within the original database entry but into a new one. Though in the end there are two almost identical entries for the same paper: One, which contains everything (including the original pdf file) but the abstract and one containing the abstract, but which does not contain the original pdf file.
Would it be possible to follow the URL in batch/automatic mode and fill in the missing values? If not, is there a workaround to fill in the missing values (i.e. abstract) manually without creating a new database entry?

aurimas · December 21, 2013

This is planned in general, but nothing of the sort is currently implemented. (it is possible to merge the duplicate items into one, but it's a rather messy workaround)

lig-hls · August 3, 2017

I am quite satisified with zotero, but its weakest point is in my view its inability to get the abstracts when retrieving metadata for the imported pdfs.
It is also because the abstract contains a lot of keywords which makes finding the correct reference a lot easier. Right now the abstract is not populated for the new references imported via pdf (my standard workflow).

So an ideal pdf workflow would work as follows:

Drag pdf into Zotero
Retrive metadata (include option of renaming pdf according to metadata already at this stage)
Offer additional option: add abstract via doi, URL, Web of Sciences

Hope this could get realized !

lig-hls · October 3, 2017

I have a workaround now to get all metadata from pdf.

1. import pdf followed by "retrieve metadata from pdf"
2. check whether metadata is correct
3. go to the URL (present in most cases) displayed in the Info Box
4. Use the webpage and save to Zotero (a new item with more metadata i.e. tags and abstract is generated)
5. look for duplicates and merge "pdf-item" with "URL-item"

Done !

Please make an automatic workflow for this ;)

azanderson · October 4, 2017

I would be happy if I could just edit the metadata associated with an imported PDF! Right now all I’m getting is Title, URL, Filename, Accessed, Modified, Indexed, Related, and Tags. No options are provided to add Author, Date, etc. This is really surprising.

adamsmith · October 4, 2017

If Retrieve Metadata fails, use right-click --> Create Parent item. This is unrelated to the rest of this thread, though, so please follow up in a new thread with any further questions.