Pubmed ID

jlanders · January 17, 2009

Is there a way for Zotero to import the Pubmed ID (from NCBI Pubmed) into a separate field. Right now it is located in the "Extras" field.

Tjowens · January 20, 2009

Currently no. I'm not sure what the best option is for these kind of identifiers. Would it make sense to have a drop down like the contributor list with each of these as a option?

jlanders · January 21, 2009

When import from Pubmed, Endnote puts this number under a field labeled "Accession Number". I think this is very important since the NIH now requires you to includes this number in your references in any government grant you submit. I am currently writing one and I was looking forward to switching from Endnote to Zotero. Unfortunately, because of this, I have to hold off on the switch.

Tjowens · January 23, 2009

The extra field is mapped to note in CSL. So you could tweak your citation style to include Pubmed ID.

jlanders · January 23, 2009

This is true. The other issue though is that I have a large Endnote library (~4000 articles). When I try to import the articles into Zotero, it put a bunch of different info into the "Extras" field. I guess this should be listed as a import problem.

On a smaller note, in the "Extras" field, Zotero puts in the text "PMID:". Not a major problem, but this may an issue in some situations.

elodie · March 12, 2009

Has anything been done about this Pubmed ID issue?
I'm new to Zotero and I really like it but I find the way PMID are stored very inconvenient. As an 'Extra', it doesn't even export the field in a RIS format. It really needs its own 'accession number' field.
I'd love to add that to the translator but I am new to the system and I'm not sure how to proceed.
Is anyone working on it? Or could anyone one help me get started?

dstillman · March 12, 2009

There's been talk of adding support for an arbitrary number of "citation keys" per item, and Pubmed ID could conceivably be one. Citation keys were originally discussed in relation to BibTeX keys, which are somewhat different from PubMed IDs in that they're user-defined, but both do fall into the general category of non-Zotero identifiers that are sometimes used in citations.

dstillman · March 12, 2009

On the other hand, you would probably want the PubMed ID to be exposed in the canonical representation of an item displayed online, which suggests that it should be an independent field like ISBN or DOI and citation keys should be strictly user-specific.

Rintze · March 12, 2009

Yes, exposing PubMed IDs would be quite helpful, especially for locating sources and for use in citing. There are many more common identifiers like it though: some major ones like the PubMed Central ID (different from the 'normal' PubMed ID), and the identifiers used by Arxiv (http://arxiv.org/abs/0903.1843). Of course, other journal repositories/databases often have other (unique) identifiers.

Tjowens · March 12, 2009

It might be worth trying to work with these sort of identifiers in the same way we work with contributors. Instead of cluttering the interface with a bunch of individual fields we create a Identifiers section which allows a drop down for different kinds of identifiers. Just like contributors you could hit the plus sign and add more identifiers, specify the ID type (Pubmed ID, Pubmed Central ID, BibTeX key, etc.) and plug in the identifier. If we took this sort of approach we could be much more catholic in allowing an abundance of different identifiers, there could well be twenty different types of IDs that change in availability dependent on item type.

elodie · March 12, 2009

I think this is a very good idea. Now, who is going to implement it?
It involves changing more than the translator, doesn't it?
I know the whole thing is open-source but how does one actually contribute?

sjimon · April 12, 2009

I also think this is a good idea. The lack of a versatile ID field is a weakness of zotero at the moment.

1. The PMID and PMCID identifiers definitely need to have separate fields, rather than being lumped into the 'Extras" field. If this can be implemented in the way Tjowens suggests, that would be ideal.
In my research field, the vast majority of the references will have a PMID associated with it. My preference is to rename all of my pdf attachments by PMID and to use the PMID for citations while writing a manuscript. Other research fields would prefer a different ID - the ability to select and add more identifiers would be brilliant.

2. It would also be really handy to have these ID fields available in the middle column (ie the middle pane of zotero that lists the articles found within a specific folder, you can customize what columns appear by the button at the top right of the panel). This would allow the articles to be sorted by these ID numbers, as well as quickly identifying articles that are missing ID numbers in the 'My Library' folder.

3. Perhaps in the future it would be possible to search automatically search the various databases (repositories) to pull down the unique IDs? (ie press the PMID search button to search the specific reference for an ID at NCBI or the PMC button to pull down the PMCID). I suppose this would mean expanding the 'repositories' field in the same way as Tjowens is suggesting for the ID field? Because of the relationship between the repositories and ID fields, should these be combined in some way?

Keep up the great work!

sjimon · April 12, 2009

...still thinking about Tjowens idea:
For journal articles, would it be worth having a separate tab for ID (along with the tabs for Info, Notes, Attachments, Tags and Related) where the multiple identifiers could be stored?

The Info tab would focus more on information about the journal article (authors, date, citation etc) while the ID tab would display information about the ID links to external repositories, such as PMID, PMC, Ovid, ISI, Arxiv etc.

amacom73 · April 13, 2009

+1 on PMID, though I'm mostly agnostic on how to implement it in terms of data structure, ie, accession number vs extra vs dedicated field or tab. When I share references with colleagues or leave placeholders in grants or manuscripts I always use PMID since it's the easiest unique identifier to search within PubMed or from Endnote

One difficulty is that if you create an item from one of the publisher websites like ScienceDirect, PMID does not get filled in. So there's currently no way that to get both the full text AND the PMID in one item in an automated fashion. If I understand the foregoing discussion here, someone has proposed a "fetch PMID function" for existing items, which would be useful.

I'd also like to suggest a (slightly OT) feature (I will put it in feature requests thread as well) that the DOI field and whatever becomes of the PMID field are "clickable" from the right pane info tab in the same way that the URL field is.

bbraun · May 16, 2009

I want to reactivate this thread, and point out that PMCID is absolutely essential for those of us with any interaction with NIH. I will be ruefully stuck with an inferior product (rhymes with 'rend-goat') until Zotero can handle PMCID gracefully.

NCBI has some useful tools for scripting:
1. esearch: get PMID using a general query, returned in XML
example, to translate doi into PMID (doi is stored in an 'AID' field):
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=%2210.1016/j.ccr.2009.02.016%22[AID]

2. efetch: get a complete record using PMID; various report formats available
example:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=19345332&report=medline

3. conversion of PMID to PMCID
http://www.ncbi.nlm.nih.gov/sites/pmctopmid

General info can be found at:
http://www.ncbi.nlm.nih.gov/entrez/query/static/advancedentrez.html
http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helppubmed

In principal, these can be used to get the PMCID, when it exists, even when the paper originates from a publisher's site and only the doi is known.

Finally, just a reminder that PMCID != PMID.

Thanks to the Zotero team and congrats on the progress so far!

dsquared · July 15, 2009

I'm a mathematician and don't ever use PubMed, but in mathematics, we have a version of the PubMed ID: the MathReviews number (MR number). Just like those above who said it's essential to have the PMID stored by Zotero in a usable, flexible way, it's essential for me and other mathematicians to have the MR number. There's also arXiv eprint IDs.

I'd like to see the MR number and arXiv ID on equal footing with the DOI field, and be exported to BibTeX as mrnumber and eprint fields. (I don't use anything other than BibTeX, so I don't know how those fields should be exported to RIS or whatever.) I don't know the best way to do that, in terms of UI or programming, but it's essential for mathematical references. Without it, Zotero is probably of limited use to me.

BTW, here's an example: if you know the article has MR number 2394455, you just append that to the canonical URL (just like you use dx.doi.org) and get http://www.ams.org/mathscinet-getitem?mr=2394455. Many articles have a list of references and also a link to forward references.

dsquared · August 26, 2009

Another identifier that should be supported is Zentralblatt Math (http://www.zentralblatt-math.org) which roughly speaking is the "European MathReviews". Just as above, knowing the identifier Zbl 1154.94303 gives you the URL http://www.emis.de/zmath-item/?1154.94303.

Pankaj Jaiswal · November 6, 2009

You may want to use the cross reference field which has two components
source ID/acronym = PMID/MR
source record ID =000000000

rdbrown0au · January 23, 2010

As someone interested in Zotero as an aid to generating Wikipedia cite journal template output, having all relevant identifiers populated (automatically if possible) would be good.
If the cross reference field mentioned by Pankaj above is a single instance, then it wouldn't suit - populating PMID, PMC and DOI should be possible.
Similarly Zotero users may find it useful to have the identifier as a clickable link (as in dsquared's examples or expanded wikipedia cite journal templates).

The following is the list from the current Template:Cite_journal
# pmid: The document's PubMed Unique Identifier, such as 15128012
# pmc: The document's PubMed Central article number (PMCID) for full-text free repository of an article, such as 246835
# doi: A digital object identifier for the document, such as 10.1130/0091-7613(1990)018<1153:TAFSIA>2.3.CO;2.
# bibcode: The document's bibcode in the Astrophysics Data System, e.g., 1924MNRAS..84..308E
# id: A unique identifier, used if none of the above are applicable. In this case, you need to specify the kind of identifier you are using, preferably with a template like {{US patent}}, {{MR}} / {{MathSciNet}}, {{Zbl}}, {{arXiv}}, {{JSTOR}} or {{JFM}}. (Use one of the more specialized parameters if possible; they are linked automatically. In other words, don't use id = PMID 15128012 anymore. Use pmid = 15128012.)

Also a number of the BioMed journals provide good citation metadata for their online abstracts or articles. Zotero doesn't seem to be mining this to populate the citation fields.
A number of journals use the citation_ form, with additional Dublin Core dc. elements
and Nature adds prism elements, but standardization is limited. arXiv doesn't provide more than the title as metadata and PLoS XHTML is different again.

#meta_field_name value
citation_volume 7
dc.title SOMAP: a novel interactive approach to multiple protein sequences alignment
citation_authors Parry-Smith, D.J.; Attwood, T.K.
citation_id 7/2/233
citation_issue 2
citation_date 04/01/1991
citation_firstpage 233
citation_title SOMAP: a novel interactive approach to multiple protein sequences alignment
citation_mjid bioinfo;7/2/233
citation_journal_title Bioinformatics
dc.contributor Parry-Smith, D.J.
dc.contributor Attwood, T.K.
dc.identifier 10.1093/bioinformatics/7.2.233
citation_publisher Oxford Univ Press
citation_doi 10.1093/bioinformatics/7.2.233
citation_abstract_html_url http://bioinformatics.oxfordjournals.org/cgi/content/abstract/7/2/233
citation_issn 1367-4803
citation_issn 1460-2059
citation_pdf_url http://bioinformatics.oxfordjournals.org/cgi/reprint/7/2/233.pdf
dc.date 04/01/1991
robots NOARCHIVE

Hacky perl for dumping this available on request

fbennett · January 23, 2010

@rdbrown0au: There has been recent discussion of this in the CSL tracker, and the BIBO (RDF) group. There is also an (inadequate and now dated) proposal concerning institutional identifiers in the zotero-legal group.

Agree that a solution to this is needed. It looks like there need to be simultaneous commitments in two projects to get things moving: BIBO and Zotero, with CSL following suit afterward with processor support. There has been a suggestion, at least, of an extensible namespace for identifiers in BIBO. Would be great to hear whether team Zotero is willing to take a look at this post-2.0. It certainly is an important item.

smsaladi · December 1, 2010

I wish to add to this chain. I hope in the new version of Zotero that import via PMCID and via patent accession number will be possible. Those two would greatly increase functionality.

jastirn · March 1, 2012

This is an old thread, but the issue of PMID placement in the Zotero record and RIS export is still unresolved, at least from my perspective. I'm using Zotero Standalone 3.0.3 for Mac.

When importing directly from a PMID, Zotero stores the PMID [MEDLINE:pmid] in the Extra field [ZOTERO:Extra]. This wouldn't be as big an issue if ZOTERO:Extra were included in the default RIS export from Zotero, but it's not.

Please include PMID in a ZOTERO:AccessionNumber or ZOTERO:PMID field and map it to RIS:AN as ########[pmid]

We could speculate on the progression of standards and where it fits best for another 2 years, or do something that *just works* for the majority of users now without them having to write custom translators. The current state is broken.

adamsmith · March 1, 2012

We hope there will be some version of a PMID/citation id field in Zotero 3.5.

I don't think there's great enthusiasm to hack the RIS translator/export. The problem with RIS is that it's a bad standard, if you want a better solution find one that involves a decent standard (MODS or Bibliontology RDF, e.g.). At this point my position (which is in no way Zotero's official position) is that it's not Zotero's problem that crappy proprietary citation managers don't support modern bibliographic standards (or in the case of Endnote any standard).

antikorpo · January 21, 2013

tl;dr but +1 to PMID field

adamsmith · January 21, 2013

There'll almost certainly be a PMID field and IIRC there is now a RIS spec that tells us where to map it to.

bbraun · January 21, 2013

I thought the CSL standard had been changed to allow multiple article identifiers. If so, it would be great of Zotero could accommodate this.

Also, I will point out that, for life sciences, there are 2 identifiers that are needed:
- PMID (PubMed ID)
- PMCID (PubMed Central ID)

The PMCID should be imported from a PMCID record, of course, but it should also be imported automatically from any Pubmed record that has an associated entry in PubMed Central (and thus a PMCID).

Finally, PMCID needs to be available for inclusion in bilbliographic styles, to satisfy NIH reporting requirements.

Thanks to all.

adamsmith · January 21, 2013

see here: https://github.com/ajlyon/zotero-bits/issues/10
there is broad agreement on this, it's just a question of having the field changes implemented in Zotero, on that see the thread pinned at the top of the forum.
edit: and yes, CSL has both PMID and PMCID variables now.

Saheli · November 19, 2013

Hi, I was wondering if there is any progress on how to easily convert existing ID numbers (doi, PMID, etc) to PMCID if available? NIH is really slamming down on any inclusion of references without PMCID, and for investigators who already have Zotero reference libraries it would be nice to be able to batch add the information robustly. Thanks!

adamsmith · November 19, 2013

no there isn't - that part also won't happen very quickly.
You'll have to manually go to the pubmed entry and look up the PMCID and add it as PMCID: PMC123456
to the extra field. Zotero does extract that information and it is used in the National Library of Medicine Grant style.

tgliou · January 4, 2014

I had problems with The Library of Medicine Grant style. To address the NIH requirements for PMCID numbers, I have put into the Extra field:

PMID: 12345 PMCID: PMC12345

When using the Library of Medicine Grant style, I got output that looked like this:

PMID: 12345

I am not good at XML, but I looked at the macro "pmcid" and changed the line:

<text variable="PMCID" prefix=" PMCID: "/>

to read:

<text variable="note"/>

which then gave me:

PMID: 12345 PMCID: PMC12345 PMID: 12345

so I deleted the line:

<text variable="PMID" prefix=" PMID: "/>

and got:

PMID: 12345 PMCID: PMC12345

Having the PMID in there too doesn't hurt, so I think I'm happy. I'm sure, however, that this is quite clumsy, but now it works for my situation. Do you have a suggestion for a better fix?

Also, I noticed that for articles that were internet citations, I was missing a space between the Availabe from and the PMID entry, thus:

Available from: http://www.ncbi.nlm.nih.gov/pubmed/12345PMID: 12345 PMCID: PMC12345

[Sorry, but the comment editor interprets the above line as an actual link. It is not. Same caution applies below.]

I changed the macro call near the end of the file from:

<text macro="access"/>

to

<text macro="access" suffix=" "/>

and I got:

Available from: http://www.ncbi.nlm.nih.gov/pubmed/12345 PMID: 12345 PMCID: PMC12345

Again, I hope that this is helpful, but I am clumsy with XML.

adamsmith · January 4, 2014

This isn't working as intended - Zotero is supposed to get them both, right now it only gets whichever comes first. We'll look into it.