Enable citation directly from bioconductor

Hi,

Bioconductor website holds 1700+ software packages with high citations.

However, zotero could not grab author and other information correctly from the bioconductor software package webpages. For example,

https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html

So I want to make a feature request for Zotero to extract info from bioconductor package webpages directly. I would appreciate for any responses.

Best,
Qiang
  • I don't know if the bibliographic information is easily retrievable from those package webpages.

    (R packages can provide citation metadata within the R environment, see https://bioconductor.org/help/faq/#citation-faq and https://stat.ethz.ch/R-manual/R-devel/library/utils/html/citation.html; e.g.:

    > citation('tidyverse')

    To cite package ‘tidyverse’ in publications use:

    Hadley Wickham (2017). tidyverse: Easily Install and Load the 'Tidyverse'. R package version 1.2.1.
    https://CRAN.R-project.org/package=tidyverse

    A BibTeX entry for LaTeX users is

    @Manual{,
    title = {tidyverse: Easily Install and Load the 'Tidyverse'},
    author = {Hadley Wickham},
    year = {2017},
    note = {R package version 1.2.1},
    url = {https://CRAN.R-project.org/package=tidyverse},
    }
    )
  • We do import from CRAN directly so that works.
    Also, on the bioconductor pages, right-click --> Save to Zotero (DOI) will at least get you the basic info (for packages with DOI).

    Not sure if we want to build a separate scrape here that needs to be maintained.
  • edited June 28, 2019
    @adamsmith @zuphillip Bioconductor lists a DOI (as far as I can tell for all packages) and is conscientious about registering them. You could just specify for the site translator to use the DOI translator.
  • Given currently limited resources, I think I'd just wait until the improved/combined generic translator arrives (which will just pick up DOI info). If that doesn't work well enough, we can look at a specific translator.
  • Hi @adamsmith , Thanks for your quick reply! I also tried to use "Save to Zotero (DOI) ". It could not abtain the programmer (2 out of 4 authors were read in as 1 "programmer" for SummarizedExperiment), abstract, version and date correctly.
  • That would indicate potentially incorrect data was registered for the package for the DOI. You should report that incorrect data to Bioconductor.
  • Hi @bwiernik and @hubuntu ,

    Thanks for looking into this issue. I'll report this to Bioconductor core team (where I am a member), and see if there is anything we can do about the DOI and related data. Hope this is just an issue that could be fixed on our end. Will follow back later.

    ~ Qian
  • Thank you Qian! That is very helpful. I wonder if in this case, there were only two authors when the DOI was registered, but more authors were subsequently added?
  • HI @bwiernik , as @hubuntu has mentioned about the error reading the authors, the 2 authors were actually read in as 1 author "programmer" (in the first name and last name field separately). So if this error was introduced earlier from DOI, it could be a serious problem! However, I still need to check in with the core team members, because I am not familiar with this part...
  • Ah, I misread hubuntu's post.
  • Below is what we get from Datacite using
    $ curl -LH "Accept: application/vnd.datacite.datacite+xml" https://doi.org/10.18129/B9.bioc.SummarizedExperiment
    .
    Every author should have their own creator tag. The other info noted by hubuntu as missing is also not in there.

    <?xml version="1.0" encoding="UTF-8"?>
    <resource xmlns="http://datacite.org/schema/kernel-4"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">;
    <identifier identifierType="DOI">10.18129/B9.BIOC.SUMMARIZEDEXPERIMENT</identifier>
    <creators>
    <creator>
    <creatorName>Martin Morgan, Valerie Obenchain, Jim Hester, Hervé Pagès</creatorName>
    </creator>
    </creators>
    <titles>
    <title>SummarizedExperiment</title>
    </titles>
    <publisher>Bioconductor</publisher>
    <publicationYear>2017</publicationYear>
    <resourceType resourceTypeGeneral="Software"/>
    </resource>


  • Hi @adamsmith ,

    Thanks for the info, and the code above helps! We did find the Bioconductor website metadata and DOI containing non-complete info. We are working to update those so that Zotero could add Bioc package webpages directly.

    Best,
    Qian
  • Hi @adamsmith @bwiernik,

    I am helping to test the standardized metadata for Bioconductor using Dublin Core.
    I set the DC.type to "Software", but zotero could not recognize it to "Computer program". The Item Type is still "Web Page". Some of the DC type works, such as article, film...

    Any suggestion to make zotero work with "software" in metadata? Thanks!

    Here is the test page.
    http://awesome-davinci-6f193c.netlify.com/

    <html>
    <head>
    <title>SummarizedExperiment: SummarizedExperiment container</title>
    <meta name="DC.creator" content="Morgan, Martin"/>
    <meta name="DC.creator" content="Obenchain, Valerie"/>
    <meta name="DC.creator" content="Hester, Jim"/>
    <meta name="DC.creator" content="Pagès, Hervé"/>
    <meta name="DCTERMS.dateAccepted" content="2015-01-01T23:18:49Z" scheme="DCTERMS.W3CDTF"/>
    <meta name="DCTERMS.available" content="2015-01-01T23:18:49Z" scheme="DCTERMS.W3CDTF"/>
    <meta name="DC.identifier" content="10.18129/B9.bioc.SummarizedExperiment"/>
    <meta name="DC.identifier" content="https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html"; scheme="DCTERMS.URI"/>
    <meta name="DC.description" content="The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples."/>
    <meta name="DC.title" content="SummarizedExperiment:SummarizedExperiment container"/>
    <meta name="DC.type" content="Software"/>
    <meta name="DC.version" content="0.1">
    </head>
    <title>Bioconductor - SummarizedExperiment</title>
    <body>
    [1]M. Morgan, V. Obenchain, J. Hester, and H. Pagès, “SummarizedExperiment:SummarizedExperiment container.” [Online]. Available: http://127.0.0.1:8081/test.html. [Accessed: 15-Jul-2019].
    </body>
    </html>
  • Any way to make the metdata work? Thanks!
  • Is "software" the correct DC.type for that? What happens if you use "computerProgram" instead?
  • Hi @zuphilip, thanks for your reply. It can be recognize correctly with "computerProgram", but all the "DC.creator" could not be read in this item type (Computer Program). It worked for "Web Page".

    I have decided to write a translator instead. The documents are great. I made one and it works well in my manual test, but failed in the pull check step. I am new to javascript... Any help would be appreciated. Thanks!
    https://github.com/zotero/translators/pull/1983
Sign In or Register to comment.