Google Search not indexing ANY Zotero group libraries

galaxyproject · November 15, 2017

Hi All,

I created a Zotero group back in September and the group's library is still not being indexed by Google. It's been a "Public, Closed Membership group" either since creation, or shortly thereafter. Is there something I need to turn on to make this happen?

The group:
https://www.zotero.org/groups/1732893/galaxy
https://www.zotero.org/groups/1732893/galaxy/items

What Google finds:
https://www.google.com/search?q=https://www.zotero.org/groups/1732893/galaxy/items

Thanks,

Dave C

galaxyproject · December 21, 2017

Hi All,

Just a ping to refresh this thread. The group library is still not being searched by Google. Any ideas?

Thanks,

Dave C

adamsmith · December 21, 2017

I don't think the content of Zotero libraries is supposed to be indexed by google.

Not quite sure why you would expect this? E.g. google also doesn't index the contents of library catalogs.

galaxyproject · December 21, 2017

A spot check of several other random group libraries in Zotero indicate that *none* of them are being indexed by Google. The robots.txt for zotero.org is:

User-agent: *
Disallow: /trac/
Disallow: /people/
Disallow: /*/following
Disallow: /*/followers

Which I don't think will prevent group libraries from being indexed.

I'm not a JavaScript person, but it looks like the group library pages are rendered with JQuery. The source for the pages doesn't include any of the papers that are shown on the page, so I'm guessing the pub names, other info and links are dynamically loaded by JQuery.

Is it possible that this is what's causing the problem with Google not indexing group libraries in Zotero?

galaxyproject · December 21, 2017

@adamsmith Thanks for the response.

I was completely unaware of Google's decision to generally not index sites like Zotero's group library pages. My assumption was based on my experience with CiteULike, which Google does index. Any idea if there is a way to find out, definitively, that Zotero Group libraries have been excluded from Google's searches?

Thanks,

Dave C

DWL-SDCA · December 22, 2017

I manage an online bibliographic database that _is_ indexed by Google Scholar but not by vanilla Google search. GS generally does very limited (if any) indexing of sites that echo or shadow material that is available from a publisher's official site. [Unless your Zotero library has mostly unique material, this "rule" probably applies to you.]

The items in a Zotero are really comprised of a clump of individual database fields that have been assembled into a display showing the metadata for an article, report, etc. in a human-readable format on your screen (or into a format that can cite the record in a chosen bibliographic style). The webpages you see when you view your library don't exist until you ask for them. Google doesn't understand the database table structure of every SQL database online so Google can't easily be expected to recognize what is in your library without some extra help (see my final paragraph).

GS indexes my site because we provide additional material in the form of categories / index terms assigned to articles included in the database. Also, we add our own abstracts to articles that don't have author abstracts plus we edit author abstracts so that someone from a professional discipline other than the authors' discipline can understand what the article is about -- we explain the jargon. This means that much of the content of records on my site are different (or more complete) from what is available from the publisher.

Also, a _requirement_ for indexing is to include GS/Highwire formatted terms in the header of each webpage generated by a database query. Since GS doesn't actually query a database to find a webpage for each record, they also require an xml-based sitemap that lists/points to each record in the database as though it was a separate webpage. We have over 700000 records so we have 700000+ lines of XML in files structured by a GS standard and with size limits imposed by their rules. I know this seems overly complex but it is actually more complex than it seems. I get a call from a GS representative 3 to 5 times a year with a request to make some sort of change to conform to new page-structure or header requirements or to ask for my participation in some experiment or beta test. Sometimes GS pays for requested temporary experimental changes. Usually they don't offer support. The only reason they ever offer to pay is that my site doesn't have a subscription fee and we aren't supported by advertisements.

galaxyproject · December 22, 2017

@DWL-SDCA thanks for the detailed explanation. The Zotero library in question is just a collection of stuff that exists elsewhere on the web. Thus that rule is probably catching me.

And, well, dang. :-)

I'll look for other ways to get this material indexed. If I find anything I'll post it here.

Thanks again,

Dave C

aad verbunt · July 30, 2018

@DWL-SDCA your information is of value to me, managing a similar bibliographic database of books, journal articles and cases. Each record has its own abstract.
My question to you is: is your database managed with Zotero?

DWL-SDCA · July 30, 2018

The short answer is no but Zotero is essential for getting records into the database. My contact information in available on the website. Write to me and I will explain.

galaxyproject · September 6, 2018

Hi all,

We never did get Google to index our Zotero library (as was predicted above). The best we could do was to tweak our wrapping of the Google Custom search to add a top level, parallel tab to our wrapper that uses the Zotero API to do the search.

The results are here:
- https://galaxyproject.org/search/?q=hox#gsc.tab=0&gsc.q=hox&gsc.page=1

The code backing that is here:
- https://github.com/galaxyproject/galaxy-hub/blob/master/layouts/search.pug
- https://github.com/galaxyproject/galaxy-hub/blob/a628c2c450dfc5a825ce6fe830eac818a9176597/src/js/index.js

This was implemented by Dannon Baker (https://github.com/dannon).

Cheers

gianlucamis · April 30, 2022

Can this be a viable solution: export your library as 'wikidata quickstatements', then import it in wikidata.org (this will help with bulk imports: https://quickstatements.toolforge.org/#/).
Opinions and further suggestions are welcome