Google indexing and the look of the items pages
Dear Zotero,
I recently posted that I started to use Google CSE to search our Zotero libraries: Zotero Forums - Search Zotero with Google
It conventient and works fast, but there is a problem with the result list. Please have a look here:
Search (with Google) Zotero library From Some Psychologists for "unspoken"
There is an entry in the result list there saying "Community Treatment Disorder". The snippets for this says
The reason for this is what Zotero gives Google while it indexes the library. I would suggest totally rewamping the look of the items pages. I would even be glad to help with this since it is a major problem! (The feel and look can of course be much enhanced for human beings to.)
Kind regards,
L
I recently posted that I started to use Google CSE to search our Zotero libraries: Zotero Forums - Search Zotero with Google
It conventient and works fast, but there is a problem with the result list. Please have a look here:
Search (with Google) Zotero library From Some Psychologists for "unspoken"
There is an entry in the result list there saying "Community Treatment Disorder". The snippets for this says
Unfortunately most of that is unrelated to the entry, which you can see here: Zotero | Groups > From Some Psychologists > Library > Community treatment orders: current evidence and the implications ;-)
"Community treatment orders
https://www.zotero.org/groups/from_some.../items/.../A93JSK45
Peter Levine: In an Unspoken Voice - references. Tips 2011. Violence. Trash. _RIS import. *Behavior An… *Behavioral … *CORRUPTION*. 1-Propanol..."
The reason for this is what Zotero gives Google while it indexes the library. I would suggest totally rewamping the look of the items pages. I would even be glad to help with this since it is a major problem! (The feel and look can of course be much enhanced for human beings to.)
Kind regards,
L
It's possible that something like adding a <nav> tag around the collections list would make Google exempt text within from searches, but there's no guarantee.
* I'm not sure if it's technically forbidden to show less content to Google — they're generally concerned with showing more content to bots than to humans. But I'd rather not risk our being penalized for serving different content to bots.
I did a quick search to see if Google handles
However to me that seems like a much smaller problem then the current problem with a lot of false hits when searching.
A button to show table of content could easily be added. (In most cases I guess you will not want to show the table of content.)
In the long term a structured search through Google would be much better of course. (I sent them such a suggestion, which they will perhaps never notice. ;-) )
1) The title that Google display in the hit list is now taken from the <title>...</title>. This could be changed to the title of the reference.
2) The reference title is now in <h2>...</h2>. This could be changed to <h1>...</h1>. I do not remember for sure, but I think Google will use the <h1> if it is available (and have some text in it).
schema.org FAQ - Webmaster Tools Help https://support.google.com/webmasters/answer/1211158
Perhaps that is enough to solve the problem here?
https://support.google.com/webmasters/answer/99170
[The following formats are supported by Google]
> * Microdata (recommended)
> * Microformats
> * RDFa
The Structured Data Testing Tool helps with testing microdata / microformats / RDFa: http://www.google.com/webmasters/tools/richsnippets
Structured Data Markup Helper helps with generating structured data markup: https://www.google.com/webmasters/markup-helper/
Learn More - Webmasters — Google Developers
What the user sees, what the crawler sees
https://developers.google.com/webmasters/ajax-crawling/docs/learn-more
(Just putting this note here now.)
I will give Google content it can index the way I want it. I use two php scripts for this. One presents the items like this:
http://ourcomments.org/zformat/g/56508/i/WQG7QNXW
The other gives Google a sitemap:
http://ourcomments.org/zformat/g/56508/sitemap.php
Before submitting this to some new Google CSE:s I wonder if you have any opinion on this. Is there something missing in the formatted output above? (It has markup for schema.org, Facebook and Twitter.)
(I think it would be pretty cool if Zotero could handle this. I mean if users would be able to upload some definition of output views. It can be made safe.)