How to make Zotero friendly websites?
I'm working on a large research database website and I can see the benefit of making it easy for users to extract citations and export them to other formats. Are there any pointers for content developers who want to help expand the number of zotero-friendly websites? What format should the data be in?
Thanks,
- s
Thanks,
- s
<http://ocoins.info/>
Zotero also reads embedded RDF, so that is a possible alternative. Zotero also has plans to eventually implement UnAPI support. If you fail to implement any of these embedded formats, at least allow RIS export (also useful for Endnote)--Zotero can catch those if users manually export them from your site.
refbase is a GPLed MySQL/PHP project which has COinS and UnAPI support:
<http://refbase.sourceforge.net/>
I also can't get it to respond to DC.Type tags to recognize a page as anything other than a general text document.
So, is there a good example of Zotero-friendly embedded RDF people can copy into their own pages?
http://arc.nucapt.northwestern.edu/refbase/show.php?author=seidman
COinS is currently embedded in MANY more sites and used by many more applications than embedded RDF, so it is a good idea to support it anyway.
You are right that the absence of abstracts and keywords is a major limitation of COinS. And you are also right that the COinS parser for Zotero could still be improved.
Embedded RDF is described and sampled on http://research.talis.com/2005/erdf/wiki/Main/RdfInHtml
For example, if I have a page of pdf's of papers and the citations that go with them. I'd like Zotero users to be able to grab what they want, like the JSTOR scraper that grabs the papers as attachments with the bibliographic entry.
http://unapi.info/
Since the Zotero folks seem to be committed to supporting open standards, it is my hope that we'll eventually see unAPI support in Zotero. This would allow Zotero to grab any metadata, bibliographic format (Endnote, RIS, BibTeX, RDF, etc) and/or (PDF) file that's associated with a record displayed on an unAPI-enabled website. Using unAPI, Zotero would also be able to grab any given abstracts and keywords.
For anyone with a little web programming skills, implementation of the unAPI service is rather straightforward. The unAPI spec and help notes do all fit on a single page:
http://unapi.info/specs/
For people interested in implementing an unAPI service for their own site, I've written some info about an existing unAPI implementation which also includes some usage examples:
http://unapi.refbase.net
If many sites would support standard retrieval mechanisms such as COinS, embedded RDF and/or unAPI, this would also significantly reduce the need for the Zotero guys to develop site-specific scrapers (which will surely break at some point in time and will thus need permanent maintenance).
I too am interested in making our pages easily digestible for Zotero.
I notice from the "compatible standards and software" page (http://www.zotero.org/documentation/compatible_standards_and_software) that Zotero supports Dublin Core.
Does this mean that making a web page Zotero-compatible is just a case of adding "DC.Whatever" meta tags to the page?
(If this is the case it seems preferable to using the COinS system - I think additional meta tags should be relatively easy to automatically drop into page headers, but I'm sure I'm missing something as it reads as if the COinS method is generally preferred.)
Thanks,
Jim
Apologies if it's considered impolite to reply to your own messages on this forum, but I went ahead and tried my suggestion about adding some DC metadata to a webpage and Zotero seems to be able to grab it just fine.
I used the following subset of the Dublin Core which seems to work well:
<meta name="DC.Title" content="A webpage by Jim">
<meta name="DC.Creator" content="Jim">
<meta name="DC.Subject" content="stuff, not much">
<meta name="DC.Description" content="Just a bit of a test to see how Zotero grabs DC metadata">
<meta name="DC.Publisher" content="Jim">
<meta name="DC.Type" content="Text">
<meta name="DC.Format" content="text/html">
<meta name="DC.Language" content="en-GB">
<meta name="DC.Rights" content="Copyright Jim">
I think Zotero put the description metadata in its "Extra" field rather than "Abstract where I guess it should be, but other than that everything looked properly aligned.
I hope that's useful.
Regards,
Jim
It is by no means impolite to answer your own question, it adds to the collective wisdom here in these forums. I meant to comment before. If you want to add metadata to individual pages Dublin Core is in fact a great way to get the basics in. But keep in mind that Zotero will only grab one DC item from a page.
COinS works well if you want to add metadata for a list of items. So if you wanted to allow someone to capture a entire bibliography or several search results COinS is probably a better way to go.
I ask because the "Create New Item from Current Page" feature ignores them -- and what if, as in many pages, no other format of meta data exists?
The META HTML tag is probably considered deprecated or something, but it would (or should be) trivial to add some code to support them if they exist and no other similar meta tags exist.
As it is, for many pages I will be adding to Zotero, I will have to manually add information that Zotero should be adding itself.
On the other hand, I suppose it's easier to ignore/remove data than add it manually...
It is really easy in my own opinions. Just need to reply three requests from Zotero.
"I also can't get it to respond to DC.Type tags to recognize a page as anything other than a general text document."
Has anyone figured this out? I'd like to eventually get people to use COinS, but in the short term I think I'll have better luck with Dublin Core. But, no matter what I do, I can't get the type detected.
hAtom is easy to parse when storing a snapshot and offers all basic informations for an article list or an individual article.
Link to draft specification: http://microformats.org/wiki/hatom
Adding to the above, could someone clarify how I could override multiple descriptors in favour of COINS e.g. DC decriptors in the META HEAD with CoiNS from the content of a page.
Concretely this means:
On http://www.londonmobilelearning.net/#outputs.php?state=0 I have a nice bibliography generated with Zotero including CoiNS which I want to prefer and display in the address bar. In the META of the Website there are DC descriptors.
Now I want my Zotero to override the DC decriptors to recognise the COINS in order to display the proper icon and information in the address bar.
See for the difference the print URL without DC descriptors at http://www.londonmobilelearning.net/print.php?printurl=inc_books_issues.html
Thanks for your help
Klaus
In the short term, you can write a site-specific translator that calls the COinS translator explicitly. It will take priority over the RDF translator.
More generally, I would like to see the embedded metadata reordered. Right now, the priorities are:
RDF: 100
unAPI: 200
COinS: 300
This makes some sense, since RDF and unAPI are more expressive than COinS. That said, I think that when both RDF and unAPI or RDF and COinS are present on a single page, then we can likely assume that unAPI or COinS are likely to carrying the more relevant data. This is exacerbated by, as Klaus notes, the growing use of DC descriptors.
I'd like to see:
unAPI: 100
COinS: 200
RDF: 300
Ultimately, there should be some reworking of the translators system so that a document can be combine formats-- an annotated bibliography might be well-described via RDF, but its entries could use COinS. Zotero would then support saving both the bibliography and its constituent entries.
It turned out that it probably is the only way to create a site specific translator. I tried to remove most of the rdf data, but just the slightest trace of rdf prevents ZOTERO from recognising COINS
But still: Rock on and thumbs up for ZOTERO !
I continued playing around: I created a site specific translator for my site on the basis of the COINS translator, but RDF was still preferred over COINS.
Then I removed really all RDF data, keeping the site specific translator, but now DOI was preferred over COINS.
Now I guess that either my translator doesn't work or this is a real bug in Zotero.
Anyway I'll now post a feature request that asks to be able to put some code into the HEAD tag that tells Zotero ultimately which descriptor or translator to use. That can't be too hard.
Best, Klaus
There really ought to be a way to specify what metadata to look at, or to request that all the metadata is looked at. Maybe it would be reasonable to run DOI, unAPI, COinS and RDF on all pages? Or perhaps a single translator could be developed that tried to do all four in a single pass through the document?
I don't really see the point in creating a site-specific translator for my site as I do have valid and rich data in it already, and I guess many others have the same problem.
BTW, collecting all available data is a clever idea.
unAPI: 200
COinS: 300
Embedded RDF: 400
unAPI 100
COinS 200
RDF 300
DOI 400
This works for me, except that you apparently stripped your site of COinS, so your site still shows only RDF.
Until a combined embedded metadata translator arrives (and I think that's the only way to go long-term), such a rearranged set of priorities would cover common cases like yours.
Edit: Dan is right... I was confusing "RDF.js" with "Embedded RDF.js"
unAPI: 200
COinS: 300
DOI: 400
Embedded RDF: 500
Again, we (I) really need to make a single translator that combines the embedded data scrapes to do them all in one pass.