Export tree of library
Hi all
I've been thinking about how to visualise my Zotero library and I wondered if it's possible to export it in a reasonable format (say JSON) while retaining its hierarchy, to enable importing into visualisation (mind-mapping) tools.
I've played around with the exporting options, but they seem only to work with one level. I suppose doing it programmatically through the API might be sensible, but I thought I'd ask here in case anyone has achieved this kind of thing already.
Thanks!
I've been thinking about how to visualise my Zotero library and I wondered if it's possible to export it in a reasonable format (say JSON) while retaining its hierarchy, to enable importing into visualisation (mind-mapping) tools.
I've played around with the exporting options, but they seem only to work with one level. I suppose doing it programmatically through the API might be sensible, but I thought I'd ask here in case anyone has achieved this kind of thing already.
Thanks!
I do want to add JSON-LD support for Zotero and I think (but am not certain) that'd include the collection hierarchy, too, but it's quite ambitious, so not going to give an eta.
A map of connections between Tags would be nice, but may be too complex?
The citation manager Docear is built around the idea of a mindmap -- I think it's a really intuitive way to get a sense of one's library, although Docear itself isn't a particularly great piece of software and can't really compete with Zotero in terms of functionality.
If I manage to put something together that's workable I'll of course make it available and submit to Zotero. I'm doing my Ph.D. though, so it all depends on what time I can commit to it :)
https://s16.postimg.org/i2xul9pw5/Capture.png
The code is very rough at the moment, but it works. It needs to be refactored according to the DRY principle.
I've added in some bits that are specifically for my workflow (colouring of tags that I use, and icons for read/unread) so I should probably strip that stuff out, or make it sufficiently generic that it can apply to other people's workflows.
Watch this space...
It only has the fields I'm interested in though (title, contributors, year, and tags), since mind maps don't need more than that.
It would be possible to add in the code for the other fields from other translators to extend it in order that it's a full export.
I'll upload the translator as it stands to bitbucket in the next few days.
https://bitbucket.org/laurenced85/zotero-translators/raw/master/Hierarchical JSON.js
EDIT: repo has been moved to https://github.com/melat0nin/zotero-translators, and this translator to https://raw.githubusercontent.com/melat0nin/zotero-translators/master/Hierarchical JSON.js
I had been trying to parse the export output files from existing translators (e.g. Zotero RDF) not realising that a custom translator can be written and just dropped into the translators folder in the manner described.
I'm a beginner in developing a translator. One point which still puzzles me is why there are many (400+) translators in the translators folder but only a few (18) seen in the export dropdown menu. And why is there a zipped translators folder in zotero.jar?
But moving on ...
I found this reference to translators.
https://www.zotero.org/support/dev/how_to_write_a_zotero_translator_plusplus
I was pursuing the suggestion of @adamsmith to parse Zotero RDF export.
In fact I can do this to some degree by using EasyRDF converter.
http://www.easyrdf.org/
and
http://www.easyrdf.org/converter
But your starter translator script now makes it much easier.
My aim is to add interactivity to individual library nodes (add events).
Then visualise tags and relations between nodes (i.e. beyond a simple hierarchical tree structure).
Here is a basic example of what I have in mind using d3.js
http://bl.ocks.org/d3noob/8375092
But this is too basic for showing inter relationships of nodes.
We then move on to visualising "many-to-many" relationships from a Zotero library.
Here is an example.
http://www.global-migration.info/
and
https://github.com/d3/d3/wiki/tutorials
...
Incidentally I have now learned to embed zotero.debug() into translator under development.
And to test js validity here .... http://www.jslint.com/
Glad you worked it out. This is precisely the sort of visualisation I want to achieve.
The question of the tree depends on one's usage of Zotero -- I tend to categorise everything in deeply-nested collections (including having the same paper in more than one), so the tree approach works to an extent for me. Others have a flatter structure and rely more on tags, in which case this basic output won't be that helpful.
Ideally the end result will be to allow each of these options, building as you say on this basic initial output. I would envisage showing tag relations and, perhaps (for me) more importantly, the related items option.
The repository is public, so if you fancy contributing please feel free. (Perhaps I should move it to github, for better publicity?)
i.e. to place child collections in an array. "children": [{ ... }]
Meanwhile, to "beautify" the exported JSON I'm using this.
/* beautify by using stringify obj with extra options */
/* idea from here ... http://stackoverflow.com/questions/2614862/how-can-i-beautify-json-programmatically */
Zotero.write(JSON.stringify(collections, null, 4));
I've migrated the repo to Github, you can see it here: https://github.com/melat0nin/zotero-translators
Please submit some pull requests as you improve on the original script :)
@dragonfly most of the translators are used while importing items in to Zotero. For example, the Wikipedia translator gets used when you save a Wikipedia pade. The translator tells Zotero how to grab the metadata of the page to create an item in Zotero.
Also, I only use Standalone, so I can't speak to the behaviour of the Firefox addon (although I expect it is the same).
One wild idea I have parked is to use Zotero to extract training corpus to be injected into IBM Watson. i.e. the training corpus is a collection of items in Zotero. About 500-1000 samples of scraped (and annotated) text in the domain would be needed. Does this mean that I have to create translators to scrape the training corpus in the knowledge domain?
I accept that this usage is straying away from Zotero as citation manager but it seems a reasonable use of Zotero to capture and manage each training corpus.
This guy recently did a similar thing using totally different tech. His approach, although somewhat overcomplicated I think, is v interesting and the outcome would be incredibly useful: https://mystudentvoices.com/scraping-google-scholar-to-write-your-phd-literature-chapter-2ea35f8f4fa1#.7i94whml2
His code is here: https://github.com/jimmytidey/bibnet-google-scholar-scraper. It might be feasible to adapt some of his Scholar-querying code, although I've not looked at it in any depth.
Also, of course, many of Zotero's competitors like Mendeley offer recommendations.
On the recommendation front I feel that's only major feature Zotero lacks; if it wasn't for (almost) everything else about it being superior that would be a dealbreaker.
"Zotero meets Watson".
Searching this forum I couldn't find any reference to Watson.
The "existing references" (above scenario) would represent the Watson AlchemyAPI training corpus.
However one missing step is a front end scraped text annotator which in IBM is called Watson Knowledge Studio (WKS).
This can be expensive though to use (after a free 30 days trial) and another avenue of research I'm exploring is looking for open source annotators which might be used instead of IBM WKS.
Here is an expansion of my scenario.
Create a free user account at Bluemix. https://new-console.ng.bluemix.net/
Note: I have done that already, but after 30 days your credit card details are required to continue. However you can stay within free usage if you are careful. Do not ask for Technical Support since that would be charged to your credit card account.
Look at Catalog site.
Go down list services to Watson > AlchemyAPI. Stay within IBM AlchemyAPI Free Plan.
An AlchemyAPI key is required .. that is free.
So the end goal (in theory) is to use Zotero to generate the training corpus.
Annotate the scraped text from the training corpus. Using an open source annotator.
Fire the training corpus to AlchemyAPI to train Watson in the field of research.
Pose questions to IBM Watson in natural language.
i.e. Watson is the "recommendation engine".
I reiterate that one has to be careful not to incur charges on your credit card account. Stay within Free Plan and do not raise any IBM Support Tickets since there is a minimum charge.
I just downloaded freemindv2.js and saved it to my translators for Zotero standalone on Mac. I close Zotero. I open Zotero and Export and the .js file is deleted from Finder right before my eyes.
Tried the same for freemind.js. Ditto.
Any help gratefully received. Thanks.
I had been going to this page and Save Link As ...
https://github.com/melat0nin/zotero-translators/find/master
However, on opening the file in Atom I noticed that it didn't correspond to the code on this page:
https://github.com/melat0nin/zotero-translators/blob/master/Freemindv2.js
Once I used the latter, the export worked.
However, importing to FreeMind brings up the notification that "the mind map you are trying to open was created with an older version of FreeMind, stored in an old format." It offers to convert to the new format and open.
On doing so, all I get is: "Error while parsing file:freemind.main.XMLParseException: XML Parse Exception during parsing of the XML definition at line 1: Unexpected end of data reached."
Can a dev confirm?