Export tree of library

laurence80386 · October 13, 2016

Hi all

I've been thinking about how to visualise my Zotero library and I wondered if it's possible to export it in a reasonable format (say JSON) while retaining its hierarchy, to enable importing into visualisation (mind-mapping) tools.

I've played around with the exporting options, but they seem only to work with one level. I suppose doing it programmatically through the API might be sensible, but I thought I'd ask here in case anyone has achieved this kind of thing already.

Thanks!

adamsmith · October 13, 2016

check the Zotero RDF exporter, which does export the collection hierarchy.
I do want to add JSON-LD support for Zotero and I think (but am not certain) that'd include the collection hierarchy, too, but it's quite ambitious, so not going to give an eta.

laurence80386 · October 13, 2016

Thanks, I just came across that as I posted. What's the best way to parse it? I tried playing around with Gephi but it seems to come unstuck on the vocabulary Zotero RDF uses.

adamsmith · October 13, 2016

I don't think any tool would be able to parse this into a visualization natively, sorry. There'd have to be some scripting involved. That's also partly due to the fact that the hierarchy in Zotero can't be easily represented in a simple hierarchical data structure, since the same item can belong to multiple collections (i.e. you're looking at a many-to-many relationship, whereas classic hierarchies can only depict one-to-many)

laurence80386 · October 14, 2016

Thanks for the pointers. I'll have a go at mashing up the Zotero RDF and XML translators to generate a mindmap file -- they're pretty simple XML, so it should be possible. I'll tackle the problem of items in multiple collections later!

Gurdas_Sandhu · October 14, 2016

I'm curious about what can be accomplished with a mind-map, and if what you develop can be shared? Do you have any screenshots or cartoons showing what a Zotero library mind-map would look like? I've occasionally used the Timeline function, but not for serious digging in.

A map of connections between Tags would be nice, but may be too complex?

laurence80386 · October 14, 2016

Initially I just want to visually map the hierarchy of my collections, so I can quickly see the titles which relate to a particular folder. Tags would be the next step, as would the 'related items' options in Zotero. How easy the latter two are will depend on the richness of the internal javascript API, although I suppose that could be extended if necessary.

The citation manager Docear is built around the idea of a mindmap -- I think it's a really intuitive way to get a sense of one's library, although Docear itself isn't a particularly great piece of software and can't really compete with Zotero in terms of functionality.

If I manage to put something together that's workable I'll of course make it available and submit to Zotero. I'm doing my Ph.D. though, so it all depends on what time I can commit to it :)

laurence80386 · October 15, 2016

Okay so I've managed to output some XML that's hierarchical and compatible with Freemind -- here's a mind map of my PhD literature (zoomed out of course, there are ~400 papers on there!):

https://s16.postimg.org/i2xul9pw5/Capture.png

The code is very rough at the moment, but it works. It needs to be refactored according to the DRY principle.

I've added in some bits that are specifically for my workflow (colouring of tags that I use, and icons for read/unread) so I should probably strip that stuff out, or make it sufficiently generic that it can apply to other people's workflows.

Watch this space...

laurence80386 · October 17, 2016

I've done a bit (a lot) of refactoring and now I have a nested JSON export, from which I'll build the XML.

It only has the fields I'm interested in though (title, contributors, year, and tags), since mind maps don't need more than that.

It would be possible to add in the code for the other fields from other translators to extend it in order that it's a full export.

I'll upload the translator as it stands to bitbucket in the next few days.

laurence80386 · October 17, 2016

Here's the hierarchical JSON translator. As mentioned above, it doesn't include all the metadata, so will need to be extended to be a "true" export.

https://bitbucket.org/laurenced85/zotero-translators/raw/master/Hierarchical JSON.js

EDIT: repo has been moved to https://github.com/melat0nin/zotero-translators, and this translator to https://raw.githubusercontent.com/melat0nin/zotero-translators/master/Hierarchical JSON.js

Gurdas_Sandhu · October 18, 2016

Thanks for sharing this. How do I use the .js script?

adamsmith · October 18, 2016

it's just a Zotero translator. You place it in the translators directory of the Zotero data folder (https://www.zotero.org/support/zotero_data), restart Zotero and it'll show up when you use right-click --> Export

dragonfly · October 18, 2016

@laurence.diver Thank you for that translator script as a baseline for developing other ideas for navigating Zotero libraries.

I had been trying to parse the export output files from existing translators (e.g. Zotero RDF) not realising that a custom translator can be written and just dropped into the translators folder in the manner described.

I'm a beginner in developing a translator. One point which still puzzles me is why there are many (400+) translators in the translators folder but only a few (18) seen in the export dropdown menu. And why is there a zipped translators folder in zotero.jar?

But moving on ...

I found this reference to translators.

https://www.zotero.org/support/dev/how_to_write_a_zotero_translator_plusplus

I was pursuing the suggestion of @adamsmith to parse Zotero RDF export.
In fact I can do this to some degree by using EasyRDF converter.

http://www.easyrdf.org/

and

http://www.easyrdf.org/converter

But your starter translator script now makes it much easier.

My aim is to add interactivity to individual library nodes (add events).
Then visualise tags and relations between nodes (i.e. beyond a simple hierarchical tree structure).

Here is a basic example of what I have in mind using d3.js

http://bl.ocks.org/d3noob/8375092

But this is too basic for showing inter relationships of nodes.

We then move on to visualising "many-to-many" relationships from a Zotero library.

Here is an example.

http://www.global-migration.info/

and

https://github.com/d3/d3/wiki/tutorials

...

Incidentally I have now learned to embed zotero.debug() into translator under development.

And to test js validity here .... http://www.jslint.com/

adamsmith · October 18, 2016

(unfortunately there is no good manual for writing translators. https://www.zotero.org/support/dev/how_to_write_a_zotero_translator_plusplus is clumsy and also outdated. Best I can recommend is https://www.zotero.org/support/dev/translators/coding in combination with looking at existing translators )

laurence80386 · October 19, 2016

@dragonfly

Glad you worked it out. This is precisely the sort of visualisation I want to achieve.

The question of the tree depends on one's usage of Zotero -- I tend to categorise everything in deeply-nested collections (including having the same paper in more than one), so the tree approach works to an extent for me. Others have a flatter structure and rely more on tags, in which case this basic output won't be that helpful.

Ideally the end result will be to allow each of these options, building as you say on this basic initial output. I would envisage showing tag relations and, perhaps (for me) more importantly, the related items option.

The repository is public, so if you fancy contributing please feel free. (Perhaps I should move it to github, for better publicity?)

dragonfly · October 19, 2016

I'm still experimenting on how to create collection children nodes in exported json to be compatible with D3.js json input as posted earlier.

i.e. to place child collections in an array. "children": [{ ... }]

Meanwhile, to "beautify" the exported JSON I'm using this.


    /* beautify by using stringify obj with extra options */
    /* idea from here ... http://stackoverflow.com/questions/2614862/how-can-i-beautify-json-programmatically */
    Zotero.write(JSON.stringify(collections, null, 4));

laurence80386 · October 19, 2016

That's a great addition.

I've migrated the repo to Github, you can see it here: https://github.com/melat0nin/zotero-translators

Please submit some pull requests as you improve on the original script :)

Gurdas_Sandhu · October 19, 2016

I saved the .js file in the translators folder but it fails to appear in the "Format: " drop menu menu under Export. I have 17 items in that menu starting with BibLaTeX and ending with Zotero RDF.

@dragonfly most of the translators are used while importing items in to Zotero. For example, the Wikipedia translator gets used when you save a Wikipedia pade. The translator tells Zotero how to grab the metadata of the page to create an item in Zotero.

laurence80386 · October 19, 2016

@gurdas did you restart Zotero? It scans the translators directory on start-up.

Also, I only use Standalone, so I can't speak to the behaviour of the Firefox addon (although I expect it is the same).

dragonfly · October 19, 2016

@gurdas I am learning more about translators as I read the docs.

One wild idea I have parked is to use Zotero to extract training corpus to be injected into IBM Watson. i.e. the training corpus is a collection of items in Zotero. About 500-1000 samples of scraped (and annotated) text in the domain would be needed. Does this mean that I have to create translators to scrape the training corpus in the knowledge domain?

I accept that this usage is straying away from Zotero as citation manager but it seems a reasonable use of Zotero to capture and manage each training corpus.

laurence80386 · October 19, 2016

One thing I'd like to achieve (in an ideal world), which is sort of related to @dragonfly 's idea, would be a recommendation engine that queries Google Scholar using the existing references in a library, suggesting new ones based on what's already there.

This guy recently did a similar thing using totally different tech. His approach, although somewhat overcomplicated I think, is v interesting and the outcome would be incredibly useful: https://mystudentvoices.com/scraping-google-scholar-to-write-your-phd-literature-chapter-2ea35f8f4fa1#.7i94whml2

His code is here: https://github.com/jimmytidey/bibnet-google-scholar-scraper. It might be feasible to adapt some of his Scholar-querying code, although I've not looked at it in any depth.

adamsmith · October 19, 2016

Related/potentially of interest: http://papermachines.org/

Also, of course, many of Zotero's competitors like Mendeley offer recommendations.

laurence80386 · October 19, 2016

Yeah I played with Paper Machines but I couldn't get any output that was actually of use to me :)

On the recommendation front I feel that's only major feature Zotero lacks; if it wasn't for (almost) everything else about it being superior that would be a dealbreaker.

dragonfly · October 19, 2016

I have also tried papermachines plugin but it does not meet my requirements.

.. in an ideal world .. would be a recommendation engine that queries Google Scholar using the existing references in a library, suggesting new ones based on what's already there.

That scenario matches what I have in mind, but using IBM Watson cognitive processing engine to query the knowledge base.

"Zotero meets Watson".

Searching this forum I couldn't find any reference to Watson.

The "existing references" (above scenario) would represent the Watson AlchemyAPI training corpus.

However one missing step is a front end scraped text annotator which in IBM is called Watson Knowledge Studio (WKS).

This can be expensive though to use (after a free 30 days trial) and another avenue of research I'm exploring is looking for open source annotators which might be used instead of IBM WKS.

Here is an expansion of my scenario.

Create a free user account at Bluemix. https://new-console.ng.bluemix.net/

Note: I have done that already, but after 30 days your credit card details are required to continue. However you can stay within free usage if you are careful. Do not ask for Technical Support since that would be charged to your credit card account.

Look at Catalog site.

Go down list services to Watson > AlchemyAPI.

An AlchemyAPI service that analyzes your unstructured text and image content.

Stay within IBM AlchemyAPI Free Plan.

An AlchemyAPI key is required .. that is free.

So the end goal (in theory) is to use Zotero to generate the training corpus.
Annotate the scraped text from the training corpus. Using an open source annotator.
Fire the training corpus to AlchemyAPI to train Watson in the field of research.
Pose questions to IBM Watson in natural language.

i.e. Watson is the "recommendation engine".

I reiterate that one has to be careful not to incur charges on your credit card account. Stay within Free Plan and do not raise any IBM Support Tickets since there is a minimum charge.

myeBR · September 22, 2017

Sorry for the native question, I'm a new learner. I would like to know how to use these codes? Is there any plug in that I can use？ Mind map with tags are extremely useful! Thank you for all the reply!

sheldrake · December 20, 2017

Thanks for all the work here.

I just downloaded freemindv2.js and saved it to my translators for Zotero standalone on Mac. I close Zotero. I open Zotero and Export and the .js file is deleted from Finder right before my eyes.

Tried the same for freemind.js. Ditto.

Any help gratefully received. Thanks.

sheldrake · December 21, 2017

I think I've solved my own problem.

I had been going to this page and Save Link As ...
https://github.com/melat0nin/zotero-translators/find/master

However, on opening the file in Atom I noticed that it didn't correspond to the code on this page:
https://github.com/melat0nin/zotero-translators/blob/master/Freemindv2.js

Once I used the latter, the export worked.

However, importing to FreeMind brings up the notification that "the mind map you are trying to open was created with an older version of FreeMind, stored in an old format." It offers to convert to the new format and open.

On doing so, all I get is: "Error while parsing file:freemind.main.XMLParseException: XML Parse Exception during parsing of the XML definition at line 1: Unexpected end of data reached."

ruzz · July 17, 2018

Hi! Has anyone found any solution that works with the standalone version?

emilianoeheyns · July 21, 2018

BBTs debug translator (BetterBibTeX JSON) outputs pretty complete references in JSON format, which I'd argue is easier to read and parse than XML. Needs BBT installed though, the translator does not work without it.

laurence80386 · May 20, 2020

Been trying to look at this again and it seems Zotero.nextCollection() (or Z.nextCollection() in v5?) no longer works.

Can a dev confirm?

emilianoeheyns · May 20, 2020

Zotero.nextCollection must work or the BBT tests would fail.