Subset a library by export/import

I want to reduce my zotero library size. It's got pretty big because I've used Zotero since the beginning, and there's a lot of crap in there (duplicates and spurious items resulting from tests, imports, different usage patterns over time etc). This makes zotero slower than it might be, but, more importantly, makes my search sets large and somewhat redundant.

I could go through manually, but that would take longer than I have time for now. My original idea had been to start with a new library and just export/re-import the particular collections I'm most interested in for current purposes. But the RDF export/import (which I wanted to use because I wanted to carry my storage files over too) seems very fragile. If the export doesn't break (which it usually does), the import does.

Any other ideas for how I might achieve this?
  • How does Zotero RDF export break? If you have test cases of Zotero RDF imports that fail, send them to support@zot....org.

    At least for duplicate detection, you might want to hold on a bit longer, as there's already (rudimentary, merge-less, and therefore hidden) duplicate detection support in the trunk, which should be improved in the near future.
  • CB
    edited April 26, 2009
    A typical export breakage is results in an empty RDF file, and the error:

    [JavaScript Error: "[Exception... "Component returned failure code: 0x80520008 (NS_ERROR_FILE_ALREADY_EXISTS) [nsILocalFile.create]" nsresult: "0x80520008 (NS_ERROR_FILE_ALREADY_EXISTS)" location: "JS frame :: chrome://zotero/content/xpcom/translate.js :: anonymous :: line 2069" data: no]" {file: "file:///D:/crispin/content/zotero/translators/Zotero%20RDF.js" line: 0}]
    Looking at that, I wonder if the problem is that I'm exporting collections with the same items in one or more subcollection? Perhaps I could flatten the collection out, and then export that without error? I'll check this out.

    I've only had a couple of examples of imports breaking (because exports generally fail). They 'break' by hanging -- the progress bar just stays there. I've left it overnight. When I eventually kill FF and restart, there is a new empty import collection.

    Edit: I look forward to the duplicate detection. But I want to clean up my library more generally, as Zotero's become pretty central to the way I work.
  • I wonder if the problem is that I'm exporting collections with the same items in one or more subcollection?
    Yup, that was it. If a linked file (not an imported file) appeared in two subcollections, it threw an error on export. Fixed on the trunk—thanks. Let us know if you find any other examples.
  • The "unfiled" search condition, once it's implemented, might also help you avoid using export/import.
  • Thanks Dan.

    I'm sending an RDF that hangs on import on to support. My upload's very slow, so it will take a while ...
  • Having spent a little while using a cut-back version of my old library for a recent project, I think a robust way of subsetting a library would be very useful, particularly for those of us who use relatively slow computers (there are many of these left in the world).

    My new slimmer zotero library is dramatically faster to use than my original one, particularly for adding new items. I hadn't really noticed how I had developed a habit of being reluctant to add new pdfs because I didn't want to break the flow of work with the long wait.

    Of course the problem is that now my online zotero library has only the small set of collections I'm working with now.

    Actually this harks back to a discussion I vaguely remember here from a long time ago, when a couple of us asked for the ability to easily switch libraries. You guys weren't keen, for what I thought were defensible reasons. But now I think a more flexible way to prune and split items between libraries, and work with multiple libraries, would be most useful.
Sign In or Register to comment.