create a human-readable folder-structure from zotero-storage

Hi.

I am using several devices where I cannot (or wouldn't want to) run the zotero software. In my case this is an e-reader & a phone. To access my library on those devices it would be helpful, if zotero could at least export the storage to a human-readable folder-structure.
  • Nice. This seems to be what I was looking for. But I can't really get it to work for me. When i run:

    #!/usr/bin/perl
    use DBI;

    my $zoterostorage="/home/goiken/.mozilla/firefox/f20c4v0e.default/zotero";
    my $target="/home/goiken/Dropbox/lib";

    my $dbh = DBI->connect("dbi:SQLite:dbname=$zoterostorage/zotero.sqlite","","");

    # Query all PDF attachments

    my $sth = $dbh->prepare('SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia WHERE i.itemID=id1.itemID AND i.itemID=id2.itemID AND idv1.valueID=id1.valueID AND id1.fieldID=14 AND idv2.valueID=id2.valueID AND id2.fieldID=12 AND ia.sourceItemID=i.itemID AND ia.linkMode=1 AND ia.mimeType="application/pdf" AND i2.itemID=ia.itemID' ) or die "Couldn't prepare statement: " . $dbh->errstr;
    $sth->execute() or die "Couldn't execute statement: " . $sth->errstr;

    # Up to here this is a copy of the linked script

    @data = $sth->fetchrow_array();
    print @data;


    I get no return, from which i conclude that @data is empty.
  • edited January 29, 2011
    Interesting. The script works for me.

    The best way to troubleshoot is to try running the SQL query outside the script and see what you get. Execute this in the folder where the Zotero database is located at.

    echo 'SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia WHERE i.itemID=id1.itemID AND i.itemID=id2.itemID AND idv1.valueID=id1.valueID AND id1.fieldID=14 AND idv2.valueID=id2.valueID AND id2.fieldID=12 AND ia.sourceItemID=i.itemID AND ia.linkMode=1 AND ia.mimeType="application/pdf" AND i2.itemID=ia.itemID;' | sqlite3 zotero.sqlite
  • edited January 29, 2011
    When I do this i get:

    The program 'sqlite3' is currently not installed.

    I installed "sqlite" when I was setting up the script. After I install sqlite3 it executes without errors. When I then go on & run the full script again, I get:
    problem connecting to "localhost", port 4242: Connection refused at /usr/local/share/perl/5.10.1/MozRepl/Client.pm line 144 Sry: I ran the wrong script ;-)
    … I get nothing again.
  • So you do get output with the command line sqlite, but not when run inside the perl script?

    One thing that comes to my mind is that the SQL query in the perl script might need to be terminated with ; on linux but not on mac. (I assume you are on linux based on the /home directory)

    So this line

    my $sth = $dbh->prepare('SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia WHERE i.itemID=id1.itemID AND i.itemID=id2.itemID AND idv1.valueID=id1.valueID AND id1.fieldID=14 AND idv2.valueID=id2.valueID AND id2.fieldID=12 AND ia.sourceItemID=i.itemID AND ia.linkMode=1 AND ia.mimeType="application/pdf" AND i2.itemID=ia.itemID' ) or die "Couldn't prepare statement: " . $dbh->errstr;

    needs to be changed to

    my $sth = $dbh->prepare('SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia WHERE i.itemID=id1.itemID AND i.itemID=id2.itemID AND idv1.valueID=id1.valueID AND id1.fieldID=14 AND idv2.valueID=id2.valueID AND id2.fieldID=12 AND ia.sourceItemID=i.itemID AND ia.linkMode=1 AND ia.mimeType="application/pdf" AND i2.itemID=ia.itemID;' ) or die "Couldn't prepare statement: " . $dbh->errstr;

    Does it work if you do this change? If it does, I will update the original post accordingly.
  • edited January 29, 2011
    No.
    I get no output with both the command line sqlite and with the perl script. The additional ';' doesn't help either.

    I'm on Ubuntu 11.04, using FF 3.6.13 and zotero 2.09, if that helps…
  • If you know SQL, the best way to figure out why it does not work is to simplify the SQL query to see what goes wrong.

    I would start by first dropping all conditions from the SQL statement

    echo 'SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia;' | sqlite3 zotero.sqlite

    Then add gradually conditions to see where it breaks. E.g.

    echo 'SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia WHERE i.itemID=id1.itemID;' | sqlite3 zotero.sqlite

    echo 'SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia WHERE i.itemID=id1.itemID AND i.itemID=id2.itemID;' | sqlite3 zotero.sqlite
  • My first time using SQL. I can only gues what the querry actually means ;-).

    The first statement gives me this:

    $ echo 'SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia;' | sqlite3 zotero.sqlite
    Error: near line 1: database is locked

    I tried changing permissions of zotero.sqlite, but it didn't help.
  • When I disable the add on, I can execute the first three sql-queries and i get output that looks like this:
    Zotero Quick Start Guide|Zotero Quick Start Guide|ABCD2345|storage:1279666769962.pdf
    Zotero Quick Start Guide|Zotero Quick Start Guide|ABCD2345|storage:1279666769962.pdf
    Zotero Quick Start Guide|Zotero Quick Start Guide|ABCD2345|storage:1279666769962.pdf
    Zotero Quick Start Guide|Zotero Quick Start Guide|ABCD2345|storage:1279666769962.pdf
    Zotero Quick Start Guide|Zotero Quick Start Guide|ABCD2345|storage:1279666769962.pdf

    (Each item comes up ~100 times)
    The full query still doesn't produce any output for me.
  • edited January 29, 2011
    I think I found the place, where it hangs up:

    This works:
    echo 'SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia WHERE i.itemID=id1.itemID AND i.itemID=id2.itemID AND idv1.valueID=id1.valueID AND id1.fieldID=14 AND idv2.valueID=id2.valueID AND id2.fieldID=12 AND ia.mimeType="application/pdf" AND i2.itemID=ia.itemID AND ia.sourceItemID=i.itemID;' | sqlite3 zotero.sqlite

    and gives output that looks like this:
    1993-00-00 1993|Parliamentary Affairs|KQAB2RXR|storage:333.pdf
    1983-00-00 1983|Geometriae Dedicata|ECWJG6VT|storage:fulltext(14).pdf
    1992-00-00 1992|PROCEEDINGS-IEEE|DEFB7VMG|storage:shermer1992.pdf


    This one does not work: (Runs, but no output)
    echo 'SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia WHERE i.itemID=id1.itemID AND i.itemID=id2.itemID AND idv1.valueID=id1.valueID AND id1.fieldID=14 AND idv2.valueID=id2.valueID AND id2.fieldID=12 AND ia.mimeType="application/pdf" AND i2.itemID=ia.itemID AND ia.sourceItemID=i.itemID AND ia.linkMode=1;' | sqlite3 zotero.sqlite

    The Difference is in the last "AND ia.linkMode=1". So what does that mean?
  • If I change the script to

    my $sth = $dbh->prepare('SELECT idv1.value, idv2.value, i2.key, ia.path FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2, items i, items i2, itemAttachments ia WHERE i.itemID=id1.itemID AND i.itemID=id2.itemID AND idv1.valueID=id1.valueID AND id1.fieldID=14 AND idv2.valueID=id2.valueID AND id2.fieldID=12 AND ia.sourceItemID=i.itemID AND ia.mimeType="application/pdf" AND i2.itemID=ia.itemID;' ) or die "Couldn't prepare statement: " . $dbh->errstr;
    $sth->execute() or die "Couldn't execute statement: " . $sth->errstr;

    It runs fine. *happy* Now I only need to make copies instead of sym-links and see how i get the structure to be

    /$target/author/title_from_zotero.pdf
  • I got the copy-part and I think I know how I would change the target path, but I have still no Idea how I get the "author" and the "title" from the database.
  • The linkmode-column tells if the item is an external item that is linked to zotero or an imported item. Basically it tells where the file is located. (If I remember correctly) I was assuming that no-one wants to copy/link files that are not inside Zotero's data directory.

    Why do you want to make copies of the same file? If you do not like the symlinks, change the symlink function to link. That way you will get hard links and don't waste disk space with duplicate files.
  • I want to include this script in a piece, that syncs external devices for me. (Reader, phone) Hence I need Copies instead of sym-links.

    Can you tell me how I get the author and the title from an Item out of the database?
  • If your DropBox folder is on the same filesystem as your Zotero data directory, hard links are better. They will sync to other DropBox devices like normal files. (I assume that you know what hard links are.)

    The field IDs (e.g. id1.fieldID=14) tell which fields the script retrieves from the database. I do not remember how the mapping works, but it is recorded somewhere in the database.
  • One way to "reverse engineer" the field type IDs is to run the following query

    echo "select * from itemData, itemDataValues where itemData.valueId=itemDataValues.valueId;" | sqlite3 zotero.sqlite

    The second column in the listing is the field id and last column is content. You can probably figure out which ID is the date and which is the author by looking at the field content.
  • From this output I can figure that the title-ID is "110". But I can't find any authors in this output.

    How do i make the SQL-query if I know the IDs?
  • You can just change the IDs 12 and 14 in the original query to what you like.

    Authors are actually stored in a different table called creatorData. I will take a look at that and post back shortly.

    If I knew how to run a script every time an item is edited/added/deleted in Zotero, I would make this little script a plugin.
  • Here is a commented version of the SQL script


    'SELECT
    /* Any fields that we use to buld the path name from */
    idv1.value, idv2.value,
    /* fields that are needed to locate the file inside zotero data directory */
    i2.key, ia.path
    /* Tables needed for the fields that we use for path name */
    FROM itemDataValues idv1, itemDataValues idv2, itemData id1, itemData id2,
    /* Tables for parent item, child item and attachment data for the child item */
    items i, items i2, itemAttachments ia
    WHERE
    /* Link the first field that we need for path name */
    idv1.valueID=id1.valueID AND id1.fieldID=14 AND i.itemID=id1.itemID
    /* Link the second field that we need for path name */
    AND idv2.valueID=id2.valueID AND id2.fieldID=12 AND i.itemID=id2.itemID
    /*Links the attachment to the parent item*/
    AND ia.sourceItemID=i.itemID AND i2.itemID=ia.itemID
    /* Only items stored within Zotero */
    AND ia.linkMode=1
    /* Only PDFs */
    AND ia.mimeType="application/pdf";'


    You can can get the author data by changing the query like this

    'SELECT
    /* Any fields that we use to buld the path name from */
    idv1.value, cd.lastName,
    /* fields that are needed to locate the file inside zotero data directory */
    i2.key, ia.path
    /* Tables needed for the fields that we use for path name */
    FROM itemDataValues idv1, itemData id1, creators c, creatorData cd, itemCreators ic,
    /* Tables for parent item, child item and attachment data for the child item */
    items i, items i2, itemAttachments ia
    WHERE
    /* Link the first field that we need for path name */
    idv1.valueID=id1.valueID AND id1.fieldID=14 AND i.itemID=id1.itemID
    /* First author */
    AND i.itemID=ic.itemID AND ic.creatorID=c.creatorID AND cd.creatorDataID=c.creatorDataID AND ic.orderIndex=0
    /*Links the attachment to the parent item*/
    AND ia.sourceItemID=i.itemID AND i2.itemID=ia.itemID
    /* Only items stored within Zotero */
    AND ia.linkMode=1
    /* Only PDFs */
    AND ia.mimeType="application/pdf";'
  • edited January 29, 2011
    Thank you so much!

    This is the winning candidate for the particular problem, in case anybody else's looking for it. (again I had to comment the "ia.linkMode=1". wouldn't give me output if i don't)

    echo 'SELECT /* Any fields that we use to buld the path name from */ idv1.value, cd.lastName, /* fields that are needed to locate the file inside zotero data directory */ i2.key, ia.path /* Tables needed for the fields that we use for path name */ FROM itemDataValues idv1, itemData id1, creators c, creatorData cd, itemCreators ic, /* Tables for parent item, child item and attachment data for the child item */ items i, items i2, itemAttachments ia WHERE /* Link the first field that we need for path name */ idv1.valueID=id1.valueID AND id1.fieldID=110 AND i.itemID=id1.itemID /* First author */ AND i.itemID=ic.itemID AND ic.creatorID=c.creatorID AND cd.creatorDataID=c.creatorDataID AND ic.orderIndex=0 /*Links the attachment to the parent item*/ AND ia.sourceItemID=i.itemID AND i2.itemID=ia.itemID /* Only items stored within Zotero AND ia.linkMode=1 Only PDFs */ AND ia.mimeType="application/pdf";'| sqlite3 zotero.sqlite
Sign In or Register to comment.