Any safe way to make global edits now?

I don't know a lot about databases, but the fact that Zotero uses SQLite, which has other clients, makes me wonder whether there wouldn't be a way to 'clean up' some of my entries even now (before global editing functionality is officially added). I know that messing with your database apart from the Zotero API is discouraged, but is it also stupid?

My questions:

(1) Is there a SQLite client which would let me do search-and-replace edits within certain fields (to remove trailing punctuation titles, Publisher and place; remove database URIs from the URL field; add a period after lonely first and middle initials)

(2) How high are the chances of messing up my data in a way that wouldn't be immediately obvious? Of course I would backup thoroughly and then test my resulting database with the database integrity checker.
  • edited September 18, 2007
    If you're an advanced user who wants to get their hands dirty, the Zotero API should still allow you to make these kinds of edits & you might ask questions or describe limitations about the API (as that will help everyone). An additional benefit of using the API is that your edits can be more easily repeated in the future & you might be able to share the utility if others would find it useful.

    If you really wish to eschew javascript, other apps are certainly technically able to edit the database. However, they aren't supported by the Zotero project & you should make sure you backup your database before using them & realize that mistakes can screw up your database.
  • edited September 18, 2007
    Thanks. Unfortunately the startup cost of learning javascript is probably a little too much for me just now, in the middle of a dissertation.

    Reading the doc page more carefully, I see the following.
    And even if Firefox is shut down before accessing the file, modifying the database directly bypasses the data validation and referential integrity checks performed by Zotero that are required for Zotero to function properly (and which aren’t provided by SQLite itself)
    Should I take this to mean that even, say, search-and-replace type edits performed, say, via the SQL Database Browser would be a Pretty Bad Idea? Is it that easy to corrupt a database? And if so, how about my second question above. How easy is it to mess something up and just not know about it for a while (long enough to do a significant amount more work) which would be lost if I revert to a backup?

    I would rather use the Zotero API as well as Make Something Reusable, but I won't manage it, I'm afraid. And I'd like to clean up my data (1100 records or so, with various artifacts of importing). Thanks again for all your work on this.
  • Should I take this to mean that even, say, search-and-replace type edits performed, say, via the SQL Database Browser would be a Pretty Bad Idea?
    Here's what that warning means:

    1) SQLite doesn't provide a way of enforcing that, say, a row in the itemAttachments table has a corresponding row in the items table. (Technically speaking: it doesn't enforce foreign key constraints.) However, the schema definitions do include the foreign key definitions, so by looking at the schema you can see what precautions you'd need to take. (Deleting a row in items would be an example of a Bad Idea, since 5 or so other tables reference items(itemID)).

    Similarly, itemData references rows in itemDataValues, so deleting rows in itemDataValues without deleting the corresponding rows in itemData would be Bad. (And no, these things aren't going to corrupt your database, per se, which is all the integrity checker checks—they just might break Zotero in weird and non-immediate ways and require you to go back into the database and delete remaining rows that reference the deleted rows, and the necessary steps might not be clear from the errors messages or odd behavior in Zotero.)

    2) Zotero does certain cleanup steps when it removes things. For example, after removing a row from itemData, it deletes any rows from itemDataValues that are no longer used. You run the risk of skipping these steps when you do things manually. That's not necessarily terrible, and those cleanup steps might happen automatically the next time you remove things in Zotero, but there's a chance you could leave some unused data behind if you're not careful.

    3) Certain fields have to conform to particular formats. Off the top of my head this is limited to the Date field, but there may be others. However, this is less of an issue when removing fields (as opposed to adding fields, which I very much wouldn't recommend doing outside of Zotero).

    Aside from the above issues, you'd need a decent grasp of SQL and our schema to do it right. For example, if you try to change a field by changing a value in itemDataValues, you might hit a primary key constraint on another row, and you'd need instead to delete the incorrect row and change itemData rows pointing to the deleted row to point to the other row with the correct value. Et cetera.

    It's also worth noting that SQLite isn't the most featureful database and lacks various things that you might want/need, like advanced string functions or built-in regex support. (You can add regex support via a user-defined function, and some third-party SQLite tools might provide this, but I'm not aware of any.)

    You're probably on your own other than that.
  • Thanks very much for this, Dan. I'll post here if I have any sucess.
Sign In or Register to comment.