Library Maintenance questions

Hello. I have been using Zotero for quite some time and have a fairly large library. So far everything is working fine but I have a difficulty with both the size of the library and particular content. The difficulty comes while I am backing up the library over my network. I am using a backup utility called Allway Sync that allows for scheduled backups for individual folders and programs. Once Allway Sync creates the first image of the folder then all further backups are just for the new information. Works fine except with Zotero folder. In looking at the file backup log the number of files exceeds 500,000. Why so many? I realized that the bulk of the files come from the Snapshot that Zotero saves from any journal article webpage. I have a couple of questions. Is there some way to limit what is saved locally in a snapshot? Can I stop saving a snapshot of the webpage and instead just retain the URL and any PDF file download? Finally is there any way to clean the zotero database and eliminate the folders associated with the snapshots and in turn delete the snapshot from the zotero records? I figured that this will reduce the library file number by over two-thirds. Thanks very much.
  • 1. Is there some way to limit what is saved locally in a snapshot?

    No, not at this time I'm afraid, sorry. Some thought has gone into how the snapshot feature could be made leaner (websites were much smaller when it was introduced), but I don't think there's anything that'll happen very soon

    2. Can I stop saving a snapshot of the webpage and instead just retain the URL and any PDF file download?

    Yes, there's an "Automatically attach Snapshots" option in the general tab of the Zotero preferences that should do this for most cases.

    3. Finally is there any way to clean the zotero database and eliminate the folders associated with the snapshots and in turn delete the snapshot from the zotero records?

    Advanced search:
    match all
    Attachment File Type -- is --> Web Site
    Title -- contains -- Snapshot

    Create a saved search. If you do select all (ctrl+a) in the saved search, it will only highlight the attached snapshots. right-click --> move to trash and empty your trash.

    Note this will delete all Snapshots, even if there is no PDF.
  • edited January 17, 2018
    @adamsmith What file types are covered under "Web Page"? .html, .htm, .mht, .maf ...?
  • I don't know, honestly, but I'd expect just html and htm
  • Thanks. Any way to search for attachments with a specific file type extension? I thought that used to be possible, but might be mistaken.
  • If you search for 'Title' 'contains' '.pdf' or whatever extension, it will return the attachments with that extension.
  • Thanks. That very much helped out. But, I still have a fair amount of folders that have "stuff" in them related to downloaded webpages. These folders, it seems, have been severed from the zotero database. I can find an alternate record with the pubmed link and PDF file attached. Likely this situation has been created over time and is related to how I have deleted merged and otherwise maintained the reference list over time. I can delete them by hand but based on a count of records in my reference list and # of folders in the storage location, I still have several thousand unaccounted folders. It there any way to match folder ID in the storage location with any particular record. Better still can this be done as a batch search
  • Each attachment has its own folder (so a snapshot and a PDF for an item will have their own folders). You can search for an attachment in Zotero by searching for the eight digit string folder name in Zotero's search field (e.g., XO2019AC). But you generally don't need to do this—files corresponding to items that are deleted in Zotero will be removed from your computer in time. (Be sure that you empty the trash in Zotero.)
  • Thanks. I have tried that approach and I have found quite a few folders with no representation in the database; ie Zotero search does not recognize the folder ID string. Some of the folders are related to snapshots or webpages, some are simply empty and some are, I think, artifacts from the use of Zotfile. Based on the dates these folders have been around for quite some time and so if there is any kind of housecleaning tools I can use to eliminate them I would appreciate a pointer. My concern is backing up the database and, of course, improving response. Right now Zotero is working OK even with all of the chaff, but my suspicion is that this can only go on so long before performance becomes an issue.
  • You really don't have to worry about this. Zotero will clean up deleted files in time. If they don't show up in a search, they might be in a group library or in the Zotero trash. As I said, Zotero will delete them in time, and files in storage have no impact on Zotero's performance (which is only affected by the items in the Zotero database).
  • Thanks. From the point of view of Zotero, you are providing good advice. There is nothing broken so there is nothing to fix. But, from the point of view of my network set-up, backups etc, I do have some problems. If there was some way I could refresh the database and start again I would.
Sign In or Register to comment.