Proposal: Pause and resume file syncing

Just started uploading my files to Bingodisk through a sync operation. The first 200Mb took nearly 20 minutes; I have 4,8Gb of attachments to go, and at the current speed of about 10Mb/minute this will probably take another eight hours. [edit: the files are zipped, so the total space needed will be less]

Thing is, I may not have eight hours today. I need to be able to pause the file syncing process and to resume it later. And this is not about me - this issue will come up frequently when 1.5 is finally released and more users with sizable libraries start syncing their files.

Proposal: make it possible to pause and resume file syncing. If there exists a paused file sync from client 1, client 2 will simply report 'Cannot sync files currently. There is a file sync in progress at another client'.
  • (I don't have debug logs, sorry - I feared that would slow down the operation.)

    Okay, so here's what happened when I had to stop the sync operation (it had been running for more than five hours, but at a certain point it didn't seem to transfer files anymore. The progress bar got stuck at 50%).

    On next startup, I was told that the last sync was last Friday (so apparently this one wasn't counted). Syncing started, and the items added since last Friday got duplicates.

    What I'd like to know is: In which order are item syncs and file syncs put through? From this experience, it looks like Zotero only ever got to the file sync this morning and didn't do anything with the item sync. I think it would make sense to prioritize item sync, and then, in a separate syncing operation, see about syncing files. That way, even if a huge uploading/downloading operation is cancelled (preferably paused), the library items will have synced alright.

    Secondly, it does seem all attachment files are on my WebDAV storage right now (the number of zip files on the WebDAV space is roughly the same as the number of folders in my local /storage/). So what caused the progress bar to get stuck at 50%, and why did the item sync not come through? Could it have to do with the fact that during the sync operation I have added a few PDF attachments to my library?
  • Syncing started, and the items added since last Friday got duplicates.
    What do you mean by "got duplicates"?
    On next startup, I was told that the last sync was last Friday (so apparently this one wasn't counted).
    Last sync time only includes item sync, not storage sync (which doesn't have an individual timestamp).
    What I'd like to know is: In which order are item syncs and file syncs put through?
    Currently, storage sync -> item sync -> storage sync. This is because new files need to be uploaded first so that their timestamps can be sent with the item metadata, and then any files marked as updated from the newly downloaded item metadata need to be synced. But we can probably add an item sync at the beginning as well (and optimize it to only do the following syncs if necessary).
    So what caused the progress bar to get stuck at 50%, and why did the item sync not come through?
    The item sync only occurs if the storage sync completes cleanly, and there are still a few things that can prevent that from occurring (which also cause the progress meter to be incorrect). If you haven't yet restarted Firefox, do you see anything in Report Errors?
  • 'Got duplicates' -> 'were duplicated'. I.e. everything added since last Friday (some ten items) was now double.

    Alas, I already restarted Firefox. Will see if I can reproduce it later.
  • 'Got duplicates' -> 'were duplicated'. I.e. everything added since last Friday (some ten items) was now double.
    That absolutely shouldn't happen. This happened to items that were added on another machine or this machine?
  • Just now I got an error on the other side (2005418299). And again (254593060).
    I've not switched on file syncing on client 2. Does it expect me to do so? I was hoping I could postpone downloading the 3Gb...

    Items are not synced since last Friday on client 2.

    The duplicated items were added on client 1 (the side from which I reported yesterday).
  • Just now I got an error on the other side
    You hit a slightly misconfigured upper limit. To avoid the error, you'll need to either manually apply the change, switch to the trunk XPI, or wait for the next build (which will be out by Monday).
  • edited November 26, 2008
    "// Can only handle 999 bound parameters at a time"

    Shouldn't this become

    "// Can only handle 990 bound parameters at a time"

    as well? (comment two lines above the change of changeset 3829 in storage.js)
  • edited November 26, 2008
    No, the default SQLite limit is still 999. It just wasn't accounting for the parameters already used in the rest of the query. I've added a comment to the 990 line to clarify. Thanks.
  • edited November 26, 2008
    I'd rather wait till Monday, then.

    I am hoping though that I can revert back to the state in which everything worked fine without file syncing. Right now I get an error message on client 2 (745489739) saying that 'reconciliation is not implemented for collections'. Presumably this means that there are differences in the collections on both sides. Regardless of collections, any items added yesterday on client 1 don't get through to client 2.

    On client 2, I do get a conflict resolution box for one item I changed last Friday on client 1. However, selecting the remote version and clicking OK doesn't update it; instead the sync operation seems to halt and I get an error (2142646859) saying 'Existing item ... exists in cache in Zotero.DataObjects.reload()'.

    If I now reset server data, do I risk duplicating the whole thing on either side?
  • Regardless of collections, any items added yesterday on client 1 don't get through to client 2.
    Metadata syncing is atomic—either everything gets through or nothing gets through.
    If I now reset server data, do I risk duplicating the whole thing on either side?
    No, that shouldn't happen. But, depending on your data, you may get conflicts. For the moment, unless you have data you need (and can't export to Zotero RDF) on both sides, it's better to clear one side. That plus disabling file sync on both sides should get you back to a working metadata sync.
  • edited November 26, 2008
    Okay, that fixed it. I'll wait for the next release before I start syncing files again. Hope the 3Gb of data on my WebDAV account will still be useful then (I haven't reset storage history).

    By the way, what does 'purge deleted storage files' and 'purge orphaned storage files' mean? Delete stuff from the server that has been deleted locally?

    /edit
    Back to what got this thread started: wouldn't it make sense to have an option for pausing and resuming file sync operations?
  • "Purge deleted storage files" deletes files from the server that have been deleted locally (more than 30 days ago, currently).

    "Purge orphaned storage files" deletes files on the server (of any age, currently) that don't have corresponding items locally.

    Of course, these are debugging options that will likely change, particularly the time-related parts.
  • Back to what got this thread started: wouldn't it make sense to have an option for pausing and resuming file sync operations?
    Yes.
  • edited November 26, 2008
    By the way, on yet another system (client 3, starting with a blank slate) I've been able to download all my attachments and sync the whole library just fine. This completes my BingoDisk.com experiment.
Sign In or Register to comment.