Synching with large imports

Hey (probably Dan)

My colleagues and I are doing a systematic review and we are having serious syncing issues probably brought on by the sheer volume of imports. My Zotero has not been able to sync in 16 days or so, which is creating a larger and larger backlog of documents (I have a couple thousand documents that the server does not and vice versa). I have tried to sync in Firefox but it basically hangs. I then tried to open standalone and let it sync for two whole days (without any other program running) and I never got past the blue "processing data" symbol. I even checked on it whenever I could to make sure the computer had not fallen asleep, but it still did not sync. I do not think I can sacrifice more time for syncing that two whole days, especially when it did not progress at all. Help!

Thanks!
  • There's certainly no need to wait two days. If a sync is still going after an hour, it's almost certainly not actually running.

    Your client is trying to sync over 60,000 items, including 47,500 deleted items. That probably won't work. (Zotero 4.1 will have a new sync architecture that's better able to handle huge syncs like these.)

    If there's a computer that can sync successfully—which would presumably be the computer where all of these (failed?) imports happened—empty the trash and sync. You should then have better luck syncing other computers.

    If you're not able to empty the trash on the computer where these items exist (and unfortunately it's not currently possible to do that on zotero.org without going page by page), we can clear the trash in the group library for you, but you should first see if you can.
  • (Or by systematic review maybe you mean you're downloading huge numbers of items from sites and then deleting all but the relevant ones? Zotero has never really been designed with that use case in mind, but that's what you do, I would recommend reducing the trash auto-empty delay to one or two days instead of the default 30 in the General pane of the Zotero preferences. That will reduce the chances that someone who doesn't sync for a few days will have a huge number of unsynced (but trashed) items to download.)
  • Actually, I can empty the trash for you fairly easily—I forgot that the last time this happened to someone I wrote a tool to do it quickly. Let me know if that would be helpful.
  • Hi Dan,

    would you go ahead w/ the tool to empty huge trash bins
    in the Group Library "3ie Externalities Sys Rev" owned by the 3ie.externalities account?

    I can successfully sync, but having trouble emptying the trash.

    Thanks.
  • I emptied the trash for that group.

  • Thanks Dan,

      but I'm encountering another issue now.

    Earlier today,
    Firefox kept crashing
    shortly after opening the Zotero-console
    .
    It is now successfully syncing, but it is taking a while - probably due to the huge change in the library from clearing the trash.

       I'm fairly confident that I would be able to complete my sync,
       but unsure if the same would be true for others
       that are sharing the library on laptops

        (I'm on a desktop w/ 16GB RAM-installed
        and firefox is using 660 MB of memory
        while zotero is syncing).

    Would you happen to have any suggestions if we're having trouble syncing on laptops due to Firefox crashing?

       I'm guessing that Firefox crashes occur
       when Zotero requests more memory than
       Firefox is allowed to work with, or can handle.


    p.s. we will lower the trash auto-empty delay to 2-days on all of our zotero-consoles
           to prevent something like this from happening again.
  • Do you mean that it was crashing or freezing? It will quite likely freeze with a deletion that large, but it should finish if let go. If it's being killed by the OS that's a bit more problematic, but there's not much to be done about it in the current sync architecture, other than to restart the computer and try the sync with no other programs running. With nothing else running, 660MB shouldn't be anywhere close to being killed by the OS on semi-modern computers.

  • While syncing on zotero, Firefox freezes intermittently and eventually crashes. I'm not sure if it's being killed by the OS, but below is a sample 'Mozilla Bug Report' that I get after Firefox crashes.

    I'll try syncing again after restarting, closing any other programs, and running Firefox by itself.

    Thanks again.

    AdapterDeviceID: 0x11c0
    AdapterVendorID: 0x10de
    Add-ons: %7B8b86149f-01fb-4842-9dd8-4d7eb02fd055%7D:0.26,mozrepl%40hyperstruct.net:1.1.2,scaffold%40zotero.org:3.0.0,%7B8620c15f-30dc-4dba-a131-7c5d20cf4a29%7D:3.6,zotero%40chnm.gmu.edu:4.0.12,SQLiteManager%40mrinalkant.blogspot.com:0.8.0,%7B0113D088-8ED1-468C-B225-585A9C53B5E3%7D:1.0,zotfile%40columbia.edu:3.0.2,%7B81BF1D23-5F17-408D-AC6B-BD6DF7CAF670%7D:8.5.1,%7B972ce4c6-7e08-4474-a285-3208198ce6fd%7D:23.0.1,firebug%40software.joehewitt.com:1.12.1
    AvailablePageFile: 26864222208
    AvailablePhysicalMemory: 10707189760
    AvailableVirtualMemory: 3342749696
    BuildID: 20130814063812
    CrashTime: 1379447257
    EMCheckCompatibility: true
    InstallTime: 1376854675
    Notes: AdapterVendorID: 0x10de, AdapterDeviceID: 0x11c0, AdapterSubsysID: 26623842, AdapterDriverVersion: 9.18.13.2049
    D2D? D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+
    ProductID: {ec8030f7-c20a-464f-9b0e-13a3a9e97384}
    ProductName: Firefox
    ReleaseChannel: release
    SecondsSinceLastCrash: 30466
    StartupTime: 1379444499
    SystemMemoryUsePercentage: 37
    Theme: classic/1.0
    Throttleable: 1
    TotalVirtualMemory: 4294836224
    URL: about:sessionrestore
    Vendor: Mozilla
    Version: 23.0.1
    Winsock_LSP: MSAFD Tcpip [TCP/IPv6] : 2 : 1 : %SystemRoot%\system32\mswsock.dll
    MSAFD Tcpip [UDP/IPv6] : 2 : 2 :
    MSAFD Tcpip [RAW/IPv6] : 2 : 3 : %SystemRoot%\system32\mswsock.dll
    MSAFD Tcpip [TCP/IP] : 2 : 1 :
    MSAFD Tcpip [UDP/IP] : 2 : 2 : %SystemRoot%\system32\mswsock.dll
    MSAFD Tcpip [RAW/IP] : 2 : 3 :
    RSVP TCPv6 Service Provider : 2 : 1 : %SystemRoot%\system32\mswsock.dll
    RSVP TCP Service Provider : 2 : 1 :
    RSVP UDPv6 Service Provider : 2 : 2 : %SystemRoot%\system32\mswsock.dll
    RSVP UDP Service Provider : 2 : 2 :

    This report also contains technical information about the state of the application when it crashed.
  • That doesn't look like it's running out of memory.

    Can you go to about:crashes in the Firefox address bar, click through on one of the recent crashes, and provide the Mozilla URL?
  • Here's the report on the most recent crash:

    https://crash-stats.mozilla.com/report/index/ed8b09c6-9a7b-4e85-833e-118b62130918
  • Any others? There's no info in that report, unfortunately.
  • Here are the next 4 reports:

    https://crash-stats.mozilla.com/report/index/27a74b69-c3a5-4cac-a6c4-785422130917

    https://crash-stats.mozilla.com/report/index/a38a4a09-c21f-41d7-9d61-2528a2130917

    https://crash-stats.mozilla.com/report/index/3c5b2513-f011-483d-8c9f-dd34a2130917

    https://crash-stats.mozilla.com/report/index/12ddd706-5017-4c07-b7f8-144b32130916
  • Nothing in there, really, but Firefox 24 just came out. You can try that.

  • Thanks Dan, for giving it a look.

    I upgraded Firefox and syncing now.

    IF it crashes again, I'll try sending you the crash-stat link.
    Hopefully it'll a bit more descriptive than
    the crash-stats w/ the older ver. of Firefox that I had.
  • Hi Dan,

       sorry for bugging.

    I restarted my computer after upgrading Firefox,
    and tried to sync again, but it just crashed again.

    The crash-report doesn't look very different from the others,
    but I just had some ideas and wanted to run it through you.

    =======================================================
    Currently, the group-library (3ie Externalities Sys Rev) is owned by an independent-account (3ie.externalities), and shared to personal-accounts (Eugene Konagaya / jadebc / queborific / jaynal).
    -------------------------------------------------------------------------
    To un-clog the sync,

    Would it make sense to do the following?
    1) copy-over the library contents to a different-group library
    2) un-share the old group-library w/ the other members
    3) re-share the new-copy of the library w/ the members.

    I'm certain that the trash that was emptied was substantially larger than the useful contents of the library.

       From what I can count, there are
       29,159 references total in the library,
       and saw ~70K references in the trash - it got pretty out of hand.


    I'm not sure how Zotero handles un-sharing,
    and wasn't sure if this would even lessen the volume of references being synced
    .
    -------------------------------------------------------------------------
    Another option may be to 'start from scratch'
    and have all members:

    1) create a new Firefox profile,
    2) re-install Zotero (on Firefox)
    3) and sync.

    I figure syncing 29,159 references would be easier on zotero/firefox than 70K+, though I'm not certain how zotero would handle syncing 29K references at once either - might run into Firefox crashes even at this volume.

    But some of our members prefer using Zotero Standalone,
    and I'm not sure if there is a way to 'reset' the profile on Standalone.

    =======================================================
    Sorry for the long e-mail,
    and we really appreciate your help.
    -Eugene
  • Are you all having trouble syncing, or just you? Do all the items still exist in the trash on your computer?

    The only thing syncing to your account at the moment is the list of deleted items, and that should sync pretty quickly unless those items actually still exist in your library, in which case they would need to be deleted, which with 50K items could indeed cause a freeze.

    So if other people don't have the trashed items and haven't tried syncing, they should just try. It might be fine.

    For you, assuming you do have those items in the trash locally, you should delete the items in the trash manually in batches by going into the trash, selecting a bunch at a time, and pressing Delete. (Disable auto-sync before you do this.) Once those are all gone, the sync might go through quickly, since it will just download the list of deleted items, see that none of them exist, and go about its merry business.

    I can also clear the delete log for the group on the server. The items then wouldn't be deleted from any computers where they existed, but if they're all only on one computer, and all in the trash, that shouldn't be a big deal. But this may not be necessary.

    I'd recommend against doing anything more dramatic at the moment such as creating a new library. This should be easy to fix.
  • Hi Dan,

    thanks for that tip,

    I see that my local trash is still full
    and it takes a long time for a small batch (19 items)
    of trash-deletion - it'd take 12 full days to delete everything!

    I drafted SQLite code to try to expedite this,
    but not sure if it'll do the trick.

    Could you take a quick look at the code below
    and let me know if I missed anything?


    Thanks again.

    =====================================
    DELETE * FROM collectionItems
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE * FROM fulltextItemWords
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE * FROM fulltextItems
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE * FROM itemAttachments
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE * FROM itemCreators
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE * FROM itemData
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE * FROM itemNotes
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE * FROM itemTags
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE * FROM items
    Where itemID IN
    (SELECT itemID FROM deletedItems)
    =====================================

    **My Zotero data-folder's backed up in case anything goes wrong.
  • edited September 20, 2013
    it takes a long time for a small batch (19 items)
    of trash-deletion - it'd take 12 full days to delete everything!
    This certainly doesn't sound right, and if this is the case something should be fixed. Can you provide a Debug ID for a sample deletion?

    I'm afraid we can't provide any support regarding direct SQL access, and I couldn't tell you off-hand anyway if those statements would be sufficient. They might be, and in theory you shouldn't be able to get your database into an inconsistent state, but you'd be totally on your own for trying. It'd be better for us to figure out why deleting from the trash is so slow for you. That sounds like the root problem here (behind the sync freezing as well).
  • Hi Dan,

    here's the debug ID D1340408438:

    While the debug logging was on,

    I deleted 20 items from the trash
    in the "3ie Externalities Sys Rev" group-library
    on the zotero console in Firefox, on my desktop.
  • And how long did that take, approximately? (Just making sure I'm looking at the right thing in the debug output, since it looks like there are several periods logged of entire minutes passing without any output.)
  • Also be aware that you probably want to leave the auto-empty days settings at 0 (disabled) until this is resolved, or else Zotero will try to delete your trashed items automatically.
  • Hi Dan,

    it took approx. 3~5 min.

    I've set auto-empty trash to 0.

    Also, I've gone ahead with the SQL code in
    SQLite Manager and it seemed to clear my trash
    but my sync's still not going through.

    MASS TRASH-REMOVE SQLite CODE
    =====================================
    DELETE FROM deletedItems

    DELETE FROM collectionItems
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM fulltextItemWords
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM fulltextItems
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM itemAttachments
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM itemCreators
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM itemData
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM itemNotes
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM itemTags
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM items
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM annotations
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM groupItems
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM highlights
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    DELETE FROM itemSeeAlso
    Where itemID IN
    (SELECT itemID FROM deletedItems)

    =====================================
    ** additions from previous set in BOLD
    *** Note: removed * for SQLite Manager syntactic reasons


    I wanted to send you a debugID from a 10-min. sync, but it seems to be stuck at a screen with a progress bar w/ the label, "Processing updated data from sync server".

    it's been stuck on that screen for atleast 30 min.,


    Thanks again, and I'll send you the debugID as soon as zotero's freed up from the 'processing...' screen.

This is an old discussion that has not been active in a long time. Before commenting here, you should strongly consider starting a new discussion instead. If you think the content of this discussion is still relevant, you can link to it from your new discussion.

Sign In or Register to comment.