error report id:1275409819, deleted tags come back again

using the mac application 4.0.8
Several of us are working on a collection with about 8571 docs in it. we merged several tags together by renaming them to the same thing. (there were thousands of these tags, so syncing was difficult, two of us had to remove ourselves from the group, sync, and re-subscribe and sync to get the beach ball from hanging the machine totally (tried waiting over night couple times, but failed)

THIS bug: now these new tags that were created by merged renaming are not behaving well: I deleted many of them from some docs, then synced, then a collaborator sync'ed his zotero application, and then the tags re-appeared on my instance!!!-- just as though he had added them all back, but he didn't. this is repeatable on current instances. To move forward, I believe I can probably export the whole collection as xml, and recreate it, but this will be painful with so many docs. these tags are acting like they have different ids in different instances so sometimes the behave as separate tags from separate origins(names), and sometime as if they are all the same tag (since they have same name). this is pretty irksome, I really liked the renaming-to-same-name-merging feature.
  • To move forward, I believe I can probably export the whole collection as xml, and recreate it, but this will be painful with so many docs.
    Yeah, don't do that.

    Tags come back when there are conflicts, meaning that data changes on both sides between syncs. Unlike with item conflicts, where Zotero prompts the user, if a tag changes on two computers between syncs, Zotero will play it safe and automatically keep both versions. A "change" to a tag can mean renaming/deleting or assigning/removing items, so if you merge a tag (meaning a delete of one) and someone else assigns the old tag to an item before pulling your change, the deleted tag will be restored.

    There are some rarer situations where Zotero can end up having to assume that all local data has changed and compare all local data to all remote data, which would have the result of restoring any remotely deleted tags that existed in the local library. This generally shouldn't happen, and the next major version of Zotero will also use a different syncing architecture that should minimize such problems.

    The best way to avoid this is to make sure everyone in the group is using auto-sync and isn't making changes before ensuring that syncing has completed successfully.

    In the meantime, if you can reliably reproduce a problem, let us know. We'd need more details in that case (e.g., the other user account in question).

    (Also, you're still on Zotero 4.0.8. You should upgrade to 4.0.9, though it shouldn't affect this.)
    these tags are acting like they have different ids in different instances so sometimes the behave as separate tags from separate origins(names), and sometime as if they are all the same tag (since they have same name).
    Not sure what you mean by this.
  • upgraded to 4.0.9
    I believe I can create a simple repeatable case in the future, but first I am going to save my current collection. I have tried unsubscribing then syncing (for hours), then resubscribing and syncing (for hours), this has not worked. so I then decided to export to RDF and recreate the collection, however, I CANNOT export the collection anymore. it hangs partway through. I have found that I can export about 400 docs at a time safely (but I would rather not export 8600 docs in groups of 400). I tried 500 or 600 docs at a time, and some chunks don't quite make it and hangs. this is surprising to me since it always used to export fine. and I have a second collection of about 8000 docs that do not have these corrupted tags that still exports fine.

    I checked the activity monitor when the export was going on, and there were over 2 GIGs of memory in use, PLUS over 3 GIGs of virtual memory in use. this is very large compared to other other similar exports I have monitored, so I am thinking I should delete my local zotero database, re install, resync to cloud and cross fingers.

    by the way, we had several tag instances with about 7000 instances each, and merged these all to the same name, (i.e., about 28000 tag instances being merged to about 7000 tag instances); this seems to have caused the problem. the person who changed all the tags in the first place seems to have no problems, but everyone else is pretty broken, even the slightest change in data from one person, take minutes of sync beach-balling for everyone else.
  • edited August 2, 2013
    Exporting and wiping your library is not the solution here. Generally speaking, libraries don't break in any sort of way that starting from scratch would fix (and tags don't get "corrupted"). If there's a problem, we'll debug it and help you fix it.
  • we had several tag instances with about 7000 instances each
    What do you mean by this? Attached to 7000 items, you mean?
    everyone else is pretty broken, even the slightest change in data from one person, take minutes of sync beach-balling for everyone else
    I'll need an example account. In your account I see only small quick syncs for the last few days. There were some larger downloads from your account a few days ago, presumably after someone else made large changes, but those were still very quick, at least on the server side.

    For libraries with many file attachments there's currently an issue where syncs can freeze briefly as Zotero checks for locally modified files, but that's unrelated to data syncing. We're working on a fix for that for 4.0.10. You could see if that was the issue by temporarily disabling file syncing in the Sync pane of the Zotero preferences.
  • Thanks Dan, I truly appreciate your attention on this, and I would definitely rather figure out what is going on this, rather than just blow things away and cross my fingers.

    (0) we actually do not have any attached files, just metadata, tags and notes. the urls link to pdfs hosted on another site.

    (1) "7000 instances"...sorry to be confusing. yes, I meant "attached to 7000 items". What happened is, someone selected about 7000 docs and dragged them onto a tag, and waited, then created several more tags and did the same. then a few weeks later, we decided this was un-workable, and he merged all the new tags back to one tag with new name. -- yes, the whole thing was a bad idea in the first place, but that is what happened. (by the way, I just checked, looks like the 7000 number is a little low, closer to 8300 of the 8600 docs in the collection were tagged this way...)

    (2) "only small quick syncs lately": this tag merging thing happened a couple weeks ago. After the other two of us realized we both could not sync even with letting it run over night, we decided to remove ourselves from the collection, then sync, then re-subscribe and sync again. we were able to sync, but now some tags come back after being deleted. It seems like most of our sync'ing troubles were spent with the "Processing updated data from the sync server" with progress bar.

    (3) I just went to the zotero web interface and deleted a couple problem tags off of several of the documents and clicked save, then went back to the zotero app on my computer, and clicked sync: the green circular arrow spun around twice, and then "processing updated data from sync server" progress bar popped up for 26 seconds, the the tags were updated. -- the point being, I think the server is responding quickly, and the problem may be in how the app is handling these updates.

    (4) Reproducing the bug requires one of us to delete a tag, then sync, then another person syncs (after doing nothing) and then the first person syncs again, then tag re-appears. So we will be meeting on Tuesday morning and can try to document some exact cases where this happens, to see if it behaves differently for different tags, or different user scenarios etc.

    (5) the collection is called 1st_Accession_01102011 and is owned by "keylargomusic" (an account which I also manage). The three people working on this collection (including me) are the three invited members who have admin privileges. The trouble tags are called WC-WORKING and TO_TAG...

    (6) by the way, this is currently a private collection, but we have recently made the pdfs public, and are hoping to make the zotero collection public as soon as we get the metadata cleaned up.
  • Reproducing the bug requires one of us to delete a tag, then sync, then another person syncs (after doing nothing) and then the first person syncs again, then tag re-appears.
    If the second person is actually fully in sync before the deletion, indicated (usually) by their being able to click sync and have it sync instantaneously and stop turning, this is extremely unlikely. But if you can reproduce it, provide a Debug ID for the two syncs on Computer A and for the first sync after the deletion on computer B. (It will be helpful to disable auto-sync while reproducing this so you can be sure you're capturing the right sync.)
  • Dan,
    We did a bunch of tests and found a repeatable bug:
    Looks like the sync can get mixed up if two users delete a tag of the SAME NAME from TWO DIFFERENT docs, and then the syncing overlaps!

    I did not read your note before we tested so I did not capture a DebugID, but Here are the Report IDS from our two computers (we generated the error many times this morning): 606191022, 377107470

    I think this description may identify the source of the bug for you.

    GENERALLY, there is NO PROBLEM with the following sequence if TAG_X and TAG_Y are different:
    Assume Doc1 has TAG_X, and Doc2 has TAG_Y…
    (1) Computers A and B both sync to cloud
    (2) Computer A deletes TAG_X from Doc1
    (3) Computer A syncs (all ok)
    (4) Computer B deletes TAG_Y from Doc2
    (5) Computer B syncs (all ok)
    (6) both Computers sync and show TAG_X removed from Doc1 and TAG_Y removed from Doc2 (as expected)

    OK, here's the REPEATABLE bug:
    say both Doc1 and Doc2 have TAG_X...
    (1) Computers A and B both sync to cloud
    (2) Computer A deletes TAG_X from Doc1
    (3) Computer A syncs (all ok)
    (4) Computer B deletes TAG_X from Doc2
    (5) Computer B syncs -> gets console error during sync:
    "Remote machine added TAG_X in Doc2 "
    -- note that Computer A did not touch Doc2…
    (6) TAG_X always reappears in Doc2 on Computer A and B
    (7) sometimes TAG_X reappears in Doc1 after all resync (intermittent)

    OF COURSE, if you insert another sync in the steps above as follows, it works fine
    (1) Computers A and B both sync to cloud
    (2) Computer A deletes TAG_X from Doc1
    (3) Computer A syncs (all ok)
    (3.5) Computer B syncs again (after not doing anything)
    (4) Computer B deletes TAG_X from Doc2
    (5) Computer B syncs (all ok)

    The reason this hit us hard, is that we added a tag like "IN_PROGRESS" to all (8000 or so) docs, and then we were each deleting this tag one-by-one as we finished double-checking the metadata on each doc; hence the sync was over lapping a lot, and sometimes big groups of deleted tags were getting UN-deleted if someone opened zotero and deleted an IN_PROGRESS tag on any doc before syncing first!!
  • Right, that's not a bug. That's just the behavior I explained above. Currently, tags are implemented as independent objects, with items attached to them. Since Computer B isn't syncing before making changes, that's a conflict, because, from Computer B's perspective, Tag_X has been modified remotely and has also been modified locally since the last sync. Rather than ask the user about each of thousands of tag changes, Zotero just merges the tag sets from each side. If your workflow involves many people making many tag deletions simultaneously, that might not work super well for you. But it's working as it should given the current design, and the best solution is just to make sure everyone has auto-sync enabled and is syncing before making changes.

    In Zotero 4.1 we'll be switching to a new syncing architecture that should help avoid problems like this, since it will treat tags as properties of items instead of as separate objects. (The downside is that modifying a tag across thousands of items will require syncing updates for each one of those items instead of syncing a single tag.)
    "Remote machine added TAG_X in Doc2 "
    That's not the actual message. It's this: "One or more Zotero tags have been added to and/or removed from items on multiple computers since the last sync. The different sets of tags have been combined."
  • "...tags are implemented as independent objects, with items attached to them..." Aha! I missed this subtlety before. This also explains why adding a tag to a doc does not update the "Last Modified" Date! This may also explain why the Export_to_RDF" is taking so long now, since, for each document, it has to search through all the attachments to all the tags...and several tags have 8000 documents attached.

    Of course, given that the expected use of tags is to ADD them and keep them, rather than delete them one at a time, I guess keeping all conflicts is generally the right thing to do. Apparently, our group just need a new process... for example, if we had added a DONE tag to docs when they were done, instead of deleting an "IN_PROGRESS" tag, we would never have encountered the problem... Thanks again for sticking with me until I understood what was going on.

    Scott
  • on this subject:
    "The downside is that modifying a tag across thousands of items will require syncing updates for each one of those items instead of syncing a single tag"--

    I see that changing the name of a tag should be easy (assuming they have a unique index, and a separate table for name lookup) However, I see that if you change a tag name to an EXISTING name, you actually need to keep one index, and change all references to the dead index to the one you kept.

    I am working on an entirely unrelated project but have a similar data structure issue, with frequent merging of entities: I finally decided to allow multiple indexes for the same entity. i.e., the entity is still unique and points to its attributes from one place, it just has an array of indexes that happen to point to it... the upside for me is the merge it trivial, the down side is that when you compare entities you need to check against the possible array of indexes. anyway, an idea...
  • Hi!

    We experience very much the same problem. My question is; what kind of 'members' (member, administrator or..?) were you in this group when the tags keeps re-appearing. We assume it might have something to do with this.
  • edited February 26, 2014
    No, it wouldn't have anything to do with that. This is just how the current sync architecture works in some complicated situations. As I say above, this will improve with the next major version of Zotero.

    In the meantime your best bet is just to make sure that everyone has auto-sync enabled and is fully in sync before working more with tags.
  • Hi Dan,

    Thanks. Okay, since we work a lot with tags, we will synch before we start working in Zotero and make sure we have auto-synch on.

    Any other tips?

    Furthermore, do you have a clue when the next major version will come out? This year, next year? Summer, fall, winter?
  • This year. Can't really be any more specific than that, but it's my current top priority.
Sign In or Register to comment.