@emilianoeheyns: Are you testing an actual migration of the Extra field, or just your own code?
@warguelles is being very generous with his time, but I think it's going to be hard to figure out what exactly is going on here without testing an actual migration. Something is still causing DB transaction timeouts, and since this is Windows, it's hard to get a full debug log.
Do you have a database you can test this with, or can you create one with thousands of Extra values? I can explain how to trigger the migration if you don't have a pre-upgrade copy.
Also, probably better to discuss on zotero-dev, but are you calling item.loadAllData() every time you work with an item (e.g., to generate a key)? That's going to slow things way down — currently, reload isn't respected there, so that's going to indiscriminately reload all item data. That should only be necessary for items in a library that the user hasn't yet opened in the current Zotero session (and wouldn't be necessary for any of these items, since Zotero loads data before migrating the fields). We can look into avoiding unnecessary reloads if data is already loaded, but you're definitely not meant to call loadAllData() every time you pick up an item to work with.
Don't worry, Zotero is an incredible tool for our work, and I'm happy to help!
Do you have a database you can test this with, or can you create one with thousands of Extra values? I can explain how to trigger the migration if you don't have a pre-upgrade copy.
Is that a question for me?
Last night I made a backup of all the pre-migrate folder at C:\Users\XYZ\Zotero, including the entire “storage” directory. I can work with a copy of it, deleting all items that don't have an extra field... I would be left with a database of more than 3,000...
Batches of 100, but using a single notifier queue that's not triggered until the end
So that wouldn't trigger BBT during the migration, just after. I have a test that's running now.
but are you calling item.loadAllData() every time you work with an item (e.g., to generate a key)?
Not every time but I am very sure I am calling it too often. I tried to find out a fair bit ago whether that is preventable, but between key generation and auto-exports, BBT often does work on items that the user hasn't opened yet, and I was running into "data not loaded" without a way to do targeted data loads.
Sorry if this is a stupid question. If I understand correctly, is BBT generating (new) citekeys for each item, even if it already has one? Wouldn't it be less problematic to generate keys only if the item doesn't have one?
I ask because in my case I am sure that almost 100% of the records have their citekey, and generating a new one during migration consumes more resources and time (apart from causing problems if the citekey of items already being used is changed).
The migration I ran with ALL plugins disabled took only around 15 seconds.
would be nice to have a progress bar for the extra migrations.
There is a progress bar — that's the circular progress bar in my first screenshot above. But as I say, after all migrations are complete (and the progress bar is by definition at 100%), it changes back to the journal article icon, and that's when the Notifier events go out. So if plugins then take lots more time, it stalls in that state.
Certainly not a stupid question. If there is a pinned key, BBT does nothing more than register that fact. I may be misdetecting the lifted key, which would cause key generation. I'm looking into that.
but between key generation and auto-exports, BBT often does work on items that the user hasn't opened yet
Items don't need to be opened — only the library itself. Zotero loads all item data when a user clicks on a library. I wouldn't think you would need to generate keys or perform exports except in libraries that either were being synced or the user had opened, both of which would result in all item data being loaded.
But that would mean there is no circumstance where I would ever have to call loadAllData, and I would not even have to try to detect whether items are loaded -- yet I was running into these problems. I don't know what the exact circumstances were, but if there can't be any that would require a loadall, I'm OK with having it default-off. That way, if a user runs into issues, I can at least turn it back on in the field.
The extra lift is still running, has been an hour so far, I see occasional BBT activity in the log but its far and few between. This is all I'm seeing -- which is the progress bar?
Can you get me a database after all? I thought I had a database properly prepped with every item having an extra: line, but I am getting notifications for items that have neither an extra: line nor a citationKey field. If every item has an extra: line before start, there should never be a time where exactly one of the two isn't present, right? I'd need a database where every regular item has a line in the extra: field.
WITH x AS ( SELECT item.itemID, item.key as itemKey, item.libraryID, MAX(CASE WHEN f.fieldName = 'extra' AND idv.value LIKE '%citation key:%' THEN idv.value END) AS extra, MAX(CASE WHEN f.fieldName = 'citationKey' THEN idv.value END) AS citationKey FROM items item LEFT JOIN itemData id ON item.itemID = id.itemID LEFT JOIN fields f ON id.fieldID = f.fieldID AND f.fieldName IN ('extra', 'citationKey') LEFT JOIN itemDataValues idv ON id.valueID = idv.valueID WHERE item.itemID NOT IN (SELECT itemID FROM deletedItems) AND item.itemTypeID NOT IN (SELECT itemTypeID FROM itemTypes WHERE typeName IN ('attachment', 'note', 'annotation')) AND item.itemID NOT IN (SELECT itemID from feedItems) GROUP BY item.itemID ) SELECT * FROM x WHERE extra IS NULL AND citationKey IS NULL
and got no results -- I take that to mean all items had the one or the other
I wasn't offering a database — just asking if you had one or could make one to test the performance. If you can reproduce items with "Citation Key:" in Extra before the migration ending up with an empty Extra and no Citation Key field after the migration, without BBT enabled, we'd obviously want to know about that…
Ah I don't know what the final situation would be, I have so far interrupted the test before it could finish.
I am seeing loads of stalled transactions. I'm not sure yet how I cause these yet, but if there is indeed the guarantee that after Zotero.initializationPromise and Zotero.unlockPromise all items are loaded, there is some early DB access I can skip. Is this indeed the case?
@emilianoeheyns: We've been holding the Z7 fields update until you resolve the performance issues. We don't want to push a release that results in a 40-minute upgrade. Though we could conceivably update the progress window to make clear that Zotero is done and it's waiting on plugins…
if there is indeed the guarantee that after Zotero.initializationPromise and Zotero.unlockPromise all items are loaded, there is some early DB access I can skip
No, items aren't loaded after those, but all of the migrated items will be loaded after the migration is done (with the test I explained on the dev list).
I would also prefer to stop calling loadAllData unnecessarily, and for that I need to be able to establish that an item or all items in a library have already been loaded.
@warguelles is being very generous with his time, but I think it's going to be hard to figure out what exactly is going on here without testing an actual migration. Something is still causing DB transaction timeouts, and since this is Windows, it's hard to get a full debug log.
Do you have a database you can test this with, or can you create one with thousands of Extra values? I can explain how to trigger the migration if you don't have a pre-upgrade copy.
Also, probably better to discuss on zotero-dev, but are you calling
item.loadAllData()every time you work with an item (e.g., to generate a key)? That's going to slow things way down — currently,reloadisn't respected there, so that's going to indiscriminately reload all item data. That should only be necessary for items in a library that the user hasn't yet opened in the current Zotero session (and wouldn't be necessary for any of these items, since Zotero loads data before migrating the fields). We can look into avoiding unnecessary reloads if data is already loaded, but you're definitely not meant to callloadAllData()every time you pick up an item to work with.Last night I made a backup of all the pre-migrate folder at C:\Users\XYZ\Zotero, including the entire “storage” directory.
I can work with a copy of it, deleting all items that don't have an extra field... I would be left with a database of more than 3,000...
https://github.com/zotero/zotero/blob/faf8f2510ecf38d24580d74771542ef4506375e9/chrome/content/zotero/xpcom/schema.js#L716-L742
I ask because in my case I am sure that almost 100% of the records have their citekey, and generating a new one during migration consumes more resources and time (apart from causing problems if the citekey of items already being used is changed).
The migration I ran with ALL plugins disabled took only around 15 seconds.
The extra lift is still running, has been an hour so far, I see occasional BBT activity in the log but its far and few between. This is all I'm seeing -- which is the progress bar?
https://ibb.co/tPY2J6wv
https://forums.zotero.org/discussion/comment/505033/#Comment_505033
That's when it's waiting for Notifier observers to finish.
WITH x AS (SELECT item.itemID, item.key as itemKey, item.libraryID,
MAX(CASE WHEN f.fieldName = 'extra' AND idv.value LIKE '%citation key:%' THEN idv.value END) AS extra,
MAX(CASE WHEN f.fieldName = 'citationKey' THEN idv.value END) AS citationKey
FROM items item
LEFT JOIN itemData id ON item.itemID = id.itemID
LEFT JOIN fields f ON id.fieldID = f.fieldID AND f.fieldName IN ('extra', 'citationKey')
LEFT JOIN itemDataValues idv ON id.valueID = idv.valueID
WHERE item.itemID NOT IN (SELECT itemID FROM deletedItems)
AND item.itemTypeID NOT IN (SELECT itemTypeID FROM itemTypes WHERE typeName IN ('attachment', 'note', 'annotation'))
AND item.itemID NOT IN (SELECT itemID from feedItems)
GROUP BY item.itemID
)
SELECT * FROM x WHERE extra IS NULL AND citationKey IS NULL
and got no results -- I take that to mean all items had the one or the other
The new features in Zotero 8 are awesome (Thanks!) but I do need the pinned keys so I will wait to upgrade.
I am seeing loads of stalled transactions. I'm not sure yet how I cause these yet, but if there is indeed the guarantee that after Zotero.initializationPromise and Zotero.unlockPromise all items are loaded, there is some early DB access I can skip. Is this indeed the case?