How to split a tag

fentonh · August 10, 2018

I need to split a single tag which contains two words into two different tags. Can this be automated?

dstillman · August 10, 2018

No easy way to do that in Zotero at the moment, but it's a reasonable feature request. Issue created.

monika.barget · February 3, 2020

Dear fentonh, I have the same issue and would like to propose a work-around.

I have imported thousands of references for data analysis from different library catalogues. The tags came in various languages and styles, and many of them are long strings separated by a double dash:

"Earthquakes -- England -- London -- Early works to 1800"

So in my case, the separator is "--".

I will now try and write a Python script that adds new tags for all keywords in the old tag and deletes the original one. There are some interesting discussions re: Pyzotero in this forum.

If it all works out for me, I will be glad to share my script here.

Best wishes, Monika

monika.barget · February 7, 2020

Not sure this is still of interest, but here is the script that solved the problem for me: https://github.com/MonikaBarget/DigitalHistory/commit/249fa457a04db8111d3135d8478438d437eade50

Please make sure to read my commit on Github before running it on your libraries. To avoid accidental deletion of data, the script adds the necessary new tags but does NOT delete the old ones. Those can be deleted in the ZOTERO tag selector.

monika.barget · February 8, 2020

PS: in my larger libraries with multiple tags per item, the HTTPS error 412 came up more frequently. As explained in my github, this error occurs when the item version is not updated correctly before another new tag is added by the Python script. I ran some tests and figured out that giving the server a response time of 3 seconds after each added tag solve the problem. I will post a link to an updated Python script here soon.

dstillman · February 8, 2020

(But as I explained on zotero-dev, there's no reason you should need to delay requests to the API, so it'd be better not to advise other people to do so. If there's a problem with the API, we'd want to fix it for everyone. So if this is still something you're seeing without the delay, we'd want either a full HTTP log (with API keys removed) or a script that reproduces the problem — e.g., a script that creates items with tags and then processes them until this error occurs.)

[Edit: I think I figured out what's happening here. See my follow-up in the zotero-dev thread. As I note there, a delay isn't really the proper way to do this, though it does do the trick.]

dstillman · February 8, 2020

Also, if you're seeing tags with separators imported from library catalogs, you should report those in separate forum threads here with example URLs so that we can fix the translators for everyone.

monika.barget · February 9, 2020

Dear Dan, thanks so much for your replies here and in the dev-group. I will send a full log of the HTTP error in the zotero-dev thread later. I will also try and follow your coding advice that items should be updated in a different order to avoid the delay workaround.

dstillman · February 9, 2020

No need for the HTTP log — I'm pretty sure it's just the issue I describe in my follow-up on zotero-dev.

Nolwenn Le.Goff · April 8, 2020

Dear Monika,

I have exactly the same issue about "--" containing tags. It's not a translators problem. These tags are intentionally written like this on Sudoc by librairies workers.
This issue is very important when having a big zotero library.

So I'm very interested by your script.

I tried it :
- the first steps are ok on my own library. I get a list of old tags, then a list of split tags, then a flatten list of new tags.
But it doesn't work with automatically added tags (only my own user tags). Would you know how to change that?

- I have a problem with the last step (after "replacetags=[]"). No error message. Get "Update completed", but no real change on my Zotero library.
So I wrote to know how many items were affected (newer in python so I just try easy things) :

print("there are", len(affecteditems), "affected items")

Got this print : "there are 1 affected items". So I suppose there is a problem here.

Would you have any idea about this?

Best wishes,
Nolwenn

PS: sorry for my english...left school 20 years ago, and french.

monika.barget · March 8, 2021

Dear Nolwenn,

sorry I missed your post last year.

I am aware that my original script did not perform well on larger collections because using the "sleep" function to delay response was merely a work-around.

I have since followed the developers' advice and written a script that iterates through individual items several times to circumvent the problem that one item cannot be updated / tagged within a few seconds.

So what my script does now is give the replacement tags I want to add an index number, and I am calling tags for all items by index before the whole function starts again.

This has worked well on my trial libraries.

I have put the new script in my GITHUB repo for you:

https://github.com/MonikaBarget/DigitalHistory/blob/master/SplitReplaceTagsPerITEM_Zotero.py

Let me know if you have any other questions.

Bonne chance! :-)

s1968258 · May 14, 2024

HI All,

I consolidated several automatic tags into one big tag for the demonstration of a pitch deliverable, but having since added new records to the collection, I've got new aggregates of tags and want to split* that consolidated tag back into its constitutent parts so I can run frequency of use in NVIVO,etc. ANY* idea how to do this other than the obvious hard way of reimporting / cross tabulating? the same I've affected is 42 records, all of which had food + (something else) that I consolidated into FOOD* AND

Thanks in advance!

dstillman · May 15, 2024

@s1968258: In the Zotero 7 beta, you can right-click on a tag in the tag selector and select "Split".