Programmatic Handling of DOCX citations is hard.
I am trying to help someone who used Word with Zotero originally but wants to move to using Zotero with LaTeX. Their thesis contains Zotero cites and field codes in DOCX show JSON data like the following (they sadly didn't use BetterBibTex so citation-key is missing):
````````````
{"citationID":"QxROkizv","properties":{"formattedCitation":"[1]","plainCitation":"[1]","noteIndex":0},"citationItems":[{"id":2486,"uris":["http://zotero.org/users/17530164/items/8AFQZBYA"],"itemData":{"id":2486,"type":"chapter","container-title":"Computing Meaning","DOI":"10.1007/978-1-4020-5958-2_5","ISBN":"978-1-4020-5958-2","page":"87–124","publisher":"Springer Netherlands","publisher-place":"Dordrecht","title":"Segmented Discourse Representation Theory: Dynamic Semantics With Discourse Structure","URL":"https://doi.org/10.1007/978-1-4020-5958-2_5","author":[{"family":"Lascarides","given":"Alex"},{"family":"Asher","given":"Nicholas"}],"editor":[{"family":"Bunt","given":"Harry"},{"family":"Muskens","given":"Reinhard"}],"issued":{"date-parts":[["2007"]]}}}],"schema":"https://github.com/citation-style-language/schema/raw/master/csl-citation.json"}
``````````````
Pandoc can read these citations out of the DOCX however there are multiple IDs: citationID = QxROkizv id=2486 and uris=...8AFQZBYA -- Pandoc copies just the id=2486 and it becomes the BibTeX citekey but this number seems impossible to use to search with in Zotero: it is hidden from view and any API search? I wanted to write a script that could search for id in the LaTeX output and replace id with the cite-keys. "citationID" also seems to not be searchable. The end of "uris" can, so a search for 8AFQZBYA shows the ref in Zotero.
I would rather be able to just search and replace using id by parsing the LaTeX, but it seems the only solution is unzip the DOCX, and parse the uris, then inject citation-keys into the JSON in the fields in the DOCX XML? Is there a better solution here? Thank you!
````````````
{"citationID":"QxROkizv","properties":{"formattedCitation":"[1]","plainCitation":"[1]","noteIndex":0},"citationItems":[{"id":2486,"uris":["http://zotero.org/users/17530164/items/8AFQZBYA"],"itemData":{"id":2486,"type":"chapter","container-title":"Computing Meaning","DOI":"10.1007/978-1-4020-5958-2_5","ISBN":"978-1-4020-5958-2","page":"87–124","publisher":"Springer Netherlands","publisher-place":"Dordrecht","title":"Segmented Discourse Representation Theory: Dynamic Semantics With Discourse Structure","URL":"https://doi.org/10.1007/978-1-4020-5958-2_5","author":[{"family":"Lascarides","given":"Alex"},{"family":"Asher","given":"Nicholas"}],"editor":[{"family":"Bunt","given":"Harry"},{"family":"Muskens","given":"Reinhard"}],"issued":{"date-parts":[["2007"]]}}}],"schema":"https://github.com/citation-style-language/schema/raw/master/csl-citation.json"}
``````````````
Pandoc can read these citations out of the DOCX however there are multiple IDs: citationID = QxROkizv id=2486 and uris=...8AFQZBYA -- Pandoc copies just the id=2486 and it becomes the BibTeX citekey but this number seems impossible to use to search with in Zotero: it is hidden from view and any API search? I wanted to write a script that could search for id in the LaTeX output and replace id with the cite-keys. "citationID" also seems to not be searchable. The end of "uris" can, so a search for 8AFQZBYA shows the ref in Zotero.
I would rather be able to just search and replace using id by parsing the LaTeX, but it seems the only solution is unzip the DOCX, and parse the uris, then inject citation-keys into the JSON in the fields in the DOCX XML? Is there a better solution here? Thank you!
-
bwiernikThe BetterBibTeX plugin can convert a word doc with Zotero citations into markdown/TeX. Install the plugin and generate citation keys, then refresh the doc in Word, then convert with BBT
Upgrade Storage