Exporting CSV: notes and annotations
I'm trying to help a faculty member use notes in a meaningful way when exporting to csv & importing to excel. I select all in collection contents, select format>CSV, and click the radio button for "Export Notes."
After importing to excel, there's text in the notes fields, but it's all in html div brackets, and there's about 30x the amount of html as the text of the note itself.
It's hard to imagine why that's the default for exporting notes into csv - why isn't it set up just to export the text itself, IOW, that which is, near the end of 1 1/2 pages of html, between quotation marks? Or is there a way to do that in the export process that I somehow missed, or do I need to figure out how to strip out everything but what's between "" marks in the html, in excel?
After importing to excel, there's text in the notes fields, but it's all in html div brackets, and there's about 30x the amount of html as the text of the note itself.
It's hard to imagine why that's the default for exporting notes into csv - why isn't it set up just to export the text itself, IOW, that which is, near the end of 1 1/2 pages of html, between quotation marks? Or is there a way to do that in the export process that I somehow missed, or do I need to figure out how to strip out everything but what's between "" marks in the html, in excel?
Personally, I'd pipe this through an R or python script (which any LLM will readily provide if you don't want to code it yourself) -- shouldn't take more than 5mins for the prompt & script, but you can also search for options within Excel, you can find some VBA scripts online e.g. here: https://stackoverflow.com/questions/47366923/remove-html-tags-from-string-in-excel-vba/47367704 (haven't tested those, but they look good). Last option would be to use a regex capable text editor like notepad++ or VS Studio Code, which would let you run the equivalent of the VBA macros as a simple search & replace.