Export customization - difficulties with tags/keywords

Hi there,

I'm working on a narrative literature review and am in the data collection phase. I'm currently casting a very wide net with my search criteria and saving almost every article that seems related to my research topic. I use Zotero's tag feature to quickly specify study populations, instruments, methodologies, variables and results as I skim through potentially relevant research. This way, I can quickly see which articles are most relevant.

Because Zotero is not a data analysis software, I want to migrate the Zotero data to Excel via CSV or Nvivo via RIS to conduct the narrative analysis. However, upon exporting my Zotero library to CSV or RIS and importing it into either Excel or Nvivo, I encounter two problems:
1) There are many columns (Excel) or attributes (Nvivo) that are empty or simply irrelevant to my research.
2) There is only one column (Excel) or attribute (Nvivo) for Zotero tags. This is unhelpful, as I want to be able to analyze data (or not) based on their specific tags. I would need each tag to be its own column or attribute, ideally with a user-defined label (population, instruments, results, etc.) but something as simple as "tag1, tag2, tag3, etc." would do.

Is it possible to offer an option to do this when exporting Zotero data? In my mind, it would look something like:
File > Export Library > Translator Options > Advanced Translator Options > Choose Fields to Export > Tag Options > Export tags individually? [if yes; define fields for each tags]

Cheers!
  • Tags are exported individually to RIS, so I'm not clear why that's not an option?

    For CSV, I guess I'm not conceptually clear how "Export tags individually" would even look. CSV is a tabular format, so every row (i.e. Zotero item) needs to have the same number of columns. With items having different numbers of tags, I' don't see how this could reasonably work. The only two even theoretically viable options -- get the maximum number of tags per items and use that to determine the width of the table or have a column for every different tag in the whole library -- aren't technically feasible, nor do they make a lot of sense to me as an export format.

    The tags in the tag column are comma separated, so there would probably be ways to explode them with relative ease after export.
  • In Excel you can split text into different columns with a few clicks.
  • Could Zotero export in a tab seperated value format? This could keep existing commas as punctuation or as a secondary separation for individual tags/keywords.
  • We could but it doesn't really make a differences and Excel handles CSV more conveniently.

    As per standard practice, all fields are enclosed in quotation marks so that internal commas are preserved.
  • See also https://tools.ietf.org/html/rfc4180:
    6. Fields containing line breaks (CRLF), double quotes, and commas
    should be enclosed in double-quotes.
  • @adamsmith - Thanks for your quick reply.

    With regards to RIS, I believe the problem may be contained to Nvivo. When an RIS file is opened in Nvivo, it generates a classification sheet useable in Nvivo that is, essentially, a CSV file. Therefore, in an Nvivo classification sheet, there is one column labeled: "keywords", with tags separated by semi-colon, just like in the Zotero-exported CSV export.

    Now, because the tags are comma separated, it is indeed very easy to explode them in Excel. The problem however, is that tags do not export in the same order as they appear in Zotero. In Zotero, tags are listed alphabetically. However, in the CSV export, they appear in a seemingly random order. Therefore, when tags are exploded by comma, each column does not represent the same concept. For example, tagged populations do not end up in the same columns, instead they end up sprawled across different columns, which ideally would be dedicated to other concepts (methods, results, instruments, etc.)

    Thinking of my own research and interests, I personally think of two options for what "Export tags individually" might look like:

    1) Having dedicated columns for every tag in my library, with rows labelled 0 (tag is not present) or 1 (tag is present). This would allow me to filter library items based on their Zotero tags. This would be cool because then, just like in the Zotero app, I would easily be able to query articles that respond to different search requests, but then also be able to perform more sophisticated analyses in Excel or Nvivo.

    2) Being able to user-define tag 'concepts' so that in the CSV file, tags are grouped in columns according to their assigned concept. Similar to regular Zotero fields. Items that are missing certain tags would be left blank or would be filled with any other missing data value just like regular bibliographic fields.

    Let's say my library tags are "LGBTQ", "Canada", "Homeless", "Meta-analysis", "Barriers to care", "Infection rates", "Qualitative" and "United States."

    If there was a way that I could specify in Zotero that the "LGBTQ" and "Homeless" tags are related to a study's population, "Canada" and "United States" related to country of publication, "Meta-analysis" and "Qualitative" related to the methodology and "Barriers to care" and "Infection rates" related to the results, then I imagine I would be able to export my Zotero data to a CSV file wherein, instead of having 1 Keywords column, I would have 4 columns for each tag subject: population; country; methodology and results.
    If a particular Zotero item has multiple tags for one concept (let's say one article studied homeless LGBT youth and is therefore tagged "homeless" "LGBT" and "youth"), then the relevant tags would be inserted comma-separated in the same column together (similar to how all tags are currently being exported under 'Keywords' in CSV).

    Basically, I'm already doing 2) by hand, but it would be absolutely phenomenal if this process could be automated within Zotero. Would truly save a lot of time!

  • Yeah, sorry, neither of the options are doable with reasonable effort, I'm afraid.

    Alphabetically sorting the tags before writing them to the CSV, however, should be pretty easy and makes sense to me. @zuphilip what do you think?
  • @adamsmith thanks again.

    Any idea whether or not these are features one could expect in the eventual future?

    I had thought about alphabetically sorting out the tags before writing them to the CSV but have not tested it yet. I'll give it a try and update here when I do. Cheers.
  • I'm pretty sure idea 1) doesn't have wide enough interest/usability to make it worthwhile. Some people have been interested in hierarchical tags, which would mimic a version of concepts, but I don't see this happening in the foreseeable future (i.e. 1-2 years), no, sorry.

    The alphabetic tag export was intended for us to change, not for you, though we'd obviously take a pull request if you're able to do this yourself.
  • Thanks @adamsmith

    Whoops, my bad. What I wanted to try was to alphabetically sort the comma-separated tags in Excel row by row once they're imported, then exploding the column -- not before writing them to the CSV! Don't have the technical ability to do quite that yet, haha.
    If you guys manage to implement alphabetical tag sorting upon export though, that'd be great. Keep me posted.
  • @adamsmith Agree on alphabetical sorting during CSV export in the Zotero translator.

    Did you try out the filters in Excel? You should be able to filter for a specific tag also when it is contained in a field with more tags. Moreover, a AND/OR-combination of two (?) filters should be possible in Excel.
  • Hi @zuphilip,

    Can't seem to be able to do that with basic filters. Trying my hand at advanced filters but it's a bit beyond my regular Excel usage. Once I get over the learning curve I'll get back to you.
Sign In or Register to comment.