QDA and Zotero
Let me try this again as a new thread. I am looking for a way to make annotations IN attachments in Zotero (PDFs, html, etc.) that can then subsequently be extracted in such a way that they can be cross-tabulated and analyzed.
Let me elaborate. Imagine you have a corpus in the form of a Zotero collection. And you want to analyze how certain topics/themes/ideas/etc. have changed over time. Or how they are different across subsets of the corpus - e.g. how are these topics/themes/ideas/ different in this country/discipline/school of thought/... The idea would then be to develop some coding scheme and to start coding the actual content of the items in your corpus. But ultimately, you'd want to be able to cross-tabulate the data.
Does anybody have an idea about how I could do this? I.e. either WITHIN Zotero (through a ), or maybe even in such a way that we could do teh coding in Zotero and then extract the coding results from sqlite?
Any ideas would be greatly appreciated.
Let me elaborate. Imagine you have a corpus in the form of a Zotero collection. And you want to analyze how certain topics/themes/ideas/etc. have changed over time. Or how they are different across subsets of the corpus - e.g. how are these topics/themes/ideas/ different in this country/discipline/school of thought/... The idea would then be to develop some coding scheme and to start coding the actual content of the items in your corpus. But ultimately, you'd want to be able to cross-tabulate the data.
Does anybody have an idea about how I could do this? I.e. either WITHIN Zotero (through a ), or maybe even in such a way that we could do teh coding in Zotero and then extract the coding results from sqlite?
Any ideas would be greatly appreciated.
http://chrisjr.github.io/papermachines/ might have some interesting analysis methods, but I think it would do best with tags, not notes (I could be wrong). Other than PaperMachines, I'm not aware of any categorical data analysis plugins for Zotero, which means that you would want to export your annotations to some other software. There would probably be ways to do this if we knew what format the other software can read.
If you give some more concrete examples of what kind of annotations you want to make in the PDFs and what kind of output you expect (or what analysis software you intend to use), we can probably assist you more.
And so yes, we could add those as tags in Zotero, but then you miss the 'drill-down' option afterwards. The nice thing about a program like Nvivo (which I hate by the way, but I'm just giving it as a well-known example) is that once you've marked up the articles, you get the stats AND you can drill down to the precise passages in the text that have been marked up with whatever code you're interested in.
And mind you - it'd be great to be able to do this WITHIN Zotero. But even if that is not possible (as I suspect), I was hoping that somebody might know of a way to work with the underlying sqlite dbase (with another software program) outside of Zotero, in such a way that we could mark up the attachments etc. without 'breaking' the Zotero structure.
Qiqqa seems to be the pdf organizer/reference manager that is the closest to allowing this (by allowing tagging of pdf annotations). Their business model doesn't appeal, however, as they lock you into using their cloud storage system, which is expensive.
@sdspieg
Any success with your work? Have you checked whether Atlas.ti, Citavi or maxQDA could be of any help?
I'm sure it will come of course, it's just way too obvious :) We can even do automated coding with papermachine, so why shouldn't we be able to do hand-coding?
And if you're interested in my longer-term view, I think more and more these 'middle-men' (like Zotero or Dedoose) will be 'cut out' and will be fully appified. People will be able to 'annotate' text in sthg like Diigo (which, incidentally also allows this for pdfs).These annotations will also get a URI, just like all DOIs and that's what will be 'quoted' - probably not in a footnote, but just as a hyperlink. Think of 'active citation', for instance. At some point, all of this human hand-coding will then be put into some deep learning system and from then on all of this will be done algorithmically. But hey that's just my layman perspective on the future of 'knowledge sourcing' :)
I once tried to code PDFs in MAXQDA but did not feel this would get me anywhere. Qiqqa's annotation snippets are nice but have their limitations too.
I personally cling on to the idea that human sense-making is not substitutable to mechanic algorithms. Deep learning sounds to me like too much science fiction but who knows.
Also you said it doesn't YET integrate into Zotero - does that mean somebody is thinking about this for the future? Because that certainly WOULD make a difference for me. I also like the idea that this might push us a bit closer to the ideal of "Wissenanhäufung"... Although I'd still have to see how 'self-structuring' the annotations are. Because it is nice to see (as we also do on our Kindles) which passages of text are highlighted by the 'wisdom of the crowd', but that is still a far cry from a (self-)structured debate about those passages. Plus I also don't quite see yet, how we can overcome the copyright hurdles for socially annotating articles from academic journals or books.
Still - thanks for the tip, Sebastian!
Citavi is a program that may be closer to what you want. It is basically like Zotero, with a browser plug-in and a Word add-in, but with the added feature of a more fully-functional "knowledge item" manager for quotations and comments linked to specific text passages in pdfs. Minuses are: Windows-only, not free or open-source, and I find their workflow for managing citations slower than Zotero's. Plusses are: comes with unlimited cloud storage, great customer support, and a built-in knowledge manager for sorting/comparing quotes.
I was hesitant to switch from my current workflow, however, because of the sunk cost of learning a new program, and Citavi has some quirks that bothered me (although these are UI issues that wouldn't bother everyone).
I do have access to NVivo-- can you tell me why you don't like that program? Is it just that you would rather use only one program where possible?
Also, I'm interested in knowing what you are unable to do in Zotero using notes, tags, and Zotfile. I'm imagining a workflow like this:
- Enter reference into Zotero and attach PDF
- annotate pdf in Acrobat with highlights and commenting
- use Zotfile to automatically put quotations and comments in an attached note
The next steps would depend partly on how many quotations and comments you would have per file. If not many, you can just tag the note with any additional keywords for later searching. If you have a lot, you could break the note into several smaller notes with individual tags.
One other thing you may not realize is that you can have standalone notes in Zotero that you link to a reference using the "related" field. See: https://www.chronicle.com/blogs/profhacker/taking-better-notes-in-zotero/36561
@sdspieg, I'm very interested in hearing your feedback given your experience with actually dealing with the sort of data.
The tagging feature of annotations is really nice, as is the ability to link to them.
Hypothesis is working with a lot of publishers on the annotation/paywall question. I think that can be done (within the limits set by the system -- you still won't be able to see articles you don't have access to.
And yes, I've talked to one of their devs and we think a plugin that brings in Hypothesis annotations into a Zotero note is super doable -- but I'm not going make any promises of time.
For full disclosure, Hypothesis is on a grant with my group at SU:
https://qdr.syr.edu/qdr-blog/qualitative-data-repository-teams-hypothesis-develop-annotation-transparent-inquiry-ati
Thanks @adamsmith . I'm definitely interested. I've always been annoyed at the lack of transparency in qualitative research. Or should I say embarrassed about my own work.
I guess there things will start changing soon. At an academically glacial pace, I reckon. In the meanwhile, we'll just keep experimenting with various ways to hook up Zotero libraries to various textmining tools (as with ITMS - on which progress is also slower than we had anticipated)
PS - in your proposal, you IMO understate the problems in quantitative approaches as well: very few people really appreciate (or just decide to ignore) the many many caveats that come with ALL datasets.
@quickfold11
Thanks - I'll check it out.
No. In fact we do use Dedoose for manual coding. I find NVivo too 'Microsoft Office'-like; whereas Dedoose is more 'Googley'. The main things is that we work with sometimes quite large research teams (>20), and so we don't want to have to deal with version control issues etc. But so if you do a search on Dedoose here, you should find more info on the whys and hows.
And how do you do that? I.e. get the highlighted/marked text excerpts automatically from Acrobat into the notes? I should maybe take another look at Zotfile, which I do find great for other functionalities. But again - it's the real-time collaborative aspect that we'd miss. We are typically talking about 100s of documents, very richly marked up with sometimes 100s of (nested) codes. And this workflow would also leave out the usual QDA-stuff: developing coding schemes, easily applying codes to excerpts, color codes, 'seeing' the codes vertically to the right of the text excerpts etc.
But that's just attaching notes to items, no? And not to specific excerpts within the html/pdf/...?