Zotero and Systematic Reviews

heddam · November 18, 2014

I am a library student, and I'm trying to explore how useful Zotero is for systematic reviews with the ultimate goal of creating a guide on how to use Zotero as a tool for systematic reviews. Would anyone who has done a systematic review using Zotero be willing to share with me what features of Zotero they found helpful and what was less helpful? Or if someone found that Zotero wasn't the best tool for systematic reviews, that would also be awesome information. I figure it's all well and good for me to try to understand Zotero as thoroughly as possible, but input from someone who has actually done a systematic review, rather than just my "pseudo systematic reviews" to mess with Zotero would be awesome.

Thanks in advance!

adamsmith · November 18, 2014

I haven't done any systematic reviews myself, but having advised people on the forums who have, I'd say biggest problems have been:
1. Importing massive search results into Zotero takes longer than into Z39.50 capable tools like Endnote. There are ways to speed up things depending on the database (e.g. on pubmed you can export all search results as XML on the left and then import into Zotero) but overall it's going to be slower.
2. While Zotero can detect duplicates, you can't tell it to automatically merge/delete them
3. Zotero slows down more than some other tools with massive databases (>30k entries)

The main advantage would be that Zotero works better than pretty much any other tool with the native search interfaces of the various databases, so you can optimize your search strategy for each respective database (and it'll work better than other tools with databases without Z39.50 support).
Other advantages - e.g. tagging, collections, saved searches - depend on how exactly you plan to use it.

heddam · November 18, 2014

Thank you for your pointing out the problems with Zotero for systematic reviews. I'm aware of all of the advantages you listed, it's just been difficult for me to identify some of the disadvantages.

DWL-SDCA · November 18, 2014

I have found that it is often better to work within various databases to narrow the records that meet the project criteria that to download large numbers of items to be screened later. I set collections with titles of each db searched and sub collections by year(s) of publication. I find that this method also facilitates performing other kinds of analyses -- looking at index terms used in each db across years, records that are available in a db but aren't indexed in a way that they may be easily found, etc. The notes or extras fields are useful here. For these purposes, having duplicates is actually a plus instead of a disadvantage. I use a new FF profile for this so that everything in the library is only directly related to the systematic review. I could go on and on but I think you have the idea.

bwiernik · November 19, 2014

Most of my research is systematic reviews and meta-analyses, and I certainly find that Zotero is extraordinarily helpful in this area. I've used both Endnote and Zotero for meta-analyses and find Zotero to be easier overall, due to the tight integration with all of the various databases adamsmith mentions. I don't tend to rely on automated searches for the reviews I do, so I never found EndNote's Z39.50 functionality terribly useful.

The other major advantage is the flexibility that tags, collections, saved searches, etc. offer. Rather than specifying a certain organization scheme, these features allow me to adapt the structure and organization of my saved studies to fit the needs of the current project. Exactly how I need to classify and code each study varies a lot across substantive domains, and the flexibility that Zotero offers here has been very helpful.

The group library and collaboration functions have also been a major advantage. I usually have teams of 10-20 research assistants working a review with me, and shared group libraries have helped me facilitate tasks like assigning articles to review to different RAs and making appropriate materials available to my RAs while limiting their write capabilities.

heddam · November 22, 2014

Thanks everyone for the feedback, it is enormously helpful.

Gurdas_Sandhu · November 22, 2014

Disclaimer: I often do a literature review for a specific question or fact. I wasn't sure what systematic reviews are, but having read the Wikipedia description, I don't exactly do systematic reviews, but something along that direction.

I find the tags feature to be very helpful. I used to create collections, but moved to tags and saved searches. With tags, I miss having an OR function; selecting two tags is a AND function. Another feature that would help is having an automated function to merge (or suggest) similar tags.

My typical workflow is to search a database for keywords, scan the results and select items based on relevance, and pull the selected items in to Zotero. Once I have the PDFs, I'd LOVE if Zotero could create a temporary collection of the references in the PDF, compare to my existing collection, highlight those references I do not have, and then allow me get the full metadata and PDFs for the references. I think having this "branching" capability will significantly improve literature review while also drastically reducing the time. Even a small step in this direction will help.

I did not understand what Adamsmith meant by " Zotero works better than pretty much any other tool with the native search interfaces of the various databases, so you can optimize your search strategy for each respective database." Am I missing some trick here?

adamsmith · November 23, 2014

I did not understand what Adamsmith meant by " Zotero works better than pretty much any other tool with the native search interfaces of the various databases, so you can optimize your search strategy for each respective database." Am I missing some trick here?

No, I just probably didn't express myself well: Using Zotero, you can go to EBSCO or Proquest or Ovid on the web and use _their_ search interface, taking advantage of all the features they offer. Zotero's import quality from those web interfaces (what I call native search interfaces) is unmatched--we do a lot of post-processing, fixing up little things etc.
With other reference managers this either just works a lot less well (e.g. Mendeley) or you have to go through the built-in search (via Z39.50) and that restricts the types of searches you can do.

I'd LOVE if Zotero could create a temporary collection of the references in the PDF,

both Endnote and Mendeley experimented with this and gave up on it. The technology for extracting references from a formatted bibliography at the end of a PDF is just not there (yet?).

Gurdas_Sandhu · November 26, 2014

I am already quite thankful that Zotero exists. And based on adamsmith's clarification, I guess all I need to do to be even more appreciative is try one of the other reference managers :)

Here's what my brain is saying are the steps to extracting a PDF bibliography at the end of a JOURNAL paper. I am curious which of these steps are proving to be road blocks:

1. Auto-find bibliography start and end. Or, user highlights the text to simplify this step.
2. User specifies the journal/citation type to make interpretation easier.
3. Zotero pulls the text, or user can push it using Zotfile
4. Scan the text for journal names, volume, issues, and page numbers. Or, at the very least, DOIs. Remember, user has specified journal type so format of bibliographic entries is somewhat known, though errors are always going to be there.
5. Create a list of suggested references in a special temporary collection. Highlight those that are already in user's database using duplicates.
6. After use has cleaned up any errors, get the articles using the extracted information.

It may not work flawless, but even a decent start could potentially save dozens, may be hundreds, of hours.

aurimas · November 26, 2014

If this is going to happen, this has to be completely automatic, so...

Auto-find bibliography start and end. Or, user highlights the text to simplify this step.

This has to be almost entirely automatic, which is a huge challenge, because bibliographies look very different in different journals. Additionally, some PDFs will contain bibliographies from preceding articles (e.g. Nature's News and Views section) or, even worse, could contain a whole additional article (with its own bibliography) as the last page. Having said that, CrossRef has developed a tool that claims to do this fairly well: http://labs.crossref.org/pdfextract/ I haven't played around with it too much, though.

User specifies the journal/citation type to make interpretation easier.

I don't think we would want users to have to identify each journal

Scan the text for journal names, volume, issues, and page numbers. Or, at the very least, DOIs. Remember, user has specified journal type so format of bibliographic entries is somewhat known, though errors are always going to be there.

http://anystyle.io/ looks like a promising resource for this. Though for references containing DOIs, you can just copy paste them into the Add by Identifier tool.

Even given the above resources, someone still has to put in the time to bring them all together in a cross-platform compatible manner and develop the user interface. It's not a small undertaking and there are other, more pressing bugs/features that require development time.

Gurdas_Sandhu · November 26, 2014

Promising projects. Thanks for answering and sharing, aurimas.

I will take back that users need to specify the journal. If the PDF is being opened from within Zotero, then the journal is easily known. I disagree that this should be entirely automatic, to begin with. That is a nice goal, but why let that come in the way of first steps? I am not saying any of this is easily done. All I am saying is that even a semi-automatic tool can save significant amount of time, if it does a few things and does them well. There are some steps a human can do easily, while a tool will struggle. On the other than, some steps are tedious for a human, but easy for a tool. My idea is that a good enough tool is one which combines these two. A perfect tool will do everything automatically.

Take the case of DOIs. Why not have a tool that only and only scans a PDF for DOIs and creates items for those DOIs? Sure, I can copy-paste or click each DOI to get the items, but when there are 30-50 of them per article and only 10 articles to go through, it is no longer an ordinary task for a human since we are talking of ~1000+ clicks. But a breeze for a tool and less than 10 clicks for a human, right?

aurimas · November 27, 2014

Yes, I don't disagree with incremental changes. But I think the small steps should be towards the goal outlined above. That is, I don't see much point in investing time towards features that are not going to contribute to the automatic solution, e.g. "user highlights the text to simplify [finding bibliography]". There's a significant overhead (in terms of UI design and user "training") associated with such temporary workarounds and ultimately manually indicating start and end of bibliography is not something we want the user to do (though I understand that it was just a quick suggestion and I'm not trying to say that your ideas were bad). As a first incremental step, I can see a Zotero add-on that extracts a bibliography from a PDF and displays it to the user. It would then be easier for the user to either copy-paste it into anystyle.io or into the Add by Indentifier tool. Alternatively, one could tackle the other end of the project first, where a plain text bibliography could be automatically imported into Zotero.

bjohas · February 21, 2015

Slightly on a tangent: For systematic reviews you might want to store structured data alongside the publication, such as effect size, number of participants, etc. Some tools (such as the EPPI Reviewer) do that.

How would you store such structured data in Zotero?

Of course, you could do this in principle in notes, but the problem is that the data needs to be comparable between different records, so ideally would be structured.

Any thoughts?

adamsmith · February 21, 2015

can't be done (beyond notes or (ab)using other fields, that is). It's tricky - there is a real danger of mission creep when adding too much functionality going into various directions that specific other software (qualitative research, systematic review etc.) covers, so I'm not sure Zotero will add functionality for this.
The way this would likely have to look is some sort of custom fields. They're certainly not on the short-term agenda, but still may happen at some point in the future. Absolutely no promises, though.

bwiernik · February 21, 2015

@bjohas

The method I use to do that sort of sorting and analysis is to store that information in Excel with each study being on a new line. I store the connection to the Zotero item using the Zotero Select Quick Copy function (from here).

If you want other Zotero metadata (like journal title) directly in the spreadsheet, you could always start the spreadsheet by Exporting your data to a CSV first.

karim.chellaoui · February 25, 2015

@bjohas & @bwiernik

I tried the Excel option but things started to get messy for the fields that allow multiple entries.
I've personally opted for Access, I'm querying the sqlite database using this script: http://royce.kimmons.me/tutorials/zotero_to_excel

I copy the resulting list into Access where I have tables linked to the Zotero UID(which I believe is not visible for the other solutions mentioned, correct me if I'm wrong) and perform my analysis there.

This proved to be an effective approach for me as I perform reviews regularly on specific topics.

kheskett · October 18, 2019

Capturing the PDF's bibliography - if it has a strong health sciences list of articles, try copying and pasting them into HubMed Citation Finder
https://git.macropus.org/citation-finder/

It tries to match citations found in PubMed. Results can be exported in RIS or Bibtex format. It won't catch everything, but sometimes even a little head start is good.