Tips for speeding up duplicate merging?
I have a large Zotero database (around 35K references) and probably 3,000 to 4,000 duplicates. This is by design because we need to search multiple databases with similar search terms for a meta-analysis and then weed out duplicates.
I've emptied the trash and also hidden the tag window to speed up switching between folders (based on another forum discussion). I'm curious:
1) In general, are there any other things I should do to help Zotero work more quickly (experiencing slow opening of the database and slow navigating between folders)?
2) Specifically for duplicates, is there any way to speed up duplicate merging? Currently it is taking 30-40 seconds after clicking merge to complete the request and pull up the next duplicate. We have the staff to manually merge the duplicates, but the lag time between merges is maddening and is slowing down the progress of the project.
We'd like to stick with Zotero for these types of projects but I'm wondering if the size of the library (and future libraries) is always going to be a hindrance or if there are other things I could be doing to improve the process.
Thanks for all your work!
I've emptied the trash and also hidden the tag window to speed up switching between folders (based on another forum discussion). I'm curious:
1) In general, are there any other things I should do to help Zotero work more quickly (experiencing slow opening of the database and slow navigating between folders)?
2) Specifically for duplicates, is there any way to speed up duplicate merging? Currently it is taking 30-40 seconds after clicking merge to complete the request and pull up the next duplicate. We have the staff to manually merge the duplicates, but the lag time between merges is maddening and is slowing down the progress of the project.
We'd like to stick with Zotero for these types of projects but I'm wondering if the size of the library (and future libraries) is always going to be a hindrance or if there are other things I could be doing to improve the process.
Thanks for all your work!
https://www.zotero.org/support/kb/item_count
One thing that would likely _massively_ improve speed is to delete all the notes automatically generated during RIS import and then delete the corresponding tag. See the Tip at the bottom here:
https://www.zotero.org/support/kb/importing_records_from_endnote#importing_into_zotero
I'm working on deleting the notes with the _RIS import tag, but its taking awhile to even load them, which makes sense. As I looked through, I found several PsycInfo references with 50+ notes, including a journal article with 147 notes. I can send the citations for a few of these larger ones if needed. Is this normal? The majority of my library is PsycINFO references.
From which version (i.e. which database provider) of PsycINFO did you import and how exactly? And yes, a reference would be helpful.
Reference #1 (147 notes):
There's an elephant in the room: The impact of early poverty and neglect on intelligence and common learning disorders in children, adolescents, and their parents.
Bigelow, Brian. Developmental Disabilities Bulletin 34.1-2 (2006): 177-215.
Reference #2 (130 notes):
Early Relationships and Their Internalization.
Akhtar, Salman. In The American psychiatric publishing textbook of psychoanalysis, edited by Person, Ethel S., Cooper, Arnold M., Gabbard, Glen O., 39-55. Arlington, VA, US:American Psychiatric Publishing, Inc, 2005.
Proquest puts every reference in the bibliography in a separate note (N1 tag). We really have no way to filter those out during regular RIS import, since that's where notes are supposed to go. When using the URL bar icon, we can customize import for specific sites, but not for generic import from RIS etc. Doesn't look like ProQuest allows you to customize RIS export either (e.g. to not contain those references).
@aurimas - I don't think you have access to this, I've put a sample RIS here:
https://gist.github.com/adam3smith/11381032
maybe you have an idea.
edit: FWIW, using the URL bar icon does fix this, but I assume you don't want to do that for systematic review.
edit2: haven't looked, but I assume going through EBSCO you'd also avoid this particular issue.
@jenlnorvell, depending on whether you need any of your notes or not, you could just delete them all using a saved search (though it would take some time and you probably want to do it in batches of a few thousand, emptying the Trash each time).
Alternatively, if you don't need any of your notes and you have not used any of the citations in a document, you can export all of your citations into Zotero RDF without notes, clear your library (we'll guide you through this, if that's your choice), and re-import it.
N1:"notes"
toN1:"__ignore"
on line 185)I was thinking that if there is an option to export all, then we could offer that also via URL bar. We could probably do it for PubMed as well.
Our best recommendation would be to use the URL bar icon for import instead of RIS export. You can disable the "attach PDF" option if you don't need the full text, which will speed things up further.
The 2nd best option would be to modify the RIS import translator as specified by aurimas above. Drastically reducing the number of notes in your library should speed things up a _ton_.
ProQuest is really our only option (which is very unfortunate for a number of reasons) because it is the provider for our university for several of the databases we are using . And it took me ages to get the references out of ProQuest because of having to go page by page and select all records on a page before exporting and with frequent instances where the page would not load. I don't see another option for choosing the contents of the export other than "Citation, Abstract, and Indexing". I have all the RIS files saved though, so maybe this idea about pre-processing the RIS file before importing would work if I start over?
Also, I don't think I need the notes and haven't used the citations in any documents.
It would be great to salvage this existing library if possible. We've already spent a good bit of time merging duplicates. Can you tell me more about the option to export to Zotero RDF?
https://www.zotero.org/support/getting_stuff_into_your_library#web_translators
That won't help you now, though. About RDF: have you been syncing your computer? That would make this harder.
To prepare for using Zotero RDF, click on "Export Library" in the gears menu, select "Zotero RDF" as the format and make sure to unselect notes and, probably, files for the export. The export will take some time and you may get an unresponsive script message (if you do, click continue), so I suggest you let it run while you do something else. Once you have the export, I'd want to hear what aurimas thinks would be the best way to get it back in.
The library you are working on is a group library, correct? Do you have other group libraries? Do you have anything in your personal library?
@Dan, does resetting to/from server work for group libraries?