Very Big Citation Libraries - is there a limit?

Hi, I had some conversation on twitter recently with @adamsmith about a very big group library - I am @classicslib. I want to add approximately 80,000 citations to a Zotero group library and have them be available via web to other group members (as well as the general public). No .pdfs or documents of any attached. I've got the citations in RIS format in .txt files, ca. 4000 citations per file. I'm using Standalone 3.0.8 for Mac. Things started off well, but when I got to ca. 40,000 entries in the group library I started having trouble with the import process into Standalone - it would hang on import, I'd get a "script not processing" message, and I'd have to close Zotero to do anything. I broke the .txt files into smaller chunks (ca. 2000 citations each) and got another year's worth in, but now pretty much anything I do is freezing Zotero (syncing, trying to import anything, at the moment just opening it) and I have to force quit. Have I simply run up against the limits of a group library with syncing? @adamsmith said there was no limit - but have other people successfully hosted libraries this large, either in Standalone or sycned? Many thanks for any suggestions.
  • Between your various libraries, you have 137K items up on the server now (meaning, presumably, that you have that number in Zotero Standalone as well). Very few people have databases anywhere near that large, and I suspect sailing will be pretty choppy for you at the present time.

    Part of the problem is that, even if you're able to upload all the data, you're essentially going to break Zotero for anyone you invite to that group. While you can upload the data in parts as you import it, due to the way syncing currently works other members would need to download the entire library at once, and that probably won't be possible server-side.

    (Right now you yourself have a massive queued download, though I'm not sure if that's a download of the uploaded data on another computer or if the original computer is redownloading the uploaded data for some reason. Do you happen to know which? I've given it a nudge, but it might be tough to get that to go through, and a sync this large isn't really something that others should attempt.)

    We're working on some syncing improvements that will offer better handling of large libraries, and we're hoping to roll those out by the end of the year. We haven't ever tested a Zotero client with that many items, though, so there may be some other performance optimizations that need to be done just for adequate local usage of a database that size.

    If you do get the library uploaded, it might be accessible via the web. But I wouldn't try to have others access it through the client for the time being.
  • Thanks for all this, Dan. The second sync queue was for a different machine, and I've shut down Zotero on that machine for now. It looks like right now Zotero isn't the answer for the specific project I wanted to accomplish, but in general it remains a very valuable tool!
  • I have a question regarding whether any of this has changed in the time since 2012? Specifically, I'm looking at several collections of citations that are less than but approaching 90,000 records. Could those be shared out with a Group? Then, I have a collection with just under 14 million citations. Could 14 million citations be shared out through a Zotero Group?
  • Dan, I also have a similar question as flagged by randtke.

    I recently became part of a 'global assessment' project that involves about 70 researchers working over 2 years to synthesize the literature on a particular set of topics. The team created a group library for its use. At last count, the number of references in the database has reached 42,000+! Now, since I use zotero for all my 'regular' research, in which I am a member of another 15 group libraries (each a much more modest number of refs--say 100 to 1000), the sudden addition of 42k references is really slowing zotero down, both when it launches, and when I do a search--whether in the big library or the smaller ones. Writing papers in Word is becoming really painful, because each time I popup the Add Citations window and enter keywords, it takes ~10 seconds to find the matching refs.

    Am wondering whether there is a way to 'separate out' this 42k library. So that I only deal with this library when writing for that project, and other writing goes on without this library. I could stop sync on this library, but I don't think that will really help speed up the search, given that 42k refs are already in there. Please suggest.
  • And I am not talking about 14 million citations ;-)
  • it's not ideal, but for the Word add-on, I've found that the classic view (though clunckier otherwise) helps when you have many, including large groups, because you can restrict search to one group easily.

    (Imo that also means that the ability to restrict search to one group in the standard dialog as you suggest is an important feature, but the above gives you a workable solution in its absence)
  • Thanks, adamsmith, for this workaround. Definitely helps me in my 'global assessment' project, when I am citing from only that group library. Unfortunately, may be of only limited help in my 'regular' work, because there my refs tend to be scattered across multiple group libraries.

    And for some reason: the simple search within Zotero (Title, Creator, Year) remains terribly slow, even though it is a 'library-level' search only (not cross-library).
  • 14 million citations amounts to about one-half of the total number of records in the PubMed database. The computing (and sophisticated software) resources to support that number are extraordinary.
  • FWIW, this _has_ gotten better since 2012. 137k between multiple libraries is significantly more possible than back then although some things may still be slow (but e.g. syncing will now generally work).

    14 Million items is definitely not possible (in any meaningful use of the word) with Zotero, though, no.
  • @adamsmith : what is the macro command in Word for opening the classic menu? Am wondering whether I can set up a keyboard shortcut so that every time I want to insert citation, I can jump directly to classic menu.
  • You can set that setting in the Cite pane in Zotero settings.
Sign In or Register to comment.