Zotero best practices and pitfalls to avoid for best performance

If there is a post or page that answers these questions, please pardon a repetition and include the page link in your response.

I would like to know what are the recommended best practices and typical pitfalls to avoid to get the best performance out of Zotero. Also, what are the limitations in Zotero regarding database size.

Specifically, some of the questions I have are:

1. What kind of PDF attachments can slow down Zotero? What if I have a PDF that is partly selectable text and text as image?

2. How many items (each with one PDF attachment) is too many items? 1000? 5,000? 10,000?

3. Regarding the 'Maximum pages to index per file' setting in Preferences > Search : What if I set it to 5,000? My total library has 2,000 PDFs, out of which 200 PDFs have more than 100 pages and about 50 PDFs have more than 1,000 pages?

3. Does it matter if the PDF is attached (to an item) as a file or attached as a link? Does the location of the PDF file matter?

4. Since Zotero FF works as a tab in the Firefox (FF) browser, what factors outside Zotero can slow it down in a way different than they would slow down other FF tabs?

5. Do duplicates affect performance?

6. MS Word's Track Changes was known to cause problems and I believe this is completely solved. Which means I can happily have Track Changes turned ON and still insert/edit inline citations. Right?

7. When working on a large document with numerous citations, for e.g. 500 pages with over 1000 citations, how to avoid long delays when inserting or editing in-text citations? I've heard of chapter-wise approach and believe that means one is writing each chapter in a separate Word file and simply copy paste them in to one file at the very end. I assume it means, Zotero refresh will reformat everything, so that no duplicates remain. This could take a long time, but is one-time process since the document is being compiled for submission/sharing. Right? Any other approaches that have worked as well or better?

7. Typically, until what page length and citation count, can a person work as one document before the need to split the chapters/pages?

8. Any other recommendations?
  • 1.) types of PDFs don't matter afaik.
    2.) Depends on your computer and on which operations you're concerned with. On a reasonably fast computer, libraries with 30-40k total items (items, attachments, notes) should still be performing quite smoothly, though with some perceptible lags. The number of PDFs only affects operations involving the full text index (such as a quick search everywhere)
    3.a)it will significantly increase potential lag on import of large PDFs and slow full text searches.
    3.b)no
    4.) not that we know of.
    5.) not more or less than two different items
    6.) to the best of my knowledge, yes, though it's possible that certain operations of the Word add-on will register as a lot of changes
    7a) yes, chapters is the right approach and yes, citations will auto-refresh
    7b)no hard-and-fast rules, and depends on the word processor (Word for Mac is by far the worst). I wouldn't author anything >100 pages with a normal number of references in a single file
    8) collapsing the tag selector on the left can speed up performance quite a bit. Having file sync enabled and being on or over your quota may cause intermittent freezes.
  • edited August 27, 2014
    Thanks, adamsmith! A few further questions:

    2. Currently my library has about 1200 parent items. Including all attachments, the total is 3000. Each time I create an item from a scholarly database or journal page, Zotero creates a snapshot attachment. Over time, I have not found any serious need for these snapshots since most of the useful information is pulled in and included in the main entry. I never bothered to remove them because I did not see any 'harm' from keeping them. Snapshots of generic webpages is a different issue and are certainly required. Would it be a good idea to delete the scholarly database snapshot attachments?

    3a. While adding new items to Zotero, I keep the pages to index set to zero. That way, Zotero is not trying to 'import' my PDFs immediately after I add them. Thanks to Dan for the tip! However, I do wish to utilize the full text search feature. So, what I'd like to do is set the characters and pages to index spec to very high values and hit rebuild index before I go to bed. Then, next day, set the pages to index back to zero. Any potential dangers here? I haven't rebuilt my index in a while, so there may be over 300 PDFs that have not been indexed.
  • 2) well, with that size library it shouldn't matter greatly, but reducing the number of items certainly speeds things up generally speaking.

    3) I think that should work. IIRC the rebuild index function lets you pick whether to only re-indexed unindexed items or all items, and you'll obviously want to pick the former. I don't know if this makes a huge difference, but since the index is, by default, syncing I'd imagine it's a bad idea to recreate it from scratch frequently.
  • Thanks, adamsmith. On (3), I meant to index only unindexed items and agree re-indexing the whole thing is a bad idea.

    Wait, did you say the index is syncing with my online Zotero account? Is there a file in my Zotero profile that tells me the size of my index (in bytes)? Just curious how it will increase when I hit reindex next time around.
  • edited August 29, 2014
    As for track changes, in Word 2010 and Word 2013 for Wondows, if you use Fields rather than Bookmarks for the Zoreto citations (which is default), changes to citations only register as a single change ("Field Change"). They work just fine.
  • Wait, did you say the index is syncing with my online Zotero account?
    yes.
    Is there a file in my Zotero profile that tells me the size of my index (in bytes)?
    no, it's part of the sqlite--but obivously you can check the size of that before and after indexing.
  • Wait, did you say the index is syncing with my online Zotero account?
    yes.
    But to be clear, you can disable it in Preferences -> Sync and it does _not_ count towards your storage quota.
Sign In or Register to comment.