Best way/practice to implement reference sharing in a large structure

Hello everybody
I am seeking a solution for the management of thousands of bibliographical references in a structure comprising hundreds of people working in geosciences. Different teams are working on various regions and/or different topics related to earth sciences.
After years, thousands of papers have accumulated in various places in the tree of our shared drives, most of them being duplicated; indeed, it is such a mess, that when a new team forms to work on a specific subject or region, they find easier to create a new folder and copy the relevant articles at that specific place rather than remembering where they already exist.
Our goal is
1. Save space disk and clean our tree removing the duplicated pdf’s.
2. Gather all the pdf files in a unique place
3. Allow quick search using key words and any meta data by each member of the different teams to find relevant papers

Going through the documentation of Zotero, it seems that it could be a good tool to set up an efficient solution. Could you confirm and give me some advice to implement it in the best way?
My understanding of the way Zotero works is:
4. “Normal” Zotero Libraries cannot be shared. Each of them belongs to only one user. They can (or not) be synced on a Zotero server (Zotero Storage) or using a WebDAV solution so as the libraries can be accessed from different computers but with only one user account.
5. Groups are specific libraries that are shared between group members. They are synced and shared only using the Zotero storage solution (i.e., multi-user libraries (=Groups) cannot be synced with WebDAV).

In our case, the solution I imagined would consist in:
6. Creating one large private Group in which all the articles we find in our folders would be collected and stored (Let’s Call it the Main Group). Their metadata would be found and downloaded from the publishers web sites.
7. The different teams could create dedicated Groups in which they would drag the papers of interest for their work from the Main Group (and reversely feed the main Group with new references of interest).
8. Each member of each team and each individual of the organisation could have his own library.
The main drawback of this solution I initially saw, lies in the fact that each time a Library (or a Group) is synced, its content is duplicated on the local computers. This implementation would therefore not be a solution to save disk space, unless only the meta data are synced. It is especially not applicable for the Main Group, which will contain thousands of references.
The idea would then consist in synchronizing only the metadata of the Main Group and synchronize (download) only the pdf’s which are needed in each team Library using the “If needed” option (“à la demande” in French version). The members of each team would then compose their own Group by recovering the metadata and the pdf from the Main Library.
Do you think this is the correct way of working?
9. Both Main and Teams Groups (metadata and (pdf) attachments) would be saved on network drives only accessible through Zotero to avoid corruption of the database. The advantage would be automatic daily/weekly back up’s.
10. Each individual of the organization would have his own Library and belong to the Main and to its Team Group.
11. I am even wondering if Team Groups are really usefull. Can anyone tell me the advantages and drawback of having several Groups in a same structure? Instead, it could be even lighter to use specific collections for each team in the Main Group. This would allow to identify the references of interest for a specific thematic or region and would be easier to manage once the project has come to an end and the team is dissolved (just remove the collection). It would also save disk space since the pdf's would not be duplicated. But the risk is to end up with a Main Group being really heavy.

Sorry for being so long. I tried to be as clear as possible to define my needs and facilitate your answer by numbering the different ideas.
Any help would be greatly appreciated.
Thank you for your precious time!

  • -1- Save space disk and clean our tree removing the duplicated
    pdf’s.

    Zotero will do that.

    -2- Gather all the pdf files in a unique place

    Same.

    -3- Allow quick search using key words and any meta data by each
    member of the different teams to find relevant papers

    Same.

    -4- “Normal” Zotero Libraries cannot be shared. Each of them belongs
    to only one user. They can (or not) be synced on a Zotero server (Zotero
    Storage) or using a WebDAV solution so as the libraries can be accessed
    from different computers but with only one user account.

    Correct.

    -5- Groups are specific libraries that are shared between group
    members. They are synced and shared only using the Zotero storage
    solution (i.e., multi-user libraries (=Groups) cannot be synced with
    WebDAV).

    Correct. But unlimited storage for unlimited groups for unlimited
    users costs $120/year, total.

    -6- Creating one large private Group in which all the articles we
    find in our folders would be collected and stored (Let’s Call it the
    Main Group). Their metadata would be found and downloaded from the
    publishers web sites.

    That would work.

    -7- The different teams could create dedicated Groups in which they
    would drag the papers of interest for their work from the Main Group
    (and reversely feed the main Group with new references of interest).

    Possible, but mind that files are not shared between groups. If an
    item “exists in two groups”, that would mean each group has a private
    copy of the item + attachments, and changes/new attachments to one would
    not be reflected in the other. If you do want that, you would have to
    use collections within a group, rather than separate groups.

    The main drawback of this solution I initially saw, lies in the fact
    that each time a Library (or a Group) is synced, its content is
    duplicated on the local computers.

    I believe a change is planned that would make it possible to only
    sync attachments in use (I suppose ones you opened in Zotero), and that
    attachments that fall out of use would be locally removed, ready to sync
    again. This would work per-computer, not all systems would need to have
    the same files synced.

    This implementation would therefore not be a solution to save disk
    space, unless only the meta data are synced. It is especially not
    applicable for the Main Group, which will contain thousands of
    references. The idea would then consist in synchronizing only the
    metadata of the Main Group and synchronize (download) only the pdf’s
    which are needed in each team Library using the “If needed” option (“à
    la demande” in French version). The members of each team would then
    compose their own Group by recovering the metadata and the pdf from the
    Main Library.

    I don’t know details of the new sync option of only in-use files (I
    remember it being named somewhere on the forums, but cannot find it
    right now), but that would work for your case yeah?

    -9- Both Main and Teams Groups (metadata and (pdf) attachments) would
    be saved on network drives only accessible through Zotero to avoid
    corruption of the database. The advantage would be automatic
    daily/weekly back up’s.

    Can’t speak on this. I think I’ve seen the Zotero team mention
    they’re not keen on setups like these. It’d mean Zotero can’t be used
    offline, and you don’t want to point all installs of Zotero to the same
    network drive location I think. That would work better in a linked-files
    situation, but here too that’s harder for the Zotero team to
    support.

    -11- I am even wondering if Team Groups are really usefull. Can
    anyone tell me the advantages and drawback of having several Groups in a
    same structure? Instead, it could be even lighter to use specific
    collections for each team in the Main Group. This would allow to
    identify the references of interest for a specific thematic or region
    and would be easier to manage once the project has come to an end and
    the team is dissolved (just remove the collection). It would also save
    disk space since the pdf’s would not be duplicated. But the risk is to
    end up with a Main Group being really heavy.

    Groups have the advantage of (course-grained) edit-permission
    management. But if that’s not your concern, one big group with partial
    sync (which isn’t available today, and I don’t know if and when it will
    be available) seems like exactly what you want.

  • Just to add to that
    re: 9 -- really depends on the exact set up of the network drive. It's generally better to have Zotero locally to prevent access errors, and the network drive would have to be individual to the person accessing it (like a roaming profile, e.g.). You can't have multiple people trying to access a Zotero database stored on a network drive at the same time.

    11 -- I'd go with team libraries if you have somewhat well-definied teams, yes. Having a giant library can be overwhelming and, e.g., search will be easier and faster in somewhat smaller libraries, as will be word processor integration.
  • Thank you emilianoeheyns and adamsmith for taking the time to read my novel and giving detailed answers.

    Regarding the item 9, what I had in mind is, I think, what adamsmith mentionned: The "Main Group" would be saved on a network drive but only accessed by its owner (who would have admin permissions). The other members would have it synced on their local drives (but without the attached files downloaded, otherwise their local drive would run out of space instantaneously). Indeed, one advantage of a network disk is that they are large enough to host all these pdf files at the same place. In my mind, (but I may be wrong), it would alos ease the transmission of the Group when the person in charge of it would change position in the organisation. It would also be a safer place as network drives are regularly backed up.
    Would it work like that?

    Regarding item 11, I'm happy to be confirmed that my questioning was not so stupid after all. It really depends on the use we'll do of this biblio database. I do not think people will share comments, writing detailed notes on publications. They could flag some papers of interest using the color codes or specific key words, but nothing more. If these are done in a TEAM GROUPS, they will not be shared in the MAIN GROUP, but it should not be a problem. But anyway, this detail is worth noting, thank you.
    Above all, if we take the decision to implement such a solution, if we want people to use it, if must be fluent. If it is slow, people will copy again the pdf in various folders on the network drive and we'll be back to the present situation. In taht case TEAM GROUPS should be the bestalternative
    We just need to design the TEAM GROUPs so as they are stable in time, as suggested by Adamsmith
    Thanks to you, it is clearer in my mind.
    Tanks a lot !
  • I'd test 9 out. Running a database of a network drive can often be frustratingly slow (even on a very fast network) and there are some cases when Zotero doesn't properly recognize it on start, but basically if it works its safe to use, so you can just test it.
Sign In or Register to comment.