Best way/practice to implement reference sharing in a large structure
Hello everybody
I am seeking a solution for the management of thousands of bibliographical references in a structure comprising hundreds of people working in geosciences. Different teams are working on various regions and/or different topics related to earth sciences.
After years, thousands of papers have accumulated in various places in the tree of our shared drives, most of them being duplicated; indeed, it is such a mess, that when a new team forms to work on a specific subject or region, they find easier to create a new folder and copy the relevant articles at that specific place rather than remembering where they already exist.
Our goal is
1. Save space disk and clean our tree removing the duplicated pdf’s.
2. Gather all the pdf files in a unique place
3. Allow quick search using key words and any meta data by each member of the different teams to find relevant papers
Going through the documentation of Zotero, it seems that it could be a good tool to set up an efficient solution. Could you confirm and give me some advice to implement it in the best way?
My understanding of the way Zotero works is:
4. “Normal” Zotero Libraries cannot be shared. Each of them belongs to only one user. They can (or not) be synced on a Zotero server (Zotero Storage) or using a WebDAV solution so as the libraries can be accessed from different computers but with only one user account.
5. Groups are specific libraries that are shared between group members. They are synced and shared only using the Zotero storage solution (i.e., multi-user libraries (=Groups) cannot be synced with WebDAV).
In our case, the solution I imagined would consist in:
6. Creating one large private Group in which all the articles we find in our folders would be collected and stored (Let’s Call it the Main Group). Their metadata would be found and downloaded from the publishers web sites.
7. The different teams could create dedicated Groups in which they would drag the papers of interest for their work from the Main Group (and reversely feed the main Group with new references of interest).
8. Each member of each team and each individual of the organisation could have his own library.
The main drawback of this solution I initially saw, lies in the fact that each time a Library (or a Group) is synced, its content is duplicated on the local computers. This implementation would therefore not be a solution to save disk space, unless only the meta data are synced. It is especially not applicable for the Main Group, which will contain thousands of references.
The idea would then consist in synchronizing only the metadata of the Main Group and synchronize (download) only the pdf’s which are needed in each team Library using the “If needed” option (“à la demande” in French version). The members of each team would then compose their own Group by recovering the metadata and the pdf from the Main Library.
Do you think this is the correct way of working?
9. Both Main and Teams Groups (metadata and (pdf) attachments) would be saved on network drives only accessible through Zotero to avoid corruption of the database. The advantage would be automatic daily/weekly back up’s.
10. Each individual of the organization would have his own Library and belong to the Main and to its Team Group.
11. I am even wondering if Team Groups are really usefull. Can anyone tell me the advantages and drawback of having several Groups in a same structure? Instead, it could be even lighter to use specific collections for each team in the Main Group. This would allow to identify the references of interest for a specific thematic or region and would be easier to manage once the project has come to an end and the team is dissolved (just remove the collection). It would also save disk space since the pdf's would not be duplicated. But the risk is to end up with a Main Group being really heavy.
Sorry for being so long. I tried to be as clear as possible to define my needs and facilitate your answer by numbering the different ideas.
Any help would be greatly appreciated.
Thank you for your precious time!
I am seeking a solution for the management of thousands of bibliographical references in a structure comprising hundreds of people working in geosciences. Different teams are working on various regions and/or different topics related to earth sciences.
After years, thousands of papers have accumulated in various places in the tree of our shared drives, most of them being duplicated; indeed, it is such a mess, that when a new team forms to work on a specific subject or region, they find easier to create a new folder and copy the relevant articles at that specific place rather than remembering where they already exist.
Our goal is
1. Save space disk and clean our tree removing the duplicated pdf’s.
2. Gather all the pdf files in a unique place
3. Allow quick search using key words and any meta data by each member of the different teams to find relevant papers
Going through the documentation of Zotero, it seems that it could be a good tool to set up an efficient solution. Could you confirm and give me some advice to implement it in the best way?
My understanding of the way Zotero works is:
4. “Normal” Zotero Libraries cannot be shared. Each of them belongs to only one user. They can (or not) be synced on a Zotero server (Zotero Storage) or using a WebDAV solution so as the libraries can be accessed from different computers but with only one user account.
5. Groups are specific libraries that are shared between group members. They are synced and shared only using the Zotero storage solution (i.e., multi-user libraries (=Groups) cannot be synced with WebDAV).
In our case, the solution I imagined would consist in:
6. Creating one large private Group in which all the articles we find in our folders would be collected and stored (Let’s Call it the Main Group). Their metadata would be found and downloaded from the publishers web sites.
7. The different teams could create dedicated Groups in which they would drag the papers of interest for their work from the Main Group (and reversely feed the main Group with new references of interest).
8. Each member of each team and each individual of the organisation could have his own library.
The main drawback of this solution I initially saw, lies in the fact that each time a Library (or a Group) is synced, its content is duplicated on the local computers. This implementation would therefore not be a solution to save disk space, unless only the meta data are synced. It is especially not applicable for the Main Group, which will contain thousands of references.
The idea would then consist in synchronizing only the metadata of the Main Group and synchronize (download) only the pdf’s which are needed in each team Library using the “If needed” option (“à la demande” in French version). The members of each team would then compose their own Group by recovering the metadata and the pdf from the Main Library.
Do you think this is the correct way of working?
9. Both Main and Teams Groups (metadata and (pdf) attachments) would be saved on network drives only accessible through Zotero to avoid corruption of the database. The advantage would be automatic daily/weekly back up’s.
10. Each individual of the organization would have his own Library and belong to the Main and to its Team Group.
11. I am even wondering if Team Groups are really usefull. Can anyone tell me the advantages and drawback of having several Groups in a same structure? Instead, it could be even lighter to use specific collections for each team in the Main Group. This would allow to identify the references of interest for a specific thematic or region and would be easier to manage once the project has come to an end and the team is dissolved (just remove the collection). It would also save disk space since the pdf's would not be duplicated. But the risk is to end up with a Main Group being really heavy.
Sorry for being so long. I tried to be as clear as possible to define my needs and facilitate your answer by numbering the different ideas.
Any help would be greatly appreciated.
Thank you for your precious time!
Zotero will do that.
Same.
Same.
Correct.
Correct. But unlimited storage for unlimited groups for unlimited
users costs $120/year, total.
That would work.
Possible, but mind that files are not shared between groups. If an
item “exists in two groups”, that would mean each group has a private
copy of the item + attachments, and changes/new attachments to one would
not be reflected in the other. If you do want that, you would have to
use collections within a group, rather than separate groups.
I believe a change is planned that would make it possible to only
sync attachments in use (I suppose ones you opened in Zotero), and that
attachments that fall out of use would be locally removed, ready to sync
again. This would work per-computer, not all systems would need to have
the same files synced.
I don’t know details of the new sync option of only in-use files (I
remember it being named somewhere on the forums, but cannot find it
right now), but that would work for your case yeah?
Can’t speak on this. I think I’ve seen the Zotero team mention
they’re not keen on setups like these. It’d mean Zotero can’t be used
offline, and you don’t want to point all installs of Zotero to the same
network drive location I think. That would work better in a linked-files
situation, but here too that’s harder for the Zotero team to
support.
Groups have the advantage of (course-grained) edit-permission
management. But if that’s not your concern, one big group with partial
sync (which isn’t available today, and I don’t know if and when it will
be available) seems like exactly what you want.
re: 9 -- really depends on the exact set up of the network drive. It's generally better to have Zotero locally to prevent access errors, and the network drive would have to be individual to the person accessing it (like a roaming profile, e.g.). You can't have multiple people trying to access a Zotero database stored on a network drive at the same time.
11 -- I'd go with team libraries if you have somewhat well-definied teams, yes. Having a giant library can be overwhelming and, e.g., search will be easier and faster in somewhat smaller libraries, as will be word processor integration.
Regarding the item 9, what I had in mind is, I think, what adamsmith mentionned: The "Main Group" would be saved on a network drive but only accessed by its owner (who would have admin permissions). The other members would have it synced on their local drives (but without the attached files downloaded, otherwise their local drive would run out of space instantaneously). Indeed, one advantage of a network disk is that they are large enough to host all these pdf files at the same place. In my mind, (but I may be wrong), it would alos ease the transmission of the Group when the person in charge of it would change position in the organisation. It would also be a safer place as network drives are regularly backed up.
Would it work like that?
Regarding item 11, I'm happy to be confirmed that my questioning was not so stupid after all. It really depends on the use we'll do of this biblio database. I do not think people will share comments, writing detailed notes on publications. They could flag some papers of interest using the color codes or specific key words, but nothing more. If these are done in a TEAM GROUPS, they will not be shared in the MAIN GROUP, but it should not be a problem. But anyway, this detail is worth noting, thank you.
Above all, if we take the decision to implement such a solution, if we want people to use it, if must be fluent. If it is slow, people will copy again the pdf in various folders on the network drive and we'll be back to the present situation. In taht case TEAM GROUPS should be the bestalternative
We just need to design the TEAM GROUPs so as they are stable in time, as suggested by Adamsmith
Thanks to you, it is clearer in my mind.
Tanks a lot !