Zotero Sync: what about security and privacy?

Dear,

It's clear that Zotero is a great «free, easy-to-use tool that help you collect, organize, cite, and share your research sources.»

But, what about the security and privacy?

I think it would be great that Zotero implement the Firefox Sync approach in order to encrypt all data locally by default <http://gregoryszorc.com/blog/2012/04/08/comparing-the-security-and-privacy-of-browser-syncing/>.

Don't you?

Jean-Martial
  • See https://forums.zotero.org/discussion/14254/sync-password-sent-in-clear-text-or-encrypted/

    For local storage, wouldn't it be better to just encrypt your entire hard drive?
  • Hello Rintze,

    Thanks for your answer but «All sync and API traffic—credentials and data—is sent encrypted» mean «are sent via a encrypted tunnel but recorded on server like they was sent by the client».

    In the Firefox Sync mecanism the user don't have to care about his privacy or need skills to encrypt his entire hard drive.

    Anyway, the most important for me is for the server side and according to the Firefox Sync Privacy Policy the rules are clear:

    • Your data is only used to provide the Firefox Sync service.
    • Firefox Sync on your computer encrypts your data before sending
      it to us so the data isn’t sitting around on our servers in a
      usable form.
    • We don’t [can't] sell your data or use ad networks on the Firefox Sync webpages or service.
    That the kind of policy implementation I want to talk about.

    Jean-Martial
  • Not sure what you mean by "policy implementation" — we're obviously not going to sell your data.

    But client-side encryption is another matter. The thing that people tend to overlook is that client-side encryption is incompatible with web-based access to data. You can't — and couldn't, securely — access your bookmarks online with Firefox Sync. It'd be the same with web-based access to your Zotero libraries, which is a big part of the appeal of Zotero for many people. (Some services offer selective web-based sharing, but the only way to do that is by sharing your key with the service.)

    I'd personally love to offer the ability to enable client-side encryption for particular libraries, with the understanding that doing so would disable web access for those libraries. That'd be a huge amount of work, though, and ultimately the demand for such an undertaking just isn't there. So short of lots more interest, grant funding, or someone taking it upon themselves to hack this into the client and server, it's pretty unlikely. (And any such implementation would also require review by a cryptography expert.)
  • Hello Dan,

    Thanks for your answer. I understand your technical choices, due to a huge amount of work if CHNM offer the ability to enable client-side encryption.

    Do you know why we choose Zotero to be the [bibliographic] knowledge manager of our organisation? Because fonctionnalities are great AND it's libre software AND also for Roy ROSENZWIEG's democratic sensibility in humanistic scholarship and teaching.

    Do you know why, for example, we'll never choose to use Google services and products? Because although the features are great, our privacy is not, definitely, guaranted AND decenies after the «Don't be evil» baseline, we are very far from the spirit of starting…

    Nowadays, we don't want restrictions and we think it's «naïve» to presume about interest instead people: more and more people don't want «Customized recommendations».

    (Sorry, there is a lot of thinks to say about, but my english speaking is very poor and it is difficult to fully and exactly develop my thoughts…)

    The question is *very* interesting.

    Work in progress…

    Jean-Martial
  • edited November 6, 2013
    I support the idea of having the option of client encryption. I believe I would trade web-access for encryption for many of my libraries. Privacy on the net is really a concern nowadays. I'd like to have control of what personal information I share with others.
  • Another thing to take into account is that locally encrypted data introduces a big hurdle for group sharing. Essentially, data stored in groups would still have to be left unencrypted.

    @Dan, out of curiosity, does Zotero employ any sort of de-duplication strategy for metadata (probably not) or file storage on the server?
  • Essentially, data stored in groups would still have to be left unencrypted.
    I'm not sure that's true, but certainly it'd be a lot more complicated.
    @Dan, out of curiosity, does Zotero employ any sort of de-duplication strategy for metadata (probably not) or file storage on the server?
    No for metadata. For file storage, yes, identical files are only uploaded once, globally for all users, based on the file hash. So the user advantage is that repeat uploads or uploads of files that other people have uploaded are instantaneous. The privacy implication (if that's what you were getting at) is that it's technically possible to know if a specific given file has been uploaded by anyone previously.
  • Essentially, data stored in groups would still have to be left unencrypted.
    I'm not sure that's true, but certainly it'd be a lot more complicated.
    The only way to share encrypted data with someone else is to also share the encryption key (I don't think asymmetric encryption would work in this case). In order to ensure that this system works, zotero.org could not serve as the middle-man here to distribute the key. So in order to receive the key you would have to establish a direct secure connection to the library owner's Zotero client. The alternative is that the library owner can give you the key through other channels in a form of group library password perhaps.

    I think this is probably the main reason why SpiderOak's file sharing functionality is so poor.
    For file storage, yes, identical files are only uploaded once, globally for all users, based on the file hash [...] The privacy implication (if that's what you were getting at) is that it's technically possible to know if a specific given file has been uploaded by anyone previously.
    No, but I was thinking how this functionality would have to be sacrificed for locally encrypted files. Not a huge deal I suppose, but still.
  • So in order to receive the key you would have to establish a direct secure connection to the library owner's Zotero client. The alternative is that the library owner can give you the key through other channels in a form of group library password perhaps.
    Right. I'm not saying the usability would necessarily be great, but at least for some uses — for example, in a lab/office — exchanging a setup token out-of-band (as in Firefox Sync's key exchange) would be pretty feasible. I imagine there are even better options — we wouldn't be the first people to want a group key exchange protocol. But that's where the cryptography experts would have to step in, and a small indication of how serious an undertaking this would be.

    Realistically, this isn't going to happen without grant funding, but I have no idea how interested funding organizations would be in a project like this. One concern I have is that it's not clear to me that client-side encryption of data that's then synced to Zotero servers would even be compatible with institutional policies that currently prevent certain groups from using Zotero (and that motivate some of the attempts to run the Zotero dataserver code). And I don't know how the population of users under such policies compares to the population of casual users who simply don't want to upload unencrypted data to an external server.
    I think this is probably the main reason why SpiderOak's file sharing functionality is so poor.
    I haven't used SpiderOak, but my understanding is that you actually decrypt the content and store it with them in order to share. So they're basically avoiding the problem.
    No, but I was thinking how this functionality would have to be sacrificed for locally encrypted files. Not a huge deal I suppose, but still.
    True. Encrypted libraries would definitely cost us more, though I'd say we see deduplication largely as a nice optimization (for us and for users) rather than as a fundamental design requirement.
Sign In or Register to comment.