Zotero development strategy, markdown, batch editing, and pdf reader notes—long!

Apologies in advance for the length of this essay. I am really hoping that the following comments are taken in the constructive manner they are intended and that the team responds. In another thread, @dstillman made this comment about the future plans/strategy for Zotero:

"We spent a lot of time thinking about this when building the new note editor and ended up deciding that standalone notes should be the mechanism for outlining. You can drag items, annotations, non-annotation PDF text, and other notes into a standalone note and then drag them around as blocks to rearrange them, with all the benefits that a note editor gives you of being able to type around them, flesh them out, add lists and other formatting, etc. The idea is to make it possible to go from annotations to notes to a draft that can be inserted into a document with active Zotero citations....(There is, however, much more we want to do with notes, so what's there now shouldn't be considered the final feature set.)"


Zotero is, in my mind, clearly the best-of-class reference manager in addition to the benefits of its being free and open-source. There are other excellent apps for notetaking, outlining, drafting, and revising text, and Zotero wisely integrates with many of them. Is Zotero's ambition to expand its functionality to be a platform for some of these functions? It seems from @dstillman's comment that the answer is yes, and it would be fantastic if it was able to do these things as well as existing apps.

My concern is this: there has been, over the last several years, an absolute explosion of activity and interest in notetaking, outlining, and Zettlekasten tools that are neither reference managers nor traditional word processors, but fill the gap between them, and offer functionality that most researchers never even realized they wanted but now can't imagine living without. Most of these offer immensely useful customizability, automation, search/filter,and linking and backlinking features that Zotero should never aim to duplicate because they don't make sense for a reference manager and, with the speed of development of these other platforms, Zotero would never catch up on in any case.

When I first learned about Zotero's development of internal pdf reader and new note system, honestly, I was unhappy. I thought, "Why is Zotero spending development time reinventing the wheel? There are many other tools that let users annotate pdfs and search notes and Zotero has a lot of other things to improve." Of course it is nice in principle to have everything in one tool, but users will always need multiple apps, so fewer apps required is a nice-to-have at best. Now that I have seen the pdf reader and note system, I feel mixed. On the one hand, it does seem great in many ways. On the other hand, Zotero is potentially at a turning point and it may go down (in my mind) a suboptimal road.

I am concerned that the Zotero dev team may have a slightly "siloed" view of user needs, which is *not* a criticism, but is just something that happens to any team focused on making a great product. Zotero had to integrate with word processors, did it early, and they are now talking about closing the gap between Zotero managing references and word processing an article by adding outlining.

The problem is that notetaking and outlining capabilities have radically changed in the last several years and Zotero will never catch up in those areas. So in my view it makes much more sense for Zotero to do with notes and annotations what it previously did with citations: focus on seamless and maximally efficient interoperability with other programs that are focused on and better suited for manipulating, organizing, and searching notes, as word processors are for drafting and revision. Zotero's annotation system is much better off seamlessly integrating with existing notetaking and outlining apps then trying to duplicate those functions.

AND VERY IMPORTANTLY, there are many functions central to reference management that Zotero has not had the capacity to improve even though they have been requested for many years. These are things that users *can't* do though other programs at least not without command-line tools. Things such as: proper syncing and merging between personal and group libraries, better group library permissions, batch editing of item data (like changing all `Place: N.Y.` to `Place: NY`), and other requests that devs have not rejected but simply have said "no one is able to do that now" or "no one is interested in that now". I would rather have batch data editing 100x more than an additional way to annotate pdfs, which I can already do in many ways. This is not a slam about the hard work that went into the new pdf reader, which I like and have already given many suggestions about improving, but it is hard to see a new "hot" feature like this when I assume that adding it took 100s of hours away from potentially figuring out how to batch edit data. As someone who has tried to work using group libraries, and spent dozens of hours manually editing data item by item, I can't understand how a reference manager dev team don't see these as a solvable and priority problems (of course, I'm not on the team, I may be missing something).

The devs have made it very clear over the many years I've used Zotero that they do not want to pick features based on user "votes", and I get their position and lhave come to terms with it, but I am hoping that they will agree that Zotero should focus on features most central to Zotero's ability to be a great, flexible, easy-to-use *reference manager* first, and focus on *interoperability* second, and only *then* focus on adding functionality to Zotero such as outlining or drafting that is already available and widely used in other free and/or open-source programs.

Finally, perhaps Zotero developers are regularly interacting with vibrant notetaking/annotating/outlining/knowledge management communities, such as those around Roam Research, Athens, Obsidian, Logseq, Checktivist, Freeplane, Transno, Readwise, Hypothes.is, etc. that live on Discord servers and dedicated forums, but if not, I strongly encourage them to do so. There, they will find are lots of intelligent contributors who, like the Zotero team, have expertise in knowledge management and huge numbers of users want to use these apps with Zotero. Many of them are already doing so, and I would love to see Zotero move forward with them in a coordinated way to meet user needs rather than in an isolated way being focused on the early workflow of just Zotero + word processor.

To be clear, I know Zotero has lots of export options, a public API, answers questions on the forum, and is happy to have contributors who would like to add these features. Nonetheless, the question is not whether there is an alternate way for others to solve core problems like personal/group library sync or batch export by learning to program, the question is should the core Zotero team focus on becoming an outliner/drafting program before addressing some of the key, known, agreed-upon gaps in its reference management capabilities. I'm not trying to tell the Zotero team what to do, I am just asking the team to address these decade-old gaps and to develop the notetaking system towards interoperability with markdown-based editors rather than building out the notetaking feature set in a less-interoperable way.

...continued next message!
  • ...continued

    A final example: one request that was discussed elsewhere and rejected was automatically writing Zotero-made notes into the pdf, which would be essential for users who want to annotate pdfs in other programs in addition to Zotero. Currently, users who want to edit Zotero pdf annotations must use manual export and reimport of each source. Elsewhere, @dstillman has given some good reasons not to allow external annotations and automatically write Zotero notes back to the file, such as potential file conflicts and an increase in files that must be synched. From my perspective, the ability to annotate pdfs using multiple programs (mobile, tablet, computer) is a high-prority item because without it people will (in practice) have to either 1) make pdf annotations with Zotero and nowhere else or 2) continue their current workflows and thus get no benefit from the pdf viewer/notes function that I assume took tons of time to develop. Of course people can just ignore the new pdf reader, but what a waste. It's just a fact that lots of people consider multi-device editing a requirement. I'm not even one of them! I do everything on PC. But I know it's important to others.

    So to sum up this long essay, here's what I would love to see:
    1) prioritization of adding long-requested functions that are central to Zotero's reference manager role and that currently cannot be done non-manually or which require difficult workarounds
    2) prioritization of interoperability in development of the new pdf reader/notetaking functions, with the assumption that many users will be using multiple apps and platforms as part of their research workflows
    3) deprioritization of developing new functions for workflows (such as outlining or drafting in Zotero) that are already available with existing apps, including free and open-source options

    It is just not possible for one app to deliver all of the amazing functionality available for research these days, and no app should try. Thanks for reading.
  • Oby
    edited June 25, 2021
    Thank you for a fantastic essay that also reflects many of my thoughts as a regular Zotero user for more than a decade. I was also a bit puzzled by the choice to prioritize a PDF editor and better notetaking functionality over longstanding issues such as item types etc. For what seems like years we have been hearing that additional item types, field improvements, etc is just on the horizon (in the form of Zotero 5.1). If I was calling the shots, I would rather have poured hundreds of dev hours in that direction, rather than in "reinventing" the PDF editor and notetaking app.

    I would also have done a serious effort at integrating some of Frank Bennetts work (Juris-M) into mainline Zotero. Juris-M is an amazing project, but it is a one man show and thus not sustainable in the long run. It also makes little sense to have a Zotero fork providing additional features that could just as well be part of mainline Zotero. Of course it requires a lot of development effort to smoothly integrate such additional features in Zotero, which is probably mostly used by non-lawyers and unilingualists. And that's why I have never tried to "nag" on the forums for it to happen, but instead been patiently waiting and hoping. Now that I see that there are apparently a lot of dev resources available, at least for PDF/notetaking development, I can't help but bring it up, however.

    A broader issue here is that it is really difficult for users to get any idea of where Zotero development is going. Years ago there used to be a roadmap on the Zotero wiki that was kept up to date. That is no longer the case. Then there was the blog, but that is hardly updated these days, and usually only when a feature is already rolled out. The Zoteor-dev mailing list does not contain much information about the direction the project is taking, neither are the Github pages (milestones, projects appears not to be in use anymore). I fully understand that "real" development discussion is best done in a private channel, but users should also have some high level insight into what the team is planning. It seems as if a lot of very interesting things are happening, and it would exciting to see a rough outline of what is planned. And perhaps be able to give some input via the forums on how the planned features could be designed.

    (Sorry if this reads like a complaint - I love Zotero, and the new PDF/notetaking features are probably going to be fantastic. I just wish there was a bit more openness with regard to what direction Zotero is taking, which in turn would enable discussion of whether that is the right direction.)
  • Thanks for your comment. I do want to be careful that this thread not focus on general complaints re: Zotero communicativeness or roadmap, because those discussions have happened on these forums many times before, and they rarely end up being constructive. Users always want more information and voice in development, developers always want to feel free to make their design choices without pressure from others who are farther from and know less about the product. I suspect that if lots of people ask for a roadmap the dev team will feel it as unwelcome pressure. So I hope we can keep things focused on the specific issues around what key functions should be prioritized, of such as your good example of Juris-M.
  • @realtime99 What is the state of the art for pdf annotations and notetaking?

    Each time I've done a survey in the past, I found the tools massively lacking.
    I am especially looking for a way to have documents and text side by side and linked, so that I can easily jump between annotations and the original document. I am looking for a strictly off-line tool with no cloud requirements. The best tool I have found is org-noter, which allows to use formulas, images, etc. but overall is still a bit too flaky for my liking.

    I am definitely looking forward to trying out the new note editor in Zotero!

    In my opinion, Zotero is just too big a program, and the user community too diverse, for the organization behind Zotero to support every use case. As I am not even a paying customer, I have zero say and feel bad even writing this comment.

    Zotero is a fantastic tool, and I am not in a position to make any suggestions for the development, but my biggest wish is to have a local API that would allow regular users to extend Zotero.

    For example, there is still no good way to do regular one-way contributions from one library to another. With a local API, it would be trivial to write such an add-on and even maintain the original creation data of each document.

    Similarly, I just spent many hours cleaning up automatic tags. I will have to do all of the work again eventually, as new articles accumulate. If I could just maintain a list of tag replacements externally, I could write a script that automatically does this for me.

    Another serious limitation in my opinion is the lack of a hierarchical folder structure that sorts tags into collections automatically. Zotero has search folders, but they are not hierarchical, so currently there is no way to automatically sort something that is tagged with e.g. ML, medical_imaging into the medical_imaging subcollection of ML, but also into the Medicine/imaging/ML category.
    This has been discussed for many years, but so far no one has come up with a good solution:(https://forums.zotero.org/discussion/comment/379233#Comment_379233, https://forums.zotero.org/discussion/comment/381436#Comment_381436)
    I am sure there are very good reasons that such a system couldn't be implemented, and I understand that the Zotero organization comes from a different field. If a hierarchical tag system was implemented, it might not fit my personal system anyway.

    My point is that with an API, all of these things could be implemented very easily in a few lines of external code. Read all of the tags, parse them, read an external txt file that describes the desired hierarchy, check if the collections exist, and then place the items into these collections. Not real-time, but good enough for my purpose.

    Importing is another difficult area. Many users are struggling with bulk imports. The Zotero metadata retrieval is great, but not foolproof. If I could write an external script to call this method, I could e.g. generate a list with the original path and the extracted metadata, see if it makes sense, and potentially revert the import.

    Finally, with a complete API, it would be easy to integrate many of the external note-taking tools you mentioned and create a playground for users to try out new things before integrating them into Zotero directly. In the past, I have e.g. written an org-mode integration that allows side-by-side annotations of pdfs using interleave or org-noter.
    If a better annotation tool came around, I would be happy to throw org-mode out.
  • @realtime99: I appreciate your perspective here, but I hope you'll understand that it's very much that — the perspective of a very advanced longtime Zotero user.

    There are plenty of extremely important features that we're either still planning or actively working on, but lack of a built-in PDF reader was by far the most common complaint about Zotero and the number one thing keeping people from using it. If you can't see that…well, you were presumably happily using Zotero with an external PDF reader! But there's simply no argument that this wasn't the most important development we could make for Zotero's future, and the overwhelmingly positive public response to the beta has made that pretty clear.

    Similarly, the idea that we should focus on first-party integration with Zettlekasten tools — something that the vast majority of people have never heard of — rather than having a clear story for a research workflow from PDFs through to a Zotero-enabled draft in Word, LibreOffice, or Google Docs (and extensibility for other use cases) is frankly just completely misunderstanding the needs of the average Zotero user. It's also based on a misconception that I tried to dispel in the thread that presumably prompted this — that we've done anything whatsoever to impede interoperability with third-party note-taking tools. The new PDF reader and note editor will enable more advanced functionality with external tools, not less. Some things just haven't been implemented yet, by us or by plugin developers. This stuff is new. Give it time.
    It's just a fact that lots of people consider multi-device editing a requirement. I'm not even one of them! I do everything on PC. But I know it's important to others.
    We agree, which is why we also built an iOS app that works seamlessly with the new PDF functionality. If you don't use iOS, this indeed probably isn't very important to you, but it made a lot of people very happy.

    For the rest, the big longstanding things that haven't been done, such as updated item types/fields and batch editing, are often blocked for specific technical reasons unrelated to the actual functionality. That's not an excuse — solving technical problems is our job, and we've certainly made mistakes in prioritization over the years — but it's just to say that the reasons specific things happen before others aren't always easy to perceive from the outside. In any case, we're well aware of the importance of some of these features, and we expect to get to many of them in the Zotero 6 cycle.

    Finally, I'll note that one nice side effect of rolling out popular features is that it lets you hire more people to work on things. We have more developers on staff than we've ever had (including someone working full time to improve saved metadata quality, which we feel is a pretty core quality-of-life issue). Things are happening. Zotero's future is bright. Stay tuned.

    (On Jurism, I've been pretty clear that we don't plan to integrate most of its functionality.)
  • @dstillman, thanks for the comment. Before I respond more specifically, I have an honest question. You made a number of claims about user preferences, such as lack of a built-in pdf reader being the number one thing keeping people from using Zotero. Is this belief based on forum comments, informal discussions with librarians, random people emailing you, etc? I'm asking in a 100% non-sarcastic way—if you did an actual survey on this among people who use Zotero competitors, and my intuitions are simply wrong, I'll grant the point without hesitation. But discussions like this often get stuck over contrasting intuitions that are heavily biased due to the different groups that individuals tend to hang out in (like me with notetaking apps), and I do think it's important to match confidence in claims to the validity of the methods used to collect data.

    Also, is it the case that your Zotero's feature development is heavily influenced by discussions with institutional leaders who are trying to choose between Endnote, Mendeley, or other reference managers that (I am guessing) include pdf readers, and you see a primary goal of Zotero as winning over these people to increase Zotero's userbase and thus impact? I'm not criticizing that possible goal, just trying to understand things.
  • I don't know if Zotero does additional surveys/user research, but I suspect dstillman and colleagues see the same thing as me, i.e. thousands of comments on the forum, which they all read, as well as a saved search for Zotero on Twitter which has dozens of daily hits.

    I think the combination of both does give you a very broad and quite representative picture of users' problems and demands and the PDF reader did/does, in fact, come up all the time.
  • @adamsmith , thanks for participating! I would be interested to hear your response to my essay.

    @dstillman , on rereading your comment multiple times, I am now more confused than before about how Zotero sets its development goals. In previous discussions on the forum over the years, I got the clear sense that the team tried to make decisions about functionality *not* based on user demand, but based on what made logical sense for a reference manager to do, just considering it as a tool in itself and what was technically possible. This was provided as a reason to not create something like a "votebox" to gather data about user wants. Granted, that was years ago, so opinions may have changed. But your comment makes reference to e.g. making iOS users "very happy" and meeting demands of Zotero users who complained about lack of a PDF reader. You also talk about what prevents people from using Zotero. Without surveys of non-Zotero users, I have no idea how anyone could have any idea about what might be preventing people from using Zotero (I assume they don't create forum accounts to post about their not using Zotero). Perhaps I misunderstood.

    I also see a new factor implied when you talk about "Zotero's future", but I'm not sure what it is. What does it mean for a feature to be important for Zotero's future? Why would a new pdf reader be important for Zotero's future, but batch editing wouldn't?
  • I don't have that much to add here. I think some of Zotero's governance/development is more secretive than I'd do it (e.g., I think a public roadmap makes sense), I obviously don't think they always get all their priorities right (but I'm also not part of all the relevant technical discussions, so can't judge, e.g., where there are actual trade-offs and where there's parallel development), but in terms of the PDF reader in particular, I think that's a no-brainer and you're just wrong that it's not core functionality -- it's possible that's philosophical differences in part (although I generally err on the side of modular design and interoperability), but I also don't think you fully understand either the limitations of the current status quo (which are significant) or the possibilities opened up by the new reader (ditto).
  • I would like to say that I am with Zotero on the PDF reader integration. I am a fairly proficient 18-month Zotero user at this point, but have been struggling to find a PDF application with exactly the right level of sophistication to allow annotation, without it being an overall complex client. The PDF integration has allowed me to stop my Adobe subscription which I am very happy about.

    However I do agree with you, that there is a risk of Zotero trying to be too much in the future. I do think that maybe an expansion of the plugin framework to give native in-text functionality to something like Scrivener, if that is possible, would do the job it needs to - and possibly maintaining an exportable format for the PDF citations.
  • edited October 5, 2021
    Adding another voice from a not-so-new customer (4 years):

    I am very happy with the recent development. While I think I get the thoughtful criticism from @realtime99 about features like batch editing, and so on — for me, the new features provide a much more positive impact.

    I already had a rather elaborate Zotero setup, with Zotfile and Dropbox-based sync. For years, I wanted to involve a tablet for focused, comfortable reading but the Zotero GUI was horrible for touch input, and even with a decent PDF reader (Drawboard), Windows tablets have a subpar experience compared to iPads. I probably spent more time on my Windows tablet setup than actually reading with it. The Zotero iOS-App already works great for reading and highlighting.

    Regarding the notes feature, I share some of @realtime99’s mixed feelings: Some other note-taking apps are way more efficient for Markdown input. I personally like Notion, where for example # automatically converts to an H1 heading, * to bullet points, etc. If the developers haven‘t checked this out yet, I strongly recommend they do.

    Despite the note editor's shortcomings, I find the new note-taking in the Zotero Desktop Beta to be extremely useful for its linked references. Reviewing a pile of papers and collecting all the key messages (and graphics) in one overarching note — with seamless links to the page in the source! — works so much better in Zotero than with external note-taking tools. Enabling this great workflow for all users (and not only for 0.1% power users with the capability to implement this with external tools) to me is a perfect justification for the built-in PDF viewer.

    My personal priorities/wishes would be:

    1. iOS: Stand-alone notes in a sidebar, with dragging and referencing similar to the desktop version.
    2. Better UI for collections and tagging:
    a. sorting/filtering by tags
    b. tags in the library list (not just the colored markers)
    c. quicker tagging on iOS from the library and the document view
    d. being able to see which collection an item belongs to
    3. Support for drawing/handwriting annotations.
    4. iOS WebDav support (but I still find it fair to pay — as I now do for the storage option which I find reasonably priced).

    When I click on a parent collection that has subcollections, I sometimes want to see the items in the subcollections and sometimes I do not. I think with a better tagging mechanism, this issue would be solved because I would use tags in some of the situations where I now use collections.

    Thank you very much to the developers. Also, I find it great to see this constructive discussion started by @realtime99 and the thoughtful comments by @danb and @adamsmith.
  • @tjochmann
    Some other note-taking apps are way more efficient for Markdown input. I personally like Notion, where for example # automatically converts to an H1 heading, * to bullet points, etc. If the developers haven‘t checked this out yet, I strongly recommend they do.
    The official documentation hasn't been updated yet, but take a look into already supported Markdown input rules here: https://github.com/zotero/zotero/issues/1976#issuecomment-877072198

    Anything is missing?

    Markdown export will soon be implemented as well.

  • Thank you @martynas_b, these shortcuts are indeed new to me and they are a big improvement!

    Here are two issues that I just experienced:
    1. (Major) Adding text between two blocks is cumbersome. Here is how to replicate a particularly annoying example: Have one block with text dragged from a paper, ending with the autogenerated reference. Have a second block with a screenshot. Now if you want to add text in between, you typically highlight the reference and overwrite it. A plus sign to add a new block would be a solution.
    2. (Minor) After re-arranging blocks, the cursor vanishes (Windows version).
Sign In or Register to comment.