forking zotero

Hi,

I have some idea about how to implement variations on the the current zotero standalone app. Can anyone give me some brief pointers about:
- which code / files set with the way in which it implements the database architecture
- how to build and test hypothetical variations I manage to implement?

I realize this is somewhat of an open ended question, and also that my prior knowledge is pretty minimal, so please let me know if these points make no sense, or if you need additional info to reply.

I also did my best to check this had not been asked before, but if it has, please feel free to simply point me in that direction. Thank you very much in advance for thinking about it!

All the best,

ejt
  • That would probably be a better question for zotero-dev
    https://groups.google.com/forum/#!forum/zotero-dev

    though to me, at least, it does seem too vague to allow much helpful input. I think the expectation is that you spend at least a little time looking at the source code as well as the developer documentation
    https://www.zotero.org/support/dev/client_coding
    so that you can ask more specific questions.
  • Thanks! I've looked at the source and honestly I'm not sure where to start. I'll admit this has probably much to do with my lack of familiarity with javascript, but even things as simple as identifying the part of the source responsible for maintaining the pdf database is hard for me to find. Is it db.js? Anyways, thank you for pointing me to the dev forum, I had not realized it existed - I'll go through that and see if I can learn more there from previous posts.
  • Before you try something like forking Zotero -- there's hardly anything that can't be accomplished by plugins.
  • edited January 31, 2019
    even if I want to change the way zotero implements its file structure? I'd like to have it keep everything exactly the same as I currently have it setup, with ~3000 pdfs labeled author year title.pdf, with no subfolders.
  • Even if you want to do that. My own plugin augments a lot of behavior by live-patching existing behavior (aka "monkey-patching"). You have to be careful or it will be fragile, but it's less work than maintaining a forked Zotero. You probably only have to patch a few functions in chrome/content/zotero/xpcom/data/item.js to have the attachmentPath behave differently.

    Mind that if you do this, whether by plugin or by forking, you likely can't expect the Zotero devs to stand for what happens on your system... this is pretty tricky territory you're wading into. I get it that people don't like the random directory names for the attachments, but personally, a single folder with ~3000 pdfs doesn't sound any more appealing. Before we talk the how of solutions, what problem are you looking to solve to begin with? Maybe there's other ways to achieve your goals.
  • You may only have to mess with assignments to attachmentFilename. Not tested in any kind of way. Very tricky territory.
  • Yeah, I'd look at ZotFile (zotfile.com) instead. You can have links to attachments in a single folder from Zotero already and ZotFile helps automate the process.

    I'd stay away from Zotero's internal storage. There's a reason Zotero hasn't changed this although many people dislike the internal organization. It's embedded in all sorts of design decisions within Zotero and changing it properly will be a ton of work, have a good chance of breaking things in the longterm for you (including making re-merges with Zotero in the future impossible), and requires familiarizing yourself with large parts of the Zotero codebase.
  • first off, thanks for taking the time to reply in such detail. knowing that item.js is maybe what I want to look at and its filepath is super helpful.

    second, what I'm trying to achieve is twofold:

    i - a version of zotero focused on managing existing large collections of pdfs, with the nice tagging / collection abilities currently in zotero; without modifying said collections. The short, silly version is "itunes, but for pdfs" (ie just show the files in a folder and lets you build arbitrary subgroups of the files). I don't need any of the metadata extraction or online storage functions, so if I learn enough I'd also cut those.

    ii- a quotes manager. this is a bit more abstract, but here's a basic workflow:
    1. annotate a pdf, using hashtags e.g. #titleofpaperidea
    2. use a zotfile-ish process to extract the annotations with the annotated text
    3. autogenerate a zotero collection to keep track of all the documents that have an annotation with a unique #titleofpaperidea
    4. when needed, autogenerate a LaTeX document containing all the annotated text corresponding to a specific #titleofpaperidea, with all the proper LaTeX \citep{entrykey} citations after each quote. (assumption being I've entered a LaTeX entry for each document in Zotero, so it can just pull the entrykey from that).
    5. along with that, autogenerate the .bib file with all the corresponding full biblatex entries.

    i- is basically something I thought would be simpler and a good test of whether or not I could handle tackling ii-

    Does this make any sense?
  • @adamsmith sorry, you replied while I was writing my answer to emiliano
    thanks for the thought as well, I'll look into modding zotfile instead of forking zotero... your links were also very helpful. Appreciate it!
  • i - it's funny that you call it iTunes for PDF -- iTunes has historically pushed the idea hard that you shouldn't care where your music files are as you are supposed to access them using iTunes. In any case, from a UI point of view, Zotero already allows you to do this if you don't care about the file layout. If you do care about the file layout, zotfile can fix that.

    ii.1/2 zotfile can already extract pdf annotations
    ii.3 is close to trivial with a plugin
    ii.4/5 can be done with a plugin. I auto-generate bibtex files in my own plugin, this would be more of the same
  • edited January 31, 2019
    Thank you. I appreciate all the details.

    i - I'll mess around with Zotfile and come back if I have clarifications on what I'm trying to achieve / take this to the zotfile devs if that's more appropriate.

    ii. 1/2 right - do you happen to know the part of zotfile that take care of that?
    ii. 3 awesome - should be a fun project for a beginner then, right? :) any tips on how to get started for this particular idea?
    ii. 4/5 same as 3: what is your plugin? Any advice on how I can auto-generate the .tex file with all the quotes and the proper citation command? Also, best case scenario - can zotfile extract the page number of the citation as well, to make it do \citep[pagenumberofquote]{entrykey}?


  • ii.1/2 best ask jlewegie (the zotfile dev)

    ii.3 On https://www.zotero.org/support/plugins, under "Zotero Development", you'll find two ways of getting started writing a plugin. In this case, you will probably want to create a Zotero notification handler (Zotero.Notifier.registerObserver) that listens on item changes to add and remove items to a collection on any criteria that you want... but a saved search can likely achieve something similar without any coding if you have the item listener just manipulate tags based on annotation extraction

    ii.4/5 I write BBT (Better BibTeX for Zotero). BBT does a lot, but conceptually, for auto-generation I just have an item listener and generate a file when it is triggered and the right condition is met. What's in the file is up to you -- I generate bibtex and a few other formats. I don't know whether zotfile can do page number extraction but I've done some PDF work in the past and the information is there.
  • Amazing - I was really not expecting the dev for BetterBibTex (really appreciate you making and maintaining that) to show up right away and help me out - thank you both again!

    I'll follow up if need be.
  • Oh BTW the BBT citation keys are accessible to other plugins, and Zotero is moving towards having a formal citekey field, and in the interim, having "Citation Key: something" in the extra field (not coincidentally also how BBT pins its keys) will do the same job.
  • Thank you for the pointer. I'm not totally sure what that means as I haven't played with BBT in a while but I will experiment and come back here if I have follow up questions I don't find the answers to.

  • @bruit I came across this thread while looking for improved ways to organize quotes in Zotero. Have you made any progress on that part of your plan?

    You might be interested in this thread on that topic:
    https://forums.zotero.org/discussion/77809/best-practices-for-organizing-quotes-in-zotero

    I'd be interested to hear your current strategy, as mine is not working great right now in terms of relating quotes to each other by topic.
  • Hi Realtime99,

    I'm still working on this but dissertation writing and various other school related things have taken over most of my time, especially with weird metaprojects like this one. I've figured out that Zotfile will extract annotations made in the google drive pdf annotation tool (weird preference, I know), so my next step is to find a way to implement the hashtag topic classifier. Then it should be as simple as making Zotero display all notes with that tag, and the highlighted text they correspond to. Not there yet.

    How about you - do you have a kludgey solution, or just doing things by hand?

    All the best -
  • Hi @ejt, I'm not sure if it's a solution or not, but I described my workflow in the discussion I linked above. Basically, quotes I want to connect to a concept are hand copy/pasted into a child note for a placeholder item with the topic name.

    One suggestion I have is implementing something like zotfile's auto-link to the pdf location of the quote. I'd actually prefer auto-linking to the Zotero parent item myself.

    I'd also like a way to easily add similar links to sources cited in the quote that are also in Zotero. So if a quote is in article Smith, 2010, and the quote mentions another article Jones, 2005, I'd like a way to have a link to Smith 2010 automatically put under the quote and a way to easily generate a link to Jones, 2005.

    While I love the idea of being able to put tags write in my PDF annotations and have Zotero automatically organize those, one issue is that if you want to create collections of quotes organized by anything other than very general topics, you can easily imagine generating hundreds of tags which would be very hard to remember. I'm wondering if there is a way to utilize Zotero's built-in tagging system, which can tag notes as well as other items, to create tagged notes in a way that would allow a user to combine tag combinations.

    You also might be interested to look at Citavi (https://www.citavi.com) which is a commercial program that combines reference management with the kind of quote tagging and organization I'm discussing. I considered moving to it just for the quote feature but decided against it because Zotero is more customizable and open.

    Let me know if there's any way I can help with this project (aside from programming, unfortunately). FYI, I use Win 10 and MS Word 2016.

Sign In or Register to comment.