Zotero PDF reader and Annotations to Markdown Workflow

edited March 10, 2021
I am posting this new topic as requested by @dstillman...

---

Hi, your new PDF reader interface is really good (although it does seem to have a minor bug in missing the occasional letter from the highlighted text).

I would love to see better markdown export capability.

My use case is that I want to annotate my PDFs and get the annotations into Obsidian (and really any other app that handles markdown) with links back to my Zotero reference library to open the specific annotation locations in the source PDF in my Zotero library.

---

The way I do it today is with the zotfile and mdnotes plugins. These are the steps I currently go through and it works great:

1. Annotate the PDF
2. Extract the annotations from the PDF using Zotfile
3. Export the annotations note(s) to markdown using mdnotes
4. Open the markdown file and copy and paste it into the notes app (in this case Obsidian, but this would work for Bear, Roam, etc.)
5. Have a nicely formatted note in the notes app (Obsidian) with all the links correct so that when I click on them, it opens the PDF inside Zotero at the location of the annotation

---

With the current beta of the PDF reader, this doesn't currently work. I have tried two ways getting the annotation note that I created in PDF reader into Obsidian:

1. Select all the text from within the annotations note created from Zotero PDF Reader and paste into Obsidian. The look of it is not bad, but all the links to the original PDF are lost. The loss of the links to the PDF location is the main issue here. Also, any images that have been captured using the Select Area tool copy over and render okay in Obsidian, but in the raw markdown mode the image description is hundreds of lines of gobledy-gook -- a subset example of which is:

![(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAdoAAACqCAYAAAAKsY7aAAAgAElEQVR4nO3deVxN+eM/8NtmH5NiZN/NZGc0EyJrNINSI7IMY4mQQeEzYcjYZswMM8aStT

...and it goes on...

2. Use mdnotes to export the annotations note created from within PDF Reader to a markdown file. This file is a mess format-wise and also loses all the links back to the pdf.

---

Current workaround which is very undesirable:

1. Annotate in Zotero PDF Reader
2. Export the PDF so that the PDF Reader highlights get embedded into the PDF
3. Reimport the PDF with highlights into Zotero
4. Use my original workflow to use Zotfile to extract annotations and use mdnotes to get markdown and then copy and paste into Obsidian.

---

Desired workflow:

1. Annotate in Zotero PDF Reader
2. Add Item Note from Annotations
3. Right click on the item note and either copy as markdown (so I can then paste into Obsidian without creating a new file) or save as markdown and it shows up as a new item under the parent in my library (which I could subsequently copy/paste into Obsidian.

Key requirements for this desired workflow:

- The pasted/created markdown should include the reference links back into the PDF where the annotation came from. So in Obsidian, I should be able to click on the link for the highlight and it should open up the PDF file in Zotero on the right page (which is basically the "Go to Page" link in the Zotero PDF Reader Annotation note).
- Also, this should export any images that are in the Zotero Item note so they show up in Obsidian (these are the images captured by Zotero PDF Reader in the item note when using the Select Area function). But somehow these need to come over in a way where they are image files and not hundreds of lines of gibberish.

---

Even more desired: create an Obsidian plugin to do all this automatically.
  • This sounds like it's mainly an issue for mdnotes (https://github.com/argenos/zotero-mdnotes ) which is designed for notes in the old format and extracted annotations by ZotFile.

    I don't think there's anything fundamental that would preclude using the same workflow as ZotFile otherwise.

    The image issue may be tricky to solve though. Those are embedded encoded images, which would be the only way to have an image in html/markdown without creating a separate file.
  • I guess the key request is for Zotero PDF Reader to be able to export its note annotations as markdown while also retaining backlinks to the original location back inn the source PDF in the Zotero library. I had heard somewhere that this was on the roadmap, but I am looking to confirm that.

    Solving the image issue would just be a bonus.
  • Backlinks in the format created by Zotfile Extracted Annotations still work in the new note editor - i.e. they open the pdf at the page number in the annotation link. This bodes well for converting mdnotes to use the new editor.

    The image issue may need to solved by using externally hosted images rather than using the new Zotero clipper. For example, adding an imagur html link to a pdf note in the new Zotero editor displays in the new note editor (after using Add Item Note from Annotations) like this:

    https://i.imgur.com/p2bqS2d.png

    So an updated version of mdnotes can parse the externally hosted image url and present it in markdown format. Obviously this is not a slick as the new Zotero image clipper but it is a possible solution to accessing images from outside of Zotero via markdown.
  • I am also quite interested in this workflow. If you need any roam user beta-testing this workflow, please let me know.
  • There are thousands of people who use Zotero with other editors and notetaking systems, and to efficiently use Zotero's new notetaking system with those apps, it seems that we will need to regularly export every annotated pdf, one-by-one, to write Zotero notes into them, and reimport them into Zotero every time we open a pdf and add annotations, in order to get those notes written to the pdf so that Zotfile or mdnotes can read and extract them. That seems like a big obstacle. Let me know if I am misunderstanding things.
  • @realtime99: Yes, you're misunderstanding things. As we've said repeatedly, including in this thread, if you want mdnotes to export zotero://open-pdf links for annotations from the new PDF reader, that's something you'd have to ask the mdnotes developer to support. Annotations added to notes from the built-in PDF reader contain all the information necessary for plugin authors to do so. (Also relevant: Zotero.Notes.getExportableNote()) If they have questions about how best to do this, they can post to zotero-dev.

    But as I've said elsewhere, if there's sufficient demand, we could consider offering a built-in way to export notes with zotero://open-pdf links. I'm not totally clear on the desired outputs here, though, so links to example documents would be helpful. Also, files with zotero://open-pdf links would inherently be local-only documents, and I'm not clear on the workflows people would be using to create final, public documents from them, so details on that would be helpful as well.
  • (I think the use case are not public facing docs but advanced note-taking and workflow apps like Roam, Obsidian, or Zettlr)
  • @dstillman thanks for clarifying. Glad to know that Zotero plug-ins can access Zotero-made annotations that are not in the pdf but are in the Zotero database.

    @adamsmith is right - the use case is people who are trying to use Zotero as a reference manager and want to output pdf annotations into interoperable formats such as markdown with very high efficiency (because they are dealing with many sources and notes) and into note-taking and outlining apps such as those mentioned. Currently, the workflow for most is that mentioned in some other posts on this forum: annotate pdf outside of Zotero, put annotations into a standalone note using a plug-in, export the annotations into another app for outlining, linking, drafting, etc., and then (much later) into a word processor for final revision and output.
  • Glad to know that Zotero plug-ins can access Zotero-made annotations that are not in the pdf but are in the Zotero database.
    I mean, they can, but that's not actually what I'm saying. When you use mdnotes, it's not doing anything with annotations in PDFs. Those have already been extracted to HTML notes by ZotFile, with zotero://open-pdf links. I'm saying that Zotero's built-in PDF reader puts all the annotation information, including the page number, into the note data — which is what Zotero itself uses to take you back to the annotation from the note — and a plugin could use that data to create zotero://open-pdf links in exported notes.
    and then (much later) into a word processor for final revision and output
    But, to be clear, you'd have lots of zotero://open-pdf links in your notes when you did this. So you'd have to go through and remove all of these links (or remove all links in the document) for a final digital output.
  • @dstillman Re: zotero://open-pdf links in the notes

    That's perfectly fine and is exactly what we (at least me, but presumably also @jdinning and @realtime99) are looking for. The current workflow - i.e., annotating in an external PDF reader, extracting the annotations as markdown using Zotfile/mdnotes, pasting them into Obsidian - results in these local open-pdf links. The idea is that your markdown-based notes app will open the PDF at the correct location. The notes are not intended to be shared or published, so the links being local is not a problem.

    The new PDF reader (which is great) has a nice way to extract annotations as Zotero notes with links, but when copied to markdown these links are lost (copied as plain text). It would be nice if there was an option to generate open-pdf links instead, so that we can use the built-in PDF reader in this workflow.
  • We're working on a solution for this workflow.
  • edited September 30, 2021
    @dstillman I would like to add my vote for this suggestion. Zotero's in-built PDF reader is awesome. I would love to see the Zotero (beta) play nicely with the other tools. Therefore, it is great to read that the development team is on top of this. :)

    @jdinning Thank you for posting your detailed request. I spent an hour trying the find out why this function was not working for me. I searched the Internet and your post appeared in the search results. Thanks for including information on a workaround in the meantime. You are right, the current workaround is very undesirable. But it is only a workaround while the development team finds a solution. :)
  • Hi everyone, I would like just to comment an upvote for this matter, since this workflow working would be golden
  • + 1

    Thanks for developing a wonderful PDF reader! The ability to extract both highlight and text annotations (bonus for inking annotations; an awesome new feature by the way), which can then be copied as a markdown file (with viable built-in PDF reader links) would be fantastic.
  • It would be nice if one could right-click selected text in a note and choose to copy that to the clipboard in markdown format. (This is just a variant of the above.)
  • Any update on this issue, please?
  • edited November 19, 2021
    I'm also on board with this. I'm trying to get into a workflow of Highlighting/annotating on my iPad, extracting the annotations on my desktop to Markdown, then taking those Markdown annotations in Zettlr. This is the workflow that the creator of Zettlr uses, although he doesn't use the beta version (https://www.hendrik-erz.de/post/how-i-work-part-iv-reference-management-reading-literature). Currently, his workflow does not work with the Beta due to issues with Zotfile and MDNotes not working entirely correctly.

    Are there options in the Advanced Config that are similar to zotfile.pdfExtraction options for styling/customizing the notes? That would be extremely useful.

    In short, some way to annotate PDFs in the NPR, extract those to Markdown, and get them into Zettlr/Obsidian would be greatly appreciated!
  • Thanks for such an amazing implementation of the PDF Viewer. This Markdown Notes Export functionality would be very much appreciated.
Sign In or Register to comment.