RTF/ODF Scan for Zotero

kithairon · May 27, 2015

Although this should not affect your issue: Suggest you're using Scrivener's .rtf export option and then change to .odt in Libre. Scriv's .rtf output was more solid when I last compared the two a few months back.

Make sure that after you open the scanned .odt file in Libre you actually set the Doc prefs – as adamsmith suggests – and do a refresh.

adamsmith · May 27, 2015

this mainly matters if you're using footnotes. I think footnote export is simply broken for Scrivener's ODT.

kithairon · May 27, 2015

You're right. Just looked up the relevant thread in Scrivener's forum that explains this: Footnotes and odt-export. It's on the developers's radar and may be addressed in the near future.

mikkelgabriel · May 27, 2015

And you're right - tada - shows up like a prince after kissing a frog...! Thanks :)

Laurentalarie · June 3, 2015

Hi, first I want to thank you for your excellent work. You tremendously simplify our life! I recently started using the RTF / ODF Scan and I find it very convenient. I converted a dissertation on LibreOffice to a Google docs and vice versa.

However, I have a little problem. When I switch of LO to Google docs, author-date multiplies without reason. I give you an example:

Originally the citations in the OL file:
(Benford, Snow, and Plouchard 2012; Snow and Benford 2000; Snow and Benford 1988; Snow et al. 1986)

And the same after the conversion to Google docs:

{ | (Benford, Snow, and Plouchard 2012; Snow and Benford 2000; Snow and Benford 1988; Snow et al. 1986) | | |zotero://select/items/0_PR3NNRHF}{ | (Benford, Snow, and Plouchard 2012; Snow and Benford 2000; Snow and Benford 1988; Snow et al. 1986) | | |zotero://select/items/0_VVHTMDJ6}{ | (Benford, Snow, and Plouchard 2012; Snow and Benford 2000; Snow and Benford 1988; Snow et al. 1986) | | |zotero://select/items/0_WV94SAIQ}{ | (Benford, Snow, and Plouchard 2012; Snow and Benford 2000; Snow and Benford 1988; Snow et al. 1986) | | |zotero://select/items/0_8BZVGIGR}

I wish to show my Google docs to my director but that makes the reading very difficult. I would like to find a way to reduce these quotes to a minimum.

Is there a way to solve this problem and how?

Thank you very much and keep up the good work!

adamsmith · June 3, 2015

See my exchange with Ilya Flamer one page back -- there's unfortunately no easy solution as of now, though note that the scan doesn't actually care about that part of the marker (i.e. whatever is between the 1st and second |), so you can (though I understand that's tedious) remove the duplicate authors.

Laurentalarie · June 3, 2015

Thank you Adam!

I was hoping that there was an alternative to manually remove the duplicate authors, but it is the lesser evil. I know it is not a priority for you but if you come by to lean on this issue, it would be nice. For large documents (eg thesis), it becomes a tedious process.

Again thank you for your attendance and your good work.

adamsmith · June 3, 2015

you do need the actual authors there, though, right? Otherwise you could switch to a numerical style before converting and just get (1-4).

cimmerian · June 10, 2015

Hey!

First of all, thanks Adam and Frank for the great plug-in. It's helping me a lot.

My question is this: Is there a code I could use so that the doc is scanned for all citations, creating an auto bibliography?

Thanks, guys.

adamsmith · June 10, 2015

sorry, I don't understand the question. Maybe run through an example of what you'd like to do?

cimmerian · June 10, 2015

Sure. Sorry about that.

With the old RTF scan, if I inserted the code {bibliography} somewhere in my text, Zotero would automatically add every source of citation it found in my text as a list of references.

That would be very useful for me as my Zotero library has many a source that ends up being cut or never used.

adamsmith · June 10, 2015

Not automatically, no, but you can use "Insert Bibliography" in LibreOffice after running the scan.

cimmerian · June 11, 2015

Yes! It worked like a charm.

I never used LibreOffice before, so sorry for the dumb question.

And thanks!

matt_price · June 17, 2015

Hi Folks,

Like everyone else says, many thanks for this plugin, it's great.

I am currently using it in conjunction with emacs org-mode and zotxt; on odt export, zotxt generates the scannable citations and inserts them into the odt document for me. This is great and seems to work flawlessly.

It would be a nice touch if I could run the scanner from the command line after I generate the odt and before I open it in libreoffice. Is there any way to address the odf scanner directly from the command line? I'm working on linux though I imagine that's not particularly important.

Thanks again!

Matt

adamsmith · June 17, 2015

not in this version, no, at least not with reasonable effort (mozrepl might be able to help, but it'd take significant effort) -- that said, Frank had an old python version of this that you may be able to make work more easily (and that would of course run from the commandline).

fbennett · June 17, 2015

Conversion via the command line would certainly smooth out the workflow. Most of the scanner code should run fine as a node.js script, but the sticking point is some ID conversions that are needed to construct (or deconstruct) URIs embedded in live Zotero references: for that, we need access to the running Zotero instance.

I haven't looked at Eric's zotxt code in quite awhile, but it must have a mechanism for interacting with Zotero. The path of least resistance might be to hook the converter into his utility, to avoid the need to reimplement the communication layer in a separate tool. I don't have time to work on it, but that might be an angle worth investigating.

(From your description of the existing workflow, it sounds like the result would be instantly available live Zotero references in the ODT export from org-mode, which would be pretty amazing.)

matt_price · June 18, 2015

Doing this straight from zotxt would of course be great, since the whole workflow already assumes that zotxt is installed. I guess there would have to be a new function in the zotxt plugin that provides an interface to the rtf scanner. I can't quite understand Erik's code, but am I right that the new function would do something like:

x = new ZoteroRTFScan
x.inputFile = path/to/odt/file
x.outputFile = path/to/odt/file
x._ODFScan()

?

Thanks again,
Matt

matt_price · June 19, 2015

Hi Frank and Adam,

So, with Erik's help I have figured out how to make zotxt run code via the API; it is actually pretty cool. Now I want to write a basic function that will trigger a scan. I had hoped something like this would work (I hardcoded the paths for simplicity):

let scanData = function(data) {
try {
let x=Zotero_RTFScan ();
x.inputFile = "/home/matt/GTD/Reference.odt";
x.outputFile = "/home/matt/GTD/Reference_munged.odt";
x._scanODF();}
catch (e) {return e;}
};

This throws an error "Zotero_RTFScan is not a function" so at the very least I assume I am not addressing the ZoteroRTFScan object properly. Do you have any advice about next steps? I feel success is tantalizingly close.

thanks,
matt

fbennett · June 20, 2015

I haven't worked with the code, but if you can provide a link to the repo with the zotxt code involved, I can take a look (IIRC there is "zotxt" and "zotxt-emacs", not sure which is relevant).

If by API you mean the API to the Zotero servers, you may have to recast the ODF/RTF Scan plugin code to run in node.js, either as a command-line utility or as a local server instance. It would take a little work, but doing it that way would probably make things less cumbersome for you in the long run - the conversion code hasn't changed in quite a long time, so you wouldn't be looking at a maintenance headache with a separate copy of the code.

matt_price · June 20, 2015

I've just put my code online here:

https://github.com/titaniumbones/zotxt

with my (quite minor) changes all in this commit:

https://github.com/titaniumbones/zotxt/commit/29fea8f9c4aedb0264195964101698dd865c59fe

Zotxt runs a small server on http://localhost:23119/, and attaches a number of endpoints to that server; search results, formatted bibliographies, etc. are returned via those endpoints. That's the "api" i was talking about. So I've written a new endpoint that in principle ought to be able to at least trigger the insertion of citations into a file. The Zotero object is mapped to z., but I'm not sure whether that allows access to rtfscan.

I really appreciate you taking the time to look at this. Thanks,
matt

matt_price · June 20, 2015

I should add some examples. Here's a working search URL:

http://localhost:23119/zotxt/search?q=latour

retrieval of an item:

http://localhost:23119/zotxt/items?easykey=SuchmanAffiliative2005&format=bibliography

And here's the new endpoint, which throws an error at present:

http://localhost:23119/zotxt/scanODF

Also, I had trouble building the xpi, so I have just installed the plugin and am working directly with the installed bootstrap.js in the current release:

https://addons.mozilla.org/en-US/firefox/addon/zotxt/

fbennett · June 20, 2015

In the code at ...

https://github.com/titaniumbones/zotxt/commit/29fea8f9c4aedb0264195964101698dd865c59fe

... in the browser, Zotero_RTFScan is a separate object in the window context, not a segment of the Zotero object ("z", in the plugin mapping). It needs to be loaded after the Zotero object (so named) itself.

You can trace out the rigamarole that does that from the ODF Scan plugin's popup page markup:

https://github.com/Zotero-ODF-Scan/zotero-odf-scan/blob/master/plugin/chrome/content/rtfScan.xul#L12

The include.js file just pulls in the running Zotero object, and that's followed by the load of Zotero_RTFScan.

Looking at the code of Zotero_RTFScan, I see that my thought of recasting it to run in node.js would not have worked so good - it has dependencies on Firefox infrastructure, so it really needs to run in the browser.

matt_price · June 20, 2015

Hmm, do you mean these lines here:

<wizard xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul"
title="&zotero.rtfScan.title;" width="700" height="580"
id="zotero-rtfScan">

<script src="include.js"/>
<script src="rtfScan.js"/>

Do you know how I access those files from a js file that lives in a different plugin? I know very little about FF/zotero plugin developmet. Since I'm not really building a web page here, just accepting an argument from a POST or GET, I don't know how to include the rtfscan.js code -- maybe it's not strictly possible? Maybe I have to actually replicate all of your code in the bootstrap.js of zotxt itself?

fbennett · June 20, 2015

That's the ones.

Other-plugin files can be addressed from within JS by their chrome:// URLs (as mapped in the manifest of the plugin where they live). It looks like the Zotero_RTFScan() file is tangled up with some DOM operations, but I should be able to help out by isolating the string-parsing and zip operations in a separate file, so they can be loaded as a module. I don't have time to work on it right away - Masters theses are due here in a few days, and I'm chock-a-block with queries and consultations for the coming week - but ping in in a couple of weeks if you don't hear anything.

Shouldn't be necessary to replicate any code in the zotxt plugin, the necessaries can be loaded as modules, and that should work anywhere JS will run.

matt_price · June 22, 2015

Hi Frank,

I'm not having much luck with my usual trial-and-error approach to programming, so I will put this aside for a bit and ping you in a week or so. At present I can't get the rtfscan code to load via chrome url ("EXPORTED_SYMBOLS is not an array" error when trying:
Components.utils.import("chrome://rtf-odf-scan-for-zotero/content/include.js");

), and even if I just put all your code inside the zotxt bootstrap.js, I can't seem to figure out how to invoke it properly. Thanks for your help, I will definitely be back in touch presently!

bobdodds · July 1, 2015

Hi all, I wanted to say that I am very grateful for all of your work on this project. I am having a small issue that I was hoping someone has seen before. I have exported a doc for scrivener that contains a series of footnotes with zotero links. They show up in the odt doc correctly however once I put the doc through the scan some info is lost.

The cases I am worried about have to do with prefix's and suffix's. For instance I have several cites that are formatted as

{ My Text blah blah blah| Edbauer, 2005 | | |zu:774115:TMTSFJ67}

and post scan all I see in the footnote is
Edbauer, 2005

I have also set up some sites in the following ways, all with the same results.
{ MY NOTE BLAH BLAH| author, 2005 | | |zu:774115:TMTSFJ67}{ | author, 2010 | | |zu:774115:2ECN4A5B}{ | author, 1989 | | |zu:774115:56WGG3XW}

{ MY NOTE BLAH BLAH| author, 2005 | | |zu:774115:TMTSFJ67}{ | author, 2010 | | SUFFIX FOR THIS NOTE BLAH BLAH ETC |zu:774115:2ECN4A5B}{ | author, 1989 | |
|zu:774115:56WGG3XW}

I am trying to create footnotes that reference several authors in one footnote and allow me add commentary between the citations using the prefix and suffix fields in the exportable cite from zotero. Also I would like it if in these grouped cites I could mute the author and just have the title of the work showup with in the foot note but allow for a full cite in the bibliography.

My doc is setup using the APA full note style. Is there something I am missing. Thanks in advance for your help

adamsmith · July 1, 2015

The prefixes, suffixes, and page numbers only become visible once you've actually selected a citation style in LibreOffice. When you open the document right after the scan, you'll just see placeholders in the basic author, year form for citations. Those don't correspond to any citation style.

takai · July 28, 2015

Now this may have already been covered, but i use BibTeX alongside RTF Scan for different purposes. Is it possible at all to get RTF Scan to reference the citekey rather than just the author/date fields (quite a few of my references have the same author/year/etc)?

adamsmith · July 29, 2015

@takai -- short answer is no, but also this thread is about the ODF-Scan add-on, which does create unique identifiers, albeit a bit more complicated than bibtex strings, so you could consider having a look at that. Otherwise, for further RTF-scan questions, please start a new thread.

takai · July 29, 2015

@adamsmith ah, perhaps the title should be updated then, as it is clearly listed as "RTF/ODF Scan for Zotero"