RTF/ODF Scan for Zotero
Frank Bennett (who did most of the work) and I (who did most of the cheerleading) are excited to announce Zotero RTF/ODF Scan, a plugin that allows you to use Zotero (properly) with any word processor capable of saving/exporting ODF format, including google doc and Scrivener.
Go here for download and instructions
http://zotero-odf-scan.github.io/zotero-odf-scan/
See here for more extended instruction with images:
http://zoteromusings.wordpress.com/2013/05/06/announcing-rtfodf-scan-for-zotero/
Please post all problems and questions here
Go here for download and instructions
http://zotero-odf-scan.github.io/zotero-odf-scan/
See here for more extended instruction with images:
http://zoteromusings.wordpress.com/2013/05/06/announcing-rtfodf-scan-for-zotero/
Please post all problems and questions here
We're keen to see users get the most out of this new bridge between Zotero and the various writing platforms in use out there. If you run into snags, please do let us know with a post to this thread.
I'm new to Scrivener and have been trying to figure out how to use it with Zotero. This is definitely an improvement over the old system. Thanks for this!
Is there any way of hiding the Zotero markers when working on text in Scrivener? I find the markers are distraction, especially the ones with multiple references. They can be quite long.
Thanks!
It's possible to shorten the markers - e.g. you can manually remove the title of the item and it won't have any effect. We can also tell you how to customize the translator to only print author-date (currently you'd have to re-do this after each update of the plugin - if there's sufficient demand, we could consider letting users define that part via a preference).
Would appreciate it if you could share with us how to have the translator print only author-date ... anything to get the markers shorter. They're quite a chunk, especially multiple references.
Thanks!
leaving this for historical reasons
In your Zotero data folder:
http://www.zotero.org/support/zotero_data
there is a folder called "Translators"
Open the folder and find the file "Scannable Cites.js" (Windows may not show you the .js)
Open the file with any text editor - notepad, TextEdit, etc. (don't use a word processor like WordPad, Word, or Scrivener)
In line 45 you'll see
mem.set(item.title,",","(no title)");
just comment that line out by putting to forward slashes at the beginning:
//mem.set(item.title,",","(no title)");
save the file. That's it. The change is effective immediately, no need to restart.
Note that you'll have to do this every time the plugin is updated (we overwrite the translator when we update).
As I say above we might allow this to be specified via user preference - I'd imagine that might happen in the next feature release (i.e. anything that contains more than bug fixes).
It doesn't do anything, of course, but I think he's attached to the fact that it is a valid URI for a Zotero item. Currently you can't take it out because of the way the scan works, but that would be trivial to change.
I'd also be interested in seeing if empty separators could be removed, and if it could generally do more free-form parsing. I'm less up on the requirements here, but I would think that the parsing could be smarter about figuring out what was what intended.
(I'm not even sure the item identifiers should be there, though obviously there would be some significant downsides—prompting at parse time, like (I think?) RTF Scan does—to removing them. But at export time it could at least ensure that there was enough item data included to uniquely identify it given the current state of the database (if the export was moved out of the translator architecture).)
So from my perspective I definitely want a unique identifier in there (checking on export isn't enough - later additions can cause ambiguity). I'm at a minimum concerned about messing with the delimiters - the more complex the regex, the more likely it is to cause problems. The simplicity of the current parsing is, in my opinion, a core strength.
We can certainly introduce a more concise identifier: I agree that the zotero://select url does add quite a bit of clutter. In some environments it can be useful, though, so support should continue and it should remain as an option. When using Zotero with an external note-taking tool (as opposed to a finished document), the user's priority might be to have quick access to the original Zotero item.[1]
About item identifiers generally, as adamsmith says it is nice that conversion just works. It also allows us to cover legal resources, which don't lend themselves to freehand referencing. As one example, citing statutory law is tricky[2] -- think of cites as pointers to pull requests and tagged versions in a git archive, which need to be referenced with precision. As a legacy from the print-only era, legal citation styles have rules of citation that capture the necessary detail, but the cite forms vary between jurisdictions. Then there are the quandaries associated with foreign language materials ...
I see ODF Scan as one more alternative. Users who prefer to enter cite markers freehand will use RDF Scan instead.
[1] That was a requirement for the original converter and MLZ extension, written with Paul Troop, on which the plugin is based.
[2] See this post by Thomas Bruce, director of the Cornell LII (scroll down to "Status, tracing, versioning and parallel activity").
In the current design, dropping separators would introduce ambiguity: in a marker with three slots, the first two could be a prefix followed by a human-readable cite, a human-readable cite followed by a locator, or a human-readable cite followed by a suffix. I think we'll keep things as they are for the present.
With a more sophisticated parser, you could adapt Andrea Rossato's markup mechanism from pandoc + citeproc-hs, which is nicely designed, and used by at least one other project (Erik Hetzner's zot4rst). That syntax assumes a human-readable local identifier, though, which would require some sort of mapping table. It could get complicated.
A nice thing about the ODF Scan marker layout is that it's very simple to explain and to use. We'll see what users report back to this thread from the field, but so far it looks like the main issue is a desire for a more compact identifier.
with 1 section (this can be tricky and I think probably should be left unsupported), try to use content as item key. if it fails, use it as freeform cite. if that fails ask user to either pick item from lib or skip it.
for the remaining, check if last section is item key. if it is, don't count it towards number of sections.
with 1 (remaining) section, assume it's freeform cite.
with 2 - freeform plus prefix. if freeform cannot be matched, try freeform plus suffix
with 3 - assume prefix freeform suffix
I would also say locator can be in brackets at the end of freeform. this would avoid suffix/locator ambiguity both for the plugin and for the user
Frank: I'm not arguing for freehand referencing specifically, just to consider whether having fixed fields is really necessary. Doing away with those would certainly require a more sophisticated parser, but that's what I'm arguing for. Why, though? Shouldn't the goal be to integrate this into a single interface in Zotero proper that can take a document—in different possible formats, and perhaps even with different kinds of possible markers, both freehand and Zotero-generated—and output a document with formatted citations? If there's enough embedded info that no disambiguation is required, great. If there is, an interface is available.
If this is never merged into Zotero then this doesn't particularly matter to me, but I assumed this was meant to be a stopgap measure until it could be worked into the existing interface.
Simon, who wrote the RTF Scan code, also may have other thoughts on this.
I'm also with Dan on this: Though I'm a bit skeptical on us being able to support very unstructured freeform citations.
As things stand, the plugin either produces a valid Zotero citation or leaves the glaring marker in the document. The risk of getting a well-formed citation with (harder to spot) unintended locators or affixes is pretty low. The appearance of the pipe markers in the draft may be less than tidy, but trust in the output reduces distraction of another kind.
A lot of work has been done in legal circles lately on parsing human-readable citations out of free text. Citation marker schemes provide a little more structure to work with, but I guess some of the stories leave me feeling there are benefits to maintaining some distinction between the human-readable and the machine-readable.
Adaptive parsing to cover both RTF and ODF Scan markers (and maybe others) is certainly something to keep in view. The plugin code provides a nice starting point for exploring more flexible approaches. I'm sure there will be further experiments, but we're pretty happy with this for the present.
We're looking into reducing the bulk of the markers, and will post to the thread again when there's something to show.
I find the thought of trying to parse what's a delimiter in a marker and what isn't really unattractive. And what Frank says about the problem of citations that don't come out right. In a 300 page document I need this to work 100% reliable. Not 98% reliable. We're going to keep it that way for our add-on.
I'm happy to chime in if/when you're looking at incorporating this into Zotero. My general position is that I think you underestimate the degree to which you will put off folks in the humanities - most notably in history! - if you make things fragile or unreliable for citations involving complex affixes.
I don't really see the downside of the pipes if we're going to keep the ID, since at that point we require a Zotero generated marker anyway and the pipes are printed by that.
I believe that the use of pipes helps to make the new plug-in's insertions _more_ human friendly. With pipes it becomes quite clear what is what. With commas or semicolons everything seems to run together.
I urge everyone to do a use-test. Write a paragraph using the plug-in as it currently stands. Copy that paragraph and edit it to replace the pipes with your choice of other delimiter. Compare the two. I find the pipe version more friendly. Another test would be to ask a naive person to look at the two paragraphs and ask which more clearly defines the parts of the inserted citation markers.
It's a tempting thought, but I'll step slowly with it: the RTF Scan syntax for suppress-author cites, in particular, may be tricky to handle against ODF markup. Would be handy if it could be made to work, though.
The change is backwards compatible (i.e. Zotero continues to recognize the old markers) and there are hidden prefs to use the old behavior, documentation will go up tomorrow.
Now I - and no doubt many others - will be scrambling to re-enter citations in the Scannable Cite format. Time well spent, and a great relief because now we have confidence that this cross-platform combination (a) works and (b) is supported. My only request is that if this system is updated (ie made prettier) that the ODF scan will continue to be able to read citations entered in the current Scannable Citation format.
Back to work, smiling broadly ... cheers Evan