rtf scan problems with disambiguation in zotero 2.1 rc1


I love Zotero, so I want to describe some problems/bugs with Zotero 2.1 rc1. I'm on mac, os x snow leopard, firefox 3.6, word 2011.

I have two books, some author, some year, different titles:
1. Turcan, "Cioran or the Excess as Philosophy", 2008 (book) and
2. Turcan, "Cioran and Religion", 2008 (journal article).
I put them in the footnote like this:
1. {Turcan, 2008, 44-45}
2. {AnotherAuhtor, 1995}
3. {Turcan, 2008}

A. FIRST PROBLEM. When the screen for disambiguation is shown, I can make ONLY ONE choice, for the both titles, so in the scanned rtf there will be just one single reference:
1. Turcan, Cioran or the Excess as Philosophy, 2008, 44-45.
2. AnotherAuthor, Title, 1995.
3. Turcan, Cioran or the Excess as Philosophy. (HERE I want the other title, Turcan "Cioran and Religion", 2008)

B. THE SECOND PROBLEM. If I don't have another title between footnote 1 and footnote 3, the scanned document look like this (the second remains unscanned):
1. Turcan, Cioran or the Excess as Philosophy, 2008, 44-45.
2. {Turcan, 2008}

C. THE THIRD PROBLEM. Why doesn't work at all the form with title between quotation marks, to eliminate disambiguation? For example none of these forms don't work (the scan process doesn't see them):
1. Turcan, "Cioran or the Excess as Philosophy", 2008, 44-45.
2. Turcan "Cioran and Religion", 2008.

Thank you very much,

  • No answers? I try to not use EndNote, but without a decent rtf scan feature in Zotero, it is impossible, because I work with Scrivener...
  • my (completely unofficial) sense is that the RTF scan is, at least currently, not much of a priority for developers - it hasn't been worked on for ages. Personally, I don't think it's usable for serious work exactly because of the type of issues you note.

    The way it's implemented (there's a separate .js file doing most of the work) it would be very much possible for a third party developer to improve this. Without that (unless devs say otherwise), I wouldn't get my hopes up for this to see any improvements soon.
  • Thank you, Adam.
    I'm waiting for a third party developer or learn myself javascript... In the meantime, I work with EndNote and rtf scanning feature, because I'm working on a big thesis and after 500 footnotes zotero becomes slower...

  • This might get some of my attention soon unless the core devs warn me that it's about to be overhauled anyway. The code doesn't look too bad.
  • It would be nice to have a good rtf scan! Thanks,

  • Avram, that would be great. This has been on my to-do list for a while, but between getting out Zotero 2.1 and working on Standalone, I haven't gotten around to it.
  • edited March 9, 2011
    @ajlyon: these are great news! Cannot contribute code but would be happy to run a few tests and help debugging if it comes to it. Thanks from another grateful Scrivener user. kithairon
  • edited March 9, 2011
    @ajlyon: now that Papers also uses CSL (and seems to heavily depend on RTF Scan-like codes, see http://vimeo.com/20816946), it might be worthwhile coordinating for compatibility.

    P.S. I also remember some discussions about RTF scan codes with regard to CSL/Pandoc, but I'm not sure where those found place.
    P.S.S. I think I was thinking of this: https://groups.google.com/d/topic/pandoc-discuss/8wz2U2dzQJM/discussion
  • I don't have a Mac to test it out, but I don't see how the Papers folks handle locators.

    I think that improved RTF scan and the wonderful Gnotero (with appropriate small changes) would be just as smooth as what Papers2 is offering. And maybe the word processor plugins could then be convinced to detect and convert RTF scan codes as well...

    I won't give this a try for a couple more weeks at least, but it should be fun.
  • Has there been any improvement in the RTF scan feature in the months since the last post?

    I am trying the RTF scan feature out in the hope of being able to move to Scrivener for my writing, but I can't find advice on how to handle same-author-same-year ambiguities. If RTF scan has been improved and ambibuity handling is now possible, guidance would be much appreciated.
  • nope, the feature hasn't been touched for over a year.
  • It will be nice if rtf scan works. I'd abandon EndNote. Some features woud be nice:
    1. A style with no ambiguations: {Author, Year, "Either Title or Short title or some words of the title"#volume} (The volume is essential for disambiguation.) - I can do that.
    2. A menu item for right click on a Zotero item named "Copy for rtf scan" or "Copy unformatted". - I can't do that.
    3. An improvement of rtf scan function. - I can't do that, but I woul like to learn. Unfortunately I didn't find a good tutorial for beginners. I have a little programming experience in Java, with NetBeans IDE. What IDE need I for programming Zotero?

    A propos, no news about rtf scan improvements in Zotero 3? I hate EndNote, but I love Scrivener (on Mac), and Zotero doesn't work with it.

    Thank you,
  • 2. It would be very easy to code a csl citation style for the rtf scan and then use quick-copy (i.e. drag&drop) from Zotero. I don't think rtf scan is currently good enough to actually make that worth it, but if there's demand I can certainly put in the time for that.
    3. The rtf scan is written in javascript, like most of Zotero, you don't need an IDE at all. But NetBeans would certainly work.
  • Thank you, adamsmith!
  • Another vote for an improved rtf-scan in zotero.
    Although I have another usecase in mind: A big problem for people who switch from Endnote zu Zotero is the incompability of word documents writen with Endnote. It is discussed in the forum for years without any solution. (http://forums.zotero.org/discussion/1824/)
    My idea is (given I have imported my Endnote library into Zotero):
    1. write a simple output style in Endnote that just put Author and year in {} brackets as required for the rtf-scan.
    2. remove field code in word and save the document as rtf
    3. let zotero do the rtf-scan.

    No. 1 and 2 is an easy task done in 15 minutes.
    No 3 by now failed for various reasons:
    a) I had problems with e.g. Authors that have a second given name. When I rewrote the style to only use the last name I ran into the problems with disambiguation mentioned in previous posts.
    b) I had problems with various references in one citation (e.g. two books by two different authors). It would be useful to expand the rtf-scan syntax with a separator e.g. ";" for different works within one citation.
    c) And even with the Authors rightly detected I had to learn, that the citation style of the created document can not be changed as it used to be possible with "normal" documents that have citations inserted by zotero. That is a petty because in many cases I re-use fragments of old texts for new ones. I think this should be possible since Zotero needs to detect the entry of the cited work anyway for the creation of the bibliography.
    So I think it would be great and helpful for many people if the rtf-scan function of Zotero improves.
  • I've just been playing around with EndNote for the first time in years in order to experience how its rtf scan feature behaves. In short, it's excellent. Seeing how well it can work made me wonder why Zotero hasn't gone down the route of requiring unique identifiers in the temporary citations, e.g.{Smith, 2001 #456}. Both Endnote and Papers do this and I assume that's why their scan works so well and Zotero's works relatively poorly. In addition, wouldn't a requirement for unique identifiers make the coding of the rtf scan function substantially easier?

    When you do get round to improving rtf scan, in addition to implementing requirement for unique identifiers, I would like to suggest that you should add a feature to the word & ooo plugins that will convert back and forth between temporary citations (used by rtf scan) and formatted citations (used by cite-while-you write). This is a useful feature in the EndNote plugin for Word (try the "update citations and bibliography" and "convert to unformated citations" functions in the EndNote plugin of Word for Mac).
  • _required_ unique identifiers may not be desirable (think - co-authoring).

    Allowing to switch back and forth is likely not possible, because this won't work with multiple users working on a document - Zotero uses globally unique ids.

    That said, some version of an identifier (likely optional) will definitely be part of an improved scan solution - note that for purists this already exists in the form of Zotero plain: https://bitbucket.org/egh/zotero-plain
    So the RTF scan is for people who _do_ want to use Word etc. but _don't_ want to use CWYW (which has become much faster & nimbler in the 3.0b version). IMHO that just makes it a lot less pressing.
  • edited October 30, 2011
    I don't understand why unique identifiers necessarily prohibit co-authoring. How is it different to the scenario of me adding a citation with CWYW and giving the document to a co-author to work with? Could you please explain what globally unique ids are?

    Anyway, optional identifier sounds good.

    I disagree with your final comment about RTF scan being for Word users who don't want to use CWYW. The ability to switch back and forth between temporary citations and the CWYW citation objects within the Word plugin is a feature that affords a great deal of flexibility in how you write. You can choose to primarily write in any writing program using temporary citations (it would be Scrivener for me) and then when I want to share with co-authors I could export and convert the temporary citations to formatted CWYW objects with a formatted bibliography in Word (so they make more sense to my co-authors than temporary citations). After their comments/amendments come back to me I would convert back to temporary citations and import the document back into Scrivener. [EDIT: This is the workflow described on various forums by a number of EndNote users who don't use Word as their primary writing program but do write co-authored papers].

    Off topic: Is it possible to quote from previous comments on this forum?
  • edited October 29, 2011
    And just to be clear. EndNote's implementation of RTF scan does not require the Word plugin. The conversion feature in the Word plugin is just an extra feature. The RTF scan can be done independently by the EndNote program too (like Zotero). This obviously makes it useful for people who don't want to go anywhere near Word or OpenOffice.
  • edited October 29, 2011
    Can't help with the headline issue, but to quote stuff you can use:
    <blockquote>Words of great import and significance.</blockquote>
  • In case there are any developers I still wonder why there is no way to build an almost unambiguous checksum using the same code like the duplicate detection is using right now.
    This could be ignored if there is only one matching ref entry or if it does not find the correct one, because it has been altered.

    Together with the drag'n drop RTF-style this would solve the problem to integrate it into Google Docs or other software for the time being, until someone comes up with a more sensible solution.
  • A major use for a good RTF Scan is Google Docs. Gdocs is a terrific way to collaborate on papers, especially with several authors, as one doesn't have to worry about conflicting versions and so on (and the comment facility is very good as well). But it doesn't handle bibliographies at all. If one could just enter {Author, date} in the Gdoc document as one wrote, then export the final product and run Zotero on it, it would be a happy resolution. Theoretically this is possible now, but the weaknesses of the current RTF scan make it less happy.
  • Any opportunity to have/create a unique identifier would be fantastic! (especially if tied to notes too)

    RTF Scan on {ID, pg} would be so simple for lone researchers, or collaborators savvy enough to use a common database.

    Further, something like {AuthorLastName_FirstTitleNonStopword_Date, pg} would disambiguate an awful lot of references, even for non-shared databases. Is it feasible to make RTF Scan detect such a concatenation?
  • nturcan, if you are still around, perhaps you have already figured this out, but you could enter something like {Turcan, Ex2008, 44-45} and {Turcan, Re2008, xx} to help in the disambiguation process. Zotero will kick these out nicely and separately (in the latest standalone beta at least), and the selection is just a matter of searching for turcan 2008 or some such in the selection window that pops up.
  • Dear mbruffey,
    I've tried to do what you suggest, but I'm not sure I got it right, since it did not work. I'm using Zotero standalone.
    I've added the prefix to the date in the rtf code, and tried both retouching the item date accordingly or leaving it like that: no hope. Not only it does not recognize the twicked inline ref, but it will ignore the author for all other ref hi may have in the rest of the text.
    What did you mean exactly by "Zotero will kick these out nicely and separately"?

  • I'm afraid, brunus, that I have not played with RTF for quite some time. Still waiting for improvements, and also not in the critical documenting/writing stage of the dissertation yet. So, unfortunately, I can't recall enough to help with your question. I just keep waiting with fingers crossed that some good samaritan will soon kick out easy-to-use disambiguation features in RTF.
  • check the discussion of odf scan at the bottom of this thread: http://forums.zotero.org/discussion/18064/2/please-add-better-integration-with-scrivener/
    Currently that's your best bet, it'll be a while until we get proper disambiguation implemented in vanilla Zotero/RTF scan
  • Thanks adamsmith. I just hopped out of that thread and into an exploration of MLZ Zotero a moment or two ago. I'm extremely impressed and excited at the work Mr. Bennett is doing! m.
  • Just another Scrivener user hoping for better disambiguation, hopefully with some kind of unique identifier for Zotero items.
Sign In or Register to comment.