ProCite to Zotero Conversion: Translator, RIS, and Testing

2456
  • @mronkko First of all thanks so much for trying to help. I appreciate your time and help in elucidating what the issues are.

    @adamsmith and others:

    Can you link to any documentation on how to write a translator file? Would this basically create a new "RIS" type file, where I would download it to Procite and do the "print bibliography" in that new translated style, and thence import the text file into Zotero?

    If anyone can do this for $50- $100 that would be doable. I am basically offering my own money, because I think it is important for my employer that he get to a new database system that will allow for quicker imports on a day to day basis, since he is always coming across new citations. He wants to upgrade to a new system, but the switching costs are high if I have to go through every item (5,000+ citations) and edit them manually in Zotero, and would be a barrier to others who might switch to Zotero, who don't have someone to do this for them.

    I am sure there are many other academic procite users who would appreciate this kind of service.

    This is a time sensitive issue, as I have put a hold on adding new citations to Procite in favor of adopting Zotero. Is there anyone who is able to write the translator?
  • Note:

    Appears to be a problem translating Procite "Book Long Form". In RIS it translated to Book Section:

    Procite Screenshot Book Long Form Record:
    http://dl.dropbox.com/u/19141190/ExBookLongForm in Procite.png

    RIS Output:
    TY - CHAP
    N1 - Record ID: 2648
    A1 - Van Gennep, A.
    A2 - Vizedom, M. B.
    A2 - Coffee, G. L.
    N1 - Author Role: translated by
    T2 - The Rites of Passage
    N1 - Author, Subsidiary: Kimball S. T.
    N1 - Author Role: with an introduction by
    CY - Chicago
    PB - University of Chicago Press
    PY - 1960
    KW - anthropological theory
    ER -

    Zotero result for RIS import (note: I have changed 2 authors to translators and one to contributor. All import as Authors or comments with the RIS file!):
    http://db.tt/C5o2Tzhv
  • The file contains some typos, so a generic solution would not work. At least these should be fixed, but there are fortunately not many.

    E.g.:
    Author Role: edtied by

    Also I am uncertain if an "editor and translator" should result in the same author being included twice in the item. Can you provide an example of a citation in a published work that actually lists the same person twice.

    I checked the code and this might be easy to implement. I might take a look at it some time later. (Most likely not before the weekend.)
  • I'll try to put something together, but it will take a day or two. Thanks for all the screenshots and the manual maevepotter, those will be very helpful.

    Regarding the Book Long Form, that looks to be a bug in ProCite (probably the export filter), but I have no experience using ProCite, so I wouldn't know how to help you fix this. You could probably change all book long form entries to book and re-export, unless that would make you lose some info.
  • I don't think the person's name would be twice, just Ed and Trans. ____ so and so. See below:

    "To cite a scholarly edition and/or translation of a work, begin with the author and title, followed by
    the editor and/or translator:
    Dante Alighieri. Inferno. Ed. Giuseppe Mazzotta. Trans. Michael Palma. New York:
    Norton, 2007. Print.

    If the editor and translator are the same person, follow the title with “Ed. and trans.” (Edited and
    translated by). For example: Dante Alighieri. Inferno. Ed. and trans. Louis Cypher."
    http://www.ithacalibrary.com/research/MLA2009books.pdf

    @mronkko: Thanks, do you need me to edit those files, and do a new RIS? Would they be the same files as are listed in your spreadsheet?

    @Aurimas: Thanks, anything would be great. I have some other people in the department expressing interest.

    Book Long Form: It might make me loose some info, like "Series Volume", (which did not transfer to zotero from the RIS, went into notes). See screenshot: http://dl.dropbox.com/u/19141190/procite book long form ex 2.png

    The problem with changing these in Procite is finding them. I don't think I can sort by source type, so I would have to find those types of records by going through all the records.
  • Another bug: Conference Proceedings

    See screenshot of record in Procite: http://dl.dropbox.com/u/19141190/conference proceeding procite.png

    Issue:
    Proceedings Title and Place of Meeting fields dumped into notes in Zotero, even though Zotero has corresponding fields. See Zotero screenshot: http://dl.dropbox.com/u/19141190/Zoteroresult of conference proceeding.png

    RIS File output was:

    TY - CONF
    N1 - Record Number: 330
    A1 - Allard, Francis
    T1 - Investigating the Bronze Age of Khanuy Valley, Central Mongolia
    N1 - Connective Phrase: In
    A2 - Hanks, Bryan
    A2 - Linduff, Kathy
    N1 - Proceedings Title: New Research Directions in Eurasian Steppe Archaeology: The Emergence of Complex Societies in the Third to First Millennia BCE
    Y2 - 2006/02/10-2006/02/10
    N1 - Place of Meeting: University of Pittsburgh
    CY - Pittsburgh
    PB - Department of Anthropology & Center for Russian and East European Studies
    PY - 2006
    N1 - Notes: have, seen
    Allard 2006

    [notes redacted here for clarity]

    KW - Mongolia
    KW - pastoralism
    ER -


    Also, as you can see, Hanks and Linduff are represented under A2, Which went into Zotero as a contributor, instead of Editor/Compiler, as they were in Procite (see prev. screenshot).
  • Looking at certain records in Zotero, it seem that it might be good for the program to add some additional fields, such as in Conference Proceedings: a date for the publication of the proceedings, as well as a date for the actual meeting of the conference.

    Is this possible for the future?
  • I will let Aurimas handle this.
  • same here - you're in good hands with Aurimas
  • Book Long Form: It might make me loose some info, like "Series Volume", (which did not transfer to zotero from the RIS, went into notes). See screenshot: http://dl.dropbox.com/u/19141190/procite book long form ex 2.png

    The problem with changing these in Procite is finding them. I don't think I can sort by source type, so I would have to find those types of records by going through all the records.
    You might be able to fix this by modifying the RIS export style (I assume you're using RIS-Endnote style) in ProCite. If you follow the manual starting at page 417, you might be able to figure out what you need to tweak. I don't have ProCite, so I can't tell you exactly, but under Book Long Form you need to change CHAP to BOOK somewhere.

    Keep in mind that Book Long Form can be used to cite practically anything in ProCite, and their suggestions include books, book sections/chapters, atlases, and a couple others, so you may end up converting some things that are meant to be book sections into books.
    Issue:
    Proceedings Title and Place of Meeting fields dumped into notes in Zotero, even though Zotero has corresponding fields.
    We should be able to handle this in the modified RIS translator.
    same here - you're in good hands with Aurimas
    Thanks for having so much confidence in me :-) Now I just need to not disappoint. I'll start working on this when I get off of work.
  • edited March 27, 2012
    You might be able to fix this by modifying the RIS export style (I assume you're using RIS-Endnote style) in ProCite. If you follow the manual starting at page 417, you might be able to figure out what you need to tweak. I don't have ProCite, so I can't tell you exactly, but under Book Long Form you need to change CHAP to BOOK somewhere.

    I don't know how to modify the RIS export style myself, other than manually changing the output. Procite doesn't appear to give you much in the way of options when you create the file. Just to include all fields...

    Thanks Aurimas, let me know what you come up with. I appreciate your help.
  • So I folded and downloaded the Demo version of ProCite.

    Turns out you can sort items by their item type. When you have your database open in ProCite, go to View->Configure Record List... Under Layout, you have a list of fields that display. Check an additional field to enable it, then in the drop-down box scroll to the top and find Workform. You can also change the title to whatever you want. Now you can sort by item type. This may also be used to sort by most other fields.

    As far as editing the RIS export format, I will assume that you are using directions similar to the ones posted here http://www.refworks.com/rwsingle/help/Exporting_from_Bibliographic_Programs_and_Importing_into_RefWorks.htm#ProCite

    That would mean that you already have the RIS-Endnote.pos style file. To edit it, you can open this file in ProCite just like you would open a library. In the Open dialog select Output Style for "Files of Type" and find the RIS-Endnote.pos file (if you don't know where it is, try ProCite5/Styles/Standard). This will open style configuration dialog. For your purposes, you will want to select Book Long Form, then go to the Bibliography tab and change the first line from TY - CHAP to TY - BOOK. Save the file (you can save as a different file to have a backup of the original) and re-export your library using this new output style.

    Either of these options will do what you want. You'll probably have more control using the first one, since you'll be able to decide whether the item should be a book or a book chapter, but the second option should be much faster.

    Finally, the RIS file that you linked to in your second post (https://raw.github.com/gist/2206031/1d3e4be1b6b0fadd0ef86f6567f11a97d50c86f7/March26ProciteRISdata) is that still the original file that you exported? or is it now the version that you modified throughout the course of this discussion?
  • I'm calling it a night. This is still a work in progress, but it seems to work quite well.

    https://github.com/aurimasv/translators/blob/RIS/RIS.js

    You can save this file as RIS.js and copy it to the Zotero translator folder (http://www.zotero.org/support/zotero_data)

    Some things that I modified:

    Added handling for author roles defined in notes immediately following author names.

    Made RIS skip over some notes like 'Record ID', 'Record Number', 'Connective Phrase'

    It now parses some other values set in notes like 'Language', 'Call Number', 'ISBN', etc.

    I think I wrote the modifications in a way that would make it pretty easy to add any other fields that we're still not handling.

    Use it at your own risk, this has not been extensively tested and the code changes are quite large. But if you do test, please post any problems you encounter.
  • [there is also no reason not to test this - the worst that can happen is that you'll need to restore the old translator, which takes one click].
  • edited March 30, 2012
    Good news, I think this is getting close to being done. Here's another update (same link):

    https://raw.github.com/aurimasv/translators/RIS/RIS.js

    Here is a list of fields that we now handle (these are placed in N1 by ProCite):

    These fields determine the creatorType(s) for the authors that precede them:
    Author Role
    Author Role, Analytic
    Editor/Compiler Role
    Series Editor Role
    Artist Role
    Cartographer Role
    Composer Role
    Director Role
    Performer Role

    These fields are parsed into creators (also taking into account the above fields):
    Author, Monographic
    Author, Subsidiary
    Director
    Editor
    Editor/Compiler
    Producer
    Series Editor

    These fields determine either numPages or numberOfVolumes depending on what Packaging Method contains (handles page(s), pp, volume(s), vol(s)):
    Extent of Work
    Packaging Method

    These are mapped directly to their equivalents in Zotero:
    Call Number
    Edition
    ISBN
    Language
    Place of Meeting
    Place of Publication
    Proceedings Title
    Publisher Name
    Series Title
    Copyright Date
    Date of Copyright
    Issue ID
    Issue Identification
    Page(s)
    Volume ID
    Scale

    This is still stored in a note, but the "Notes:" part is now dropped:
    Notes

    These are ignored, since they are meaningless outside of ProCite:
    Connective Phrase
    Record ID
    Record Number


    Please let me know if something doesn't seem right. It will be easy to fix.

    Any fields that are not mentioned above are stored in attached notes.

    Here is a list of fields that we still don't handle. If someone has some ideas on what to do with them, please let me know:

    Author Affiliation
    Author Affiliation, Ana.
    Author E-mail
    Document Type
    Medium Designator
    Original Pub Date
    Section Title
    Series Issue ID
    Series Volume ID
    Abstract Journal Date
    Abstract Journal Issue
    Abstract Journal Title
    Abstract Journal Volume
    Acquisition Number
    Address
    CODEN
    Column Number
    Computer Program
    Extent of Letter
    First Page
    Frequency of Publication
    Histroy
    Location in Work
    Matrix Number
    Medium
    Medium (Data File)
    Plate Number
    Recipient E-mail
    Recording Title
    Registry Number
    Related Document No.
    Report Identification
    Reproduction Ratio
    Size
    Title
    Title Monographic
    Title, Long Form
    Title, Monographic
    Title, Short Form
    Translated Title

    EDIT: Updated field mappings
  • Hi Aurimas,

    Wow what a lot of work, thank you so much for all of your efforts. Sorry for the slow response, as I was not at work for several days.

    Here is a link to the original database so you can work with it in Procite. The pdt is the database that you would import, honestly not sure what the other file does, but it is part of the output when you save the database, so there it is.

    http://dl.dropbox.com/u/19141190/Rogers-ArchaeologyupdatedMar26.pdt
    http://dl.dropbox.com/u/19141190/Rogers-ArchaeologyupdatedMar26.pdx

    Let me know if you still need the original RIS, or the one mronkko made.

    I am going to review what you wrote a little more and try to import it.

    Am I understanding correctly, that what you wrote is a translator, and that I will copy it to the Zotero translator folder, and then import the database? Will I need to export a fresh RIS file, and/or edit the RIS style file like you mentioned above? Or will I simply let Zotero read the ProCite database file?

    Thanks.

    Meghan
  • Also, as far as record ID, I had kind of wanted to keep that field, just so that in case of an error, it would be possible to refer directly to the original record in Procite. I suppose it is not a huge thing though, as you would be able to look up the author or title and find it that way.

  • Am I understanding correctly, that what you wrote is a translator, and that I will copy it to the Zotero translator folder, and then import the database?
    exactly.
    Will I need to export a fresh RIS file, and/or edit the RIS style file like you mentioned above? Or will I simply let Zotero read the ProCite database file?
    the translator works with an RIS file, not the ProCite database - I believe you should be able to just use the one you have already created, Aurimas would have to confirm.
  • the translator works with an RIS file, not the ProCite database - I believe you should be able to just use the one you have already created, Aurimas would have to confirm.
    I'm not sure how much you modified your original RIS file. I was doing all the testing with the file you posted on gist, so it should import better with a fresh export.
    Also, as far as record ID, I had kind of wanted to keep that field, just so that in case of an error, it would be possible to refer directly to the original record in Procite. I suppose it is not a huge thing though, as you would be able to look up the author or title and find it that
    If you want to keep it, simply open up the RIS.js file and comment out lines 317 and 318 (they should have record ID and record identifier) by placing // at the beginning of the line. I wasn't sure which way was more preferred, but I figured that in most cases record id would not be necessary, like you pointed out.
  • I did not modify the original RIS file at all; that was the output straight from Procite.

    Regarding the Record ID comment out: that sounds fine. Thanks.

    "That would mean that you already have the RIS-Endnote.pos style file. To edit it, you can open this file in ProCite just like you would open a library. In the Open dialog select Output Style for "Files of Type" and find the RIS-Endnote.pos file (if you don't know where it is, try ProCite5/Styles/Standard). This will open style configuration dialog. For your purposes, you will want to select Book Long Form, then go to the Bibliography tab and change the first line from
    TY - CHAP
    to
    TY - BOOK
    . Save the file (you can save as a different file to have a backup of the original) and re-export your library using this new output style.

    Either of these options will do what you want. You'll probably have more control using the first one, since you'll be able to decide whether the item should be a book or a book chapter, but the second option should be much faster."


    Will I still need to do the above to ensure a complete transfer, or have you coded that into the translator?

    Do you foresee any problems I should check for when I am testing it?

    Also, Zotero seems to have problems when I import the whole RIS at once, with occasionally scripts stopping, and prompting me about it, or just taking a very long time (+20-30 minutes). Should I break up the RIS into smaller files?

    Finally, what html do you use to make the quotes from other people show up outlined in the dotted line? I tried < quote > < /quote > but no luck.
  • Will I still need to do the above to ensure a complete transfer, or have you coded that into the translator?
    You will still need to do this. I don't think there is a way for us to figure out what is supposed to be a book and what is a book chapter once it has been exported as a chapter.
    Do you foresee any problems I should check for when I am testing it?
    I would pay most attention to authors and their roles, as that is the most complicated piece of code.
    Also, Zotero seems to have problems when I import the whole RIS at once, with occasionally scripts stopping, and prompting me about it, or just taking a very long time (+20-30 minutes). Should I break up the RIS into smaller files?
    It should work fine if you split it up. I would imagine that it will be easier for you to do it in pieces anyway.
    Finally, what html do you use to make the quotes from other people show up outlined in the dotted line? I tried < quote > < /quote > but no luck.
    Use <blockquote> and format comment as HTML
  • Thanks.

    I made a tutorial of what we have currently (perhaps the translator will change locations or be updated elsewhere). Hope it is helpful to someone. I thought it might be nice to summarize the process. I tried making it into a webpage hosted on dropbox, but I can't get the images to load, and the code is a bit beyond me.

    PDF of tutorial: http://dl.dropbox.com/u/19141190/Tutorial Pro to Zot/TutorialProcitetoZoterobyMulkerin.pdf

    Meghan
  • edited April 2, 2012
    Looks like line 770 has a bug. It has been stopping the import process (goes really slow, never completes. I split the txt into two files, should I do more?)

    Error while importing:

    A script on this page may be busy, or it may have stopped responding. You can stop the script now, or you can continue to see if the script will complete.

    Script: C:\Users\MulkerinM\AppData\Roaming\Zotero\Zotero\Profiles\x27d89mc.default\zotero\translators\RIS.js:770
  • large imports will take a long time - the busy script does not indicate a bug, Firefox produces that automatically when a script doesn't finish pretty quickly. I'd suggest you chop the import file into more chunks - but if you don't want to, you can likely just get it to work by hitting "continue" a couple of times for that script.
  • Ah I see, ok. I am using standalone, not Firefox though, just in case that makes a difference.
  • edited April 2, 2012
    I just ran your entire file through the translator. That pop-up came up about 15 or so times, but if you keep clicking continue, you will eventually get through it all.

    After the translation, I did not see any errors and it managed to import all 4719 items. I'm just not sure how correct all of them were.

    EDIT: actually it might take you even longer. I wasn't technically importing the items into the library, just parsing them.
  • How long did it take you to import all the items? Just curious if this computer isn't working well. Does a zotero rdf usually load faster (like this one takes a while because the translator is hard at work)?

    Also, I have cancelled (by closing the program) several times after it seemed like it had frozen. Is there some kind of cache I should clear on Zotero, in case of a partial import that doesn't show up in the library?

    Thanks aurimas, really appreciate your work on this. Do you consider the translator complete at this point, pending any issues?
  • How long did it take you to import all the items? Just curious if this computer isn't working well. Does a zotero rdf usually load faster (like this one takes a while because the translator is hard at work)?
    Took me about 2.5 minutes using Zotero Firefox extension. Not sure how it compares to other formats.
    I have cancelled (by closing the program) several times after it seemed like it had frozen. Is there some kind of cache I should clear on Zotero, in case of a partial import that doesn't show up in the library?
    That's a good question about stopping imports somewhere in the middle. There may be some database corruption, but someone else would have to answer this. There is a feature in Zotero to check your database integrity. Preferences -> Advanced -> Check Database Integrity.
    Do you consider the translator complete at this point, pending any issues?
    I noticed that we import GEN as presentation, which leaves out some fields. I'll need to take a look at this, but I can't right now. Right now this is the only bug I'm aware of, but it has not been extensively tested. You were going to be my test subject :-)
  • edited April 2, 2012
    I updated the translator with some modified item type mappings. It should yield better results.

    If you have not re-exported your ProCite library using the modified Book Long Form style, you will notice a lot of chapters missing titles. You should be able to safely convert those to books and the title will be filled in.

    @adamsmith (or someone else who knows): could you take a look at the field mappings? https://github.com/aurimasv/translators/blob/RIS/RIS.js#L104

    EDIT: I got the item type info from ProCite's website (but I can't find the link now). I think they're all listed at http://en.wikipedia.org/wiki/RIS_(file_format)#Type_of_reference
Sign In or Register to comment.