ProCite to Zotero Conversion: Translator, RIS, and Testing

1356
  • Hi Aurimas,

    You updated the translator since my last post? Should I download a new RIS.js file and reimport the library again?

    I did do the modification with changing CHAP to BOOK in the Book Long Form section.

    Note tutorial I made above, which basically outlines what I did.
    One other question, is there a way to make truly separate libraries in Zotero, not collections, without a separate account?
  • Unrelated to transfer:
    Why does a Zotero "Report" not display "Series Editor" under the creator field in the summary list view of all the citations?
  • edited April 3, 2012
    ERROR:

    EDIT: Problem occurs in every Book Section record I have checked so far. Omits chapter title all together, it does not even go into the notes field. This happened before Aurimas gave the coded RIS, so was an existing problem that I did not know about.

    I found an error, not sure how widespread it is. It's a book chapter in ProCite not taking the chapter's title i.e. "Title, Analytic", and mapping it to Zotero's "Title" field in Book Section.

    RIS Output:

    TY - CHAP
    A1 - Adams, Robert McC.
    T2 - Contexts of Civilizational Collapse: A Mesopotamian View
    N1 - Connective Phrase: In
    A2 - Yoffee, Norman
    A2 - Cowgill, George L.
    N1 - Author Role: editors
    T2 - The Collapse of Ancient States and Civilizations
    CY - Arizona
    PB - University of Arizona Press
    PY - 1988
    SP - 20-43
    N1 - Notes: have, seen
    KW - Asia
    KW - empire
    KW - archaeological theory
    KW - social organization
    ER -

    Screenshot to compare outputs:
    http://dl.dropbox.com/u/19141190/errorbookchapter.png
  • This is happening because both Analytic and Monographic titles are placed in T2 instead of T1 and T2 respectively. This is probably best fixed in the ProCite RIS export style. Fairly simple to do:

    Open up the RIS-EndNote style in ProCite as previously.
    Select Book Chapter
    Go to Bibliography tab
    Change line 4 from:
    T2 - <04 Title, Analytic>|
    to:
    T1 - <04 Title, Analytic>|

    I'm thinking that all of these bugs we're finding in the export style will eventually find their way into some sort of tool that will fix ProCite's style. I don't think ProCite's license allows us to host modified pos files, but I think we may be able to provide a web script that would fix the flaws.

    Thanks for going through this long debugging process.

    P.S. I also screwed up that last update. GEN is now mapped to Document, which is no better than Presentation. I will update it to Journal Article, so that we preserve as many fields as possible. I'm thinking of adding a tag when we encounter GEN, so that the user can easily find and fix these to whatever is the appropriate type after import. Give me a few minutes to implement this

    P.P.S. If you're not doing so already, I would really recommend splitting up your library into smaller pieces. At least while we're going through this debugging process. If I were you, I would export/import 100 items at a time. This time, if you get a couple hundred items to import correctly and you make some minor tweaks, they are not overwritten when you do another import with modified translator. I hope that makes sense. I just feel pretty bad if you're going through the items and then we find some flaw in the translator and you have to go through all of it again.
  • @aurimas - I'll take a closer look at the type fields - I'd be inclined to include those into the main RIS importer.

    I'm also with you on the import script/procedure. In my ideal world, every product would just export rich and spec-compliant data. Since that's never going to happen, in my slightly-less-ideal version, Zotero will just develop import scripts for all major ref-managers. Some will require almost no work, and some quite a bit, but I think having the info for each program in one place would be great.
  • @adamsmith - I would love to be able to handle this inside a script, but it just gets so messy when you have cases like two T2 fields.

    Idk, I think we can come up with some logic like if publication/book title is already set and there is no title, then set title to publication/book title and set publication/book title to the new T2.

    I think we can also use the "Connective Phrase: in" to help us with some of the logic too, I just haven't come around to using it yet.
    I'd be inclined to include those into the main RIS importer.
    Technically, the modified version does not break imports of standard RIS files at least from what I tested. But I can see how there may be a lot of hesitation to merge this into the main RIS due to all the extra code. Besides, I would probably agree that we can just make this a new translator and call it RIS-ProCite.
  • I have been deleting the full library each time from Zotero standalone to insure that there are no duplicates etc.

    If I were to split it into 100 items at a time it would be like 47 different files. I think that would be a bit too much. Also I am using notepad (though I am sure there is something better, like notepad++ or whatever), so there is no reliable way that I know of of splitting it into a certain number of records without counting each one. Similarly in Procite, I can't seem to select all "journal articles" workforms, without checking all of the boxes manually. Am I missing something obvious?

    Can you link to the translator RIS when it is updated and I will try again? Should changing T2 to T1 solve all the problems with Book Chapter? Are there any other types I should know about?

    Is there a reliable way to count how many of each type of workform I have in the library?

    Also question from above:
    Unrelated to transfer:
    Why does a Zotero "Report" type not display "Series Editor" under the creator field in the summary list view of all the citations?
    I have a person who is an editor of a series of reports, and he is published elsewhere, and should be listed alphabetically with his other publications, not a blank creator field in the list view. In Zotero under Report there is only the option for "series editor" which doesn't seem to show up in the creator field.
  • post unrelated questions to a new thread, please. You, Aurimas, and I are probably the only ones reading along here by now ;-).
  • edited April 3, 2012
    P.S. I also screwed up that last update. GEN is now mapped to Document, which is no better than Presentation. I will update it to Journal Article, so that we preserve as many fields as possible. I'm thinking of adding a tag when we encounter GEN, so that the user can easily find and fix these to whatever is the appropriate type after import. Give me a few minutes to implement this
    Can you link to the translator RIS when it is updated and I will try again?
    Updated. Same link https://raw.github.com/aurimasv/translators/RIS/RIS.js
    Should changing T2 to T1 solve all the problems with Book Chapter? Are there any other types I should know about?
    It will fix the problem with blank titles. I'm not sure what other types do this. I'm not too inclined to go though the entire RIS-EndNote export style and try to debug it just by looking at the code. But if you do find more bugs, we will work with you to fix them. Also, see below for how to make this debugging thing not waste all of your time.
    If I were to split it into 100 items at a time it would be like 47 different files. I think that would be a bit too much. Also I am using notepad (though I am sure there is something better, like notepad++ or whatever), so there is no reliable way that I know of of splitting it into a certain number of records without counting each one. Similarly in Procite, I can't seem to select all "journal articles" workforms, without checking all of the boxes manually. Am I missing something obvious?
    This is how I would approach it:

    • Export ~100 items from CitePro. Import into Zotero.


    • Go through the records and make sure they look OK. If you find a bug with the import, we might be able to fix it in the translator. If it's not a very big bug, you can probably just fix it manually and complete looking over the 100 items.


    • Now you move on to the next 100 items and you don't have to ever worry about the first 100, you know they are right. Now you can use a possibly modified RIS import translator to make your next 100 items need even less tweaking.

      Keep in mind that every time you do an import from a RIS file, it will import into a new collection, so you will not be mixing your new imports with the old. If you want to delete a collection, make sure to delete all items within collection first, and only then delete the collection. Otherwise your items will just end up in the Unfiled category.

    Now to answer your question about exporting a subset of items in ProCite. There are several ways you can do this.

    1) You can sort items by various fields, like Workform and Record ID. I've explained how to do this above:
    Turns out you can sort items by their item type. When you have your database open in ProCite, go to View->Configure Record List... Under Layout, you have a list of fields that display. Check an additional field to enable it, then in the drop-down box scroll to the top and find Workform. You can also change the title to whatever you want. Now you can sort by item type. This may also be used to sort by most other fields.
    Once you have the items sorted, you can click on the first item in the range, then, while holding down Shift key, click on the last item in the range and it will select all the items in between. Now just click the Mark Selected button. You can proceed to exporting marked items.

    2) You can use the search feature (Search tab on the bottom of the database window). It allows you to search for certain values in almost every field. E.g. Record ID>20 AND Record ID<120 will return records from 21 to 119. They have a list of fields and operators available from the menu, so it should be easy to do what you want.

    This also answers your other question:
    Is there a reliable way to count how many of each type of workform I have in the library?
    Use the search and look for Workform. Take a look at the "Insert Field", "Operators", and "Insert Term" buttons towards the top.
  • Sorry for the double post.
    I made a tutorial of what we have currently (perhaps the translator will change locations or be updated elsewhere). Hope it is helpful to someone. I thought it might be nice to summarize the process. I tried making it into a webpage hosted on dropbox, but I can't get the images to load, and the code is a bit beyond me.

    PDF of tutorial: http://dl.dropbox.com/u/19141190/Tutorial Pro to Zot/TutorialProcitetoZoterobyMulkerin.pdf
    This looks great. A few points I want to make.

    The location of the translator will most definitely change locations.

    Book Long Form is actually often meant to be used as a chapter, but can be used as a book as well. ProCite gives a LOT of flexibility for deciding which type to use for different citations, which is not necessarily a great thing. So changing Book Long Form from CHAP to BOOK is not always desirable.

    At the end of all of this, it might make sense to post a similar tutorial on Zotero's wiki. Perhaps you would be willing to transfer your tutorial over there. We'll see how much tweaking will be needed once we're done with this.
  • Ok, I will try this again a bit later. Thanks for the help.

    I would be willing to post to the wiki.
  • I imported Book Chapters only, and modified T2 to T1 for the first title, which worked.

    However, it is not importing Series Title at all. It is in N1 below, but does not even transfer into notes in Zotero. It just disappears. What should I change it to so it transfers?

    Thanks!

    TY - CHAP
    N1 - 6903
    A1 - Baines, John
    A1 - Yoffee, Norman
    T1 - Order, legitimacy, and wealth: setting the terms
    N1 - Connective Phrase: In
    A2 - Richards, Janet
    A2 - Van Buren, Mary
    N1 - Author Role: editors
    T2 - Order, Legitimacy, and Wealth in Ancient States
    RP - In File
    CY - Cambridge
    PB - Cambridge University Press
    PY - 2000
    SP - 13-17
    N1 - Series Title: New Directions in Archaeology
    N1 - Notes: have, seen
    KW - empire
    KW - imperialism
    KW - ideology
    KW - social organization
    ER -
  • series title should be T3, but wait for Aurimas - he may want to change this in the translator instead.
  • I think all the associated fields have problems mapping. i.e. series volume.. Changing it to T3 worked for title, but the "Series Title:" also transferred, so that in Zotero we have "Series: Series Title: New Directions in Archaeology. " This could cause problems with citations.

    N1 - Series Editor Role: <31 Series Editor Role>|
    N1 - Series Title: <32 Series Title>|
    N1 - Series Volume ID: <33 Series Volume Identification>|
    N1 - Series Issue ID: <34 Series Issue Identification>|
  • Volume ID in Book Section is also not transferring at all. It disappears.

    TY - CHAP
    N1 - 10763
    A1 - Perttula, Timothy K.
    T1 - European Contact and Its Effects on Aboriginal Caddoan Populations Between A.D. 1520 and A.D. 1680
    A2 - Thomas, David Hurst
    N1 - Author Role: editor
    T2 - Columbian Consequences: The Spanish Borderlands in Pan-American Perspective
    CY - Washington, D.C.
    PB - Smithsonian Institution Press
    PY - 1991
    N1 - Volume ID: 3
    SP - 501-518
    N1 - Notes: have, seen
    KW - Caddoan arch.
    KW - culture contact
    ER -
  • Error:
    Book Long Form:

    "Title, monographic" is not transferring. The "Record ID" is also not going in the notes. Can you comment in all record ID's to transfer into notes, to make it easier to cross check for errors? Thanks very much.

    TY - BOOK
    N1 - Record ID: 1
    A1 - Abel, Annie H.
    N1 - Author Role: editor
    T2 - Chardon's Journal at Fort Clark, 1834-1839
    CY - Pierre
    PB - South Dakota State Department of History
    PY - 1932
    N1 - Notes: seen
    KW - Arikara
    KW - ethnohistory
    KW - fur trade
    KW - Upper Missouri
    ER -

    TY - BOOK
    N1 - Record ID: 2
    A1 - Abel, Annie H.
    N1 - Author Role: editor
    T2 - Tabeau's Narrative of Loisel's Expedition to the Upper Missouri
    CY - Norman
    PB - University of Oklahoma Press
    PY - 1939
    N1 - Notes: have, seen
    KW - Arikara
    KW - ethnohistory
    KW - Upper Missouri
    ER -
  • Sorry, I had a typo. Not sure how it got through in the first place, but almost none of the labeled items in notes would have transferred.

    It's fixed now, please try again.
  • Hold one second, I'll fix the IDs
  • Hi Auriumas, I think we were posting at the same time. Did you see the recent posts above as well?

    Thanks. Let me know when I should re-download the translator.
  • oops and again. I will wait.
  • Ugh, the monographic title is a bit of a pain to fix. You'll have to wait a bit until I take care of those. I'm pretty sure book title for BOOK type should be in T1 or TI though. You could just change that in the ProCite export file if you don't want to wait.

    Otherwise the ID and the typo are fixed so you can update

    I'll come up with something to make sure titles are mapped right later in the day.
  • Thanks, I will wait till later then to re-upload. I appreciate it!
  • Lower priority:

    "Book short form" error:

    Call number does not map to appropriate zotero field. It will probably also double as above "series: series title: name" when fixed, since it lists call number in the field. See ex. below.

    TY - BOOK
    N1 - Record ID: 11353
    A1 - Blaut, J. M.
    T1 - The Colonizer's Model of the World: Geographical Diffusionism and Eurocentric History
    RP - Not in File
    CY - New York
    PB - Guilford Press
    PY - 1993
    N1 - Call Number: D 16.9 B49 AFA
    KW - colonialism
    KW - culture contact
    ER -
  • Sorry for the delay. The updated translator is up. https://raw.github.com/aurimasv/translators/RIS/RIS.js

    I would caution to pay extra attention to all titles (i.e. title, book title, publication title, series title). I think these should import as expected, but I'm not 100% sure. There will be a problem if let's say for a book chapter, the book title is given, but the chapter title is omitted. This was the case in a couple of your example records in the first few posts on this thread. I'm hoping that those have been converted to books now with the change in Book Long Form export. If they have, titles should import just fine. Otherwise, the "book title" will be set as chapter title.

    We were also not handling some fields (e.g. "RP - Not in File") and these were being discarded. They are now stored in notes.

    @adamsmith The way I ended up handling titles is basically by not trusting title levels (i.e. T1 vs T2 vs T3) I push such titles as they appear in the file onto their individual stacks. Then at the end of the record I concatenate these and assign them to title, book title, series title in order without overwriting anything that may have been set by fields like TI, JF, BT, etc. Can you think of some cases where this may break? I pointed out one case above, but I feel that these kinds of cases are just bad records and we can't deal with everything that is poorly formatted.
  • I didn't check through the code super thoroughly but it makes sense to me, don't see how this would break anything.
  • Thanks Aurimas, I appreciate it. You have done a great job. I understand how wonky a lot of these fields are, and how un-standardized. I am still in process of uploading with the new translator today. I found a few things so far:

    Book Long Form:
    Series Title and Series Volume ID did not transfer to the correct fields in Zotero. Volume ID was transferred to the notes, and so preserved but series title was lost.

    TY - BOOK
    N1 - Record ID: 1748
    A1 - Rogers, M. J.
    T2 - Yuman Pottery Making
    CY - San Diego
    PY - 1936
    T3 - San Diego Museum Papers
    N1 - Series Volume ID: No. 2
    N1 - Notes: referenced in Ceramics and Man, F.R. Matson, ed. Aldine Pub, Chicago p. 61, 1965
    KW - ceramics
    ER -

    Many others in Book Long form have worked perfectly though, so not sure why that happened.

    This one (book section type) was almost perfect, but didn't transfer series volume ID to the right field. It did get preserved in the notes field though, which was good.

    TY - CHAP
    N1 - 19043
    A1 - Brosseder, Ursula
    T1 - Xiongnu terrace tombs and their interpretation as elite burials
    N1 - Connective Phrase: In
    A2 - Bemmann, Jan
    A2 - Parzinger, Hermann
    A2 - Pohl, Ernst
    A2 - Tseveendorzh, Damdinsüren
    N1 - Author Role: editors
    T2 - Current Archaeological Research in Mongolia: Papers from the First International Conference on "Archaeological Research in Mongolia"held in Ulaanbaatar, August 19th-23rd, 2007
    CY - Bonn
    PB - Bonn University Press
    PY - 2009
    SP - 247-280
    N1 - Series Title: Bonn Contributions to Asian Archaeology
    N1 - Series Volume ID: 4
    KW - archaeology
    KW - mortuary analysis
    KW - Mongolia
    KW - empire
    KW - pastoralism
    ER -

    Thanks!
  • Same error with Book Long Form, and Series Title and Series Volume ID, with those two fields not transferring, though series vol is does show up in notes.

    TY - BOOK
    N1 - Record ID: 96
    A1 - Baerreis, David A.
    A1 - Dallman, John E.
    T2 - Archaeological Investigations Near Mobridge, South Dakota
    PB - Society for American Archaeology & University of Wisconsin Press
    PY - 1961
    T3 - Archives of Archaeology
    N1 - Series Volume ID: No. 14
    N1 - Notes: have, seen; I have on fiche. Includes Spiry-Eklo (39WW3) and Bamble (39CA6).
    KW - Arikara archaeology
    ER -

    Should I change anything in the RIS on the procite end?
  • Updated to not drop series titles and include series volume ID (under series number)
  • Thanks Aurimas! that seems to work.

    Then just a minor thing (same as what was already done with book short form).. in the series title for book chapters, "series title" gets transferred into the field, i.e. "Series: Series Title: Bonn Contributions to Asian Archaeology"

    TY - CHAP
    N1 - 19043
    A1 - Brosseder, Ursula
    T1 - Xiongnu terrace tombs and their interpretation as elite burials
    N1 - Connective Phrase: In
    A2 - Bemmann, Jan
    A2 - Parzinger, Hermann
    A2 - Pohl, Ernst
    A2 - Tseveendorzh, Damdinsüren
    N1 - Author Role: editors
    T2 - Current Archaeological Research in Mongolia: Papers from the First International Conference on "Archaeological Research in Mongolia"held in Ulaanbaatar, August 19th-23rd, 2007
    CY - Bonn
    PB - Bonn University Press
    PY - 2009
    SP - 247-280
    N1 - Series Title: Bonn Contributions to Asian Archaeology
    N1 - Series Volume ID: 4
    KW - archaeology
    KW - mortuary analysis
    KW - Mongolia
    KW - empire
    KW - pastoralism
    ER -
  • I'm not sure I understand what is not working right. The entry you posted imports correctly on my end. I might be overlooking something.
Sign In or Register to comment.