RIS Export
I'm trying to bring the RIS translator up to date and get it to maintain data integrity upon export/import
Running into a few issues that I would like some input on:
(The following chart may be useful for this discussion: https://github.com/aurimasv/translators/wiki/RIS-Tag-Map)
1) Not all item types have a 1:1 RIS type mapping. Listed are the item types and the RIS TY term that they are exported as. On import, the itemType would default to the bold entry.
letter:"PCOMM",
interview:"PCOMM",
email:"ICOMM",
instantMessage:"ICOMM",
forumPost:"ICOMM",
film:"MPCT",
tvBroadcast:"MPCT",
audioRecording:"SOUND",
radioBroadcast:"SOUND",
podcast:"SOUND",
presentation:"SLIDE"
(This might not be appropriate)
document:"GEN"
(imported as journalArticle, since it has more fields)
My thoughts on this is that we can use something like what ProCite does on RIS export. Immediately following the TY entry, we can include an N1 field with something like "Item Type: interview". These sort of fields will be supported in the new RIS translator and would allow to faithfully reconstitute proper item types. Naturally, this is not part of the official spec, but I don't think it does any harm. Even if this is imported as a note by other software, it would be descriptive enough where it may be helpful.
2) Some Zotero fields don't seem to have a corresponding RIS field. Here is a list of things I could not map. Some suggestions are given in parentheses based on the typical value for these RIS fields:
We can either find fields to assign these to, or go ProCite route and create N1 fields with descriptive labels. Probably assigning them to existing fields will increase chances of them (accidentally) ending up in the right place when other software imports.
Misc. questions:
Can manuscript have a page range? I know Zotero only supports number of pages, which makes sense, but is there a use case for a page range for manuscript?
There are some annoying "errors" in RIS spec, like Y2 is Date Accessed for almost all item types, except for:
Statute, where it is Date Enacted (but in fact Date Enacted is already DA)
Web Page, where Date Accessed is M1 and Y2 is not assigned
For some reason Book Section's "Number of Volumes" is assigned to IS instead of NV, like the rest of items.
Inconsistencies for seriesTitle, where it's T3 for some types and T2 for others for no apparent reason.
Editor and Series Editor are for some reason flipped for Book.
Should I just "fix" these in Zotero translator, or stick to the spec? I think date accessed and number of volumes would be ok to fix, but seriesTitle, editor and seriesEditor may cause some problems with other software.
There's a whole other mess with creator types, which I will ask for help with later.
Running into a few issues that I would like some input on:
(The following chart may be useful for this discussion: https://github.com/aurimasv/translators/wiki/RIS-Tag-Map)
1) Not all item types have a 1:1 RIS type mapping. Listed are the item types and the RIS TY term that they are exported as. On import, the itemType would default to the bold entry.
letter:"PCOMM",
interview:"PCOMM",
email:"ICOMM",
instantMessage:"ICOMM",
forumPost:"ICOMM",
film:"MPCT",
tvBroadcast:"MPCT",
audioRecording:"SOUND",
radioBroadcast:"SOUND",
podcast:"SOUND",
presentation:"SLIDE"
(This might not be appropriate)
document:"GEN"
(imported as journalArticle, since it has more fields)
My thoughts on this is that we can use something like what ProCite does on RIS export. Immediately following the TY entry, we can include an N1 field with something like "Item Type: interview". These sort of fields will be supported in the new RIS translator and would allow to faithfully reconstitute proper item types. Naturally, this is not part of the official spec, but I don't think it does any harm. Even if this is imported as a note by other software, it would be descriptive enough where it may be helpful.
2) Some Zotero fields don't seem to have a corresponding RIS field. Here is a list of things I could not map. Some suggestions are given in parentheses based on the typical value for these RIS fields:
archive
archiveLocation
extra
libraryCatalog
rights
websiteType (M3) //blogPost, webpage
postType (M3) //forumPost
letterType (M3) //letter
presentationType (M3)
artworkMedium (M3, but that may conflict with RIS Type of Work. Are these the same?)
interviewMedium (M3, technically RIS Type for PCOMM) //interview
audioFileType
company (PB) //computerProgram
programmingLanguage (would it be ok to use LA for this?)
docketNumber (is this equivalent to callNumber? can we use CN?)
country (is this what RIS Designated States [C3] refers to?) //patent
filingDate (C1 or M2, both unassigned for patent)
forumTitle (T2, but that is designated as Series Title for ELEC, T3 is unassigned)
websiteTitle (same as above)
journalAbbreviation (J2, this is RIS Alternate Title)
//tv/radio broadcast
programTitle (T2/T3 for Series Title? These differ for audiovisual material and film/broadcast)
episodeNumber (M1) //also podcast
network (PB)
studio (PB) //videoRecording
meetingName (T2?) //presentation
We can either find fields to assign these to, or go ProCite route and create N1 fields with descriptive labels. Probably assigning them to existing fields will increase chances of them (accidentally) ending up in the right place when other software imports.
Misc. questions:
Can manuscript have a page range? I know Zotero only supports number of pages, which makes sense, but is there a use case for a page range for manuscript?
There are some annoying "errors" in RIS spec, like Y2 is Date Accessed for almost all item types, except for:
Statute, where it is Date Enacted (but in fact Date Enacted is already DA)
Web Page, where Date Accessed is M1 and Y2 is not assigned
For some reason Book Section's "Number of Volumes" is assigned to IS instead of NV, like the rest of items.
Inconsistencies for seriesTitle, where it's T3 for some types and T2 for others for no apparent reason.
Editor and Series Editor are for some reason flipped for Book.
Should I just "fix" these in Zotero translator, or stick to the spec? I think date accessed and number of volumes would be ok to fix, but seriesTitle, editor and seriesEditor may cause some problems with other software.
There's a whole other mess with creator types, which I will ask for help with later.
Not dropping data obviously feels like the right thing to do, but when it just results in mostly useless data that doesn't follow any spec and that either a user would have to deal with by hand for thousands of items or another developer would be tempted to try to use, I'm not sure it's really in anyone's benefit.
I'd like to send you my modified version so that you can have a look at it and (hopefully) include the modifications in your version.
You find my comments by searching the file for //WH:
I modified RIS.js once again and added type specific import and export of document type (M3), see
https://gist.github.com/2972868#file_ris_wh.js
The differences between the RIS.js currently distributed and my modified version are shown in the document
http://www.filefactory.com/file/yyu22y099od/n/RIS_js_File_Comparison_pdf
which was created using the cost-free comparison tool DiffMerge from
http://www.sourcegear.com/diffmerge/downloads.php