Import problems

jcw · November 19, 2008

I am trying to import an Endnote 7 database of approximately 1000 entries, including abstracts, URLs, etc. According to what others have said in this forum, it seems best to use the RIS output style and a text file. One problem I was encountering was that accented characters (like ö, é, ü etc.) were not imported correctly and showed up with question marks in Zotero. I verified that the text file created by Endnote was correctly using the accented characters, so this is definitely a problem with Zotero's import filter. Also, Greek characters (such as sigma, theta etc) and special formats like superscripts could not be transferred correctly, but this problem already showed up before the import (the text file did not contain these characters either). Text files don't seem to be the right format for such characters, but html (which did not import into Zotero) or rtf files (which contained all the right characters, but only imported them as code) should provide the necessary coding.

Another observation was that the Endnote fields 'URL' and 'date' were not properly filled in on Zotero's side, but ended up in an the 'Extra' field. The Endnote field 'label' was completely lost (which we use as an article identifier).

It seems to me that a proper import function is of central importance for the success of Zotero - I could imagine that many of my colleagues across the globe would opt for Zotero if the transition of their references from Endnote to Zotero (which is pretty much *the* standard used by everyone in the Physical Sciences) was completely seamless. However, if this requires manual re-editing of thousands of entries (incl. abstracts which often burst with special characters), nobody is going to make the switch. Which would be a shame, given all the functionality that Zotero has to offer...

noksagt · November 19, 2008

I am trying to import an Endnote 7

This version preceded their UTF-8 support, so I don't know how some characters would be interpreted. It might be the case that you would not have this issue in the branch build or in version 1.0.8 (when it comes out).

Text files don't seem to be the right format for such characters, but html (which did not import into Zotero) or rtf files (which contained all the right characters, but only imported them as code) should provide the necessary coding.

Importing a standard format that is plain-text in a markup language seems like a fairly ugly hack. Are the symbols present in EndNote XML format? There is an open ticket to support EndNote XML in the future.

Another observation was that the Endnote fields 'URL' and 'date' were not properly filled in on Zotero's side, but ended up in an the 'Extra' field. The Endnote field 'label' was completely lost (which we use as an article identifier).

Past threads have indicated what fields can and can't be differentiated in RIS & how to improve EndNote's export filter and/or how to modify the exported text file before you import it into Zotero.

jcw · November 20, 2008

Thanks for your response, noksagt. I will wait until importing accented characters and special formatting is fully supported before switching to Zotero; it has taken us several years to build our Endnote library with more than 500 pages of abstract text alone, most of which was entered manually since online abstract services did not fully support special characters like Greek, Umlauts, or special formatting like superscripts and subscripts - and there is simply no time to redo the work.

The xml file correctly contains the accented characters, but not the special formatting like subscripts/superscripts or Greek letters (which were entered using Endnote's symbol font).

Again, I think that Zotero is certainly a step in the right direction, and I commend the work of all the developers. But most people don't start building reference libraries using Zotero, but with commercial software (often acquired through campus-wide licenses) - and therefore it is absolutely critical for Zotero's widespread use to make the transition as smooth and *automatic* as possible. I hope that future version of Zotero will address this critical issue (as far as it can be addressed from the import side).

noksagt · November 20, 2008

Unfortunately, I don't see how Zotero will ever be able to import what isn't present it files output by EndNote.

Newer versions of EndNote may have (slightly) better export. You might try to use a colleague's copy of a newer version and/or use a trial copy to see if the newer export functionality would suit your needs.

jcw · November 26, 2008

Noksagt,
just to clarify this again: accented characters *were* correctly exported by Endnote, but failed to import correctly into Zotero. Also, an html file created by Endnote *contained* all the accented characters, Greek characters, and specially formatted characters, but Zotero was not able to import the file.

Of course, Zotero can't import what isn't present in files output by Endnote, but this is simply not the issue here; the problem lies with the import filter of Zotero.

Also, it would be nice if Zotero could import html files, which, as far as I can tell, constitutes a perfect transition file format since it supports all the special characters and formatting that commonly appear in a reference database.

noksagt · November 26, 2008

Also, an html file created by Endnote *contained* all the accented characters, Greek characters, and specially formatted characters, but Zotero was not able to import the file.

Can EndNote import the file? RIS and BibTeX use plain text, not HTML. I am aware of no program that can import an HTML RIS-like file & it seems hackish to me.

Jborche · November 28, 2008

at first, I really appreciate what was created with ZOTERO and I would be the next to switch from our standard, commercial literature program. However, as was said before:

"It seems to me that a proper import function is of central importance for the success of Zotero - I could imagine that many of my colleagues across the globe would opt for Zotero if the transition of their references from Endnote to Zotero (which is pretty much *the* standard used by everyone in the Physical Sciences) was completely seamless. However, if this requires manual re-editing of thousands of entries (incl. abstracts which often burst with special characters), nobody is going to make the switch. Which would be a shame, given all the functionality that Zotero has to offer..."

Without a proper import function, this is just not an option. I use Refman and my libary file contains some 7,000 entries, I tried the Export with the RIS format but the import into ZOTERO had a lot of mistakes, especially concerning which field from the export file is put in a ZOTERO field. Maybe that I used some Refman fields not in the right sense, but I did it in this way and I have no chance to change it without some basic informations concerning the RIS import format in Zotero.
Why is it not possible to give a list of all used abbreviations of the ZOTERO RIS import format, as I have this information from Refman ( http://www.refman.com/support/risformat_intro.asp )? This would give me the opportunity to solve the majority of my problems, and not to loose important informations.

On the other hand, what is working perfect from Refman to ZOTERO via RIS is the problem named above with special characters (like ö, é, ü etc.). Problems exist when italics is used in the title. In the RIS export file there are some special characters in front and at the end of what should be italic, and in ZOTERO then it looks like this " perch (Perca fluviatilis) predating ". Is there any solution for this problem?

Again, many thanks to the developers of ZOTERO, you are not far away from a perfect solution. But that should include some options for data transfer which can be used by the majority of users, and not only by some experienced people that are able to build new source code.

Tjowens · December 1, 2008

I might be able to help with the first part. Even if you don't understand javascript the RIS translator's code makes a considerable amount of sense. You can see what fields become what and the comments do a nice job explaining why Zotero does what it does. You can browse the translator source at https://www.zotero.org/trac/browser/extension/trunk/translators/RIS.js

Jborche · December 10, 2008

Hi Tjowens
thanks for your help, that is something I was looking for and it will be helpful to solve the first part of my problems.

Is there anybody who could hep with the second problem:
"Problems exist when italics is used in the title. In the RIS export file there are some special characters in front and at the end of what should be italic, and in ZOTERO then it looks like this " perch (Perca fluviatilis) predating ". Is there any solution for this problem?"