Importing EndNote files into Zotero

bbolker · November 6, 2008

mark: Good luck with that! It would probably be a good idea
if people didn't explain exactly which platform they were trying
to export to -- I can imagine that Thomson/Reuters is not exactly
falling over themselves to facilitate data transfer to Zotero at
the moment ...

SimonCropper · November 6, 2008

Contacting Thomson/Reuters to talk about how I can export my data and stop using their product is unlikely to solicit a response.

Follow up questions...

1. Is their a functional limit to how many references a 'collection' can have?
2. Does increasing the number of references increase the load on memory?
3. Is their an ideal format for importation into Zotero. I notice in the export dialog their is a Zotero RDF option. What format is this in? Why isn't their a comparable option in the import dialog?
4. Endnote allows you to create an output style. Is their documentation anywhere of the 'ideal' import format (that is a 1:1 list of data/type:fields/type)so such a style can be created? Once this style was created then issue of importing Endnote bibliographies would disappear.

noksagt · November 6, 2008

As above, their export is broken enough that it does not allow other EndNote users to import data. Complaints about defects in the export format are perfectly valid & proper export is needed for more than just a migration to a particular competitor's application. Such complaints may still fall on deaf ears--they have for years.

Re. 3:
Zotero does allow you to import RDF. The Zotero RDF uses a vocabulary that is used by few (if any) applications outside of Zotero. It is probably able to export/import more info from Zotero.

I am a big fan of MODS XML right now, as it is a rich standard that others already do use.

There are efforts to make an RDF ontology that consider the needs of programs and users other than Zotero.

The best format that EndNote exports and Zotero imports is probably RIS currently.

Re. 4:
EndNote's data and export templates are fundamentally broken enough not to allow something that is completely satisfactory. A MODS XML export from EndNote would be very useful, but it seems impossible to implement a filter that would make perfect MODS XML.

EndNote does have their own XML schema (they actually have a few conflicting versions (different in import & export and different across different versions) and their XML sometimes does not validate). An EndNote XML importer for Zotero is an open item in trac.

SimonCropper · November 6, 2008

Thanks for the background. I have been playing around with Zotero and came to the conclusion that the only export format that imports entirely is Zotero RDF.

I created a book and book section item in an empty library then exported and reimported the two dummy references. The data in the references were just descriptions of the field name. I was quite surprised.

Quickly...
MODS Export -- lost data in fields # of Volumes, Language, Short Title, Repository and Extra, and lost all the attachments (URLs to files or websites).
REFER/BIBLX -- Totally corrupted the records with an extra webpage reference being created from the book reference.
RIS Import -- lost data in fields series number, # of Volumes, edition, language, call number, loc. in archive, repository, rights and extra, and lost all attachments (URLs to files and websites)
Zotero RDF -- author fullnames were transfered to lastnames, lost URLs to local files, URL to websites retained.

On inspection of the RDF file created, it is obvious that it is not going to be easy for anyone outside Zotero to create a translator of any type.

I am wondering if a modified RIS format could be accepted as an import option (called Zotero RIS), which has the missing tags present to make a complete Zotero record. This information could then be used to make a modified Endnote Output Style. For example, RE - for repository. RI - for rights, etc.

Something else worth noting different between Endnote and Zotero is how the data is stored. In Endnote if you change between reference types, say from a book to article, fields are just renamed and the data stays intact. Change it back and all the original data returns. In Zotero, if you change to a reference type with less fields you get a warning about the potential loss of data in the fields now not required.

For the record: I am using XP Pro SP3, FireFox 3.0.3 and Zotero 1.0.7.

noksagt · November 6, 2008

the only export format that imports entirely is Zotero RDF.

Even this imports with subtle differences, which is why you are encouraged not to rely on it for backups. You commented on some such differences, yourself.

Many of the issues with MODS XML can be improved--the format is rich & has support for many of those things.

It is unclear what issues you had with REFER. However, this & RIS & BibTeX are rather limited formats.

RIS does not support several of the fields that you had problems with. Expanding RIS seems like a bad idea to me. It isn't an extensible format. If we really need a work around for one of the flat file formats, we can put formatted information into one of the user-definable fields (e.g. 'U5 - <rights>cc-by-sa-3.0</rights">' or a similar way of using an existing field).

EndNote's "hidden" data actually has had many long-term complaints. I wouldn't want it in the Zotero interface & it doesn't matter for data migration, I think.

SimonCropper · November 6, 2008

I agree with your comments. Personally I am just trying to transfer 1000s of references to Zotero in a way that avoids inserting unknown errors. Once done I will give EndNote the flick, and my transition to OpenOffice will be complete.

In regards to the actual number of references that can be imported in a shot. I have successfully imported 1000 without generating an error. So I will be cutting my EndNote library up into smaller bits for importation. Thanks everyone for your input.

purpledoodlebug89 · November 11, 2008

I tried exporting from Endnote 9 (Mac) to BibTex in text file format and then inputting to Zotero as it says on the website. I lost all diacritical marks and anything that comes out in the text file as @incollection (that is, book sections) only imports an author.

I'm afraid that there will be other data input errors, and as I am beginning my thesis, I really can't afford dumb mistakes that make it look like I don't know what I'm doing. Exactly how reliable is this import at the moment? With thousands of citations, I'm really not looking forward to importing by hand. (i.e. that will have to occur after the dissertation if at all)

Tjowens · November 13, 2008

First you might try using RIS instead of BibTeX. Otherwise you might check to make sure that you have the most up to date version of the BibTeX export format from Endnote.

The next step beyond this would be to open up the exported file in a text editor and make sure it looks OK. If you are still having trouble with the file there is a good chance you can massage it a little bit with a few thoughtful uses of find and replace.

Lots of Zotero users have transferred over from Endnote, many with thousands of references in their collections. Generally if you are willing to fuss with it a little bit you can get a very clean transfer.

purpledoodlebug89 · November 13, 2008

Thanks. I had someone recommend exporting using the Endnote export format instead of BibTex. That did the trick.

SimonCropper · December 3, 2008

This is a summary of how I eventually imported my Endnote Library into Zotero, and some general observations on the use of Zotero after using Endnote for many years.

General observations...
1. The generic 'Refman (RIS) Export' file provided with EndNote 5 does not result in a clean import into Zotero (I suspect different version of the RIS Format Specifications have existed over time but these versions have not been labelled as such). I searched the Endnote Style repository on the internet for the latest version of the file and found that this style is now not available. I proceeded to modify the file I had based on the latest RIS Format Specifications.
2. After doing this and conducting some trial imports, I found that not all the tags specified in the RIS Format Specifications are recognised by Zotero. I then started generalising until I obtained the maximum data transfer into Zotero Fields that made sense to me. Issues included retaining my repository data (e.g. library, reference collection, project file) and accession codes (e.g. W2345, R459, Y236 respectively) and importation of my keyword data and abstracts. I found that the keyword -> tags was unsatisfactory and messy and pushed this data, along with the abstracts into Zotero's Notes Section. I used the Zotero Abstract Field to store the RepositoryLocation:AccessionCode combination for the lack of an alternative field that imported RIS data could be sent. As Zotero functionality improves I am hoping that a field copy function will be developed that will allow me to transfer this data into the repository field available in Zotero (but which RIS Import data can not be directed). Personally I would like to see a Zotero 'text-based' RIS-like import file format where their is a TAG available for each field and reference type. People can then send their data to this text-based file using anyone of a multitude of programs e.g. endnote, databases (e.g. foxpro, access), notepad, word, etc.
3. I found that my Endnote library was best cut up into blocks of about 1000 references. Even with a reasonably fast computer this import took about 20 minutes.
4. With such a large dataset I find the use of the 'search' field very irritating. This field acts more like a dynamic filter; type 'a' and it filters the data to only show those with 'a', type 'ab' it only shows references with 'ab'. Type a name and the system shuts down for 5-10 minutes until the filter has caught up. The 'advanced search', represented by the magnifying glass is much better. This facility appears to search various indexes and produces good results in seconds and allows you to store the results as a type of collection.
5. Compared to Endnote, Zotero is a breeze at getting reference data available on the internet into its database. Coupled with a NewsReader to let me know when the latest issue of the key journals that I read are released - it has become a quick and easy routine of checking my feeds followed by reviewing the reference lists that appear and clicking on the icon in the navigation field for any reference that I wish to save. It is as easy as that. Of the various places I visit I have only encountered problems importing data this way 5% of the time. In most cases the problem centres around importing collections - if I drill down to the individual reference the import works.
6. Endnote Styles appear to provide the user with greater control of the output (combination of the huge variety of scientific styles available plus the availability of an inbuilt style editor). CSL styles are OK but biased towards the humanities and social sciences Very few scientific styles are available. Of four journals I intend to publish articles in 2008 and 2009, I am having to create CSL style sheets in XML for every journal. This adds an unexpected overhead to paper preparation. CSL/XML also represents a huge learning curve for the uninitiated.
7. For people trying to get their Endnote Data into Zotero I have provided the format of my Endnote Style Sheet below for the key reference types recognised by Zotero Bibliographic Style Sheets.


Generic
`TY  - `GEN|`
AU  - `Author|`
PY  - `Year|`
BT  - `Secondary Title|`
ED  - `Secondary Author|`
CT  - `Title|`
CY  - `Place Published|`
PB  - `Publisher|`
T3  - `Tertiary Title|`
A3  - `Series Editor|`
ET  - `Edition|`
SP  - `Pages|`
Y2  - `Date|`
SN  - `ISBN/ISSN|`
N1  - `Notes|`
N1  - `Abstract|`
N1  - `Keywords|`
N2..- `Accession Number|`
VL  - `Volume|`
UR  - `URL

Journal Article
`TY  - `JOUR|`
AU  - `Author|`
PY  - `Year|`
TI  - `Title|`
SP  - `Pages|`
JF  - `Journal|`
VL  - `Volume|`
IS  - `Issue|`
N1  - `Notes|`
N1  - `Abstract|`
N1  - `Keywords|`
N2  - `Location|: Accession Number|`
UR  - `URL

Book
`TY  - `BOOK|`
AU  - `Author|`
PY  - `Year|`
BT  - `Title|`
CY  - `City|`
PB  - `Publisher|`
SP  - `Number of Pages|`
T3  - `Series Title|`
ED  - `Series Editor|`
ET  - `Edition|`
VL  - `Volume|`
Y2  - `Original Publication|`
SN  - `ISBN|`
N1  - `Notes|`
N1  - `Notes|`
N1  - `Keywords|`
N2  - `Location|: Accession Number|`
VL  - `Volume|`
UR  - `URL

Book Section
`TY  - `CHAP|`
AU  - `Author|`
PY  - `Year|`
BT  - `Book Title|`
ED  - `Editor|`
CT  - `Title|`
CY  - `City|`
PB  - `Publisher|`
ET  - `Edition|`
VL  - `Volume|`
T3  - `Series Title|`
SP  - `Pages|`
N1  - `Notes|`
N1  - `Notes|`
N1  - `Keywords|`
N2  - `Location|: Accession Number|`
VL  - `Volume|`
SN  - `ISBN|`
UR  - `URL

bdarcus · December 3, 2008

6. Endnote Styles appear to provide the user with greater control of the output (combination of the huge variety of scientific styles available plus the availability of an inbuilt style editor).

Yeah, but that's not the styles; it's what Endnote built on top of them.

I'm biased, but I'm not the only one who believes CSL is much better designed than Endnote's style system purely from the styling standpoint.

CSL styles are OK but biased towards the humanities and social sciences Very few scientific styles are available. Of four journals I intend to publish articles in 2008 and 2009, I am having to create CSL style sheets in XML for every journal. This adds an unexpected overhead to paper preparation. CSL/XML also represents a huge learning curve for the uninitiated.

True. It's something I think many of us hope and expect will be addressed in time. Ideally, we get to a point where a user can just do a few clicks to create a new style.

bdarcus · December 3, 2008

BTW, some problems with Endnote export are a consequence of really old (and AFAIK unfixed) bugs in Endnote.

mark · December 4, 2008

Great overview, Simon.

With such a large dataset I find the use of the 'search' field very irritating. This field acts more like a dynamic filter; type 'a' and it filters the data to only show those with 'a', type 'ab' it only shows references with 'ab'. Type a name and the system shuts down for 5-10 minutes until the filter has caught up

Start your search with " (double quote) to have Zotero wait until you finished typing and hit Enter. BTW, shameless blog plug: I wrote up 12 Zotero tips and tricks last week, maybe there's more you find helpful.

Of four journals I intend to publish articles in 2008 and 2009, I am having to create CSL style sheets in XML for every journal.

Do note that with the upcoming 1.5 version you'll be able to use .ens output styles from your legal copy of EndNote to style citations. (The Sync Preview already works like that, but I wouldn't recommend migrating before the official version comes out.)

SimonCropper · December 4, 2008

Bruce,

You stated...

I'm biased, but I'm not the only one who believes CSL is much better designed than Endnote's style system purely from the styling standpoint.

At present, I am currently unclear of my opinion of CSL. As you may have noticed I stated "Endnote Styles appear to provide the user with greater control" - empahasis on "appear". I am still fiddling with XML creating a range of new styles and reserve judgement until I have fully explored CSL functionality.

One thing I have noticed is that most styles in Zotero only have 3 output styles - BOOK, CHAPTER and DEFAULT (formatted to present as a JOURNAL entry). This is very limited in my view. I have been told in other parts of the forum that this system captures most situations but I am yet to be convinced. I have a wide variety of webpages, reports, conferences, unpublished manuscripts, maps, CD Software, Computer Programs and Legislation that have quite specific data that needs to be presented in a set way in a bibliography. Having all these items presented like a journal is limiting. The problem though does not appear to be in the ability of CSL/XML to render the references only that people don't wish to program for all these reference types.

Simple Styles, as found in the Zotero Style Repository, already appear complicated, especially without any explanatory comments in the code. How is a style with all types of references accounted for going to look and be maintained? Questions I am sure that would spark a debate in other sections of the forum.

Suffice to say I have not finalised my judgement on this part of Zotero until I have fully explored its functionality. I will post my views in the appropriate spot once I have finished and submit my styles to the repository when they are complete.

noksagt · December 4, 2008

One thing I have noticed is that most styles in Zotero only have 3 output styles - BOOK, CHAPTER and DEFAULT (formatted to present as a JOURNAL entry).

It is perfectly possible to make a completely different style for more reference types in CSL. It is not desirable to do so, though--most bibliographic formats have only slight variations due to reference type, and CSL is able to reflect this simplicity (whereas .ens cannot).

This is very limited in my view. I have been told in other parts of the forum that this system captures most situations but I am yet to be convinced.

Given that you are free to add more types, why are you unconvinced? You can make .ENS-like citation files if you really wanted to.

The problem though does not appear to be in the ability of CSL/XML to render the references only that people don't wish to program for all these reference types.

Or that these other formats are undocumented. Or that the style author do not consider them to be unique. It is hard to discuss styles in general. I see your initial complaint about the lack of scientific styles and your subsequent complaint about separate ways to cite legislation as inconsistent pragmatically. What scientific journals specify how to cite legislation?!

In any case, you are free to either fix particular styles or cite documentation for a style that indicates that the CSL-file may need more type-specific handling & these issues can be addressed.

How is a style with all types of references accounted for going to look and be maintained?

No worse than the ugly bloatedness of a similar EndNote style...

bdarcus · December 4, 2008

One thing I have noticed is that most styles in Zotero only have 3 output styles - BOOK, CHAPTER and DEFAULT (formatted to present as a JOURNAL entry). This is very limited in my view. I have been told in other parts of the forum that this system captures most situations but I am yet to be convinced.

My argument is that the best, most robust, styles are typically not going to have any type conditional logic in the citation or bibliography elements, but will restrict those to macros.

I have a wide variety of webpages, reports, conferences, unpublished manuscripts, maps, CD Software, Computer Programs and Legislation that have quite specific data that needs to be presented in a set way in a bibliography.

I'd urge you to reassess that assumption. I know there are exceptions where one does need to tweak output for some types, but I don't think it's common. A good macro can go a really long way. And macros are a feature that Endnote's styling system doesn't have the last I recall.

See some design notes of mine from awhile back.

Having all these items presented like a journal is limiting.

It's not that simple. In CSL, all types have one of three fallback types: article, book, and chapter. These base types correspond to the structural characteristics I describe in that link above. So when, say, Zotero sees a record for which it does not see any particular CSL logic, it maps it to the corresponding base type. This is why for many case, you don't need definitions for much more than those three types.

Aside: with macros, this feature doesn't really need to be here, but has remained for what I'd call legacy reasons.

The problem though does not appear to be in the ability of CSL/XML to render the references only that people don't wish to program for all these reference types.

The more code, the longer it takes, and the more buggy it potentially is. So it makes sense to code for the common cases, and address problems as they arise.

Simple Styles, as found in the Zotero Style Repository, already appear complicated, especially without any explanatory comments in the code. How is a style with all types of references accounted for going to look and be maintained?

As I hinted above, in order to understand the design of CSL, you need to stop thinking about citation formatting through the lens of reference types. If you have styles that are a collection of smartly designed macros, with simple citation and bibliographic definitions, it becomes quite easy to maintain those styles. Indeed, that's a big part of the idea behind the macro system.

SimonCropper · December 4, 2008

Ok, it appears I have poked a bear here. Both the proceeding authors have jumped on the comments I have made concerning CSL style sheets in several discussion threads on this forum. Obviously, they are passionate about this model, which is good to see.

I have however spend several days working through the information available on this website and spend time downloading, reviewing and adjusting existing styles to better understand what you are trying to achieve. Independently, I have prefaced my investigation with the the following assumptions:-

All references have 4 distinct data types: CREATOR, DATE CREATED, TITLE, WHERE IT CAN BE OBTAINED. In reality, this is all we (as users) really want to know. Each reference type have variants on this data.

Every journal has a list of nitty picky specifications of how they want the in-text citations and bibliographies formatted.

Following on from your comments I take it you are suggesting that you use macros to format the CREATOR data, DATE CREATED data, etc. and only use the citation and bibliography sections to control how single objects (BOOK), parts of a single object (CHAPTER) and sections of a part of a object (ARTICLE) are presented.

One would presume that if your CSL file is structured properly you should only ever need to modify the options and formatting in the citation and bibliographic sections.

Am I on the right track?

bdarcus · December 4, 2008

Yes, you are on the right track.

But, I'd go a little further. Take a look at the bibliography section for the APA style. It is only a series of macro and variable calls (though I'm a little confused about why there's both a "container-contributors" and a "secondary-contributors" macro; they ought to be the same thing).

Or, this more complex Chicago style also shows the basic idea. There you have fragments like:

<text macro="contributors"/>
<text macro="title"/>
<text macro="description"/>
<text macro="secondary-contributors"/>

So the code here is all generic, and if there's any type or data-specific logic, it happens in those macros.

Also, on your last question, it depends. In some cases, you'd be more likely to be making tweaks to the macros.

marsh · January 15, 2009

I am trying to import bibliographic records from EndNote X. I export them using Refman (RIS) Export format to a txt file. Then I try importing them. With Zotero 1.0.9 I have no problem importing all 70 records. With Zotero 1.5 Sync Preview about 15% of all records cause an error. ("An error occurred while trying to import the selected file. Please ensure that the file is valid and try again.") Here, for example, is the data from one of the records:

TY - JOUR
AU - Martin, Ron
AU - Sunley, Peter
PY - 1998
TI - Slow convergence? The new endogenous growth theory and regional development
SP - 201-227
N1 - Jul
JF - Economic Geography
VL - 74
IS - 3
SN - 00130095
N1 - Slow convergence? The new endogenous growth theory and regional development
N1 - TY - JOUR
KW - Economics
Technology
Geography
Human capital
Economic growth
N2 - In economics, interest has revived in economic growth, especially in long-term convergence in per capita incomes and output between countries. This mainly empirical debate has promoted the development of endogenous growth theory, which seeks to move beyond conventional neoclassical theory by treating as endogenous those factors particularly technological change and human capital relegated as exogenous by neoclassical growth models.
UR - http://proquest.umi.com/pqdweb?did=34405463&Fmt=7&clientId=16241&RQT=309&VName=PQD
ID - 13875
ER -

In order to isolate this particular record as the one causing the problem, I had to "jackknife" the file, trying first to import all 70 records, then the first 35, then the first 15 or so, etc. until I isolated the problematic record(s). With 70 original records, at least twelve caused this problem with the 1.5 preview.

As if this wasn't enough trouble, I really need to use another set of references from a search yielding 181 items. Again, I used a jackknife technique and had a substantial number of records (maybe 30 or so) that imported OK. Then I was interrupted and when I returned I imported these records a second time. I've been unable to delete them. A "Find Duplicates" tool would be very helpful, but if I can't delete the duplicates it won't do me any good.

Finally, the jackknife approach works for all but the error-causing records. Nonetheless, it's incredibly time consuming. With 181 records and a 15% error rate, there will be close to 30 errors, or an average of one every 15 or so records. Since each problematic record has to be isolated individually, this will take hours.

Any help would be most appreciated. If you want copies of some of the other problematic records, I can supply some. In the first batch of 70 records, when an individual record would not import, I save its text file.

Thanks.

sean · January 15, 2009

I have no problem importing that record in Zotero 1.5. What error do you get in "Report errors..." when you try?

Rintze · January 16, 2009

Finally, the jackknife approach works for all but the error-causing records. Nonetheless, it's incredibly time consuming.

Just wondering: there are quite a few reports here on how Zotero gets stuck on RIS files with non-standard entries. Wouldn't it be handier to have Zotero just skip the RIS-records it cannot digest, and just report that a number of records couldn't be imported? That would offer a faster way to find any problematic entries.

marsh · January 16, 2009

Rintze's idea is a good one. In addition, Zotero should give the record number and a few pieces of identifying information (e.g., first two lines).

Sean, I'm not sure what you mean. All I get is the text I quoted in my earlier post. Is there some place to look for more detailed information?

marsh · January 16, 2009

Just to elaborate on Rintze's idea, it would be great if Zotero tried to digest as much of the RIS file as it can. If there are errors, Zotero should say something like, "Five records could not be imported. Please see the log at xxx" where xxx is a hyperlink to a file listing the record numbers and beginning text of the records where Zotero encountered problems.

Rintze · January 16, 2009

Or perhaps Zotero could spew out a new RIS-file with the records that couldn't be parsed. That would probably be more helpful than a log.

SimonCropper · January 18, 2009

Looking at the record provided I can see "TY - JOUR" twice. This is the flag to record the start of a record in RIS and should not be in the N1 field, i.e. "N1 - TY - JOUR". Try removing this and importing again.

jbonnell · January 23, 2009

I've read through all of the posts in this thread hoping to effect a smooth transfer from Endnote X (for Mac) to Zotero. I've tried exporting my Endnote references using both RefMan (RIS) and BibText, but in both cases when I go to import into Zotero I get the error message "no translator could be found for the given file." Wondering if I've missed the resolution to this issue, as I notice aaronwilcher brought it up in October. Suggestions much appreciated.

robinwyatt · February 25, 2009

Hi all. My old Toshiba laptop, on which I ran EndNote X for Windows, recently died. I have since bought a Mac and have connected my old hard drive to the Mac, encased in a shell. I would like to know if it is possible to upload a). my personalised citation style and b). my library of references from the old hard drive to Zotero using my Mac. Of course, I could not open EndNote and simply export. Does anyone know how to do these two things? Thanks so much in advance!

Rintze · February 26, 2009

For a, you should be able to load EndNote styles via the Style Manager found in Zotero 1.5 beta (Zotero 1.0.x doesn't support EndNote styles). More info here: http://www.zotero.org/support/styles

For b, you will have to locate your Zotero library and copy it to the Firefox profile on your other computer. More info here: http://www.zotero.org/support/zotero_data

robinwyatt · February 26, 2009

Thanks, Rintze. But the thing is that I want to get this stuff off the EndNote files in my old hard drive. How would I do that?

Rintze · February 26, 2009

Whoops. Forgot you were using Endnote halfway writing my post. You will need to convert your EndNote library to a format Zotero can understand (RIS seems to work best). That will require a working copy of Endnote though.