unAPI export from DB - finding bad record?

My SafetyLit database uses unAPI to allow Zotero users to directly download records. However, sometimes there are problems with downloading multiple records using the folder icon. If I open each record on the page individually, I can download every record one-by-one.

Is there a way for me to identify the record that is causing the problem with downloading multiple records. Most of the multiple-download to Zotero attempts are successful. But I want all records available for downloading by clicking the folder icon. I have the problem narrowed to 10 records but I can't the one that is causing the problem.
  • this may be obvious but have you looked at the debug output during the failed translation?
  • However, sometimes there are problems with downloading multiple records using the folder icon.
    Do you have examples of URIs or search terms that will always lead to the problem?
  • Yes. Other than a statement that there is an illegal XML character, I didn't find the debug output very helpful. The debut ID is: D1154641749. But that got me thinking when I found another batch of ten records with a download error. Again each individual record imported fine. However, this time I noticed something about one of the records that seemed to be unusual. The doi number contains angle brackets.

    10.1367/1539-4409(2004)004<0024:PRTODV>2.0.CO;2

    I looked through the other group and found another record with a doi containing angle brackets. When I look on the publisher's website the doi indeed contains angle brackets.

    What do I need to do to my metadata so that this can be avoided?

    Thanks
  • To replicate the problem please go to:

    http://www.safetylit.org/citations/index.php?fuseaction=citations.archivesearch

    In the A line, first search field enter "intimate partner violence" (without the quotes) and select textwords+synonyms from the dropdown (instead of "author").

    (keep the Boolean at "AND")

    In the B line search field enter "SES Proxy" and select textwords+synonyms

    Limit the search to 2001-2010.

    Leave the number of records at 10

    Go to page 185.

    Thank you
  • AdamSmith / noksagt

    If I left any confusion, my "yes" (above) was to adamsmith's question about debug. I was still writing my reply to adamsmith while noksagt was writing.
  • I've updated the trunk version of the unAPI translator to ignore entries with invalid XML, and to log the invalid ID to the console. The issue here is
    this ID, which contains:

    <identifier type="doi">10.1367/1539-4409(2004)004<0024:PRTODV>2.0.CO;2</identifier>

    Those < and > characters inside the <identifier> tag need to be escaped.
  • Thank you. I presume that would also apply to any ampersands, yes?
  • all predefined entities I'd assume, yes
    http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Predefined_entities_in_XML

This is an old discussion that has not been active in a long time. Instead of commenting here, you should start a new discussion. If you think the content of this discussion is still relevant, you can link to it from your new discussion.