Import a BibTex file - filename and file fields are not working with diacritic/diaeresis chars

A similar problem was reported here: https://forums.zotero.org/discussion/72082/author-name-with-diacritics-causes-paragraph-mark-in-document-libreoffice-ubuntu-bookmarks/p1 it was fixed but it seems like the right behavior was not extended to these fields.

*Problem decryption:*
The issue happens when the *file* field has some accented/diacritic/diaeresis in its value the software does not load/attach/copy the file resource to the reference.

Report ID: 772123055

*Scope:*
2,345 files and 2658 references - I cannot fix all of them by hand

*System settings:*
version => 6.0.19, platform => Win32, oscpu => Windows NT 10.0; WOW64, locale => en-US, appName => Zotero, appVersion => 6.0.19, extensions => Zotero LibreOffice Integration (6.0.3.SA.6.0.19, extension), Zotero Word for Windows Integration (6.0.2.SA.6.0.19, extension)


*Steps to reproduce:*
1. Save text below as a .bib file, and import into Zotero
% -------------------------------------------------------------------------
% This BibTex file was generated by Qiqqa (http://www.qiqqa.com/?ref=EXPBIB)
% Friday, January 6, 2023 4:52:13 PM
% Version 3
% -------------------------------------------------------------------------


@article{kuhne2006matters
, author = {Thomas K\"uhne}
, title = {Matters of (meta-) modeling}
, journal = {Software and Systems Modeling}
, year = {2006}
, volume = {5}
, number = {4}
, pages = {369--385}
, publisher = {Springer}
, filename = {C:\\\\docs\\K\"uhne2006 - Matters of (meta-) modeling.pdf}
, file = {K\"uhne2006 - Matters of (meta-) modeling.pdf:docs/K\"uhne2006 - Matters of (meta-) modeling.pdf:application/pdf}
, tags = {MLM;ontological versus linguistic}
, keywords = {MLM;ontological versus linguistic}
}

@inbook{godel1933present
, author = {G\"odel, Kurt}
, title = {The present situation in the foundations of mathematics}
, year = {1933}
, pages = {45--53}
, volume = {3}
, journal = {Collected works}
, filename = {C:\\\\docs\\G\"odel1933 - The present situation in the foundations of mathematics.pdf}
, file = {Gӧdel1933 - The present situation in the foundations of mathematics.pdf:docs/Gӧdel1933 - The present situation in the foundations of mathematics.pdf:application/pdf}
}


*Error:*
the author, title, tags and keywords files are working OK, this means Zotero can translate accented/diacritic/diaeresis characters to its UTF-8 representation, e.g:
\"o -> ӧ
\"u -> ü

but the file and filename fields are crashing when they have the same characters. it looks like the right behavior was not extended to these fields.


*Evidences - Zotero logs:*
for the Author column:
(3)(+0000006): Translate: Replace(2) \"u in Thomas K\"uhne with ü
(3)(+0000004): Translate: Replace(2) \"u in Thomas K\"uhne with ü

for the file column:
(3)(+0000003): Translate: Replace(2) \"U in K\"Uhne2006 - Matters of (meta-) modeling.pdf:docs/K\"Uhne2006 - Matters of (meta-) modeling.pdf:application/pdf with ü
(3)(+0000000): Translate: Replace(2) \"U in KÜhne2006 - Matters of (meta-) modeling.pdf:docs/K\"Uhne2006 - Matters of (meta-) modeling.pdf:application/pdf with ü

and the reported error is:
(4)(+0000001): Translate: Adding attachment

(3)(+0000000): Zotero.Attachments.cleanAttachmentURI() is deprecated -- use Zotero.Utilities.cleanURL

(2)(+0000000): cleanURL: Invalid URI: docs/K{"u}hne2006 - Matters of (meta-) modeling.pdf

(3)(+0000000): Translate: Attempting to parse path docs/K{"u}hne2006 - Matters of (meta-) modeling.pdf

(3)(+0000000): Translate: File at C:\Users\zabalad\Downloads\dza_test_qiqqa\docs\K{"u}hne2006 - Matters of (meta-) modeling.pdf does not exist

(3)(+0000000): Translate: File at C:\Users\zabalad\Downloads\dza_test_qiqqa\docs\K{"u}hne2006 - Matters of (meta-) modeling.pdf does not exist

(3)(+0000001): Translate: File at C:\Users\zabalad\Downloads\dza_test_qiqqa\docs\K{"u}hne2006 - Matters of (meta-) modeling.pdf does not exist

(3)(+0000000): Translate: file:///docs/K%7B%22u%7Dhne2006%20-%20Matters%20of%20(meta-)%20modeling.pdf is not a file URI

(1)(+0000000): NS_ERROR_FILE_UNRECOGNIZED_PATH Exception: Component returned failure code: 0x80520001 (NS_ERROR_FILE_UNRECOGNIZED_PATH) [nsIFile.initWithPath] Zotero.File Error: Translate: Could not parse attachment path Zotero.Translate.ItemSaver.prototype._saveAttachmentFile<@chrome://zotero/content/xpcom/translation/translate_item.js:607:17 From previous event: saveItems@chrome://zotero/content/xpcom/translation/translate_item.js:250:30 Zotero_File_Interface</this.showImportWizard@chrome://zotero/content/fileInterface.js:393:5 oncommand@chrome://zotero/content/standalone/standalone.xul:1:1


Previously the system logs were showing the right translation for the character, see:
(3)(+0000000): Translate: Replace {\"u} in Kühne2006 - Matters of (meta-) modeling.pdf:docs/K{\"u}hne2006 - Matters of (meta-) modeling.pdf:application/pdf with ü
and after printing this it's just crashing when try to attached the file (maybe because it's not using the translated filename previously)

Thanks for you help.
  • edited 11 days ago
    but the file and filename fields are crashing when they have the same characters. it looks like the right behavior was not extended to these fields.
    They are not crashing. file is a verbatim field and has different parsing rules. Qiqqa got it wrong. It's in good big-name company though, Mendeley makes the same mistake.

    With BBT installed, you can rename file or filename to files, which is parsed non-verbatim by BBT to help Mendeley users, and apparently Qiqqa users. I'll add the filename field to do the same. But file is documented, and should be exported/imported in verbatim mode. As I was doing a release tonight anyhow, I've added support for filename.

    edit: Qiqqa apparently also uses tags instead of keywords like they should have. Added tags to 6.7.50, out when the test suite passes.
  • edited 11 days ago
    Thanks for the quick reply @emilianoeheyns
    With BBT installed, you can rename file or filename to files
    it didn't work. Also I edited the hidden preference for BBT: verbatimFields and removed file from its list of values and I didn't work either.

    Are there another hidden preferences that I have to update to get it working? do I need to restart the PC?

    Cheers,
  • @zabalad it should require no other options, please open an issue on github.
  • I think I see the problem, man this is outrageous, Qiqqa seems to think that \\ is an escaped backslash in LaTeX, but that is actually a paragraph break. They should have used \textbackslash or $\backslash$.

    I don't see an easy way around this other than hand-editing to replace \\ with \textbackslash{}
  • Thanks for the quick reply @emilianoeheyns

    So if I understood right I need to replace this:
    K\"uhne2006 - Matters of (meta-) modeling.pdf

    to this:
    K\textbackslash"{u}hne2006 - Matters of (meta-) modeling.pdf
  • No, K\"uhne2006 - Matters of (meta-) modeling.pdf is right assuming the field is non-verbatim. It's the filename field that I think tries to encode the absolute path but has the par breaks, and that just won't import anything as it is.

    What's in the file field should work if,
    • you have removed file from the verbatimFields, and
    • the bibtex file is in the directory that has the subdirectory docs at time of import
    If that is the case, and it doesn't import the attachment, we can look at that on github.
  • so, it's not working because I'm doing the process as you mentioned above:
    you have removed file from the verbatimFields
    it's done

    e.g.: "verbatimFields": "url,doi,pdf,ids,eprint,/^verb[a-z]$/,groups,/^citeulike-linkout-[0-9]+$/, /^bdsk-url-[0-9]+$/"
    the bibtex file is in the directory that has the subdirectory docs at time of import
    it's done

    e.g: exported folder tree:
    Qiqqa_filtered.bib (I'm using this file)
    Qiqqa.BibTeX.tab
    Qiqqa.tab
    Qiqqa.html
    Qiqqa.png
    docs/
    docs_original/
    autotags/
    tags/
    authors/
    titles/
  • edited 11 days ago
    For me it does import the Kühne PDF, you'll have to open an issue on github so we can delve into why it isn't for you.
  • you are not responding there though.
Sign In or Register to comment.