Feature request: Remove accents/diacritics from filenames
I am using Zotero 7 and the customizable file renaming (https://www.zotero.org/support/file_renaming). I would like to have all accents/diacritics removed from filenames. This would be similar behavior to ZotFile's "Remove special characters (diacritics) from filename". However, I cannot find a similar option in Zotero's built-in file renaming. I imagine this could be added as a parameter. I tried the "localize" parameter, but it didn't remove accents/diacritics. Currently, my only ideas for a workaround is building regexes (e.g., `replaceFrom="č" replaceTo="c" regexOpts="g"`) for each and every special character.
replaceFrom="\\p{M}" replaceTo="" regexOpts="g"
will work for Zotero 7. A similar solution is available for Zotero 6, but you'd have to write out"\\p{M}"
(which is possible but tedious).@maia-sh, can you explain why you want this? All modern filesystems should support Unicode.
@dstillman, thanks for letting me know that modern systems should support accents. I wanted to change them out, as I've had problems in the past and had my system therefore set up without accents, and I would like consistency across my library.
Diacritics are not universally applied in all data sources that a user may add to Zotero. So if author names are included in file names for example, it is likely that one will end up over time with the same author's name inserted in file names both with and without its diacritic(s). That creates difficulties when searching by file name (outside Zotero for example). (the same problem occurs when searching for author names within Zotero, although that's not the file naming issue raised here).
If there were an easy way of correctly inserting all missing diacritics in existing data like that, that would of course be preferable. But that does not exist as far as I know. So, failing that, a simple approach to remove diacritics from all file names is the "least-bad" option. That's presumably why it is an option in Zotfile's renaming scheme. Many Zotfile users would have large file collections where diacritic removal from file names has been standard practice. I certainly do.
I have also had batch file code fail on Zotero filenames with diacritics when processing files outside Zotero, where unicode support was lacking. Where the code was designed to find "orphan" files - files no longer in Zotero's database but still present on disk - that can lead to files being mis-designated as orphans (which could lead to such code wrongly deleting them). That's why I started using Zotfile's option.
Zotero.Utilities.removeDiacritics
. Then, if the function is already defined, why do not adding it with an option (maybe through Customizing the Filename Format) to use with the renaming function getFileBaseNameFromItem?(cc @dstillman, @AbeJellinek, @tnajdek)
Thanks!