Non-alphanumeric characters in URL
This is a two-part question. The first is technical, about Zotero. The second is more about citation conventions, so I want to be clear in advance that I don't necessarily expect an answer, let alone a definitive one.
1. Is there any way to avoid having non-alphanumeric URL characters automatically saved in as a ludicrously long string of machine code?
Example:
Citation URL:
https://osakana.suisankai.or.jp/wp/wp-content/uploads/2020/12/ 1997%E5%B9%B4%E3%80%80%E5%85%A8%E5%9B%BD%E9%AD%9A%E9%A3%9F%E6%99%AE%E5%8F%8A%E6%8B%85%E5%BD%93%E8%80%85%E8%82%B2%E6%88%90%E6%A4%9C%E8%A8%8E%E4%BC%9A%E3%80%80%E5%A0%B1%E5%91%8A%E6%9B%B8.pdf
(Had to put a space after the ...12/ to avoid the URL rendering back to kanji here. It remains that long string of nonsense when output into Word.)
Actual URL:
https://osakana.suisankai.or.jp/wp/wp-content/uploads/2020/12/1997年 全国魚食普及担当者育成検討会 報告書.pdf
I was able to manually cut and paste the URL into Zotero and get it to output the original, kanji intact.
2. But that's actually not a great solution either, since most English-language publications/publishers won't print the non-alphanumeric characters.
Does anyone know what the conventions are regarding shortened URLs?
It seems like the easiest solution would be to use a URL like:
https://is.gd/3Z7zKc
But it also seems to me that this would be frowned upon. Anyone have any insights?
Thanks as always!
1. Is there any way to avoid having non-alphanumeric URL characters automatically saved in as a ludicrously long string of machine code?
Example:
Citation URL:
https://osakana.suisankai.or.jp/wp/wp-content/uploads/2020/12/ 1997%E5%B9%B4%E3%80%80%E5%85%A8%E5%9B%BD%E9%AD%9A%E9%A3%9F%E6%99%AE%E5%8F%8A%E6%8B%85%E5%BD%93%E8%80%85%E8%82%B2%E6%88%90%E6%A4%9C%E8%A8%8E%E4%BC%9A%E3%80%80%E5%A0%B1%E5%91%8A%E6%9B%B8.pdf
(Had to put a space after the ...12/ to avoid the URL rendering back to kanji here. It remains that long string of nonsense when output into Word.)
Actual URL:
https://osakana.suisankai.or.jp/wp/wp-content/uploads/2020/12/1997年 全国魚食普及担当者育成検討会 報告書.pdf
I was able to manually cut and paste the URL into Zotero and get it to output the original, kanji intact.
2. But that's actually not a great solution either, since most English-language publications/publishers won't print the non-alphanumeric characters.
Does anyone know what the conventions are regarding shortened URLs?
It seems like the easiest solution would be to use a URL like:
https://is.gd/3Z7zKc
But it also seems to me that this would be frowned upon. Anyone have any insights?
Thanks as always!
Short URLs are generally not a good idea for citation, as they simply create an additional dependency that may disappear before the original resource.
APA has a weird vacillating blogpost on this (that also contains some errors) where they kind of row back on the link shorteners and say they're mainly for student papers. https://apastyle.apa.org/blog/shortened-urls
@dstillman said: With respect, the question is not whether Unicode is available. As with all things, it's a question of implementation, which means house rules and conventions (some of which are unchanged from decades before the internet), ownership of fonts with full Unicode support, etc. Things are definitely easier for online publications, but for print there are still a lot of hurdles that I encounter on a semi-regular basis -- and would on a regular basis if I were more productive, I suppose.
Thanks to @adamsmith for the Chicago reference. I had missed that. The impermanence defense is strange to me given that third-party links do, in fact, change―that's why we all know what 404s and web.archive.org are―but I will stick to the manual.
I think the passage is worth quoting in part for posterity: While this is lovely in principle, it does not solve the fundamental issue of having URLs that are more than 3 lines long.
Elsewhere, Chicago makes an effort to shorten nearly everything, including headers, footers, titles, captions, etc. URLs are an interesting exception, technical justifications notwithstanding.
Anyhow, this is not something I would worry about, and if a publisher had a problem with it, I would object. That's part of the point, though. Leaving aside that a shortened URL is a second dependency — that was my main point — the original URL is an identifier, so it provides value in the citation even if it goes offline, for example by being findable in the Wayback Machine. If a shortened URL goes offline, it's worthless, because the metadata pointing to the original URL is stored on a server that no longer exists. Using the original URL also puts the onus for preservation/redirection on the publisher, which is more appropriate than, say, some company that made a deal with the Grenadian government.