Snapshot fails where images have spaces in filenames

Hi,

I have come across an error where some of the images in my snapshots do not seem to work correctly. Specifically, if the image has a space character in the filename it appears not to work. I think that what is happening is that the file is saved with the space character escaped to "%20", but that when this is called in the snapshot it would need to escape the resulting percent sign escaped to "%25".

I have tested opening the snapshot in Chrome and Internet Explorer on Windows 7. The snapshots were originally taken using Chrome.

Example:

I took a snapshot of this page, which appeared to be successful:
http://hact.org.uk/blog/2014/05/27/big-data-and-housing-part-3-machine-learning
However, when I returned to view the snapshot the last image in the main body of the article does not appear. Right-clicking the image in Chrome and selecting "Copy image URL" I find that the URL it is pointing to is:

file:///C:/Users/ [snipped] /RxxxxxxE/Anomaly%20detection1.png

If I paste that URL into my browser it fails to open the image, but when I look in the relevant folder on the machine I can see there is an image by that name. If I visit the URL of the folder in Chrome (i.e. dropping the image name from the end) I get a directory listing. If I then click on the file "Anomaly%20detection1.png" it takes me to the image (which opens correctly) at the following address:

file:///C:/Users/ [snipped] /RxxxxxxE/Anomaly%2520detection1.png

Note- %2520 instead of %20.

Thanks,
Jim
  • Yes, that's exactly the double-encoding problem you say it is.
    Spaces in images are quite rare - most websites avoid them - but this should be fixable.
  • edited March 6, 2018
    I have a similar problem, with image filenames containing diacritics, punctuation, arithmetic symbols, and suchlike.

    I am using Zotero 4.0.29.6 and Firefox 52.6.0.

    To replicate:
    1. Go to https://commons.wikimedia.org/wiki/Category:Whales_breaching
    2. Take snapshot of webpage
    3. Open snapshot in browser
    4. The images with odd characters in their file names will not display (the images with spaces in their names do)
    5. Back in the Zotero window, right-click on the snapshot within the collection item and choose "Show file".
    6. The images are in fact saved to disk, but there are double-escaped filenames.

    An ideal fix would not require the snapshots to be manually re-saved, if that could be done elegantly with a reasonable amount of effort.

    I rather need the full unicode range to deal with the names of the authors of papers, and symbols in the titles, too. Sadly Latex does not really do Unicode.
  • Zotero 4.0 is no longer supported. You should upgrade to Zotero 5.0, where this works.
  • Great, and thank you for the quick response. I'd assumed I had the latest version simply because Zotero 5.0 isn't in Debian yet, not even in Sid. I should have checked.
Sign In or Register to comment.