Store "real" URL when using proxy

Current situation

Using the Firefox Zotero plugin (v3.0.8), using the "Save to Zotero" URL-bar feature (etc.) uses the URL "as is", even if the site has been accessed through a (Zotero-managed) proxy. This results in the URL quoted in subsequent references looking like this:
Xiao, B. and Benbasat, I. (2007). E-commerce product recommendation agents: use, characteristics, and impact. MIS Quarterly, 31 (1), p.137–209. [Online]. Available at: http://ezproxy.XXX.ac.uk:9999/citation.cfm?id=2017327.2017335&coll=DL&dl=GUIDE&CFID=75852338&CFTOKEN=52010918 [Accessed: 4 April 2012].
where "ezproxy.XXX.ac.uk:9999" is the proxied replacement (in this case) for "dl.acm.org".

Recommended solution

When proxying is active, use the original, unproxied URL when setting the URL field.

Reasoning

Proxying details are subject to change whilst the original URL is more likely to remain stable. Proxying redirection happens when the non-proxy URL is used, anyway, so storing the proxy data is redundant.

Additional notes

Whilst this is easily reproducible for me, Dan Stillman reports that it works for him.
  • I think everyone agrees that that should be the default behavior.
    The fact that it's not for you is likely due to one or both of these factors:
    1. ACM put proxied URL in the bibtex data they export - which they really shouldn't - and
    2. Your proxy is somewhat odd in that instead of inserting itself into the domain name as most proxies do (i.e. dl.acm.org.ezproxy.oxford.ac.uk) it replaces it.
  • I'm pretty sure the URL isn't in the bibtex data at all.

    My only comment on your second point is that this is for all proxied URL via my University, not just ACM (which was just an example). They do all take the same format, however, of fully replacing the hostname.

    Another example:
    http://onlinelibrary.wiley.com/doi/10.1111/j.1083-6101.1996.tb00176.x/full
    becomes
    http://ezproxy.XXX.ac.uk:9999/doi/10.1111/j.1083-6101.1996.tb00176.x/full
    (with a different real number for "9999", of course)
    and it's the latter that is stored.
  • The port-based method is one of the two EZproxy methods that's always been supported by Zotero, so I don't know why it would work any differently, but we can see what Simon says. (I only have access to GMU, which switched from port-based to domain-based a while ago.)

    If someone else has access to a port-based system, it'd be helpful to know if they have the same problem.
  • I'm pretty sure the URL isn't in the bibtex data at all.
    click on the bibtex link on that page. I get:
    url = {http://dl.acm.org/citation.cfm?id=2017327.2017335},
    I'm prett sure you'll see:
    url = {http://ezproxy.XXX.ac.uk:9999/citation.cfm?id=2017327.2017335},

    --> that's the URL Zotero imports.

    Wiley might be a translator issue, though. Could you try JSTOR (any article?)
  • Hi,

    I am running into the same problem as the original poster with the Firefox Zotero plugin (3.0.8). I tried ACM, JSTOR and springerlink.

    It imports the Info and the PDF just fine, but the URLs in the Metadata still keep the proxy that my university is using. Of course, it would be better to have an option to not include the proxy in the URL, as people sometimes switch universities, research labs etc.

    I get
    http://www.jstor.org.proxy-remote.galib.uga.edu/stable/4144399
    instead of
    http://www.jstor.org/stable/4144399

    Also, I get
    http://www.springerlink.com.proxy-remote.galib.uga.edu/content/l4208670j1382v42/
    instead of
    http://www.springerlink.com/content/l4208670j1382v42/

    I played around with the Proxy settings, iterated through all possible settings, but it does not fix this, the URL imports keep the proxy.
  • Just to add one more datapoint, I had been having the same problem.

    http://www.ncbi.nlm.nih.gov.laneproxy.stanford.edu/pubmed/20731387

    should have been stored as

    http://www.ncbi.nlm.nih.gov/pubmed/20731387

    but it wasn't. I have hundreds of such stored URLs, which will be quite annoying in a year's time when I switch universities. These are back from 2010 and before. Newer additions no longer have the same problem.
  • enozkan, may I ask what settings you have in Preferences/Proxies?

    Also, do you have any configured Proxies with Hostname and Scheme there?
  • Yes, I have "Enable proxy redirection" and "Automatically recognize proxied resources" clicked on, but not "Disable proxy redirection when my domain name contains ..."

    For my "Configured Proxies", I have "Hostname" as Multi-Site and a scheme for my university library's proxy scheme (with the relevant %h and %p's).
  • I defined a scheme for my proxy as follows:

    http://%h.proxy-remote.galib.uga.edu/%p

    and put "Hostname" as Multi-Site and entered www.springerlink.com The redirection itself works.

    However, the stored URL metadata in both the Zotero entry and the SpringerLink Full PDF still include the proxy, sadly nothing different from what happened before.

    Hopefully it can be fixed in a future update.
  • Interesting. I have pretty much the same preferences settings. And everything appeared to work fine, but when I created an item from a springerlink.com article as you did, I did observe the issue you are describing! (I usually create all my zotero items through ncbi/pubmed or google scholar, and they were fine I guess.) Hopefully, the developers can help.
  • @Dan, Simon,

    Do you think it would be ok to de-proxify all the URLs (URL field and link attachments) when importing from web translators?
  • I like this idea. If I am within the system that uses a proxy, clicking a de-proxified URL will go through my university automatically. However, if I want to cite an item, I don't want the reference to contain the proxy.

    Am I missing something?
  • Hello,

    This issue still happens with current Zotero 4.0.
    Has anyone found a solution? or a workaround?

    Thanks in advance.
Sign In or Register to comment.