Standalone failing to import PDFs (ProQuest, NYT)

Hello, my Zotero standalone app is no longer able to import PDFs from ProQuest or New York Times archives.

When I import a citation, I briefly see a PDF icon titled "Full Text PDF" under the new citation, but it disappears almost instantly. It's been happening on every attempt for the past few days, but here is one example: http://search.proquest.com.ezproxy.cul.columbia.edu/docview/173221843/13987F5ADAE3C46605A/1

I'm using Zotero standalone 3.0.8 with Chrome 21.0.1180.90 on Mac 10.5.8.

PS I also tried it with the Firefox plugin. It works correctly with FF.
  • could you confirm that other proxied resources - say JSTOR - still import PDFs into your Standalone?
  • JSTOR is working.
  • edited October 17, 2012
    Hi Adam, I'm just wondering if there's any status update on this.

    Thanks,
    Michael
  • no, I have this bookmarked, but as I believe I mentioned elsewhere, troubleshooting connector-only issues is a good deal harder so it will take a while unless someone else takes this up.
  • Ok, that's a little frustrating. Is there anything I can do to help? I have a software background.
  • PDF download/attachment via proxy&standalone is going to be more fragile until Zotero has full proxy support in the Standalone connectors, which I don't think is anywhere close. If you need this to work reliably for sites like EBSCO and Proquest the only recommendation I have is to use Firefox.
    For most other sites with simpler structures connectors work just fine.

    If you want to look at this yourself you can try to compare the debug output for Standalone and for Zotero for Firefox:
    http://www.zotero.org/support/debug_output

    Some notes on debugging in Chrome directly are here:
    https://groups.google.com/forum/?fromgroups=#!searchin/zotero-dev/debug$20connector/zotero-dev/pRw2jv7JIGs/TA9Lpb4oy-EJ
  • Bummer. FF sucks. I try not to use it anymore.

    I'll take a stab at debugging as soon as I get a chance.
  • I took a look at the debugging logs. One obvious difference are these two lines in the stand-alone log:

    (2)(+0000002): Downloaded PDF did not have MIME type 'application/pdf' in Attachments.importFromURL()

    (3)(+0000000): Deleting item 5031

    If I'm reading this right, it looks like the file gets downloaded, but the app doesn't recognize it and deletes it. That jives with the behavior I'm seeing where the PDF attachment appears in Zotero for an instant and then goes away.

    If I'm on the wrong track here, could you please point me in the right direction?

    I've also submitted the logs to the server...
    Stand-alone log: D865221731
    Firefox log: D327996334
    Source URL: http://search.proquest.com.ezproxy.cul.columbia.edu/docview/173296630/13A0F8314BC595DAA9A/13

    Thanks a lot,
    Michael
  • you're on the right track, but my suspicion (and the cause of this debug output in all cases that I have seen it) would be that the problem isn't that Zotero isn't recognizing the file/mime type correctly, but that it's indeed not downloading a PDF, but some regular webpage, most likely an "access not allowed" or so, because the proxy (and thus access to the full text) gets lost along the way when using Standalone.
    You could adjust the translator by deleting the mimeType from the attachment setting and see what you get, maybe that'll tell you more.

    If I'm right, there is a fair chance that this just won't work with Standalone and proxy because of such authorization issues. As an alternative to using Firefox, you could also look into connecting via VPN:
    http://library.columbia.edu/services/faq/eresources/vpn.html
    instead of ezproxy - there's a very good chance that'd work.
  • I commented out lines 115 and 411 (mimeType: 'application/pdf') in ProQuest.js and restarted zotero but saw no differences in the behavior or in the log file. Is there something else I should do to delete the mimeType from the attachment setting?

    I also set up a VPN connection to Columbia. Again, there were no differences.

    Thanks.
  • Now Firefox isn't working either. ProQuest pages show the webpage icon in the address bar, not newspaper or magazine icons, and when I save, the pdfs aren't downloaded.

    I assume that I should create a new thread for this?

    PS I love zotero, but the instability is really starting to get to me.
  • I have PDF attachments consistently working for me on Firefox, e.g. for the "Senate No Place for Lazy Man" article: http://search.proquest.com/docview/173296630

    You'll get the generic article logo on Proquest if you're on the page view page rather than the citation/abstract view, since Zotero uses information on the latter to detect the item type, but the actual import should have the right item type. AFAIK that has always been the case.

    Did you try resetting and updating your translators in case your attempts to debug/fix this for Standalone changed anything?
  • Ah yes, that was it, thank you. I reset the translators, and it's working fine. I apologize for my error and for my outburst.

    Do please let me know if there is anything more that I can do to help debug the stand-alone.

    Regards,
    Michael
  • I have observed that saving embedded metadata to Zotero is not always available, and also that when it is available, it imports the data differently (and more accurately) than if I use the "Create New Item from Current Page" button. I would like to 1) understand this process better and 2) figure out the best way to get data imported as accurately as possible. I understand that I'll always have to edit some, but I'd sure like to keep it to a minimum. Thanks.
  • Please don't post in unrelated threads. We try to keep one thread to one topic - this being on PDF import from Proquest.

    The short answer is that it's always better to use the item in the URL bar than the "Create New Item from Current Page" and that that's also the reason for keeping the two functionalities physically separate.
    You'll get embedded metadata when there is embedded metadata on the page that Zotero detects. For anything beyond that, please start a new discussion.
Sign In or Register to comment.