Can't import from Physical Review
Hi everyone!
I just wanted to add a paper from Physical Review B and one from Physical Review Letters to my zotero library when I noted that at the moment, the usual zotero buttom in the address line of firefox is missing (not only for these two papers but for all APS journal papers), so I cannot add them properly now :-(
http://prl.aps.org/abstract/PRL/v103/i25/e257202
http://prb.aps.org/abstract/PRB/v80/i23/e235321
The problem occurs not only on my computer but also on others, so it seems as if something has changed with APS during christmas...
I just wanted to add a paper from Physical Review B and one from Physical Review Letters to my zotero library when I noted that at the moment, the usual zotero buttom in the address line of firefox is missing (not only for these two papers but for all APS journal papers), so I cannot add them properly now :-(
http://prl.aps.org/abstract/PRL/v103/i25/e257202
http://prb.aps.org/abstract/PRB/v80/i23/e235321
The problem occurs not only on my computer but also on others, so it seems as if something has changed with APS during christmas...
The error ID is: 1453710441
Is it still not working for you at the above URLs, or are you trying different URLs?
[Javascript Error: "DOI translator: could not find DOI" {file:"chrome://zotero/content/xpcom/translate.js" line 896}]
After running that and, for good measure, clicking "Update Now" in the General pane of the Zotero preferences, please confirm that you're still getting the error on the above URLs. If you are, you may be experiencing an issue with Zotero 1.0 that doesn't exist in 2.0.
I can import papers from the Physical Review, but the abstract field remains empty.
(Just tested for Physical Review Letters and Physical Review A).
(1) Bug: Article numbers are not stored in the "Pages" field.
(2) Abstracts are not captured (I don't have automatic PDF capture on, so I don't know about PDFs)
Sample link:
http://prl.aps.org/abstract/PRL/v99/i12/e126105
I run a check of the database (was corrupted but now it is fine) and checked all suggestions in "Troubleshooting translator". My symptoms were and are:
I click on the zotero image in URL field (any PRB, PRL paper)
I get "Could not save item. An error occurred when saving the item, ..."
No more info
I have Firefos 3.0.17
Zotero 1.0.10 updated
I'd be very grateful for any help
"https?://(?:www\\.)?(prola|prl|prb|rmp|pra|prc|prd|pre|prst-ab|prst-per|).aps.org.*/(toc|searchabstract|abstract)/"
This means that you can also use PROLA on prl.aps.org etc., not just prola.aps.org
I also added the .* at aps.org.* , so the site will register through proxies of the form
prola.aps.org.myproxy.example.com/etc/etc
This may not be a good idea, since for some reason this doesn't work through proxies. I'll paste the debug log in the post. Also, if your proxy has "toc" in the name you'll be in trouble, some fixing in PROLA.js line 4 is needed.
To guide me in the subdomains to add I used http://prola.aps.org/browse.html but I'm still getting errors in some cases, even if the URL modification scheme used in the translator would appear to work from a glance at the export Endnote link. In the next post i attach a debug log from trying prst-per, it also fails on
http://prola.aps.org/abstract/PRI/v7/i4/p193_1
I've done very very little testing. Hope this can be used to improve the translator, at least it could be brought to work for prl, prc, pra and several others, even if it doesn't work through proxies. Perhaps adjust the regex to match the succesful cases.
(4)(+0000000): Translate: Parsing code for PROLA
(3)(+0000003): created hidden browser (1)
(3)(+0000000): loading http://prst-per.aps.org/abstract/PRSTPER/v4/i2/e020002
(3)(+0000383): http://prst-per.aps.org/abstract/PRSTPER/v4/i2/e020002 has been loaded
(4)(+0000001): Translate: Phys. Rev. ST Phys. Educ. Res. 4, 020002 (2008): Editorial: Physics - spotlighting exceptional research
(3)(+0000004): deleted hidden browser
(2)(+0000001): Translate: Translation using PROLA failed:
message => newDoc.evaluate("//div[contains(@class, \"aps-abstractbox\")]/p", newDoc, null, XPathResult.ANY_TYPE, null).iterateNext() is null
fileName => chrome://zotero/content/xpcom/translate.js
lineNumber => 816
stack => ([object XPCNativeWrapper])@chrome://zotero/content/xpcom/translate.js:816
name => TypeError
url => http://prst-per.aps.org/abstract/PRSTPER/v4/i2/e020002
downloadAssociatedFiles => true
automaticSnapshots => true
(3)(+0000026): HTTP POST id=2c310a37-a4dd-48d2-82c9-bd29c53c1c76&lastUpdated=2009-01-18%2023%3A15%3A00&diagnostic=version%20%3D%3E%202.0b7.6%2C%20platform%20%3D%3E%20Win32%2C%20oscpu%20%3D%3E%20Windows%20NT%205.1%2C%20locale%20%3D%3E%20%2C%20appName%20%3D%3E%20Firefox%2C%20appVersion%20%3D%3E%203.5.7%2C%20extensions%... (1489 chars) to http://www.zotero.org/repo/report
(5)(+0000001): Translate: running handler 0 for done
(5)(+0004595): SELECT COUNT(*) FROM fulltextItems WHERE (indexedPages IS NOT NULL AND indexedPages=totalPages) OR (indexedChars IS NOT NULL AND indexedChars=totalChars)
(5)(+0000000): SELECT COUNT(*) FROM fulltextItems WHERE (indexedPages IS NOT NULL AND indexedPages<totalPages) OR (indexedChars IS NOT NULL AND indexedChars<totalChars)
(5)(+0000000): SELECT COUNT(*) FROM itemAttachments WHERE itemID NOT IN (SELECT itemID FROM fulltextItems WHERE indexedPages IS NOT NULL OR indexedChars IS NOT NULL)
(5)(+0000001): SELECT COUNT(*) FROM fulltextWords
(3)(+0000065): DATE: retrieved with algorithms: ({year:2009, month:8, day:28})
(3)(+0000001): DATE: retrieved with algorithms: ({year:2008, month:4, day:13})
(3)(+0000001): DATE: retrieved with algorithms: ({year:2008, month:4, day:13})
(3)(+0000000): DATE: retrieved with algorithms: ({year:2009, month:7, day:7})
(3)(+0000001): DATE: retrieved with algorithms: ({year:2009, month:8, day:13})
(3)(+0000000): DATE: retrieved with algorithms: ({year:2008, month:8, day:16})
(4)(+0000006): Translate: Binding sandbox to http://www.example.com/
(3)(+0000000): Translate: Searching for translators for an undisclosed location
(4)(+0000000): Translate: Parsing code for Zotero RDF
(4)(+0000003): Translate: Setting configure option getCollections to true
(4)(+0000000): Translate: Setting configure option dataMode to rdf
(4)(+0000000): Translate: Setting display option exportNotes to true
(4)(+0000000): Translate: Setting display option exportFileData to false
(4)(+0000001): Translate: Parsing code for MODS
(4)(+0000002): Translate: Setting display option exportNotes to true
(4)(+0000000): Translate: Setting configure option dataMode to xml/e4x
(4)(+0000001): Translate: Parsing code for Refer/BibIX
(4)(+0000001): Translate: Setting configure option dataMode to line
(4)(+0000000): Translate: Setting display option exportCharset to UTF-8
(4)(+0000001): Translate: Parsing code for RIS
(4)(+0000002): Translate: Setting configure option dataMode to line
(4)(+0000000): Translate: Setting display option exportNotes to true
(4)(+0000000): Translate: Setting display option exportCharset to UTF-8
(4)(+0000000): Translate: Parsing code for Unqualified Dublin Core RDF
(4)(+0000001): Translate: Setting configure option dataMode to rdf
(4)(+0000000): Translate: Parsing code for Wikipedia Citation Templates
(4)(+0000002): Translate: Setting display option exportCharset to UTF-8
(4)(+0000000): Translate: Parsing code for BibTeX
(4)(+0000008): Translate: Setting configure option dataMode to block
(4)(+0000000): Translate: Setting display option exportCharset to UTF-8
(5)(+0000004): SELECT key AS domainPath, value AS format FROM settings WHERE setting='quickCopySite' ORDER BY domainPath COLLATE NOCASE
The RIS I get on a manual DL looks ok:
TY - JOUR
M1 - Copyright (C) 2010 The American Physical Society
M1 - Please report any problems to prola@aps.org
ID - 10.1103/PhysRevSTPER.4.020002
TI - Editorial: Physics - spotlighting exceptional research
A1 - Sprouse, Gene D.
VL - 4
IS - 2
PB - American Physical Society
SP - 020002
PY - 2008/09/15/
JF - Physical Review Special Topics - Physics Education Research
JA - Phys. Rev. ST Phys. Educ. Res.
J1 - PRSTPER
UR - http://link.aps.org/abstract/PRSTPER/v4/e020002
ER -
(4)(+0033029): Translate: Parsing code for PROLA
(3)(+0000004): created hidden browser (1)
(3)(+0000000): loading http://prl.aps.org.myproxy.example.net/abstract/PRL/v104/i2/e026801
(3)(+0000817): http://prl.aps.org.myproxy.example.net/abstract/PRL/v104/i2/e026801 has been loaded
(4)(+0000001): Translate: Phys. Rev. Lett. 104, 026801 (2010): Carbon Nanotubes as Cooper-Pair Beam Splitters
(3)(+0000001): HTTP POST type=ris to http://prl.aps.org.myproxy.example.net/export/PRL/v104/i2/e026801?type=ris
(3)(+0000007): deleted hidden browser
(3)(+0000000): Translate: Translation successful
(5)(+0000000): Translate: running handler 0 for done
(4)(+0000117): Translate: Binding sandbox to http://www.example.com/
[Note: Above line not edited by me -npj]
(4)(+0000001): Translate: Parsing code for RIS
(4)(+0000004): Translate: Setting configure option dataMode to line
(4)(+0000000): Translate: Setting display option exportNotes to true
(4)(+0000000): Translate: Setting display option exportCharset to UTF-8
(5)(+0003802): SELECT COUNT(*) FROM fulltextItems WHERE (indexedPages IS NOT NULL AND indexedPages=totalPages) OR (indexedChars IS NOT NULL AND indexedChars=totalChars)
(5)(+0000001): SELECT COUNT(*) FROM fulltextItems WHERE (indexedPages IS NOT NULL AND indexedPages<totalPages) OR (indexedChars IS NOT NULL AND indexedChars<totalChars)
(5)(+0000000): SELECT COUNT(*) FROM itemAttachments WHERE itemID NOT IN (SELECT itemID FROM fulltextItems WHERE indexedPages IS NOT NULL OR indexedChars IS NOT NULL)
(5)(+0000001): SELECT COUNT(*) FROM fulltextWords
(3)(+0000067): DATE: retrieved with algorithms: ({year:2009, month:8, day:28})
(3)(+0000002): DATE: retrieved with algorithms: ({year:2008, month:4, day:13})
(3)(+0000000): DATE: retrieved with algorithms: ({year:2008, month:4, day:13})
(3)(+0000000): DATE: retrieved with algorithms: ({year:2009, month:7, day:7})
(3)(+0000001): DATE: retrieved with algorithms: ({year:2009, month:8, day:13})
(3)(+0000000): DATE: retrieved with algorithms: ({year:2008, month:8, day:16})
(4)(+0000006): Translate: Binding sandbox to http://www.example.com/
(3)(+0000000): Translate: Searching for translators for an undisclosed location
(4)(+0000000): Translate: Parsing code for Zotero RDF
(4)(+0000003): Translate: Setting configure option getCollections to true
(4)(+0000000): Translate: Setting configure option dataMode to rdf
(4)(+0000000): Translate: Setting display option exportNotes to true
(4)(+0000000): Translate: Setting display option exportFileData to false
(4)(+0000001): Translate: Parsing code for MODS
(4)(+0000002): Translate: Setting display option exportNotes to true
(4)(+0000000): Translate: Setting configure option dataMode to xml/e4x
(4)(+0000001): Translate: Parsing code for Refer/BibIX
(4)(+0000001): Translate: Setting configure option dataMode to line
(4)(+0000000): Translate: Setting display option exportCharset to UTF-8
(4)(+0000001): Translate: Parsing code for RIS
(4)(+0000002): Translate: Setting configure option dataMode to line
(4)(+0000000): Translate: Setting display option exportNotes to true
(4)(+0000000): Translate: Setting display option exportCharset to UTF-8
(4)(+0000000): Translate: Parsing code for Unqualified Dublin Core RDF
(4)(+0000001): Translate: Setting configure option dataMode to rdf
(4)(+0000000): Translate: Parsing code for Wikipedia Citation Templates
(4)(+0000002): Translate: Setting display option exportCharset to UTF-8
(4)(+0000000): Translate: Parsing code for BibTeX
(4)(+0000009): Translate: Setting configure option dataMode to block
(4)(+0000000): Translate: Setting display option exportCharset to UTF-8
(5)(+0000005): SELECT key AS domainPath, value AS format FROM settings WHERE setting='quickCopySite' ORDER BY domainPath COLLATE NOCASE
- through proxies (at least for my case, but see note 1)
- for all parts of aps.org (pra, prb, etc etc)
- even if the article has no online abstract
- on Table of Contents pages for issues of the journals
- on "Citing articles" tabs (only for prola articles, see note 2)
The code is in my next post.
Note 1: I disallow third-party-cookies. This blocks cookies from my proxy, so I had to add an explicit allow for that domain, the translator was ok.
Note 2: This feature works only for the PROLA articles. Some other articles may show up in the list without title, but selecting them is unlikely to work. If someone sets up a DOI finder or similar that can scrape the whole page, this translator might get in the way. But for now, it's either this, or no multi-import at all, so I vote for keeping it. This approach did not work for the "References" tab.
{
"translatorID":"2c310a37-a4dd-48d2-82c9-bd29c53c1c76",
"translatorType":4,
"label":"PROLA","creator":"Eugeniy Mikhailov and Michael Berkowitz",
"target":"https?://(?:www\\.)?(prola|prl|prb|rmp|pra|prc|prd|pre|prst-ab|prst-per|).aps.org/(toc|forward|searchabstract|abstract)/",
"minVersion":"1.0.0b3.r1",
"maxVersion":null,
"priority":100,
"inRepository":true,
"lastUpdated":"2009-12-26 23:15:00"
}
function detectWeb(doc, url) {
// toc indicates table of contents, forward is a "Citing articles" page
if (/\/toc\//.test(url) || (/\/forward\//.test(url))){
return "multiple";
} else {
return "journalArticle";
}
}
function doWeb(doc, url) {
var arts = new Array();
if (detectWeb(doc, url) == "multiple") {
var items = Zotero.Utilities.getItemArray(doc, doc, "(abstract|abstractsearch)");
items = Zotero.selectItems(items);
for (var i in items) {
arts.push(i);
}
} else {
arts = [url];
}
Zotero.Utilities.processDocuments(arts, function(newDoc) {
Zotero.debug(newDoc.title);
if (newDoc.evaluate('//div[contains(@class, "aps-abstractbox")]/p', newDoc, null, XPathResult.ANY_TYPE, null).iterateNext()) var abs = Zotero.Utilities.trimInternal(newDoc.evaluate('//div[contains(@class, "aps-abstractbox")]/p', newDoc, null, XPathResult.ANY_TYPE, null).iterateNext().textContent);
var urlRIS = newDoc.location.href;
// so far several more or less identical url possible
// one is with "abstract" other with "searchabstract"
urlRIS = urlRIS.replace(/(searchabstract|abstract)/,"export");
var post = "type=ris";
var snapurl = newDoc.location.href;
var pdfurl = snapurl.replace(/(searchabstract|abstract)/, "pdf");
Zotero.Utilities.HTTP.doPost(urlRIS, post, function(text) {
// load translator for RIS
var translator = Zotero.loadTranslator("import");
translator.setTranslator("32d59d2d-b65a-4da4-b0a3-bdd3cfb979e7");
translator.setString(text);
translator.setHandler("itemDone", function(obj, item) {
if (item.itemID) {
item.DOI = item.itemID;
}
item.attachments = [
{url:snapurl, title:"PROLA Snapshot", mimeType:"text/html"},
{url:pdfurl, title:"PROLA Full Text PDF", mimeType:"application/pdf"}
];
if (abs) item.abstractNote = abs;
item.complete();
});
translator.translate();
}, null, 'latin1');
}, function() {Zotero.done();});
Zotero.wait();
}
Any suggestions on how to fix this nicely?
PS: Just found out about the nice debug log submission tool. Sorry for the unnecessary flooding of the thread :-/
It seems that for all of Physical Review (which is not exactly a little journal on the sidelines), import is broken: no abstract, no page number (recently called article number in Phys.Rev.), no pdf-import.
Similar for Reviews of Modern Physics.
This problem has been known and persisted for a long time now.
It means that Zotero is broken for the biggest journals in Physics.
Completely unacceptable.
PLEASE fix this bug.
Thank you !
A thousand times "Thank You" to npj !
Note - When writing my previous post, I had not recognized npj's
contribution as a complete working new translator for Phys. Rev..
Also thought that PROLA was only for old archived Phys. Rev..
Thanks to noksagt for pointing this out.
Now let's hope that the new translator covers all cases and gets deployed soon.
Thank you !
I'm not so familiar with zotero. Can anybody give me instructions how to manually update the PROLA translator to the version posted above?
Thank's.
Tomek
%APPDATA%\Mozilla\Firefox\Profiles\
and I found that zotero.jar lives under this directory in:
extensions\zotero@chnm.gmu.edu\chrome
and that zotero.jar can be decompressed using, for example 7-zip (download from http://www.7-zip.org )
I tried to put npj's PROLA.js in various places within the resulting directory tree, rezipping into zotero.jar, and restarting firefox, but nothing seemed to work.
I couldn't find any documentation on the zotero website about where translators should be saved, nor could I find any of the translators which I must already have. Perhaps they are stored somewhere other than zotero.jar, but I have no idea where....
../../../zotero/translators/
which I guess on Windows is:..\..\..\zotero\translators\
They're not zipped or anything, you can just drop the new version into that directory. On Windows, you may need to restart Firefox for the new translator version to take effect, I'm not sure.(EDIT: This applies to Zotero 2.0 only.)
pdfinfo-Win32.exe*
pdfinfo-Win32.exe.version*
pdftotext-Win32.exe*
pdftotext-Win32.exe.version*
storage/
zotero.sqlite*
zotero.sqlite.bak*
This applies to Zotero 2.0 only. Translators are much harder to access in Zotero 1.0.