Unable to bulk import a list of urls
I'm on the Zotero linux desktop client and I'm pulling my hair out trying to make this work. Basically all I want to do is import a list of urls into Zotero and have the metadata pulled from the page and a snapshot saved. If I drag and drop a bookmark from firefox this works perfectly.
The problem is that if I drag multiple items weird things start happening. All bookmarks have the same title, it only imports 1 and there's no title, or nothing happens and drag and drop is disabled until I exit the application.
I tried saving my url as a .html file and importing it, but Zotero didn't treat it as a webpage. I built an ris file by hand and imported that, but no metadata was pulled and no snapshot created. I looked for the "Add Webpage" option in the menu but there's no entry.
Is there any way to pull in a list of urls?
The problem is that if I drag multiple items weird things start happening. All bookmarks have the same title, it only imports 1 and there's no title, or nothing happens and drag and drop is disabled until I exit the application.
I tried saving my url as a .html file and importing it, but Zotero didn't treat it as a webpage. I built an ris file by hand and imported that, but no metadata was pulled and no snapshot created. I looked for the "Add Webpage" option in the menu but there's no entry.
Is there any way to pull in a list of urls?
You're really meant to add URLs from the browser, since those use your existing cookie state, including web-based proxy access and other site logins. While Zotero has the ability to save directly from URLs (which would be roughly equivalent to what you'd get when you save on zotero.org, barring IP-based subscription access), the results might not be as good, so it's not something we encourage when you have the Zotero Connector available.
This is the reason you can't, say, paste a list of URLs into Add Item by Identifier the way you can paste in a list of DOIs (which itself has the downside of not using any web-based proxy you have to download PDFs, though it will use an IP-based proxy). We could consider adding that for cases like yours, but I'd be concerned that some people might start to add individual URLs that way instead of loading them in the browser, which wouldn't be ideal. It doesn't, actually. Dragging a URL will create a webpage item, but it won't run translation on the page to grab metadata. I'll fix that, but for the reasons above, it's still not really a recommended workflow.
That aside, dragging multiple URLs just seems to be poorly implemented across various browsers and applications. It's possible we could make that work from some browsers, but I'm not sure we could make it work reliably. Because you're not meant to add webpages by hand. See the note here:
https://www.zotero.org/support/getting_stuff_into_your_library#manually_adding_items
That said, if you really want to do this, you can save a newline-separated list of URLs to a text file, open Tools → Developer → Run JavaScript, and run this code, after adjusting the path on the first line:
var path = '/home/username/Desktop/urls.txt';
var urls = Zotero.File.getContents(path).split('\n').map(url => url);
await Zotero.HTTP.processDocuments(
urls,
async function (doc) {
var translate = new Zotero.Translate.Web();
translate.setDocument(doc);
var translators = await translate.getTranslators();
if (translators.length) {
translate.setTranslator(translators[0]);
try {
await translate.translate();
return;
}
catch (e) {}
}
await ZoteroPane.addItemFromDocument(doc);
}
)
This won't load JavaScript on the page before trying to save, so some pages might not save properly. (There's a way to do that, but it's more complicated and much slower.)
I'm very comfortable with Javascript so I'll try the solution you posted. Could you point me to documentation of how to get it to run Javascript too? I think I can figure it out if I know where to look.
@bwiernik: I think @camjohnson was asking about the alternative approach I mentioned above where it loads JS on the page.
JavaScript knowledge isn't particularly relevant here — this is all Zotero-specific code. But you'd want to use Zotero.HTTP.loadDocuments() instead of Zotero.HTTP.processDocuments(), and then, for safety, re-parse the document using DOMParser:
var path = '/home/username/Desktop/urls.txt';
var urls = Zotero.File.getContents(path).split('\n').map(url => url);
await Zotero.HTTP.loadDocuments(
urls,
async function (doc) {
var parser = new DOMParser();
var safeDoc = Zotero.HTTP.wrapDocument(
parser.parseFromString(doc.documentElement.outerHTML, 'text/html'),
doc.location.href
);
var translate = new Zotero.Translate.Web();
translate.setDocument(safeDoc);
var translators = await translate.getTranslators();
if (translators.length) {
translate.setTranslator(translators[0]);
try {
await translate.translate();
return;
}
catch (e) {}
}
await ZoteroPane.addItemFromDocument(safeDoc);
}
)
But I'd strongly recommend using the previous one first and only using this if there are pages that aren't working right. Note that loadDocuments() loads pages in parallel, so it isn't meant for running with many URLs at once and will likely crash Zotero and/or your computer if you do so.
```
var path = '~/Desktop/temp/Exported Items.txt';
var urls = Zotero.File.getContents(path).split('\n').map(url => url);
rv = ""
for (var url of urls) {
try {
await Zotero.HTTP.processDocuments(
url,
async function(doc) {
rv += "\n" + url + ": ";
if (!doc) {
rv += ("doc is null")
return;
}
var translate = new Zotero.Translate.Web();
translate.setDocument(doc);
var translators = await translate.getTranslators();
if (translators.length) {
translate.setTranslator(translators[0]);
try {
await translate.translate();
rv += "translated successfully "
return;
} catch (e) {
rv += "translation failed with " + e
}
}
await ZoteroPane.addItemFromDocument(doc);
rv += "Succeeded"
}
)
} catch (e) {
rv += "\n" + url + (" Failed Download");
}
}
return rv + ("\nAll Documents Processed")
```
For the record I'm moving from Firefox bookmarks to Zotero for managing this content. I've looked at a bunch of tools including Polarized, Notion, Mendeley, Papers app, pinboard, Hypothes.is, Diigo, etc. but none of them have all the features I need. Zotero is the closest though so if a few of these kinks get ironed out this use case might be a source of new users.
needs to be the filepath to your file with URLs.
getContents()
.)As I say, it could also be due to invalid characters in the file, in which case the charset won't matter — and that's fairly likely, because nearly all URLs would generally be ASCII anyway, so other charsets generally shouldn't come into play. If you just paste a couple URLs into a new file, I think you'll find it works fine.
The final code looks like this (with actual URLs pasted into var path)
var path = "URL1,URL2,URL3";
var urls = path.split(',');
await Zotero.HTTP.processDocuments(
urls,
async function (doc) {
var translate = new Zotero.Translate.Web();
translate.setDocument(doc);
var translators = await translate.getTranslators();
if (translators.length) {
translate.setTranslator(translators[0]);
try {
await translate.translate();
return;
}
catch (e) {}
}
await ZoteroPane.addItemFromDocument(doc);
}
)
>> It seems like this would be a pretty common workflow but maybe I'm missing something?
> It's just not, I'm afraid — you're the first person I can recall ever asking for this.
I would expect this request to become more common.
A little over a year ago, Firefox removed a bookmark 'notes' feature, where one could add useful notes/info to bookmarks stored in Firefox.
In searching for a fix to restore this ability, or an alternative, I came across a recommendation for using Zotero as the bookmark library instead. Therefore I am looking at moving my existing browser bookmarks gathered over the years to Zotero.
So while I have no issue using the suggested javascript supplied here, I am just noting why you may get more interest for such a feature than previously.
I would also like to import a list of URLs, or to be more precise, I want Zotero to visit each URL and mimic the behavior of pushing the "Save to Zotero" button. My overall workflow is to feed a JS file to Zotero via BBT's debug bridge that then reads the URLs from a file just as discussed above. I've been trying to use both "main" variants shown above but was only successful with processDocuments and not loadDocuments (yet).
My main question though is why all variants above have a return after translate.translate(). This does not make any sense to me because then the later ZoteroPane.addItemFromDocument is never executed (and thus nothing added to the database)?
I am nevertheless almost satisfied with the additional error handling camjohnson's version provides - one small thing I would like to check is that none of docs is of type web page (as this is most likely to be an error in my use case). Is there a way to do that (before inserting the result into the db, i.e. within the processDocuments processor)?
Also, is there any way to debug this code better? I have been using the Zotero.debug() and the debug output window so far but this is rather ineffective.
var path = '/home/username/Desktop/urls.txt';
var urls = Zotero.File.getContents(path).split('\n').map(url => url);
await Zotero.HTTP.processDocuments(
urls,
async function (doc) {
var translate = new Zotero.Translate.Web();
translate.setDocument(doc);
var translators = await translate.getTranslators();
if (translators.length) {
translate.setTranslator(translators[0]);
try {
await translate.translate();
return;
}
catch (e) {}
}
await ZoteroPane.addItemFromDocument(doc);
}
)
I'm getting Zotero is not defined. I'm trying to import multiple urls tried to import from Chrome history but didn't succed
[Exception... "Component returned failure code: 0x80520001 (NS_ERROR_FILE_UNRECOGNIZED_PATH) [nsIFile.initWithPath]" nsresult: "0x80520001 (NS_ERROR_FILE_UNRECOGNIZED_PATH)" location: "JS frame :: chrome://zotero/content/xpcom/file.js :: Zotero.File</this.getContents :: line 159" data: no]
I am also getting a very similar error:
[Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIFileInputStream.init]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: chrome://zotero/content/xpcom/file.js :: Zotero.File</this.getContents :: line 167" data: no]
and I implemented the same code.
Yes I have tried many different ways to fix the file path error and I even included an absolute path to the text file, but I was still unable to fix the problem.
Is there a different way to indicate the filepath with Windows?
This is what is going into path url right now: "C:\Users\my_user_name\Desktop\text_file.txt"
The rest of the code is the same.