CiteULike personal PDF download
I have just tried Zotero and I think it is great! I have a large collection of articles on CiteULike that I would like to import and it seems to work quite OK except that my PDFs are not downloaded. I have tens of PDFs and it would be a major pain to have to download each manually and then attach it to the library item imported from CiteULike. Is there some way to get PDFs imported?
To find out, click the gear icon and then the "General" tab.
@phdthesis{Tracz:1997:PHD,
author = {Tracz, William J. },
citeulike-article-id = {3385689},
keywords = {algebraic-specification, architecture},
month = {March},
posted-at = {2008-10-08 10:04:23},
priority = {2},
school = {Stanford University},
title = {Parameterized Programming in LILEANNA},
year = {1997}
}
The import worked fine except that the "school" key was not imported: the Zotero entry shows blank university.
I think that for CiteULike a different translator is necessary. It would be best to import the bibtex since it seems more complete than CiteULike's URI. CiteULike makes you go through a form to get a bibtex file so I guess it is a bit harder than getting the Endnote record. But to make the import complete, the HTML page of the CiteULike article needs to be examined itself and any extra URLs and the PDF files have to be imported from there. I would implement it in a jiffy if I was fluent with javascript and Zotero. As it is, I hope that someone else can have a look.
This is something I have done a long time ago, and, as far as I can remember, it mostly worked. I still have plenty of entries from CiteULike with the original PDFs. At this point, I can't walk you through the process (I don't remember anything), but you can look up how to modify imports in Zotero, and see if my patch still applies.
In fact, I don't see why any of the changes in Zotero in the last several years should prevent that version from working. Install Scaffold 2.0 in your up-to-date Zotero installation and give the modified translator a try. If it doesn't work, it's more likely because of a change in CiteULike behavior than in Zotero behavior.
ajlyon: I see your suggestion makes a lot of sense. However, I know close to no Javascript, nor anything about CiteULike's detailed behavior. Learning JS just for this was not an appealing idea to me. So, in the mean time, I found a solution which is a very ugly kludge, but it worked. In case it could help anybody, I am posting it here.
The basic idea is to use SyncUThink (http://www.andrewberman.org/projects/sync/), which, among other things, downloads all the pdfs one has in CiteULike. So all that we need to do is modify the bibtex file from CiteULike so that it includes the full path to the local pdf file downloaded. For this modification we need to find a way to easily match the citeulike bibtex records to the pdf file names (without digging in the Java code for SyncUThink nor learning about CiteULike's structures). So this is what I did. Again, be warned this is an ugly kludge.
- Use the standalone version of SyncUthink (download from link above).
- If run from an Emacs (shell) buffer, the correspondence between CiteULike IDs and pdfs is clear there. Save that buffer and call it "emacs.buffer.txt".
- Leave only the lines with interesting stuff from that buffer (one per record)
grep Downloading emacs.buffer.txt > tmp1
- The CiteULike ID is the third from last field (separator is "/"). Get
just the ID and the pdf file name, so we can see the correspondence between
CiteULike's ID and file name in a two-column file
awk -F"/" '{print $9, $11}' tmp1 > id.and.pdf
- Now we only have to match the ID in the bibtex file with the pdf and, for the sake
of simplicity, substitute the "citeulike-article-id " by
"pdf ", so we can import the bibtex file in Zotero. I do it in R.
### What follows is all done from within R
id.pdf <- read.table(file = "id.and.pdf", header = FALSE)
common.dir <- "/home/ramon/CUL3-pdf/" ## where the pdfs live
bib <- readLines("rdiaz.bib") ## the file where SynkUThink stored your bib
pos.to.check <- grep("citeulike-article-id", bib) ## lines that have the IDs
get.the.pdf.path <- function(z, id.pdf, common.dir) {
id <- as.numeric(strsplit(strsplit(z, "\\{")[[1]][2], "\\}")[[1]][1])
pos.id <- match(id, id.pdf[, 1])
if(!is.na(pos.id))
return(paste("pdf = {", common.dir, id.pdf[pos.id, 2], "},", sep = ""))
else
return(NULL)
}
## find the correspondence ID -> pdf and, if existing, substitute the field
for(i in pos.to.check) {
tpdf <- get.the.pdf.path(bib[i], id.pdf, common.dir)
if(!is.null(tpdf))
bib[i] <- tpdf
}
## the file "bibwithpdfs.bib" can be imported into Zotero
writeLines(bib, "bibwithpdfs.bib")
########## We are done #######
It would be nice to improve CiteULike support, and perhaps it will happen one day.