Advice before migrating via RIS

My ten-year-old version 1 of Mekentosj's Papers app has been staggering under the weight of my 28,000 PDFs and its own design flaws and missing features. I finally have enough time to migrate the library to Zotero, which I hope to use with Better BibTeX and Zotfile to keep the linked files organized in my Sync folder for sync.com. I've had Papers dump all its metadata into (malformed) RIS, (malformed) Endnote XML, and (malformed) BibTeX. (After all, why would anyone ever want a well-formed RIS file that actually contained the authors' first names and the correct years?) After a bunch of cleaning and merging information between the formats, I finally have 25 megabytes of RIS entries like...

TY - JOUR
ID - 25686
AU - Warren, Richard M.
AU - Obusek, Charles J.
Y1 - 1971
JF - Perception and Psychophysics
JA - Perception & Psychophysics
SN - 0031-5117
T1 - Speech perception and phonemic restorations
M3 - 10.3758_BF03212667
VL - 9
IS - 3B
SP - 358
EP - 362
L1 - file://localhost/Users/kevin/Sync/Papers/1971/Warren/Perception%20&%20Psychophysics%201971%20Warren-1.pdf
L3 - papers://312B67BD-FE2F-416D-BD92-113D7395966B/Paper/p25686
ER -


My questions...

Should I change anything in the L1 line so that Zotero/Zotfile can find the linked file? The "L1" code? Delete "file://localhost"? If I'm going to set Zotero's Linked Attachment Base Directory to "Users/kevin/Sync/Papers", should I delete that part of the path? And am I going to get bitten by the fact that Papers couldn't be bothered to URL-encode the ampersand in the journal name part of the file name?


For years, I've been maintaining a separate (minimal) BibTeX file for a subset of the papers (about 6000). I'm guessing this would be the most convenient time for me to merge the BibTeX keys into Zotero's records for those papers — not all of them are completely predictable from the rules that I'll be telling Better BibTeX to use. Is there an RIS field that I can put the key in so that Better BibTeX will be able to find it and use it?


And a more general/subjective question...

I never liked Papers' directory structure and its file naming scheme of "Journal Name Year First Author". But the thought of letting Zotfile move and rename 28,000 files seems kind of intimidating. (Is it even sane to expect it to succeed?) Should I just stick with the existing structure and filenames or is it worth sucking up all the extra bandwidth that resynching the new files will mean?

Thanks.
  • Pretty sure there's no way to handle the bibtex keys in RIS.

    As for importing the files, don't delete the "Users/kevin/Sync/Papers" part -- Zotero doesn't use the base directory during import, so that'd likely break things.

    I think removing file://localhost would work best, but I'd recommend just testing this with a minimal RIS file before you run through the whole thing. Honestly don't know about the & - again, easiest to try it out.

    Also, when importing, I'd cut the RIS file into chunks. Importing 28k items at once might be rough.

    I wouldn't see why moving 28k files with ZotFile shouldn't work (though I'd do it in batches). Whether it's worth it is a question for you. Personally, I don't care where my files are outside Zotero. I only ever access them from within.

  • BBT honors keys that live in the extra field if they're preceded by Citation Key:, so if there's a way to import to the extra field, it's automatable.
  • From the looks of it, M2 imports to extra, so M2 - Citation Key: somethingfancy should do the job.
  • I like your sense of minimal BTW ☺️
  • Another option is to import both and use the Zotero duplicate merging feature, but if it's feasible to merge beforehand, that's going to be more convenient than clicking through 6000 items (says the TeX-head who sees all problems as automation problems)
  • Thanks, guys. I'll try some small tests, then cut the RIS into chunks, and try "M2 - Citation Key:".

    @emilianoeheyns: By "minimal", I meant only the fields absolutely necessary for a printed bibliography — so no abstract, keywords, journal publisher's address, local file refs, etc. But, yeah, 6000 is pretty ridiculous. (In my defense, I *have* cited over 3500 of them, at least in my personal notes.)
  • My collection for my masters thesis has 3000 items -- not judging, just always funny to see academic packrats.
  • Thanks again.

    For anyone who ends up here after googling a similar question, here's what's worked in my small tests:

    Zotero can find my files if the RIS entry has a line like:

    L1 - file:///D:/Sync/Papers/2003/Fischer/Perception%20&%20Psychophysics%202003%20Fischer.pdf

    (I'm doing the conversion on the D-drive of a Windows machine. I'm guessing that "file://User/kevin/blahblahblah" would work on a Mac.) The raw ampersand didn't mess things up.

    Better BibTeX *will* find and use a key in a field like:

    M2 - Citation Key: SmithJonesZhang2018b


Sign In or Register to comment.