Abbreviations for Zotero
In parallel with work on the citation processor (citeproc-js), I've put together an Abbreviations Plugin that enables external maintenance of abbreviation lists. The plugin seems to work with official Zotero as well as the experimental multilingual client (MLZ) for which it was originally designed. The UI is not ideal, but I've recently eliminated some of its more annoying quirks, and it should be serviceable. The link above leads to an install page, with an onward link to some rough documentation.
The plugin stores abbreviation lists on a per-style basis in an SQLite database. Only the abbreviations relevant to an open document are pulled into memory, so it should be possible to carry large abbreviation lists for a large number of styles without a serious impact on performance.
Please post feedback on the plugin to this discussion thread.
The plugin stores abbreviation lists on a per-style basis in an SQLite database. Only the abbreviations relevant to an open document are pulled into memory, so it should be possible to carry large abbreviation lists for a large number of styles without a serious impact on performance.
Please post feedback on the plugin to this discussion thread.
I have a couple of questions and a question/suggestion.
1. Could you elaborate on the importing process for those who are less familiar with JSON or CSL files? For testing, I downloaded one of your MLZ styles, but it's a CSL file, not a JSON file that the plugin looks for when importing.
2. I've downloaded the PubMed abbreviation list. How do I convert this into a format that the plugin can parse out? I'm assuming it will require some find & replace operations to the file.
3. I think a repository of abbreviation lists for this plugin on the web where anyone can add to the lists in a wiki-like collaborative manner would be wonderful. Would there be a legal hurdle to such implementation?
Thanks!
Here is a simple example of the JSON structure. Building a list of journal abbreviations manually would just be a matter of filling in the blanks, as it were. To convert data from a public list to this format would require some modest scripting, but the abbrevs are just one-to-one mappings, so apart from data cleanup etc it shouldn't be a huge task. Journal abbreviations go in the container-title segment; the others can be left empty:
{
"default": {
"container-title": {
"European Human Rights Reports": "EHRR",
"All England Reports": "All E.R."
},
"collection-title": {},
"institution-entire": {},
"institution-part": {},
"nickname": {},
"number": {},
"title": {},
"place": {},
"hereinafter": {},
"classic": {},
"container-phrase": {},
"title-phrase": {}
}
}
One point I would mention here is that for some journals I have different full names in my library (I think you are aware of this possibility). Cleaning up my library would be probably the best solution but time consuming. For the abbreviations I solved this by adding all my different name versions in the json file.
What I would suggest or better request is something like a merge function for different journal names to only one. Or similar to the tag renaming – if you rename it once in the tag selector it will change all of them in different references (and additional to this I think this is also relevant for the creator name fields). I’m sure somebody else requested that before, but I think it fits good to this approach.
I wrote a script and made a json abbreviation file from the pubmed journal database. However, the import function seems not working correctly. Whenever I import this json file containing huge entries of abbreviations, nothing changed. If I use export, I can only get the entries before import (which means nothing imported). Only if I type in the abbreviation manually, the entries get modified in the exported file.
Could you have a look at my json file ? Maybe I missed something.....
You can download it here
https://docs.google.com/open?id=0B7d1ivQI3OkpS3AwdE9Ia3lfeEk
cheers,
https://docs.google.com/open?id=0B4OI8S-ZuErIR2lRYThVOEN5X1k
However, it seems to me that with ~25000 entries, zotero and firefox just stopped responding.
Another inconvenience is that the plugin only maps the journal title with matching case (i.e. "The Journal of cell biology" will not match with "The Journal of Cell Biology"). As the upper- and lower-case changed with citation style, it becomes quite problematic. Is it possible to make the plugin ignore the capitalization ?
The speed of import itself can be improved. It's running in a single transaction (which is faster than not), but we can speed things up a bit further by using precompiled storage operations. I'll look into setting that up as time permits. This could be done, but it would mean data loss, and so increase ambiguity. I'm not sure it would be a good idea.
The PubMed list itself is pretty messy, with some journals registered in abbreviated form. Forcing everything to lowercase would increase the possibility that the abbreviation of one journal (registered as its "full" name) overlaps with the proper name of another. If matches are case sensitive there will be more misses, but the user can still register a missed journal name form when it is encountered, which seems adequate.
Give it a try when you have a chance. I think the changes will work across all platforms, but if you have difficulties, let me know and I'll sort things out.
The speed is improved, however the browser kept popping up the warning:
"Warning: Unresponsive script.........Script:chrome://abbreviations-for-zotero/content/xpcom/import.js:80", even when the importing was finished. For example, if I chose to import the whole 26,000 entries to replace the entire local list, the browser was busy for ~10sec and the warning popped up, if I click cancel at this first warning, and then export the list, only about 8000 entries were imported.
As a result, I have to click continue upon this unresponsive script warning for 3 times and click cancel at the 4th time when the warning popped up to have the entire list imported. If I didn't click cancel at all, this unresponsive script just kept popping up forever (at least >10 times).
It looks like the script just didn't tell the browser it has finished the importing task.
On the other hand, it would be really helpful if you can add an option for the plugin to ignore the case, because almost 90% of my journals are not matched because of this capitalization issue.
Thanks a lot for your help,
You are probably not seeing the speed improvement yet. The main speed boost came from adding a few database indexes that were missing. These don't (yet) get installed in an existing database if they are missing. I'll set that up, and tweak another thing that is probably slowing things down (the default import method will still be very slow, I think, due again to a missing index or two).
More news in a few days.
Screenshot:
https://dl.dropbox.com/u/5277753/AddEdit%20Citation_2012-06-04_15-56-02.png
Some feedback about the script - we need a an option to create versions with and without full stops, depending on citation format. Currently the Index Medicus list compiled by qztseng has some abbrevation with full stops and other without.
The other issue is slightly different versions of journal names in index medicus and Zotero. But this has already been mentioned, I won't elaborate on that.
I think the best solution rather than a list would be kind of a dictionary - there are quite a few words that tend to repeat that could be easily replaced by a script.
Thanks,
Rob
An unresponsive script warning isn't an error; it just means that the import hasn't finished yet. If simple small imports are confirmed to work and you have a valid json file, you would want to click "Continue".
No luck with the PubMED list. But I cleaned up the JSON in my own list with the help of an online validator (http://jsonlint.com/), as well as saving as ANSI. The largest hurdle was different character's for " (right, left, simple).
Works great. Is there a abbreviation's repository I can upload it to? The subjects are biology, ecology and environmental science.
Thanks for supporting Zotero
There is no community repository yet, but we can open one. Shall I open a space for lists on GitHub, and reflect it on CitationStylist as a solution for the time being?
Now I'm looking for a good list for statistics and mathematics...
Alas, I couldn't quite get the JSON format that fbennett suggested. My JSON file does validate. However, when it is imported into the plugin, only the journal title seems to be there. I hope that someone can convert the list so that it is useful.
The two files may be found at:
http://www.safetylit.org/old-stuff/journalslist-120711.csv
http://www.safetylit.org/old-stuff/journalslist-120711.json
I tried to learn to do the necessary scripting to get this to work but I'm frustrated at my lack of skills and my lack of time needed to spend on learning how to do what needs to be done.