Pull Website Title in automatically for Create New Item From Current Page

Seems like most of the websites and blogs I visit aren't COiNS-enabled, so I wind up using the Create New Item From Current Page button a lot, and that defaults to the Web Page item type. Zotero clearly pulls whatever's in the title element into its Title field, which makes sense. But most CMSs also put the website's title and sometimes the author's name into that title element, so I then have to clean out the Title field, putting such information as the author's name and the website or blog's title into the proper fields. I'd love it if Zotero would try to do that for me -- most of the data is semi-structured, using characters such as the pipe | to distinguish between the page's title and the website's title. I'm sure Zotero would get it wrong sometimes if it did this, but as it is it's wrong (well, "wrong") ALL the time.

Some conventions have developed for this -- here's some examples of what's between web page title tags. You can see that it wouldn't be too hard to at least guess at what's the web page title and what's the web site title:

<title>Building a Large-Scale Print-Journal Repository - Wired Campus - The Chronicle of Higher Education</title>

<title>National Digital Public Library Gets One Step Closer to Reality | Fast Company</title>

<title>JISC Digitisation Programme &raquo; The UK&#8217;s National Digital Library &#8211; A Digital Public Space</title>

<title>Google & the Future of Books by Robert Darnton | The New York Review of Books</title>

Even just telling Zotero to put anything after a pipe symbol into the Website Title field would save me a good chunk of time. Thanks!
  • I think that even small tricks in the basic save page function are not a good idea. I think that we can extend our current site translators to save from more sites-- I'm working on that now.

    I'm also hoping that we'll get some simplifications for site translators in the next release of Zotero, so that people can more easily and quickly implement them for new sites.
  • Fair enough -- I'm sure you know more about the potential problems than I do, and it's not your fault that I'm getting information from the millions of blogs that'll never have individual site translators. The Chronicle of Higher Education has a translator, for instance, but not its "Wired Campus" blog on the same site. Though I wonder whether it'd be possible to do a kind of CMS translator as well as site translators -- without having researched it too carefully, seems like WordPress and Drupal have certain conventions like the kind I describe above. But oh well.
  • We do have a general Blogger translator, and a general Wordpress translator (Livejournal, etc) could be done pretty easily.

    I wrote the Chronicle translator, and I'll look into fixing the blogs support-- it did work at one point.
  • I think that'd be the way to go - add relatively generic translators that pick up standards - all blogspot blogs are already recognized that way, I don't see why that shouldn't be the case for wordpress and drupal etc. sites as well.
  • I just wrote an LJ translator, since LJ (ЖЖ) is the lifeblood of the Runet.

    Please go to http://github.com/ajlyon/zotero-bits/raw/master/LiveJournal.js and save the file to the translators directory of your Zotero data directory (http://www.zotero.org/support/zotero_data).

    It should start working. A translator like this for WordPress could be written rather easily.
Sign In or Register to comment.