automatic tagging.

o.hymas · December 9, 2007

It would be great to see pdf's tagging go the way of mp3 tagging. PDFexplorer has started this off by making it easier to create metadata that was relevant to the PDF, but it is still a manual input.

For a long time, music database such as Gracenote have been able to automatically identify a piece of music and create the metadata straight from the music itself. Similar techniques could be used by Zotero to create the metadata and the tags for PDF's.

It should be easy enough to create a unique ID for PDFs (eg reading the 1st 300 words) that would be stored in a central database with the user added tag information for that PDF.

A Zotero user with a new PDF's could then first see if the tag info is in the central database, and if not create the tag information which will be sent to the central database and shared with all users.

The user could also be given the choice of saving the tagged info in the meta data of the PDF.

bdarcus · December 9, 2007

There's nothing peculiar about PDF as an output format though, and it would be a mistake to design a solution that presumed there was. We're dealing with documents, which can be serialized in different ways. So the solution is to start from the general: a semantic web perspective.

In any case, I think what you're generally looking for is on tap for Zotero 2.0.