Automatically identifying item type from metadata when inserted in zotero?

I've noticed that not matter whether I'm on a conference proceedings website or a journal one when I insert an item into zotero then that item in the majority of the cases is insert as journal article by default despite the fact that the item is a conference item. Is there a way for zotero to automatically recognise the correct item type from the metadata? In a library of hundreds or thousands of items I'll have to go manually at each check the type from the already inserted metadata fields and accordingly update the item type field to reflect that.
  • Zotero tries to do that, it just defaults to journal article if it's not able to. If you have a URL for a conference paper that imports as a journal article, we can take a look, but chances are the metadata just isn't rich enough to tell.
  • Thanks, here's an example link http://www.icml-2011.org/papers/398_icmlpaper.pdf. I have no idea how zotero extract metadata but when I think of them I have 2 particular ideas on my mind. First is the example of using something like pdfinfo to extract metadata from a pdf and the other is something like tesseract which is more robust than pdfinfo since it can work even for scanned pdfs (i.e. images). In the link that I provided you can notice in the first page of the pdf there's a footnote indicating venue, time and volume of conferece. I don't know if those are considered metadata or if zotero is able to parse those, but adding this item puts it in my collection as journal item instead of conference item.?
  • For PDFs, Zotero uses pdftotext and tries to guess metadata based on the content of the PDF. So it doesn't do OCR via tesseract, but that's also not needed for most contemporary PDFs, which contain a text layer. Without such a text layer, Zotero just fails at retrieve metadata with a message specifying the lack of OCRd text.

    Metadata extraction works best when there's a DOI on the first page, otherwise there's a lot more guesswork involved. Trying to infer the item type from a regular footnote is not going to be feasible any time soon (if ever), so for cases such as the one you link to the answer is that it's not going to be possible (and footnotes are not metadata -- metadata is by definition structured information).
  • I understand what you're saying. Just a follow up, the same issue happens with eprint services like arxiv 99% of the time. For instance here's another example https://arxiv.org/abs/1611.09630. This item in the arxiv comments declares the venue where it was submitted but zotero doesn't seem to be taking those info from arxiv into account since this and the majority of cases like this from arxiv are added to zotero as item types of journal article.
  • Right, but there, too, the workshop submission is just a comment. In the arxiv metadata, this is just stored as an arxiv preprint (check the bibtex on the right), which we import as a journal article for maximum compatibility with citation norms for arxiv preprints (but will eventually import as a preprint once we have that available as an item type I think)
  • Cool thanks!
Sign In or Register to comment.