Tag frequency of use

That would be great to be able to know which and how many times a tag is used.

That could help to get a big picture of what kind of item are in my database. It would tell me if I do need more citation about specific tag that isn't use much and vice versa.

I found a way to do that using "db browser for sqlite" (github) and some excel charts but that's quite some work for a simple feature that could be implemented in the core of Zotero.

Thank you Zotero!
  • edited April 11, 2015
    For those who are looking for a way to know their tags' frequence of use here is how I do it (anyway advice to improve this method is welcome!)

    Software needed : Db browser for Sqlite (open source) and Excel

    To get the tags' frequence of use 

    1. Open a new Excel worksheet

    2. Open Db browser, click on open Database and navigate to your zotero.sqlite (make a backup before!)

    3. Click on the tab Browse data and then in the table drop down choose itemTags.

    4. Clcik twice on TagID to organize the column from smallest to largest. Then click again and hold you click while you press CTRL+C.

    5. In excel select column A and press CTRL+V. Right-click add a row and enter in the field A1: Tag_freq.

    6. Select the column A , go on Insert tab, click on pivot table, then ok

    7. On the right panel, drag Tag_freq in ROWS and VALUE

    8. Copy ColumnB (Count of Tag_freq) in column F. 

    9. In DB browser, go to the table tags. Organize tagID  from smallest to largest. Then click on name but hold your and press CTRL C. 

    10. In excel, select column E and press CTRL V

    11. Drag the selection so that you first Tag so that the first tag will be in E4

    12. Select column E and F, tab Home, Sort&Filters, Filter, then on the title of column F choose Largest to Smallest. 

    (*) not possible to copy only 1 column in Db browser, this method is quicker than using SQL queries. 

    To get a chart

    Select Column E and F, Insert tab, click on recommended chart, choose the second chart (it's a column type chart), click ok. Double click on the Y axis label, and enter in the right panel, under Axis options, bounds, maximum the frequence of your most used tag. (Print screen here)

    Quick explanation of zotero.sqilte DB organization:

    • table tags givesthe ID of a tag

    • table itemTag gives in the ItemID the Id of the citation and tagID the id of the tag (so one itemID comes back as many times as there are tags on this items)

  • Maybe the Zotero Plugin Paper Machines is interesting for you.
  • Thanks a lot for that Zuphilip. I had not idea such powerful plugin existed for Zotero. It's great to see that!

    I had a quick look at several the different features, not very familiar with this kind of datavisualisation, but it's will for sure save me a lot of time trying to to all that work by myself. What a precious link you gave me!

    I have seen the world cloud feature but unfortunatley we can't restrict the worl to the tags only. The Topic modeling by tag isn't working well, the windows doesn't have the right scale. I will have a longer look in the next days, so much to do with this plugin. Once again thanks for that Zuphilip!
Sign In or Register to comment.