Autocomplete/Doi minor annoyance

bkaplowitz · March 19, 2023

Hi,

I came back to regular Zotero use (the first time in the early zotero days it was magic, until it deleted half an essay repeatedly in the early word integration) after a few years as a Mendeley user. It is considerably more refined/polished in Zotero 6, and I like a lot of the changes. It is also great as a database management tool.

There are four minor frustrations I seem to have after switching back:
1. The big one is incomplete metadata on import. Oftentimes, a very incomplete set of data is fetched using the web connector. Rather than autocomplete this, Zotero seems to expect you to search for the citation manually on cross ref or google scholar and then reimport the citation. Sometimes even the DOI or arxiv id is copied and yet most fields remain unfilled. You then have to load it into the wand, generate a new citation and delete the old one. Is there any chance an auto-fill is coming, or even just a simple javascript or applescript or bash script to automate this process if a doi is found? I remember, I think, there used to be a plugin for google scholar that would do this, but it seems defunct. This is especially noticeable coming from Mendeley, where often the citation seemed to auto-fill more comprehensively. Similarly, with the OCR plugin, I'm surprised that data on like the title author etc. isn't automatically fetched from the pdf after being recognized as text.

2. Sometimes the Zotero connector seems to get stuck thinking something is a webpage despite embedded metadata and it takes a few seconds to reconfigure. If you import too early, it loads a snapshot of a website instead of the article. I’m not sure if this is a function of using it in chrome or some other add-on interfering, or if it can just sometimes be on the slower side. Any attempts to improve this would be greatly appreciated.

3. Bettter/more customizable duplication detection. Supposedly the matches are fuzzy, but if so, the sensitivity of the match is pretty low--a lot of duplications are not found despite very similar titles missing just one or two word differences or a slightly different set of authors. I guess one really easy thing to do would be to just expose the sensitivity of the match or ruleset via say regex to the user. Additionally, merging can only happen on the same file type which can be an annoyance.

4. Better integration with other pdf tools like LiquidText in terms of tracking tags etc. in zotero 6, and easier importing/generation of notes in research knowledge database tools like Roam, Logseq, Obsidian, or even something like. DevonThink. I realize this is also something that has to happen on the other side as well, but right now the integrations (mainly tested for Obsidian) are quite hacky and don't work great for generating research notes even using plugins intended for exporting/importing.

Any thoughts would be greatly appreciated.

I am very excited to see the continued progress on Zotero! It is a great tool overall.

dstillman · March 19, 2023

(In general it's always better to start new threads for separate issues, and we always needs Steps to Reproduce.)

1. The big one is incomplete metadata on import. Oftentimes, a very incomplete set of data is fetched using the web connector.

If you're not getting proper data from somewhere, we'd want that reported with an example URL in a new thread per issue.

This is especially noticeable coming from Mendeley, where often the citation seemed to auto-fill more comprehensively.

Metadata updating is coming, and it should certainly be able to fill in items from a DOI, but I'd just add that Mendeley's metadata has always generally been quite bad — often filling items with seemingly user- or machine-learning-generated garbage, and even completing items with data from a totally different reference due to misapplied identifiers. That's not the approach to metadata we take.

2. Sometimes the Zotero connector seems to get stuck thinking something is a webpage despite embedded metadata and it takes a few seconds to reconfigure.

Detection finishes when the page loads. If some resource on the page is stuck loading, it may stay detected as a webpage. (An ad blocker tends to help here.) You can sometimes just click the stop button or press Esc to stop page loading. The main option for us would be to wait to run translation until the page stopped loading, but it would still just be the same problem where you'd possibly have to force-stop the page.

If you can reproduce this reliably somewhere, report it in the separate thread and we can take a look.

a lot of duplications are not found despite very similar titles missing just one or two word differences or a slightly different set of authors

We would want examples in a separate thread.

Similarly, with the OCR plugin, I'm surprised that data on like the title author etc. isn't automatically fetched from the pdf after being recognized as text.

Not sure what you mean by this. We don't have anything to do with the OCR plugin. If you run metadata retrieval on a PDF and Zotero finds a title, authors, etc., it will use those to try to retrieve official metadata. If you're not seeing that for a file you'd expect it to work on, report that in a separate thread and link to a public PDF or email a custom PDF to support@zotero.org with a link to the thread.

4. Better integration with other pdf tools like LiquidText in terms of tracking tags etc. in zotero 6, and easier importing/generation of notes in research knowledge database tools like Roam, Logseq, Obsidian, or even something like. DevonThink. I realize this is also something that has to happen on the other side as well

Not "as well" — we don't have anything to do with the existing integrations. But we're aware there's demand for better integration with some of these tools.