PDF attachment content and REGEX handy usages

cadudesun · April 17, 2014

Hi,

I need to explore PDF attachment content in advance search. So far I've been using Regex capabilities very few and my expectation is that it is quite powerful.

a) I would appreciate any tip or suggestion regarding handy Regex operators/syntax/usages to mining data in a literature review process using Zotero.

b) To search exact expressions using Regex, are the "quotation marks" necessary?

Many thanks,
Cadu

aurimas · April 17, 2014

Handy tip (completely unrelated): To query two words within N words of each other, try
word1(?:\s+\w+){0,N}\s+word2
E.g., to query for "secondary data" or "secondary analysis of data", use secondary(?:\s+\w+){0,2}\s+data

;-)

aurimas · April 17, 2014

oh and for (b) the answer is no. That will just search for quotation marks as well.

adamsmith · April 17, 2014

also, for the basics
the cheat sheet is good: http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/

b) in addition to what aurimas says - regexes are literal by default, i.e.
/find this sentence/ (you don't need the slashes in Zotero) will only return true when you have exactly "find this sentence" in the document.

aurimas · April 17, 2014

This is where I learned regexp: http://www.regular-expressions.info/

sdspieg · July 31, 2016

I remember seeing this at the time, but not paying much attention to it. But can somebody please clarify WHERE you do these regex searches? Can we use those in the Zotero Advanced search dialog box?

vanbang9710 · April 22, 2023

It's in the advanced search box https://i.imgur.com/KA0wQbU.png

nakutcher · January 12, 2024

This page has highly useful information; thanks! Does anyone happen to know if Regex works differently when CJK (Chinese) characters are in the database? I tried some searches (including aurimas' very handy shortcut above) and they don't seem to find instances in my database. I'm wondering if my Chinese language content is the reason.