Proximity search - PDF attachment content - advanced search

cadudesun · April 15, 2014

Hi,

Is it possible to query using Proximity search in Advanced Search:

a) For instance, in NVivo I use the syntax "data secondary"~2 to query for close words. With this query I would find:
- secondary data
- secondary analysis of data
- data used for secondary

b) Is there any kind of proximity search, even linear. For instance, "data secondary"~2 I would find:
- data used for secondary
=>but not:
- secondary data
- secondary analysis of data
=> since the query term is written with "data" first followed by "secondary".

Thanks,
Cadu

adamsmith · April 15, 2014

no. Don't think that's planned, either. I see why it's nice, but at some point I think some division of labor makes sense. Let qualitative data analysis tools do sophisticated search routines and focus Zotero on reference management and organization.

cadudesun · April 17, 2014

Hi,

For instance, to query NVivo if one has a massive amount of sources/documents using advanced search criteria is time consuming and to access, open and handle the documents themselves are very very slow. Some points:

a) Considering data analysis tools which allow advanced search criteria (e.g. proximity search, boolean operators) on large amount of sources at once(e.g. 4.000 PDFs, 9GB), does anyone know any tool able for that besides any NVivo (Atlas.ti and MaxQDA aren't able as well)?

b) Regarding Zotero, I consider advance search quite efficient most of the time. To have a "proximity search" capability in Zotero at least to query PDF content would be handy. If one wants just to query PDF content (not to code and so on), Zotero would be quite enough. Just emphasizing, what is on the spot is search, not coding which is the core of qualitative data analysis tools.

Thanks,
Cadu

aurimas · April 17, 2014

FWIW, I think this could be implemented as a Zotero extension

aurimas · April 17, 2014

"linear" solution is provided here using RegExp searches. (in case someone else comes looking for this)

sdspieg · May 1, 2014

Interesting discussion.

The point is that many of us love what Zotero does in terms of bibliographical management; but that many of us also want to do all sorts of other new 'cool' stuff with that (taking advantage of the structured nature of the sqlite database). That is why papermachines, for instance, is such a fantastic tool IMO. If you have a large corpus, it allows you to use the date field to 'see' the waxing and waning of some main topics in that corpus over time.

The key of course is the 'mashability' of such things. Like Papermachines mashes Zotero with Mallet, it'd be fantastic if somebody could mash sthg like Zotero and Dedoose.

dokan · September 25, 2014

I don't get why proximitiy search is not implemented in most search engines, especially not academic ones. Once google had the AROUND operator, Web of Science has the NEAR operator (unfortunately indexs only abstract).

As the search behaviour of most people, even academic, is very basic (nearly zero use of search operator or even knowledge of their existence), there is no huge demand.

Best solution I found so far is Copernic Desktop search. It also has the near operator and can index complete PDF's.

Naomi_Cunningham · June 1, 2021

Have things moved on at all since 2014?

t-g · April 29, 2024

Upvote on that ^
This has come up many times for me. Would be a great feature.

bolejj · November 7, 2024

I use X1 search, but found it to be clunky at times with PDFs. You can do proximity searches, but for an unknown reason it will not take you to the place IN the PDF. It finds them with the search using the NEAR keyword, but then you have to search manually and without one of the two words. Just flaky. But it does work at times.

To get around this I have all of my zot PDFs in one folder. You can use Adobe Acrobat to search your whole collection, but it is far from intuitive.

bolejj · November 7, 2024

Another option is to use Filelocator PRO
You can do complex searches with that tool.

(evolution NEAR:25 wellbeing) OR (evolution NEAR:25 well-being)

This app is fast and does not require indexing.

bolejj · November 8, 2024

Here's the instructions for using Acrobat's proximity search across multable PDFs.
To do this.
> In Acrobat open Menu then Preferences the choose Search
> set "Range of words for Proximity to 25 (or whatever you like), then close.
> open a PDF in acrobat. I don't think this will work in Reader.
> Do Ctl-F to open the search box then use the drop down to choose advanced search.
> Put your two words in the search box
> In the "Look IN" box you MUST open a folder which as many PDFs in it. So chose that folder. Use the Browse option on the bottom.
> In the "Return results containing" box you MUST choose "Match All of the words". If you don't the Proximity option will be grayed out.
> You can now check in the Proximity option then do your search.

You can build an index to speed up the searches.