Search multiple PDFs
I have a library of 500+ research articles in PDF (about 700-800 MB of my HD). They are all searchable PDFs (I have OCRed the ones which were scanned pages with noi text). The articles are of different sizes, but about 50 pages each.
I've put these PDFs all together in one folder and now I am looking for a search engine (for Windows 7) which is able to perform full-text searches in this whole library of PDF files. I've tried several pieces of software, but no one has given me a satiisfactory experience. Let me tell you which ones I've used:
Windows Search: fast and indexes files, but not very straightforward to limit the searches to a specific folder.
Google Desktop: fast and indexes files, but I have not found a way to linit the searches to PDFs inside a specific folder (I don't want it to search the thousands of PDFs stored in my HD). Plus, it has been discontinued by Google.
Copernic: fast, indexes files and I can limit the searches to a specific folder. However, it is not able to render properly the text inside the PDFs.
Mendeley: it creates a database of PDFs, indexes and searches those PDFs included in the database. However, it has crashed due to the large number of PDF files I've added. In addition, it cannot display all the instances of a specific word I search for.
Zotero: I couldn't even try it, it crashed as I tried to add my PDFs to its database.
Adobe Reader: it searches all PDF files inside a specific folder. However, the search is very slow (it does not index files). It is able to show all the instances a word is found in each PDF file, renders PDFs greatly and it is possible to read and annotate the PDFs right after the search. But it is sooooo slow.
PDF X-Change Viewer: prety much the same as Adobe Reader.
Foxit Reader: the best so far. Just like Adobe Reader and PDF X-Change Viewer, but the searches are a bit faster. In addition, I liked the interface better.
The ideal solution for me would be if Foxit Reader could index all PDFs inside a specific folder, so searches would be much faster. Is it possible? Is there a solution which I have not yet tried?
I've put these PDFs all together in one folder and now I am looking for a search engine (for Windows 7) which is able to perform full-text searches in this whole library of PDF files. I've tried several pieces of software, but no one has given me a satiisfactory experience. Let me tell you which ones I've used:
Windows Search: fast and indexes files, but not very straightforward to limit the searches to a specific folder.
Google Desktop: fast and indexes files, but I have not found a way to linit the searches to PDFs inside a specific folder (I don't want it to search the thousands of PDFs stored in my HD). Plus, it has been discontinued by Google.
Copernic: fast, indexes files and I can limit the searches to a specific folder. However, it is not able to render properly the text inside the PDFs.
Mendeley: it creates a database of PDFs, indexes and searches those PDFs included in the database. However, it has crashed due to the large number of PDF files I've added. In addition, it cannot display all the instances of a specific word I search for.
Zotero: I couldn't even try it, it crashed as I tried to add my PDFs to its database.
Adobe Reader: it searches all PDF files inside a specific folder. However, the search is very slow (it does not index files). It is able to show all the instances a word is found in each PDF file, renders PDFs greatly and it is possible to read and annotate the PDFs right after the search. But it is sooooo slow.
PDF X-Change Viewer: prety much the same as Adobe Reader.
Foxit Reader: the best so far. Just like Adobe Reader and PDF X-Change Viewer, but the searches are a bit faster. In addition, I liked the interface better.
The ideal solution for me would be if Foxit Reader could index all PDFs inside a specific folder, so searches would be much faster. Is it possible? Is there a solution which I have not yet tried?
If crashing is a problem for you, you should import the PDFs in smaller batches. Maybe 50 at a time.
If you are willing to spend some time and at the same time help Zotero get better, you can troubleshoot the crash here at the forums with the developers. Just try to explain the situation that causes the crash in as much detail as possible and you will likely get help here.
That being said, for finding a specific PDF on my computer, I use Spotlight on Mac. If I want to restrict to a particular folder, I use the Zotero full text search. (My long term wish would be to combine these both into a single function) Both are fast and I have not experienced crashes either with around 7000 PDFs stored in my database.
As I'm using a laptop equipped with a Core i7-2720M with 8 GB of RAM, I don't think it will get much faster than that.
The only software which I have found so far that has satisfactorily done this kind of search is dtSearch, but it is a buggy piece of software and it costs US$ 199...
If pdf handling is that important to you, Windows is likely the wrong OS for you - the pdf handling on Macs is about 5 years ahead of anything windows (or linux).
I thought spotlight also did quite a good job in combination with preview
I actually really don't like Macs, so my knowledge about this consist mainly of looking at other people's screens and being jealous.
I have not tested Papers much.
I've found DevonThink for Mac to be a better choice in this regard. It is fast to search, it shows the results by relevance and it is also possible to navigate throught the highlighted results (I've dne it with Skim).
Still, I would rather have something workable for Windows.
If you open a folder in Explorer, any search you do in the upper-right-hand box has results limited to that folder. You can see in the search box there is a greyed-out "Search <foldername>".
http://blog.techhit.com/55696-indexing-and-searching-pdf-content-using-windows-search
Use the latest iFilter versions from Adobe's website.