Advanced Search has several issues in 4.0.4

I'm trying to use Advanced Search in version 4.0.4. There are several issues. Bugs:

* I look for "All of the following" (Well, my UI is in Swedish), Criteria "Title" "Does not contain" "Colliculus"; then "Attachement" "contains" "Cortex". The very _first_ entry in the list is "A Circuit Model for Saccadic Suppression in the Superior Colliculus".

It seems clear that exluding search terms is not working at all.

* Randomly, but fairly often, the advanced search will start failing altogether, and when closed the main Zotero window will show, effectively, "otero has encountered an internal error. Please restart Firefox now."

* Regular expression search does not seem to work when searching PDF files; this, I believe, is reported already though.

Other issues:

* There is no way to search all fields in advanced search, and no way to "continue" a quick search onto the advanced interface, so you'd do an advanced search only on the items you already found with a Quick Search.

* There is no way to prune the very long list of search types in advanced search, or to set the default type. 99% of the time I am searching the contents of the attached PDF files, and yet I have to locate and choose that single (badly named) option of "attachement contents".

I would very much like to select only a small subset of search types in my defualt dialogue; to set "Search PDF Attachement" as the default selection; and, indeed, to have a specific "Search the PDF attachement" option.

Hope this can help. If I can do something to help fix these issues please let me know.
  • * There is no way to prune the very long list of search types in advanced search, or to set the default type. 99% of the time I am searching the contents of the attached PDF files, and yet I have to locate and choose that single (badly named) option of "attachement contents".

    I would very much like to select only a small subset of search types in my defualt dialogue; to set "Search PDF Attachement" as the default selection; and, indeed, to have a specific "Search the PDF attachement" option.
    couldn't you set that up as a saved search?
    * I look for "All of the following" (Well, my UI is in Swedish), Criteria "Title" "Does not contain" "Colliculus"; then "Attachement" "contains" "Cortex". The very _first_ entry in the list is "A Circuit Model for Saccadic Suppression in the Superior Colliculus".
    but the title is greyed out - correct? Meaning Zotero finds the attachment, but not the top level item. For non-attachment searches excluding title words definitely works. I'm not sure if that's working as intended and it does seem odd, but that's what's going on.
  • Thanks for the quick answer.

    "couldn't you set that up as a saved search?"

    Hm, that kind of works, for the specific case when I'm looking for a single keyword. But that's not usually the case, and then I'm back to having to pick out that one criterion from a huge list again. I guess I could set up half a dozen saved searches, each one with one more line, but that feels more than little like a kludge.


    "but the title is greyed out - correct? Meaning Zotero finds the attachment, but not the top level item. For non-attachment searches excluding title words definitely works. I'm not sure if that's working as intended and it does seem odd, but that's what's going on. "

    That's not just odd to me; that's broken. I mean, I specifically tell Zotero _not_ to show me posts with this criterion.

    In this particular case I have a list with perhaps half a dozen actual hits hidden in hundreds of false ones. Is there a way to unbreak this so they really don't show up in the list and hide the true hits?


    Oh, and there is no "search all metadata" criterion anywhere, is there? That and a "Search PDF attachement" would cover just about 100% of what I ever want to search.
  • Hm, that kind of works, for the specific case when I'm looking for a single keyword. But that's not usually the case, and then I'm back to having to pick out that one criterion from a huge list again.
    that makes sense, yes. Saved search won't help much there. I don't have a great GUI idea to solve this, though, I think it's important to keep the advanced search comprehensive. I guess adaptive menus could be an option?
    As an immediate solution, you're aware that the list reacts to keystrokes? I don't know what the condition is called in Swedish, in English pressing "att" (for "Attachment Content") gets me right there. (If something is poorly translated feel free to post about this or join the translation team on Transifex https://www.transifex.com/projects/p/zotero/ ).
    Oh, and there is no "search all metadata" criterion anywhere, is there? That and a "Search PDF attachement" would cover just about 100% of what I ever want to search.
    outside of the quick search, no. You're not the first person to ask for that, though.
    In this particular case I have a list with perhaps half a dozen actual hits hidden in hundreds of false ones. Is there a way to unbreak this so they really don't show up in the list and hide the true hits?
    I'm actually trying to fully understand how this behaves - could you try doing the same search with the "include parent and child items..." option checked - in my cursory tests that does produce the correct result.
  • In this particular case I have a list with perhaps half a dozen actual hits hidden in hundreds of false ones. Is there a way to unbreak this so they really don't show up in the list and hide the true hits?
    If the parent is gray and the child attachment is black, it's not broken. That's by design, and it's how search results (advanced and quick search) in Zotero have always worked, because showing child items without the context would be confusing much of the time.

    I could see an argument that it's less appropriate for searches to show sibling context items, but given the hierarchical nature and the varying and vague ways child items can be named, I think showing the parent context row makes sense. "JSTOR Full Text PDF" without the parent title isn't particularly helpful.

    (You certainly may be running into actual bugs, though. Search functionality will be undergoing a badly needed overhaul later this year.)
    Regular expression search does not seem to work when searching PDF files; this, I believe, is reported already though.
    Should be fixed in 4.0.5, available now.
  • that makes sense, yes. Saved search won't help much there. I don't have a great GUI idea to solve this, though, I think it's important to keep the advanced search comprehensive. I guess adaptive menus could be an option?
    Adaptive menus would be painful to implement cleanly for the developers, I suspect. Perhaps have a couple of "top-level" fairly general options at the top of the drop-down list, separated with the rest by a divider? Something like:

    Metadata
    Attached documents
    My Notes
    -------------------
    # pages
    # volumes
    Anmälningsnummer
    ...

    I'm actually trying to fully understand how this behaves - could you try doing the same search with the "include parent and child items..." option checked - in my cursory tests that does produce the correct result.
    I assume it's the third one? That works!

    But it seems completely counterintuitive to me. The option tells me it will _add_ the upper- and lower-level posts to any matching post in the results; that it will increase the number of displayed posts, never reduce them. I probably don't really understand what it's meant to do properly.
  • edited April 16, 2013
    I could see an argument that it's less appropriate for searches to show sibling context items, but given the hierarchical nature and the varying and vague ways child items can be named, I think showing the parent context row makes sense. "JSTOR Full Text PDF" without the parent title isn't particularly helpful.
    The basic issue, I think, its that to me as a user, the PDF paper, the metadata and my notes are all one document, not a parent document and a couple of child documents.

    So when I search for something I expect to find the document — all of it or none at all. I never want to see one or the other but not both. If the metadata is a non-match, it should not show the PDF either even if it's a hit, as they're two connected parts of one and the same single unified post.


    Your post also explains another annoyance: when you search DPFs, it shows me the list with each and every post opened so the PDF file is visible. That's a bit frustrating as it halves the number of items I can see at once. But as you think of the PDF and the metadata as separate documents that probably makes sense to you as well.
  • @Janne - glad it's working. I agree this is very confusing - I'd argue basically unintelligible. I'll sleep on it and write more tomorrow.
  • Oh, OK, I guess I misunderstood what you were saying, though I guess my answer was still sort of relevant.

    I agree that the situation with regard to search conditions and parent/child items is less than ideal. There are search conditions that can apply to either ("Title"), search conditions that only apply to attachments ("Attachment Content"), and search conditions that apply to the parent but search the child ("Child Note"). The fact that child items aren't technically counted as part of collections adds a whole other wrinkle when you're doing collection-based searches. And as for "Include parent and child items of matching items", well, I'm not going to say that's the UX innovation I'm most proud of (though it does serve a purpose, in its own largely incomprehensible way).

    I can't get into this too much at the moment, but there are definitely ways we can improve this. I don't think the ability to match on parent and child items separately is the problem, though. If someone has 50 child notes attached to an item, it's pretty important to be able to show the one that matches instead of just the parent. (That might be a case for not showing sibling context items, but that's a separate issue.) And since we override Select All to select just the actual matches in the middle pane, you can use searches in various creative ways to do otherwise tedious things, like dragging all PDFs to a folder in the filesystem or exporting all notes that match a particular string.
  • OK, no need to sleep on this then - I agree that this is exactly the issue:
    agree that the situation with regard to search conditions and parent/child items is less than ideal. There are search conditions that can apply to either ("Title"), search conditions that only apply to attachments ("Attachment Content"), and search conditions that apply to the parent but search the child ("Child Note").
    and I also agree with Dan that there are definitely situation were you do want matches in specific attachments.

    One thought, though, Dan - do we really need to include attachment titles in the title search? If they weren't included that would fix this, no?
  • One thought, though, Dan - do we really need to include attachment titles in the title search? If they weren't included that would fix this, no?
    I don't think it would fix this—or at least my understanding of this—because "Attachment Content" matches just attachments and excludes all regular items, so you still need "Include parent and child items of matching items" to match on the parent title.

    The simplest solution would probably be to add more conditions that function like "Child Note", matching parents that have children that match the given criteria—that's essentially what Janne is looking for, I believe. (And the fact that "Attachment Content" doesn't function that way is arguably inconsistent and confusing.)

    You can actually achieve some of this now by combining "Include parent and child items of matching items" (to include parents even for conditions that match children) and then "Show only top-level items" (to restrict the result to the parents). So, for example, you can have [Attachment Content] [is] [PDF] and [Title] [contains] [Foo] and select those two checkboxes and you have just the parent items with titles matching "Foo" that have PDFs. But there should be a simpler way of getting that result.
  • In my case I was explicitly trying to _exclude_ words in the title. Is there any case where it would make sense to ask to exclude it and still show the related content?

    What I mean is, say the use-case of lots of notes attached to each biography item. You search for "Globular" in your notes, and two notes match, one each in two separate biography items.

    But then you also add that the title must not contain "puddings". One of those biography items ("Chocolate puddings as bludgeoning weapons of opportunity; A survey of church bazaar police reports in West Anglia 1966-1969") matches this. Is there really a reason to show that bibliography item and the associated note, and not just the other one? If both are shown in either case, the exclusion criteria becomes rather meaningless.


    That is, to me, orthogonal to the question of highlighting matches. There it absolutely makes sense to mark the specific item — the note, in this case — in the post that fit, and grey out the rest.

    Or, better, keep all parts black (and thus easy to read for those with bad eyesight) and add a mark, bold text or similar to the specific attachements that hit.
  • Well, again, this is really just about what items particular search conditions apply to. You can already do exactly what you describe with the "Child Note" condition. Then it's matching parents, so "Title" will work. With the "Note" condition, "Title" isn't doing anything, because "Note" matches only notes, and "Title" doesn't apply to notes. If you use "Note" but add "Include parent and child items of matching items", you'll get the same as with "Child Note", except the child note will also be black and you'll also get any top-level notes that match.
    Or, better, keep all parts black (and thus easy to read for those with bad eyesight) and add a mark, bold text or similar to the specific attachements that hit.
    But the whole concept of what "matches" depends on the search conditions applying to specific items. "Note" is one search condition among many, and you can select any arbitrary combination for a given search. If "Note" meant "Child Note", as you want it to, it couldn't highlight just the note like you're suggesting, because [Title] [does not contain] [puddings] also matches the parent. So then which is the match to highlight? If both, then how would you Select All of the matching notes? If just the notes, because there was a "Note" condition, then what happens when you remove the "Note" condition and leave the "Title" condition? The matching parents would clearly need to be highlighted, but then why weren't they highlighted when the "[Child] Note" condition was there? What if you were also matching on the parent item's tags and creator?

    A couple possible approaches here:

    1) Have more conditions that explicitly reference parent or child. So you could have "Note" as it is now but have a "Parent Title" to restrict based on the parent, while allowing just the note to match. This would be the inverse of the current "Title" and "Child Note".

    2) Have conditions that treat parent and child items more like cohesive units, like you're suggesting, but then add some concept of filters that pare down the actual matches at the end so that Select All can continue to work. It's technically possible to do this sort of thing now by chaining saved searches, though it might be a bit buggy. You can, for example, do a search on "Title", add "Include parent and child items of matching items", save that search, and then use it as the source for a second search with [Item Type] [is] [Note] to get just the notes of items with that title. But there could potentially be a way to add conditions as filters to the end without creating two separate searches. (Of course, this is just equivalent to a "Parent Title" condition.)
  • I'm sorry; since Mystery Checkbox fixes my problem I'll let this go.

    But I'm still curious about my example above: There really is no point to an exclusion criteria, is there, unless (I guess) you also search for a positive criteria in the same subset of the post?
  • I think you're still just misunderstanding the items that the different conditions apply to—you'd have to read what I wrote above for the details. Again, excluding based on title works just fine. You just can't exclude a title from a search that isn't going to match any parent items because it's already limited to notes by the "Note" condition. If you want to exclude notes based on the parent item's title, use "Title" and "Child Note".
Sign In or Register to comment.