Zotero search duds + Gibberish characters
Hi, I have had two issues with Zotero for Apple. Issue (1) prevents me from using Zotero as a linked database.
(1) The first is searching my library on the downloaded version of the program. I search - even for a single word- using the search window slot at the top middle of the Zotero window. Zotero returns hits in some of the "Notes" which are associated with a single bibliographic entry. That's OK, because I would also expect the hits to be in the "Notes". BUT: there are no actual hits there, also within the key words. It will find "real hits", but those are in the minority.
(2)The second issue regards gibberish character streams, such as †or “ or ’ or ’ or – or … ...etc. I had exported the results from a stored search as rdf-format, and then reloaded it into a new Group Library. The re-uploaded file contains those characters randomly imbedded into the text of notes and titles (yes, the Euro-symbol is a favorite)
Of possible relevance to (1) is my observation that the native Apple search engine also finds bogus words throughout e.g. pdf files.
What's going on here?
Thanks!
(1) The first is searching my library on the downloaded version of the program. I search - even for a single word- using the search window slot at the top middle of the Zotero window. Zotero returns hits in some of the "Notes" which are associated with a single bibliographic entry. That's OK, because I would also expect the hits to be in the "Notes". BUT: there are no actual hits there, also within the key words. It will find "real hits", but those are in the minority.
(2)The second issue regards gibberish character streams, such as †or “ or ’ or ’ or – or … ...etc. I had exported the results from a stored search as rdf-format, and then reloaded it into a new Group Library. The re-uploaded file contains those characters randomly imbedded into the text of notes and titles (yes, the Euro-symbol is a favorite)
Of possible relevance to (1) is my observation that the native Apple search engine also finds bogus words throughout e.g. pdf files.
What's going on here?
Thanks!
If you think you're getting matches you couldn't, could you share the URL for one of the items you think shouldn't match when you view it in your web library, along with the search term and search mode (e.g., "Everything") you're using? Well, are you talking about notes or are you talking about PDF files? Those would be totally different issues. If the OS search finds words within the PDF files, that's likely just due to bad OCR for the file, particularly if these are older papers that were scanned. Zotero will just search the hidden text layer in the document. First, to copy items to a group library, you can just drag — there's no need to export and import. But exporting and importing shouldn't affect those characters. Have you checked to confirm that you don't have the same thing in the original library? Can you export a single item this is happening for, upload it somewhere, and provide a link here?
Thanks for your quick response. Here are my imbedded answers:
BUT: there are no actual hits there, also within the key words. It will find "real hits", but those are in the minority.
-I'm not sure what you mean by that. You're saying it finds matches that shouldn't be matches? To be clear, you understand the difference between the black and gray results? That the child items will always appear, regardless of whether they match the search, but will appear in gray?
--Yes, it’s finding matches in the child items which shouldn’t be matches. Specifically, the notes for a bibliographic entry are incorrectly colored black
-If you think you're getting matches you couldn't, could you share the URL for one of the items you think shouldn't match when you view it in your web library, along with the search term and search mode (e.g., "Everything") you're using?
--I searched for a term under “Everything”. I then searched the child entries which appeared black, using the search function which works with the child entries, I thereby determined whether the hit was correct.
For example, searching for the term “qua” in the library which is named “Temp Video”. Here are the first six bogus hits from that search, together with my comments;
1. Entry: Energy Efficiency 2019; child: “p 37-1 Figure 2.14. …” fake hit
2. Entry: “The carbon footprint of streaming video: fact-checking…”; child “(5) Although the carbon footprint…” fake hit (has pasted graphic file)
3. Entry: “Semiconductor device, display system and electronic device” child: “Not N-L-O relevant”
4. Entry: “Chip-to-chip interconnect with embedded electro-optical bridge structures”; Entry: “FIG. 3 is an exploded view….” fake hit (has pasted graphic file)
5. Entry: “Energy 101: Energy Efficient Data Centers”; child: “Apparently posted on 5/25/2015”. This child has an imbedded url. That url address does indeed contain the character string which I was using to test.
6. Title: “What is Fiber Optic Transceiver | Optcore.net”; child “Definition of Fiber Optic Transceiver”….fake hit (has pasted graphic file)
Of possible relevance to (1) is my observation that the native Apple search engine also finds bogus words throughout e.g. pdf files.
-Well, are you talking about notes or are you talking about PDF files? Those would be totally different issues. If the OS search finds words within the PDF files, that's likely just due to bad OCR for the file, particularly if these are older papers that were scanned. Zotero will just search the hidden text layer in the document.
--I don’t save separate copies of indexed pdf files in the Zotero registry. That would become far, far too large. Instead, I just paste the name of the relevant pdf file into the individual Zotero entry. So it isn’t possible to search in the manner which you have just described.
I had exported the results from a stored search as rdf-format, and then reloaded it into a new Group Library.
-First, to copy items to a group library, you can just drag — there's no need to export and import. But exporting and importing shouldn't affect those characters. Have you checked to confirm that you don't have the same thing in the original library? Can you export a single item this is happening for, upload it somewhere, and provide a link here?
--Initially I had just tried to “drag-and-drop”. But for whatever reason, that didn’t work as intended. But now, yes it did. So this problem appears to have defaulted.
Thanks so far.....
Note, though, that you shouldn't be pasting graphics into notes, and if you do so 1) you could easily get false matches, because image data will contain lots of random characters and 2) you likely won't be able to sync those notes. We'll support embedding images in notes in the future, but for now it's not supported and shouldn't be done.
I still don't really know what you mean re: PDFs. The fact that macOS finds words in PDF files is only relevant if you've added those PDF files to your Zotero library. I don't know what you mean by "defaulted" here. But exporting to RDF still shouldn't result in corrupted characters that weren't in the original library.
The library which is named "Temp Video" has 38 entries.
For the online version of “Temp Video”:
Searching for “qua”gives hits in five different entries.
There is no way to distinguish "dark" and "light" colored children notes for a hit (or hits) in a given entry. So I don't see how I can send you an url.
For the local version of “Temp Video”:
When I carry out exactly the same search on the local version (on my Mac) of that same library, then I get back 33 hits (not five)
Now I can see which children are colored dark or light. Hence, now I can see the fake hits.
After this message,then I can try to sen you the urls to one, but from the online Zotero, not the local one. Even though you won’t see any dark color to the child notes at all.
Yes, I had also considered that the graphics which I had pasted into the child-notes were somehow responsible for the 33 dud hits. Here is what I observed for the first six of the 33 hits:
#2, #4 and #5 hadpasted graphics
#5 had an imbeddedurl which contained a string with "qua"
#3 had nodistinguishing characteristics
Please recall thatthe entries in “Temp Video” were copied out of a saved search.That initial saved search is called “z_Any video, Parent andChild”
If I search “z_Anyvideo, Parent and Child” for “qua”, then TWO hits are returned. Neither of them have any dark children or main entries.
Thanks
Fenton
PS: I used the term“defaulted” to mean that the issue of "drag-and-drop" had been solved.
Another parent has the title starting with “Energy 101: Energy Efficient Data Centers…”, where the child has “qua” at the child starting with “Apparently posted on..”. In this case, “qua” is in the lengthy URL which is imbedded into the text. The online version of this entry, itself, has the URL:
https://www.zotero.org/groups/2549006/temp_video/search/Energy 101/titleCreatorYear/items/KEWLFR54/item-list
The search on the Internet version had five hits, even though it wasn't possible to identify the children which produced those hits.. Those same hits are also found on the local version.
Summary: my observations from today appear to be the _ opposite _ of those from earlier yesterday. Yesterday I was finding bogus hits on the local version. Now the search appears to only give authentic ones. However, the online version misses most of those.
This all appears to be working properly. You were just seeing matches from embedded image data and links, and hadn't yet fully synced your items with the web library.
Fortunately, the Internet version gave no bogus hits.
Does the native link function allow linking to a local, non-url file, especially just a graphic one?
Thanks!