PDF articles in the Data Directory are not appearing in Library

Let me preface all this with I am not using Zotero's Sync Service.

Based on an earlier failed attempt at building my library on a NAS I own, I manually moved the files onto my computer which appears to have caused a large amount of my PDF articles to disappear from the My Library search. I can 100% verify the PDF files are in the Data Directory and the Data Directory is seeing other items in the folder and/or adding new items to that folder.

I was racking my brains to cipher how to search for the Data Directory folder name when I stumbled across https://forums.zotero.org/discussion/120075/zotero-randomly-removes-pdfs-from-parent-item-but-keeps-it-in-storage-missing-annotations and searched for a few folders with the eight character name to verify they are in fact missing from My Library. They are.

I want to ask if there is a way to re-scan that folder and locate the missing PDF files from My Library that are in the Data Directory? I am betting there is but am I not seeing the idea when looking through the help or the menus for such a search. I have tried Rebuild Index, once when I moved the library and just recently upon learning the PDF files are in the Data Directory but just missing from My Library.

Worst case option I think is to simply write a script to pull out all the PDF files into a single folder and start from scratch to rebuild, but would like to try and avoid this if possible.
  • edited 14 days ago
    I'm not really following this, but I think you're misunderstanding a lot of parts here.

    Can you just explain the exact problem you're facing within Zotero itself? The data directory is managed by Zotero, and none of it really works the way you're describing. Moving PDFs in the data directory wouldn't change what you see in your Zotero library — those are just attachment files.
  • I think you might be reading my description incorrectly actually.

    TLDR version - Zotero points to a Data Directory that has 100 individually unique items in PDF format but Zotero's My Library only shows 90 of those items. I want to get all 100 to be shown. How?

    Longer version.
    1. I made an attempt at storing my Original Data Directory on a Synology NAS I have meaning that Edit -> Settings -> Advanced -> Data Directory Location was pointing to Synology NAS. This didn't work as desired.
    2. I cloned the Original Data Directory using FreeFileSync from the location on Synology NAS to my Laptop and editted Zotero so that Edit -> Settings -> Advanced -> Data Directory Location is now pointing at C:\Users\%User%\Documents\Zotero_Library.

    Now, as explained in my first post, I am seeing some but not all of the PDF files (what you are calling attachments). For example, when I randomly select a Title from My Library, right click on it, select Show File, the Folder within the newly CLoned Data Directory (as I stated in point 2) pops up and I see a .zotero-ft-cache, .zoter-ft-info, .zotero-pdf-state, and the actual PDF file (or what you call attachment). This is as expected. Now, working backwards, this folder with the PDF file (or attachment as you call it) has an eight (8) character name - Q3L4DXPT - which I enter into the Search Bar in the main body of interface where I have All Fields & Tags selected, the same Title pops up in My Library.

    Now, for PDF files (attachments, whatever you want to call them) that are missing, like a folder called UQCAS3ZP, which is 100% known to be located in the exact same newly Cloned Data Directory as Q3L4DXPT, I do not find the attachement.

    So, what I want to know is, is it possible for me to have Zotero go back through and perform a reindexing/index repair/catalog synchronization to find and rebuild My Library with the missing attachments?
  • edited 14 days ago
    I promise you I didn't misread your post. What I'm telling you is that you're misunderstanding how this works and what would've led to it.

    Zotero didn't somehow forget about some of your attachments. There's nothing to "reindex" here (and "Rebuild Index" isn't remotely related to this).

    Zotero is a database, stored solely in zotero.sqlite, and entries in the database remain there unless you delete them from within Zotero. If there are 'storage' folders that you don't see attachment items for in Zotero when you search for the 8-character folder name in All Fields & Tags mode from the library root or the trash (including in any group libraries, if you were using those), you're not using the same zotero.sqlite file that was used to create that 'storage' folder. You might have failed to copy zotero.sqlite when copying 'storage' or vice versa, or you might have corrupted your data directory by putting it on the NAS (e.g., a Zotero instance on one computer that didn't have those attachments overwrote the database — much like what would happen in a cloud storage folder). But the data you see comes entirely from zotero.sqlite.

    You can run these commands from Tools → Developer → Run JavaScript to see if the item is or was ever in this database:

    return await Zotero.DB.valueQueryAsync("SELECT COUNT(*) FROM items WHERE key='UQCAS3ZP'")

    return await Zotero.DB.valueQueryAsync("SELECT dateDeleted FROM syncDeleteLog WHERE key='UQCAS3ZP'")

    If those return 0 and false, respectively, you're using a version of the database where that attachment never existed, and you should try to find your original database, not try to recreate library entries from the 'storage' folder. If you can't, sure, you can run a script to find folders that don't exist in the database and move them out, do a search for all PDFs within those folders, and then drag those PDFs into Zotero again in the hopes that it can retrieve metadata for them, but there's no functionality to find unreferenced files in the 'storage' directory, because that's not something that would ever happen in normal Zotero usage.
  • @dstillman I appreciate the help. I ran the queries given and they show the item was never in the database. For the record, I said they were not in the database so no surprise. I don't know how I can have another copy of the database that is not in use because as the Data Directory was simply cloned to my computer and I went into Zotero and changed the pointer.

    On a side note, please refrain from telling any person on the boards they don't know how it works or they are not asking the right question. It is condescending and in no way helpful. If you are not sure what someone is asking and need some clarity simply ask for clarity.

    I have made several posts over the years on this forum about things and in return I hear that's not how it works as opposed to either explaining the process or pointing to a reference for explanation. If someone is asking for a new feature that you and the team don't like, don't get an attitude, move on.

    I fully understand how Zotero works and have looked into the source code on GitHub. I notice in replies you make on the board, some others have done it here and there, that the default is to slam the questioner and belittle them. I am here because I have an issue and therefore do not know something. I can tell you with 100% clarity that databases get corrupted, sometimes for no reason besides age. I can tell you with 100% clarity that if a process is not followed, the end result can be an issue such as missing data.

    I explained how I tried something on a NAS that didn't work and moved the files/PDFs/attachments (whatever the language you want to use) to my computer. I explained how there was an article I wanted to have in Zotero, knew it to exist while being on the NAS, and is now missing from Zotero on my computer. I asked how to find the file. Simply say there is no method to find unreferenced files instead of making patronizing comments like "just explain the exact problem you're facing within Zotero itself?"

    At the end of all this, I moved the files/PDFs/attachments (use the desired lingo of your choice here) in some incorrect manner and the SQL database was corrupted. It is not a mark against Zotero, it's a user that did something incorrectly for which Zotero was not designed for. In my world that is an edge-case and you try to cover them but can't always.

    Just stop with treating others not in distillman-land as inferior or less intelligent please.
  • edited 13 days ago
    Telling you that you're misunderstanding something isn't an insult or an attempt to "belittle" you — it's trying to explain that the way you're framing the question isn't the right way to think about it. Properly understanding the problem is the only way to actually solve it properly and prevent it from happening again. I wrote a long, detailed post explaining how this works on a technical level and giving you specific commands to run, so I hope it's clear that my goal is to help you.

    It's just much harder for us to help people when they jump to their idea of the solution (relinking folders in 'storage', writing shell scripts) and start running totally irrelevant commands ("Rebuild Index") rather than following standard reporting guidelines and letting us walk them through proper troubleshooting steps. We really do always want people to start by just explaining the problem in Zotero itself, no matter how technically competent they might be.
    I ran the queries given and they show the item was never in the database. For the record, I said they were not in the database so no surprise.
    But that's not the same thing at all. Saying they're not in the database is very different from their never having been in the database. That's the point here. Your database almost certainly did not just become corrupted (and you can confirm that from the Advanced → Files and Folders section of the Zotero settings). What I'm trying to explain is that you have a database from a different copy of your data directory than the one your 'storage' folder is from, and so that's where you should begin trying to solve this, by looking for other copies of zotero.sqlite — backups in the data directory, files on the NAS, on other computers, in your own backups, etc.
  • I'm a bit confused by the hostility of this response -- dstillman's reply contained a fair amount of detail on what could be happening (including theories about how exactly you could end up with the 'wrong' version of the database), description about how Zotero's attachment model works, etc.

    I'll add to this that if you have any groups with attachments, you'd have to search for storage strings in every group -- the attachments are all in the same directory, and searching in MyLibrary wouldn't find items in any group.

    Beyond that, if you want a script that just identifies unlinked attachments in the Zotero storage folder, here's e.g. a perl script that you can adapt (it's ancient, but I suspect it'll still work as is) or translate to a preferred scripting language: https://raw.githubusercontent.com/mronkko/ZoteroCleanOrphans/master/ZoteroCleanOrphanedFiles.pl
  • I am not being hostile, I said I appreciate the help, and most importantly I asked to please refrain. But for the record, telling someone "the way you're framing the question isn't the right way to think about it" is the very definition of condescension. If @dstillman or @adamsmith believe that a question is not clear, do not tell the asker they "don't know how it works".

    I understand Zotero uses a database, I understand that I may not use the exact terminology as you, and I am fine when someone tells me that what I call PDF is what Zotero calls Attachment. What I am not fine with, and what the above is clearly trying to convey, is that this was one in a long line of Zotero telling users How Things Are as opposed to either listening and offering an alternative option or having enough understanding to know that all, and I mean all, software has flaws and Zotero is no exception.

    Examples include simply saying Zotero has no method for reviewing a storage location and reindexing it to ensure all the attachments are found in the Zotero library. Telling me I "don't know how it works" is wasteful, arrogant, and gets all parties nowhere.

    Another example is many have asked for handling complete Journals in the past. This forum is littered with postings by you both telling people to use a Book type or something, and my favorite was when aborel told me a link was a Book and not a Journal as opposed to worrying about helping with the actual question. I can go back through and find more instances of where a user mentions PDF and more than a sentence is wasted explaining the PDF is an attachment or some such. That is terminology and if you, the reader, are not clear of the meaning, simply move on with a stated assumption that you believe PDF to be an attachment and try to help.

    Book and Journal for many on this forum are the same thing, an electronically bound collection of individual papers. Don't be condescending and tell me that a book and a journal are two separate things and until I learn that fact Zotero will not help - wasted time for both parties and ultimately getting us nowhere.

    Lastly, it is not correct to say that dstillman added useful info or that he even understood my original question.

    "Moving PDFs in the data directory wouldn't change what you see in your Zotero library — those are just attachment files."
    -or-
    "Can you just explain the exact problem you're facing within Zotero itself?"

    In the above sentences, I never said I moved PDFs in the Data Directory, I said moved the PDFs from a NAS to my computer which is not the same. Then there's telling me to "just explain" the issue as if the idea hadn't occurred to me. Instead of these two sentences simply asking for clarity as opposed to telling me I don't understand would have been a superior choice and doesn't come across as belittling.

    Zotero is a great tool, and I and others appreciate it and the community. I also know that maybe you both do mean well and believe that the word choices you use in posts are harmless, but many times they are not. Blatantly telling a person they don't understand is condescension and shows arrogance. Same with telling someone they don't know how it works as you have nod idea what I know. In the past I have viewed these incidences as neither of you represent the software was a whole and that you are simply doing your part to contribute in your way. But that gets harder over time as I read answers to me and others.
  • OK, this is getting ridiculous. Two people here have spent a significant amount of time trying to help you, and you're just being defensive, belligerent, and rude.
    telling someone "the way you're framing the question isn't the right way to think about it" is the very definition of condescension
    No, it's just not. We're developers of Zotero. You are not. You're not expected to understand all the technical details, but you seem to have some bizarre notion that you are, and that it's an insult if we tell you you're misunderstanding something. If you can't accept that our explaining that you're thinking about something the wrong way is how we help you solve the actual problem you're facing, instead of some made-up other problem, you shouldn't bother posting here. It's a waste of everyone's time.
    I can go back through and find more instances of where a user mentions PDF and more than a sentence is wasted explaining the PDF is an attachment or some such.
    I mean, you just literally don't understand this? The distinction we make is between a PDF file on disk and an attachment item in Zotero. That's in fact incredibly relevant for your particular issue, and given that you're still, many posts in, talking about moving PDF files on disk as if that's what caused this, it's apparently a distinction that's still lost on you. We'd be happy to continue trying to explain it, but you don't seem interested, so I'm not going to bother.

    I explained above how you would go about solving this properly — by finding the zotero.sqlite file that you were using when you added those PDFs. You can do that, or you can move the orphaned folders out and start from scratch.

    But responding like this to the people taking the time to try to help you understand something you clearly misunderstand so that you can get your data back is just obnoxious, so I'm closing this thread.

    (For the record, the other example you cite involved you hijacking an unrelated thread because you misunderstood something basic. A little more humility on your part would, in fact, be helpful.)
This discussion has been closed.