Search Criteria for Misplaced (not without) Attachments?

edited July 8, 2023
Unfortunately there has been some linkage problem with files in my zotero, still not sure about the source of the issue, I come from here: https://forums.zotero.org/discussion/105835/z7-bug-webdav-file-not-found#latest

Nonetheless I've manually linked 2000+ entries (so daunting!), identified by their semi-transparent icon of PDF. I wonder if the search function can filter these items with abnormal attachments for efficiency?

Just to clarify, I do not mean those entries that do not have an attachment (i.e. not those without an icon in the attachment).

Also, clicking the top of the column semi-works: it doesn't sort all the entries (probably because of the size of my library? over 10k) so far it's been a bit of a hit-and-miss going on.
  • There are some strategies within Zotero that have been suggested for identifying/cleaning up broken links to PDFs, although not 'non-tedious' for large numbers of files, eg
    https://forums.zotero.org/discussion/98890/delete-a-large-number-of-dead-links-to-pdf-files-in-zotero-parent-items

    There is also Javascript code that can be run under Tools\Developer\Run Javascript to get a list of PDF files that Zotero thinks are in either local Zotero\storage or at linked file paths; but I am not aware of/sure if similar code would work for WEBDAV storage paths. But such lists can be the starting point for finding paths where no PDF actually exists (for one reason or another), using further batch process/other programming language code (ie outside Zotero).
  • @tim820 thanks very much for bringing that to my attention, though my issue is rather dissimilar to that of your referred post. I detect that some of my pdfs would have annotations embedded and I'd much rather line them up manually than risk loosing a couple along the way. (Saying that, though, manually doing 2k is a bit much I confess but i'm more than 75% through, I hope/think.)

    What you said about the java code sounds great, the idea of getting orphaned files on either/both Zotero and storage. I'll look into it!

    Note for @dstillman: I speculate that the the logic for determining whether a file has changed is by looking at the modification time of *the folder* of the attachment. I think this should be changed to the attachment files themselves. (I *vaguely* remember seeing this from the diagnostic logs, am not too sure.) Sometimes I see that zotero returns sync success while there might be cases of empty folder in /storage.
  • I speculate that the the logic for determining whether a file has changed is by looking at the modification time of *the folder* of the attachment.
    I'm not sure why you think that. It uses the modification time and contents of the file itself.

    I'm not really understanding the situation here, but Zotero will download files that exist on the server. Files are identified by item keys, which never change. If they files aren't on the server, they were never uploaded on the computer where you added them, as explained on Files Not Syncing.

    There's no reason you would ever have to manually add back stored files unless you no longer have access to the Zotero data directory where the files were originally added.
  • edited July 17, 2023
    @dstillman Perhaps I'm not explaining right...allow me to try again.

    I'm still manually linking back the attachments (for whatever fault/reason), and it's still the case that if I do not perform reset sync history these files will not be synced even if the sync panel says sync completed. (I can confirm this by looking into the webdav directory)

    So: we have the
    1) local directory under the same name/identifier,
    2) the attachment within the directory (now located and saved into local directory),
    3) the zotero's record of the directory (I actually use the directory from zotero to locate the file in my backup,

    What happens is: once I relocate the attachment file, zotero will locally recreate the same directory as its record, copy the attachment into this directory, but this will not sync, not the file, nor the folder.


    Clarification
    1) The backup I have is of the /storage folder just as how it was locally.
    2) Zotero doesn't create this folder in the webdav unless I reset sync history.

    Sometimes, however, folders in my backup can be empty (without any files in), this could be due to my syncing service, of course. The worry here is that there's some structural issue with syncing.

    What have I miss that prevented the sync process to sync the now-located files to my webdav unless I reset sync history?

    Now, if what you said is right (that zotero compares the modification time of the attachment file with its record), what does it do when the server doesn't have this file but merely the record of once having this file?

    One explanation to why I think it's the folder's modification time rather than that of the file is because when the file changes in Zotero, its parent item doesn't change..Of course, I could be miles off from the truth.
  • So you're saying there are files that are missing from both the Zotero storage folder and WebDAV, you're using the Locate button to add the files from a backup back to the storage folder, and the files aren't being automatically uploaded to WebDAV unless you reset file sync history? And just resetting file sync history once isn't enough? You have to do this after every Locate to get it to upload the latest file?

    Can you reset file sync history and sync once without logging running, and then provide a Debug ID for 1) trying to open a file that's missing, 2) using Locate to relink it, 3) syncing and having it not be uploaded, and then another Debug ID after clearing output for resetting file sync history and syncing again such that it's uploaded?

    To be clear, Zotero can't delete attachment files locally on disk or remotely on your WebDAV server without the attachment item also being deleted and removed from the trash, so whatever happened, it wasn't Zotero that did it. If you made the backup from another computer, that's the easiest explanation — it just means the files were never uploaded from the other computer, so they couldn't be downloaded to this one.

    If you have a backup of 'storage' and it would have the same Zotero libraries (i.e., it wouldn't have files from group libraries that you no longer belong to), the easiest thing to do would be to just merge all folders from that backup into your storage folder. Different tools handle that differently, but you'd want to use something that did the merging on a file level and wouldn't replace an existing subfolder that contained an attachment with an empty folder from the backup.
  • edited July 18, 2023
    Thank you @ dstillman, we're getting somewhere.

    I can do reset sync history for files in batch (i.e. after locating several files, one reset sync history will force sync them all), reset sync history for individual file is not necessary. So no problem on the reset front.

    Yes I'll provide the debug IDs for the procedures. Yes, you've understood correctly that: even now that I locate the files back to zotero, it will not sync unless I reset sync history. No error is produced in the process.

    You mentioned that Zotero doesn't delete attachment files - granting this - my concern is with why the files aren't synced in the first place (all while the sync panel reported no error). If a user chooses to *not* sync attachment, wouldn't the entry in zotero just *not* have an attachment? (i.e. no pdf icon at all) In my case, I choose Zotero to sync both the attachment and the entry and the sync reported no error, but on basis is this "no error" if the files are not synced to the webdav in the first place? One easy(?) explanation is that my webdav server had been acting funnily and somehow pick-and-choose what files it likes to sync or delete...but if this is the case then Zotero hasn't been checking the file with the server, then?

    Group library was never really used on my part - so I don't know whether that could be problem.

    The empty folder case I really struggle to understand...

    Off to generating the debug IDs!
  • edited July 18, 2023
    my concern is with why the files aren't synced in the first place (all while the sync panel reported no error)
    The issue would've been on whatever computer you actually added the files. You might have synced with a different WebDAV server or with Zotero Storage or you might have been getting some error that you didn't notice or there might've been some other problem with the WebDAV server. You wouldn't know any of that from what you see on this computer now.
    If a user chooses to *not* sync attachment, wouldn't the entry in zotero just *not* have an attachment?
    No, whether file syncing is enabled or what service you're using has no bearing on whether there's an attachment item to begin with.
    One easy(?) explanation is that my webdav server had been acting funnily and somehow pick-and-choose what files it likes to sync or delete...but if this is the case then Zotero hasn't been checking the file with the server, then?
    Once Zotero uploads a file, it doesn't continue to check it, no — that would be incredibly inefficient. Zotero has to trust that files aren't going to be randomly deleted.
    The empty folder case I really struggle to understand...
    There's nothing inherently wrong with empty folders, and they may not actually be empty. Zotero creates attachment folders with hidden files for various purposes. Those aren't necessarily any different from other files that were never uploaded from another device and therefore never downloaded onto the computer you may the backup from.

    Anyway, let's stop speculating here — it's not a good use of anyone's time. If you have a Debug ID, we can look at the Locate/sync issue.
  • edited July 18, 2023

    D547267243 (before)
    D2093536298 or D1296533809 (after)
Sign In or Register to comment.