more than one data subdirectory

I need to clean up some disk space, and a duplicate file finder has found more than one copy of some .pdfs in the Zotero data directory. Each is in its own subdirectory folder. They are definitely dupes. Zotero is only using one of these subdirectories, though, and the other copies don't appear in Zotero. Zoplicate doesn't see them.
Is there a way that I can re-scan my data directory so that I can find these copies? I can find them one by one, but there are a good number, and sometimes there are three or four. No idea how this happened, but I'd like to clean it up.
  • Finding "orphaned" attachments is a difficult problem, because by definition Zotero has no record of them. And by the way, AFAIK there is no known Zotero bug that would cause them to occur. More often they seem to arise when people try to do odd things with their data folder from their OS.

    You can get a list of all PDF attachments under Zotero\storage that Zotero *does* know about - run the following code under Tools\Developer\Run Javascript (code is for Windows*):

    var sql = "SELECT (? || '\\storage\\' || key || REPLACE(path,'storage:','\\')) AS filepathnames FROM itemAttachments JOIN items USING (itemID) WHERE libraryID=1 AND path IS NOT NULL AND path LIKE ? ORDER BY filepathnames";
    var filepathnames = await Zotero.DB.columnQueryAsync(
    sql,
    [
    Zotero.DataDirectory.dir,
    'storage:%.pdf'
    ]
    );
    return filepathnames.join('\n');

    *Mac or Linux probably requires changing all \\ to /.

    You would then need to compare that list to a list of all PDFs you find under Zotero\storage in your OS. To get a list of those, you would run a search for *.PDF in your OS at that level. In Windows 11, you can save all those found file paths to the clipboard by selecting all the results and then right-clicking Copy As Path.

    Some code would be the least tedious way to then compare the two lists. Simplest would perhaps be two columns in Excel, one for each list, with an additional column of LOOKUP functions that tells you YES or NO for each PDF in your OS list ... whether or not that PDF is also in the Zotero list.

    If you do use a search/match code approach like that, be careful that your search method handles diacritics in file names (eg author names), otherwise you may get a 'false negative' on those (ie your code says Zotero has no record of the file, when it is actually in the list).

    If you want to quickly check whether an *individual* PDF you find under Zotero\storage is known to Zotero, you can paste its 8-char folder name into quick search for All Fields & Tags.
  • (This is also usually just a misunderstanding and someone has the same file in multiple libraries.)
  • (And if you're just trying to clear up disk space, and you're sure that all your files are online, you can just delete all PDFs from within the 'storage' folder and switch Zotero to download files on demand. A future version will make it possible to automatically remove local copies of files that have been uploaded to save on local disk space.)
  • thank you!
Sign In or Register to comment.