more than one data subdirectory
I need to clean up some disk space, and a duplicate file finder has found more than one copy of some .pdfs in the Zotero data directory. Each is in its own subdirectory folder. They are definitely dupes. Zotero is only using one of these subdirectories, though, and the other copies don't appear in Zotero. Zoplicate doesn't see them.
Is there a way that I can re-scan my data directory so that I can find these copies? I can find them one by one, but there are a good number, and sometimes there are three or four. No idea how this happened, but I'd like to clean it up.
Is there a way that I can re-scan my data directory so that I can find these copies? I can find them one by one, but there are a good number, and sometimes there are three or four. No idea how this happened, but I'd like to clean it up.
Upgrade Storage
You can get a list of all PDF attachments under Zotero\storage that Zotero *does* know about - run the following code under Tools\Developer\Run Javascript (code is for Windows*):
var sql = "SELECT (? || '\\storage\\' || key || REPLACE(path,'storage:','\\')) AS filepathnames FROM itemAttachments JOIN items USING (itemID) WHERE libraryID=1 AND path IS NOT NULL AND path LIKE ? ORDER BY filepathnames";
var filepathnames = await Zotero.DB.columnQueryAsync(
sql,
[
Zotero.DataDirectory.dir,
'storage:%.pdf'
]
);
return filepathnames.join('\n');
*Mac or Linux probably requires changing all \\ to /.
You would then need to compare that list to a list of all PDFs you find under Zotero\storage in your OS. To get a list of those, you would run a search for *.PDF in your OS at that level. In Windows 11, you can save all those found file paths to the clipboard by selecting all the results and then right-clicking Copy As Path.
Some code would be the least tedious way to then compare the two lists. Simplest would perhaps be two columns in Excel, one for each list, with an additional column of LOOKUP functions that tells you YES or NO for each PDF in your OS list ... whether or not that PDF is also in the Zotero list.
If you do use a search/match code approach like that, be careful that your search method handles diacritics in file names (eg author names), otherwise you may get a 'false negative' on those (ie your code says Zotero has no record of the file, when it is actually in the list).
If you want to quickly check whether an *individual* PDF you find under Zotero\storage is known to Zotero, you can paste its 8-char folder name into quick search for All Fields & Tags.