Importing large amount of pdf files
I've used various programs over the years, generating many duplicates (i.e. a mess), and now that I've settled on zotero I'm facing the following issues:
1- zotero was unable to extract metadata for some pdfs. Is there a way I could export those items without parent into a specific folder, try another tool such as Mendeley or Papers, and reimport them? Right now they're listed as unhelpful filenames in zotero, each in its own subdirectory buried somewhere, and it would take forever to open them one by one just to decide whether they're even worth keeping.
2- The items for which zotero was able to identify the metadata and make a parent item had many duplicates. I successfully merged them, but I now notice that the children attachments are duplicated. This is a waste of storage, indeed some items have 4 times the same pdf file. Is there anything I can do to clean this up, or at least help identify those items with more than one pdf attachment, and select the one I want to keep?
3- My 2Gb online quota is filling up quickly (partly because of this duplication), I would like to find out if some pdf files are particularly heavy and remove them from online sync (I just found one scanned pdf by chance that was 2.1 Gb...). Is there a way I could sort attachments by file size?
I'm using the latest standalone version on a Mac.
Thank you.
1- zotero was unable to extract metadata for some pdfs. Is there a way I could export those items without parent into a specific folder, try another tool such as Mendeley or Papers, and reimport them? Right now they're listed as unhelpful filenames in zotero, each in its own subdirectory buried somewhere, and it would take forever to open them one by one just to decide whether they're even worth keeping.
2- The items for which zotero was able to identify the metadata and make a parent item had many duplicates. I successfully merged them, but I now notice that the children attachments are duplicated. This is a waste of storage, indeed some items have 4 times the same pdf file. Is there anything I can do to clean this up, or at least help identify those items with more than one pdf attachment, and select the one I want to keep?
3- My 2Gb online quota is filling up quickly (partly because of this duplication), I would like to find out if some pdf files are particularly heavy and remove them from online sync (I just found one scanned pdf by chance that was 2.1 Gb...). Is there a way I could sort attachments by file size?
I'm using the latest standalone version on a Mac.
Thank you.
(2) Also in the main library view, type the "+" key. This will expand all of the items so that you can visually see which ones have duplicate attachments.
(3) You can't sort by attachment size within Zotero. However, you could use a file system analysis tool to find large items in Zotero's storage folder (https://www.zotero.org/support/zotero_data). You can identify the large items then search in Zotero for the 8-character folder name the file is in (e.g., AEOQ2P53). This will show you the attachment item in Zotero so you can delete it (or move the file to a different location on your compute and re-attach the file as a link).
1- I see; hadn't realised I could export those pdfs to a folder by just dragging them outside zotero
2- the + key is very handy, thanks!
3- I guess I was hoping there was a built-in way to do it within zotero, but the search for folder name will do
First, I could not figure out what you mean by pressing the "+" key in the main library. Could you please elaborate?
Also, I located my Zotero storage folder but how do I run a file system analysis?
(3) If all you want to do is find the largest PDFs, browse to the storage folder using Windows file explorer and then in the search bar of the file explorer type "*.pdf" and hit enter. Windows will show you the list of all PDFs in the storage folder. Change the view to show details, and make sure the Size column is activated. Sort by size to see the largest PDFs at the top. You could also type "*.*" to see all files in storage and not just PDF.
Example: https://1drv.ms/u/s!Atcr8aCyjBrulaV6jIoEXA6O1ch3WQ?e=V7p2Iw
Or, you could install a tool. I haven't used these, but here's a list:
https://www.itechtics.com/15-tools-visualize-file-system-usage-windows/