Importing large amount of pdf files

I've used various programs over the years, generating many duplicates (i.e. a mess), and now that I've settled on zotero I'm facing the following issues:

1- zotero was unable to extract metadata for some pdfs. Is there a way I could export those items without parent into a specific folder, try another tool such as Mendeley or Papers, and reimport them? Right now they're listed as unhelpful filenames in zotero, each in its own subdirectory buried somewhere, and it would take forever to open them one by one just to decide whether they're even worth keeping.

2- The items for which zotero was able to identify the metadata and make a parent item had many duplicates. I successfully merged them, but I now notice that the children attachments are duplicated. This is a waste of storage, indeed some items have 4 times the same pdf file. Is there anything I can do to clean this up, or at least help identify those items with more than one pdf attachment, and select the one I want to keep?

3- My 2Gb online quota is filling up quickly (partly because of this duplication), I would like to find out if some pdf files are particularly heavy and remove them from online sync (I just found one scanned pdf by chance that was 2.1 Gb...). Is there a way I could sort attachments by file size?

I'm using the latest standalone version on a Mac.

Thank you.
  • edited October 10, 2016
    (1) In the main library view (i.e., not in a collection) in the upper right corner of the center pane, click the More Columns icon and choose Item Type. This will show the type of item as a column that you can sort on. Files without a parent will have a type of "Attachment"

    (2) Also in the main library view, type the "+" key. This will expand all of the items so that you can visually see which ones have duplicate attachments.

    (3) You can't sort by attachment size within Zotero. However, you could use a file system analysis tool to find large items in Zotero's storage folder (https://www.zotero.org/support/zotero_data). You can identify the large items then search in Zotero for the 8-character folder name the file is in (e.g., AEOQ2P53). This will show you the attachment item in Zotero so you can delete it (or move the file to a different location on your compute and re-attach the file as a link).
  • Thanks for the tips!

    1- I see; hadn't realised I could export those pdfs to a folder by just dragging them outside zotero

    2- the + key is very handy, thanks!

    3- I guess I was hoping there was a built-in way to do it within zotero, but the search for folder name will do
  • I have the same question as (3) above — regarding how to find large file size PDFs to remove because I'm out of storage.

    First, I could not figure out what you mean by pressing the "+" key in the main library. Could you please elaborate?

    Also, I located my Zotero storage folder but how do I run a file system analysis?
  • edited May 21, 2020
    (2) Typing "+" means you hit the key combination required to type the "+" symbol. On most keyboards that would means SHIFT and the key that has the "+" symbol on it. Select the items in Zotero and type "+" key to expand the items. You can also hit the minus key to contract the items.

    (3) If all you want to do is find the largest PDFs, browse to the storage folder using Windows file explorer and then in the search bar of the file explorer type "*.pdf" and hit enter. Windows will show you the list of all PDFs in the storage folder. Change the view to show details, and make sure the Size column is activated. Sort by size to see the largest PDFs at the top. You could also type "*.*" to see all files in storage and not just PDF.
    Example: https://1drv.ms/u/s!Atcr8aCyjBrulaV6jIoEXA6O1ch3WQ?e=V7p2Iw

    Or, you could install a tool. I haven't used these, but here's a list:
    https://www.itechtics.com/15-tools-visualize-file-system-usage-windows/
Sign In or Register to comment.