Renaming parent item with full file name

edited April 2, 2021
For a given set of items in a collection, is there any way to copy the full filename as the parent title instead of having it truncate parts it thinks are irrelevant? I want to go through the automated zotfile process of rename and move but I don't want to lose potentially valuable data I could manually parse myself later.

To be clear, I'm just looking to do this for a collection of items rather than make this a default behavior.
  • I thought Zotero also just transferred the full filename to the title. Do you have an example where that's not the case?
  • edited April 2, 2021
    Well when you add PDFs and use the get metadata function it tries to extract relevant metadata and figure out the title, author, year and publisher and everything, right? I assume while it tries to read from the contents of the pdf itself, some of this is parsed from the filename (or maybe I'm mistaken). One of the things it does is remove (what it thinks) are the subtitles, and copy just the title to the parent item. This doesn't always work and I frequently have the subtitle copied over instead of the title, which seems to be discarded. There are also multiple cases where the year and journal are in the filename but aren't transferred to the parent metadata. Here's a specific instance, though there are many others:

    A file named: "The Road to Eleusis Unveiling the Secret of the Mysteries - 1979 - Journal for the Scientific Study of Religion" gives the parent metadata Title "Unveiling the Secret of the Mysteries," and discards both the year and the Journal. Yet "The Road to Eleusis" is the most important part of that title, not what comes after it. I can only assume zotero thinks that's the subtitle, but it's clearly a little confused. It does, however, copy over the authors which are not in the filename, which makes me think it doesn't even look at the filename.

    There are many cases where it removes the subtitle when really it would be great to keep instead. This seems to be often things that come after " - " and ": " and "_", but not always (see above example). But why discard valuable metadata in the first place? In the interest of keeping a short title? I'm personally not quite sure why zotero is set to try and remove the subtitle which seems to be the main issue causing all this fuss in the first place. What's the advantage in that? That is unless, as I'm suspecting, it's not even looking at the filename.
  • Zotero’s Retrieve Metadata function doesn’t use any embedded information from the PDF or filename at all. It looks in the PDF content for an identifier like an ISBN or DOI, or if that fails, uses a few pages of full text to identify the item in various databases. This usually yields good results, but can sometimes find lower quality data for older items, especially older books.

    In such a case, I would suggest either:
    1. Add the item as you currently are, then manually clean up metadata.
    2. Temporarily disable the PDF Metadata Retrieval option in the General pane of Zotero preferences, add the file to Zotero, then right click on and choose Create Parent Item and clean up the metadata manually.

    As both options require manual clean up of the data, I’d say option 1 is arguably faster.
  • Well there goes any hope I had of not pulling all my hair out. Thanks for letting me know. I think I'll just give up on zotero then.
  • Can we perhaps take a step back, what exactly are you trying to accomplish?

    Number 2 above is exactly what you were asking for.
  • edited April 2, 2021
    Yes, in hindsight, it does seem like that's what I was asking for. But I'm realizing this won't work for me because I have a library of roughly ten thousand pdfs I'm trying to organize. Going through them one by one to identify which ones have more data in the filename than in the automated retrieval function is just not something I can afford right now.
  • Your framing here is a bit odd. Again, what are you trying to accomplish? If you just want a library with the titles from the filename, you can do that by turning off metadata retrieval. Or you can just turn off file renaming so that the original filename is preserved regardless of the recognized metadata, letting you transfer over any additional metadata from the filename later at your leisure.

    But metadata retrieval is metadata retrieval — it tries to actually retrieve high-quality metadata for the PDF, which includes far more than the title/author/year that might be in the filename. The filename just isn't really relevant in that case, and it doesn't really make sense to expect Zotero to understand or care what's in your custom filenames.

    If all of your files use the exact same filename format, it would be possible to write a script that populated the parent metadata with just those components once they were in Zotero, as long as you turned off file renaming now.
  • edited May 8, 2021
    I have a related problem - I actually want Zotero to retrieve the metadata and change parent item name for 15000+ PDFs. It had success with many, but there's a large amount of items with nonsensical or even just plain wrong metadata/names. For those, I'd like to revert the parent name to the filename. Is this possible in some way? Perhaps with one of the addons like Zotfile or Zutilo, or some sort of script? It's not really feasible for various reasons for me to remove the files-to-be-renamed and add again
  • If you haven't yet restarted Zotero, you can right-click and choose Undo Retrieve Metadata.
  • edited May 8, 2021
    I didn't know that was possible. Anyway, its not helpful here - I restarted Zotero weeks ago.

    I'm simply wondering if there's some sort of batch process that can set the parent item name/title to be the filename, while maintaining all the other metadata. Essentially the reverse of "Rename file from parent metadata"

    (p.s. I think my comments belong in the original discussion, not this split-off one. Sorry if my description was confusing)
  • Sorry, "revert" threw me off. This could be done with scripting, but there's no built-in functionality to do that. And it would only be possible if you disabled renaming before retrieving metadata.
  • Thanks for the quick reply, as always. I have found a bit of a workaround workflow that will make it possible, but a bit tedious, for me to achieve this.
Sign In or Register to comment.