Exclude some metadata from import via connector
When importing an item from a browser via the Zotero Connector I often get lots of metadata items that I don't need. Is there a way to exclude specific items?
For example, for a journal article, it frequently puts "Publisher" in the Extra field. I don't want that there.
Or, for an item that already has a DOI in the DOI field, I still get an additional "_eprint: https://doi.org/10.1080/..." entry in Extra.
Also, I will never need to know the "Library Catalog" so I'd love to prevent Zotero from filling it automatically.
Is there a place where I could specify specific metadata fields to be excluded from (browser) import?
For example, for a journal article, it frequently puts "Publisher" in the Extra field. I don't want that there.
Or, for an item that already has a DOI in the DOI field, I still get an additional "_eprint: https://doi.org/10.1080/..." entry in Extra.
Also, I will never need to know the "Library Catalog" so I'd love to prevent Zotero from filling it automatically.
Is there a place where I could specify specific metadata fields to be excluded from (browser) import?
The example comes from www.berghahnjournals.com
Try for instance
https://www.berghahnjournals.com/view/journals/ajec/34/1/ajec340102.xml
It puts "www.berghahnjournals.com" into Library Catalog and adds Publisher and Section in the Extra field (Section is just a duplicate of the journal name from the Publication field.)
I've followed your suggestion and asked ChatGPT to write me a script to clean up upon import. I've then added it as an Action to the Action & Tags plugin (Event = Create Item; Operation = Script). But nothing happens. Can you spot what's wrong (sorry, I don't have any scripting experience)?
// This script is intended to be run in the Zotero JavaScript API environment
// Function to modify an item after it is added
function modifyItem(item) {
// Check if the item has the "Library Catalog" field and delete it
if (item.getField('libraryCatalog')) {
item.setField('libraryCatalog', null);
}
// Get the current "Extra" field content
let extraField = item.getField('extra') || '';
// Check if "Publisher" or "Section" fields are present
const publisher = item.getField('publisher');
const section = item.getField('section');
// If "Publisher" is present, remove it from the "Extra" field
if (publisher) {
extraField = extraField.replace(/Publisher:.*?\n/g, '');
}
// If "Section" is present, remove it from the "Extra" field
if (section) {
extraField = extraField.replace(/Section:.*?\n/g, '');
}
// Update the "Extra" field
item.setField('extra', extraField.trim());
// Save the modified item
item.save();
}
// Event listener for when an item is added
Zotero.Events.on('itemAdded', (event) => {
const item = event.item; // Get the newly added item
modifyItem(item); // Modify the item
});
// Notify the user that the script is active
Zotero.notify('Item modification script is active. Modifications will occur on item addition.');
If my URL is
https://www.jstor.org/stable/...
then in LibraryCatalog I get simply JSTOR.
So it doesn't give me any additional information I don't already have. And for troubleshooting the URL should be much more valuable.
To keep it free from clutter I'd love to eliminate that – perhaps reserve only for those cases where an item comes from a physical location/archive/repository that doesn't have a URL, but I think that's extremely rare.
In any case, I would still love to be able to learn how to control other fields, i.e., how to filter out data in general.
What's your concern about clutter? Especially now that you can collapse the Info section and see a citation preview in the item pane header (right-click -> View As -> Bibliography Entry), it doesn't seem like a little extra info way down at the bottom of Info is that big of a deal.
I prefer the Info section to be always expanded in addition to the citation preview in the header, so I can retain a full overview of all existing metadata. That's one of the best design considerations that went into Zotero in my view, that it's always right there (compared to BibDesk, for instance, or some other tools). For my eyes, it's much easier to detect missing items or errors right away.
There are also things the citation preview just doesn't show. Like capitalization for instance, if my preview is in Chicago, for instance, then it's automatically in Title Case, so I'll never know if there are any words that should always be capitalized, but that are stored lowercase, which would be a problem for other styles. The citation preview also doesn't tell me whether the language field is specified or not, which is crucial for BitTeX export later. And so on. So the only section in the right pane I always keep expanded is the Info section. Therefore, it would be nice to find a way to automate some of the cleaning that I'll otherwise would have to do manually.
Anyway -- this doesn't exist and the way import is set up it wouldn't be simple to make it exist, but you can relatively easily write scripts that remove info from fields
https://www.zotero.org/support/dev/client_coding/javascript_api#batch_editing