No URL when copying citation after migration from Mendeley.

KelSolaar · April 28, 2020

Hi,

I have migrated from Mendeley to Zotero not without some pain, i.e. debugger fun to by pass database encryption, and I have had some issues.

I think the big one currently is that my references that were convered to Document type lost their URL values: they have been moved to the PDF attachment nested under it. The problem is that now if I copy the reference for Bibliography, I have no URL at all in the Bibliography text. If I copy the PDF URL into its parent Document URL, it is all good.

Two images showing a migrated Document: https://imgur.com/a/GJCbjMf

And the resulting Bibliography: "FiLMiC Inc. (2017). FiLMiC Pro - User Manual v6 - Revision 1 (pp. 1–46)." where I would want "FiLMiC Inc. (2017). FiLMiC Pro - User Manual v6 - Revision 1 (pp. 1–46). Retrieved from http://www.filmicpro.com/FilmicProUserManualv6.pdf"

I don't want to do that for my hundreds of document though.

Cheers,

Thomas

dstillman · April 28, 2020

This is by design. We tend to handle metadata much more deliberately and precisely than Mendeley, and in this case, their laxer approach manifests itself in a database that doesn't clearly distinguish between item URLs and PDF URLs — items just have a single list of one or more URLs. Generally, the URL of the PDF doesn't belong in the Zotero URL field, because it's not what should be cited, so we don't store PDF URLs there when importing from Mendeley. In the example you give, it might be reasonable to cite the PDF, if that's the canonical source for the manual online, but for most items people are saving to Zotero there would be an article page that would be cited instead.

While we could consider always saving the first URL to the URL field, there's a good chance that would result in URLs being stored there that don't belong, which would set a bad example for people's first exposure to Zotero metadata.

I wouldn't expect this to be an issue for hundreds of documents unless you're dealing almost entirely with examples like the one above, where you're really just citing a PDF file rather than a more abstract entity (e.g., a journal article). If you want, we could provide a script that added the URL from the primary attachment for each item to its URL field if not already populated, but you should confirm that that's really what you want to do.

KelSolaar · April 28, 2020

Hi,

Thanks for coming back to me, appreciated! In the collection I'm currently dealing with, I have a bit under a hundred of them to deal with currently and have many more in other collections. I use those documents as references in our code, here is a an example: https://github.com/colour-science/colour/blob/develop/colour/models/rgb/transfer_functions/fujifilm_flog.py#L13 or the aforementioned Filmic Pro one: https://github.com/colour-science/colour/blob/develop/colour/models/rgb/transfer_functions/filmic_pro.py#L13

Yes it is what I would like to do :) I have started to look at the JS API but a script would be super useful I reckon.

Bests,

Thomas

KelSolaar · April 29, 2020

Hi,

I tried to get attachments from an item using the following code:

```javascript
var item = ZoteroPane.getSelectedItems()[0];
var fulltext = [];
if (item.isRegularItem()) { // not an attachment already
let attachmentIDs = item.getAttachments();
for (let id of attachmentIDs) {
let attachment = Zotero.Items.get(id);
if (attachment.attachmentContentType == 'application/pdf'
|| attachment.attachmentContentType == 'text/html') {
fulltext.push(await attachment.attachmentText);
}
}
}
return fulltext;
```

It unfortunately does not work, `alert(attachment.attachmentContentType);` returns `[ Javascript Application ]` however `alert(attachment.getField('url'));` is fine, just not sure how to check whether the content type is pdf.

Cheers,

Thomas

dstillman · April 29, 2020

Make a backup first, but this should work:

var items = await Zotero.Items.getAll(Zotero.Libraries.userLibraryID, true);
for (let item of items) {
	if (!item.isRegularItem() || item.getField('url')) continue;
	let attachment = await item.getBestAttachment();
	if (attachment) {
		let url = attachment.getField('url');
		if (!url) continue;
		item.setField('url', url);
		await item.saveTx({
			skipDateModifiedUpdate: true
		});
	}
}

KelSolaar · April 29, 2020

Awesome and thanks, this worked great! I just added an extra check to just apply on Document Type: `|| item.itemTypeID != 12`