Why is Abstract embedded in Word field metadata?
I just noticed that Zotero inserts the full text of the "Abstract" field into the hidden JSON data in a cite field in Word.
My understanding is that Abstract is never used in cites. (I'm not even sure it's available in CSL?)
I understand the whole Zotero entry is generally copied into the field, especially to allow continued usage for already-cited items if not found in the library (for example, while sharing a document), but if Abstract isn't ever used, then it doesn't need to be there.
This isn't particularly harmful, but my guess is that it might contribute to why Word gets slow after many Zotero cites are inserted. At the very least, it will increase the filesize unnecessarily. Note that for many dissertations, for example, the abstract may be several paragraphs, perhaps 500 words or more, and that would be duplicated every time that entry is cited.
Is there any advantage to this feature? Should it be removed?
This isn't an urgent bug or anything, but maybe worth thinking about.
Personally I'm not sure I understand the idea of "Abstract" in general in Zotero. I occasionally find it useful when searching items in my library, or to quickly check what an article is about without opening the article itself. But I rarely use it, have thought about deleting all abstracts from my library to save space-- previously a minor reason, but now if it's always embedded in Word cites, not quite so minor.
My understanding is that Abstract is never used in cites. (I'm not even sure it's available in CSL?)
I understand the whole Zotero entry is generally copied into the field, especially to allow continued usage for already-cited items if not found in the library (for example, while sharing a document), but if Abstract isn't ever used, then it doesn't need to be there.
This isn't particularly harmful, but my guess is that it might contribute to why Word gets slow after many Zotero cites are inserted. At the very least, it will increase the filesize unnecessarily. Note that for many dissertations, for example, the abstract may be several paragraphs, perhaps 500 words or more, and that would be duplicated every time that entry is cited.
Is there any advantage to this feature? Should it be removed?
This isn't an urgent bug or anything, but maybe worth thinking about.
Personally I'm not sure I understand the idea of "Abstract" in general in Zotero. I occasionally find it useful when searching items in my library, or to quickly check what an article is about without opening the article itself. But I rarely use it, have thought about deleting all abstracts from my library to save space-- previously a minor reason, but now if it's always embedded in Word cites, not quite so minor.
--
As for size, it's not huge. For example, an average abstract in my library is around 2000 characters long, or 2kb. If there are 100 cites like that (whether repeating the same reference or citing many) in a document, that would add about 200kb to the document. Still not a huge problem, but not insignificant. More relevantly I would worry that Word would struggle some with that much embedded metadata, although I'm not sure how to test it. It's clunky, sometimes buggy with fields anyway, so that can't help.
But yes, if it's used then I understand why it's there, and more importantly how complex it would be to attempt to 'fix' this, so no problem :)
That said, we used to offer the option of not embedding metadata. We now always do, and it might be worth revisiting that to confirm that including metadata doesn't have a significant performance effect in large documents.
@dstillman, yes, those are good reasons I wasn't thinking about, and the planned collections option sounds useful.
As for not embedding the data, I'd be curious just to see if there's any way to speed up Word. Not a Zotero issue at all, really, but still a usability challenge.
You've both answered my questions here, thanks. I was just wondering when I saw all that text hidden there.
I'd still be curious to see some performance figures with and without abstracts included in a long document. If it turned out that abstracts were dramatically slowing down Word or the plugin, that might change the cost/benefit analysis, particularly if Zotero gets better at updating item metadata and pulling down abstracts (which never exist in data from Crossref anyway).
I agree that abstracts are sufficiently rarely used in citations/bibliographies, that there's a good case to be made for excluding them from embedded metadata if the cause significant issues (performance & otherwise).