Semantic Relations
At Bruce's request, I've created a wiki page for building up a list of potential semantic descriptors that could be used to extend the existing item relation ("Related") support. While any predefined list will be inherently limiting, having a base set of relations allows for all sorts of analysis and visualization of the data.
Feel free to edit the page. If you don't have a Trac account, you can create one or just suggest edits on this thread.
A few comments:
1) Like creator types, the relations probably need to be item type–specific. There probably even needs to be more complicated logic that takes into account the source item type and the target item type.
2) We need to decide how this is related (if at all) to the upcoming hierarchical data model. As in, if an artwork item is created and defined via the data model as being within a book item, is that also reflected wherever these semantic relations are displayed? What about a translation of another work? Are these two separate classes of relations—data model relations and semantic relations—or are they fundamentally the same and just represented in the UI differently, with some relations bestowing the ability to interact with the related item in special ways (say, to edit "parent" fields from within the metadata pane)?
3) As extensive a list as this becomes, it probably will need to be extended with the same mechanism we end up using for custom types and fields, since there's no way this could satisfy all use cases, especially once custom types and fields are possible.
Feel free to edit the page. If you don't have a Trac account, you can create one or just suggest edits on this thread.
A few comments:
1) Like creator types, the relations probably need to be item type–specific. There probably even needs to be more complicated logic that takes into account the source item type and the target item type.
2) We need to decide how this is related (if at all) to the upcoming hierarchical data model. As in, if an artwork item is created and defined via the data model as being within a book item, is that also reflected wherever these semantic relations are displayed? What about a translation of another work? Are these two separate classes of relations—data model relations and semantic relations—or are they fundamentally the same and just represented in the UI differently, with some relations bestowing the ability to interact with the related item in special ways (say, to edit "parent" fields from within the metadata pane)?
3) As extensive a list as this becomes, it probably will need to be extended with the same mechanism we end up using for custom types and fields, since there's no way this could satisfy all use cases, especially once custom types and fields are possible.
cites (or references) -
- refutes
- supports
I have requested a Trac Account.
A refutes B = B refuted by A
Right?
I'm not sure whether "inspired" would be the best way to call it, but it might be useful for early works, when people did not use to cite their sources or publish bibliographies, but where textual analysis shows that whatever B wrote was derived from A.
A "develops ideas of" B (that is he considers himself to have taken up the same conversation or line of enquiry) This could perhaps be put more succinctly.
If you have item A (a note of your own invention, another item, whatever) and it is refuted by B, then you would simply need a relationship of "B refutes A"
The list of relations for A and for B would then both include "B refutes A"
In your case, from what I gather, you would get something like this: "'Title of Some (Mistaken) Article' refutes My Idea Stored in a Note". It would show up when viewing the relations of A and the relations of B.
If so, I suggest the following (with the idea expressed in the wording being more important than the wording itself):
-"Published version of", to relate a published primary source to an archival holding and a scrubbed-up dissertation to the original dissertation
-"Reprints"
-"Image of", which would be useful for linking a photo/sketch of a material culture object with that object's record
-"Copy of" and "Reproduction of" for cartobibliography
-"Transcription of", to relate text to an audio/video recording
1) cites article X in bibliography
2) is cited by article Y in Y's bibliography
3) Some evidence exists of another relationship exists between X and Y that is not strictly bibliographic - we might want to limit the potential sorts of relationships
I would discourage normative semantic operators (refutes, supports, etc) in favour of strictly empirical assessments, particularly as these operators may one day more from my highly opinionated view of the world into something more "wiki" like, where my "supports" is someone else's "refutes".
Ideally Zotero would be neutral about any associated graphing tools, instead generating a simple array that could then be post-processed by an as-yet-unwritten tool to generate inputs for graphviz or other tools.
I find myself losing track of this discussion, since it was awhile ago, and there's not much context above. But IIRC, there are basically two kinds of relations of relevance: what we might call "content" relations (part of, review of, translation of, etc.), and what I'm going to call "annotation" or maybe "commentary" relations (the refutes, etc. stuff).
I also get a little worried about the second class, which I believe we talked about in the context of notes. But one area where they matter a lot is law. But these probably need specific relations (affirms, overturns, etc.).
1) "content" relations (part of, review of, translation of, and I add: reprint of, adaptation of, remake of, manuscript of, etc.) are a kind of "objectives" relations between items-docs, independant of the searcher, as the relation between quoted/quoting items is independant of searcher opinion or analyse: it's a fact. I note that theses relations are (always?) "one way" relation, non-symetric:
A is part of B / B can't be a part of A
A quote B / B can't quote A (it's logically impossible!)
A is a translation of B / B can't be a translation of A
etc.
2) Bdarcus: "annotation" or maybe "commentary" relations (the refutes, etc. stuff).
Here, the relations are based, not on "facts", but on "meaning", relations are more intellectuals. To have a relation like that, 2 items must have a "subject-keywords" in commun: they are both related to this subject and they disagree, refute, confirm, comment, etc., about these "subject".
Exemple, if you like to compare the position of authors about "War in Iraq": 1) find it and add a tag "War of Iraq" to all selected items; after, add an other tag to each of them: "Support war" or "Against War"... (there is others ways to do it too.) These relations are very a different kind than the first kind of relations.
It's why, Zotero should separate the 2 kinds of relations and not try to put all of them under the actual "related" (even we can name and tag the relation), wich should be limited to the first kind of relation, factual and objective relations between items.
----
I don't know if it could help (and it's not new also), but I found in "OWL Web Ontology Language", that OWL distinguishes between two main categories of properties that an ontology builder may want to define:
1) Object properties link individuals to individuals.
2) Datatype properties link individuals to data values.
I'm not used to this language, but the first kind of relation, between items as "part of", seem to be "Object properties (which) link individuals to individuals".
The second kind of relation, between "subjects", seem to be "Datatype properties which link individuals to data values".
Does it make sens? Can it helps to create semantic relations in Zotero?
Luc
In plain language, the first is represented by the statement "x is related to y" while the second by "x title is 'Some Title'". E.g. in the RDF subject-predicate-object model, it just says the content of the object is different (a URI vs. a string).
In this discussion, by contrast, we're in essence always assuming the object is a URI (or in a relational database, another row). We're just talking about different kinds of predicates/relations.
To facilitate data exchange and automatic analysis, this could come with a set of predefined tags for an arbitrary number of standard relationships, such as those mentioned on the wiki page, while still allowing users to create their own (and use available visualization and analysis tools on those as well as the standard ones).
Cited by/cites type relationships should then be automatically updateable by extracting, e.g., Scopus or ISI information.
Best,
Sven
As promised, I looked at this (Semantic Relations) forum and have a "sketch" of an suggested approach. It is not a "solution"; I don't know enough about Zotero or the full requirements for Semantic Relations. But, it might be a start for discussion and correction.
There are four sections to follow, separated to better the chance that some will get through:
1) Semantic Relation database "Rules".
2) Relation Issues
3) Relation Examples, per Feature Requests for Semantic Relations
4) Diagram of Semantic Relation "Rules"
I hope this helps. Thank you for your help.
Bill
Collection: A Collection may represent a research project or a specific area of interest. Collections may have subordinate Subcollections. Titles (Items) are organized under Collections and their Subcollections. A single Title may be in multiple Collections at the same time.
Each "Collection" may contain one or more "Collection"s (e.g., Subcollections).
Each "Collection" may define one or more "DefinedTag"s.
CollectionTitle: Implements the feature that a single Title may be in multiple Collections at the same time.
Each "Collection" may reference one or more "CollectionTitle"s.
Each " CollectionTitle " always belongs to one “Collection".
Each "Title" may belong to one or more "CollectionTitle"s.
Each " CollectionTitle " always references one “Title".
A “Title” may be transferred to a different “CollectionTitle” (effectively, another “Collection”).
Title: Carries selected identifying attributes about the item.
Each "Title" may contain one or more "Statement"s.
Each "Title" may contain one or more "Statement”s.
Each "Title" may have one "SourceSnapshot". (Apparently an existing limit.)
Each "SourceSnapshot" always is of one "Title". (Apparently an existing limit.)
Statement: A Statement is a textual description of one or more assertions or comments to be found in or about the parent Title. A Title may have one or more Statements (Assertions), each of which may have a Relation to one or more other Statements of Title(s). By themselves, sequenced statements could form an outline of points to be taken about the Title. (Also, the Statement capability enables one assertion or comment to reference another Statement in the same Title document, such an internal contradiction.)
- A Statement is identified by its Title ID and Statement ID.
- A Statement has a Statement Heading attribute.
- A Statement has a Description attribute.
- A Statement may have a Sequence attribute for display purposes.
- A Statement may have an InTitleAs attribute referencing a location within the document
- A Statement has an IsDefault attribute indicating whether or not it is a placeholder for later edit.
Each "Statement" always belongs to one "Title".
Each "Statement" may originate one or more "Relation"s.
Each "Statement" may be referred to by one or more "Relation"s.
Relation: A Relation is an abstracted "concept" object (thing of interest to the user). It represents the existence of a relation between a statement in one document and another statement in a document. An instance (row) is independent of direction.
It would display differently depending on the "Statement" from which the "Relation" is originated. Default Verb Phrases are provided from the “Type” table, but may be over-ridden.
Each "Relation" is always classified by one "Type".
Each "Relation" is always originated by exactly one "Statement".
Each "Relation" always refers to one (referenced) "Statement".
No "Relation" may refer to the "Statement" from which it originated.
- A Relation has a Verb Phrase Originating attribute used when the originating statement is compared to the referenced statement.
- A Relation has a Verb Phrase Referenced attribute used when the referenced statement is compared to the originating statement.
- A Relation allows a Relation Description attribute explaining “why” the originating statement is related to the referenced statement.
- A Relation has an IsDefault attribute indicating whether or not it is a placeholder for later edit.
Bill
Relation Issues:
Cycling: This structure resembles that of a Bill Of Materials; it would raise the problem of preventing cycling ("A" consists of "B" & "C"; "C" consists of "A"& "D"; and so on forever). Workaround: Provide no capability for Bill Of Materials processing. Document the fact that “cycling” detection and prevention are not provided.
Implementation: may be tricky. If a user just wants to record that a relationship exists between Title A and Title B, the application should be capable of creating a defaulted A-Statement N, B-Statement M, and a Relation R between A-N and B-M. The IsDefault flag facilitates follow-up.
Simplicity: I’m not sure that the "rules" for handling this situation as a database belong in Zotero; they may be too complex for good performance. Perhaps in a personal database that links to Zotero if I can figure out how to link it and keep it synchronized.
It might be simpler to eliminate separate "Statement"s, but then “Relation” would have to provide for Statement Heading & Description, Originating & Destination Verb Phrases, and the Relation Description. Also, it might be difficult to identify contradictory statements within a single Title.
Bill
Type A Originating Verb Phrase Originating Referenced
Name: B Referenced Verb Phrase Referenced Originating
Support: Statement A Is Supported By Statement B
----------> Statement B Supports Statement A
Refute: Statement A Is Refuted By Statement B
----------> Statement B Refutes Statement A
Publish: Statement A Is Published As Statement B
----------> Statement B Is Publication Of Statement A
Image: Statement A Has Image In Statement B
----------> Statement B Is Image Of Statement A
Copy: Statement A Copied As Statement B
----------> Statement B Is Copy Of Statement A
Reproduce: Statement A Is Reproduced As Statement B
----------> Statement B Is Reproduction Of Statement A
Transcribe: Statement A Transcribed As Statement B
----------> Statement B Is Transcription Of Statement A
Review: Statement A Reviewed In Statement B
----------> Statement B Is Review Of Statement A
Adaptation: Statement A Adapted In Statement B
----------> Statement B Adaptation Of Statement A
Annotation: Statement A Annotated In Statement B
----------> Statement B Annotation Of Statement A
Commentary: Statement A Has Commentary In Statement B
----------> Statement B Is Commentary On Statement A
Remake: Statement A Remade As Statement B
----------> Statement B Is Remake Of Statement A
Contradiction: Statement A Contradicts Statement B
----------> Statement B Contradicts Statement A
This did not preserve the formatting. It might still be readable. Sorry.
Bill
My attempt to copy a gif of the diagram here did not succeed.
I will try Trak.
Bill
The Diagram of Rules has been attached as a file on TRAC.
See:
ERD.gif
as an attachment on
https://www.zotero.org/trac
I hope this helps.
Bill
More broadly, I'm going to ask the obvious: why reinvent something vaguely like RDF in SQL, rather than just use RDF more directly, particularly when the primary import/export format will in fact be RDF? Or to put this more generically what would be the relation (no pun intended) between this model and an RDF representation?
For example, in RDF you've got the standard triples: subject, predicate, object. The subject and objects WRT to Zotero and this particular use case are the Zotero items. Each has an internal ID, and (when 2.0 comes) a global URI. So the primary issue we're dealing with are the predicates, also identified by URI.
Of course, triples alone aren't useful here b/c need to track who's making whatever statements. For this, most triples stores (including those built on top of MySQL, etc.) add a fourth element: context.
So context comes in as in essence "user-generated statements." Every user has a graph of their statements. Those statements can then also be trivially aggregated.
Am not offering any specific implementation suggestions here; just wondering how RDF might help think about this use case.
I did not mean to suggest any particular technical implementation. Also, I agree that a 15KB gif results in a miserable illustration.
About the illustration (diagram): I had not tried to transfer a graphic to you folks before, so I chose the smallest possible file size, intending to copy it into an HTML post. That did not work, so I sent the .gif thru TRAC.
I am limited to screen capture techniques and the diagramming application. I could send you .jpg, .wmf or .pspimage. All screen captures would be a bit fuzzy. I could also send you the original diagram (Visio 2000 or 2003). Your choice. One more thing: If I did it right, the "rules" directly correspond to the essential content and relationships in the diagram.
Let me try to answer the broader question. I led myself astray by looking at what files the Zotero installation had put on my PC: .js and .sqlite. I assumed an SQL implementation. After your note, I looked up RDF, URI and WRT (Assumed URI means Universal Resource Identifier, found RDF but not WRT), together with several links claiming RDF could be implemented in a relational database. (I can't comment on those claims.)
Many entries in the Zotero Forum on Semantic Relations appeared to request that the "Related" element make sense forward and backward. E.g., 1) subject, active predicate (as verb), object and 2) object, passive predicate (as verb), subject. I took that as a requirement and generalized (abstracted) the "Related" element so that it could be bi-directional. Also, I added the "Statement" element to support "what" was being referenced in each Title. The "Type" element was added as an attempt to make it easier to enter a "Relation".
I think that the basic problem is the bi-directional requirement. I do not know how this is implemented in a Resource Description Framework; I am out of my depth there.
Anyhow, take the diagram for what it's worth, if anything.
With best regards,
Bill
So it might be useful to add a short string to a semantic relation such as the page number. It might, however, get to complicated...
Also, in my admittedly naïve opinion, I think that a semantic-based relationship (such as a "refuted by" relationship) would be less useful than a cited-by/cited-in relationship, although I suppose that a user can just create arbitrary labels for relationships, so that they can create new semantically-based relationships like, "kind of refuted by" or "totally misinterprets".
A way to export these as some kind of visual graph would be fantastic as well.
I searched through the forum posts and this is the discussion thread that seemed most appropriate to say this.
Thank you!
ZotFile already allows extraction of Annotations from PDFs, so I think it is possible to read text from a pdf file for further analysis:
Zotero should automatically search for citations in attached PDF-files (when indexing? in the context menu?) to determine whether one of the cited files is already in the library.
It would be great to have a "cited" realtionship (or until such feature is implemented just a simple relationship) between the articles that cite "each other". Of course only the newer article can cite the older one, at least for static publications: this could serve as an optional filter for better performance, since an article from 1899 cannot possibly cite any newer article in the library an thus will Zotero would only have to check a few (or even no) older articles.
The Citation Typing Ontology defines a great list of relation types: https://sparontologies.github.io/cito/current/cito.html.
@bjohas and I are currently experimenting with typed relations in Kerko (a Zotero client web app). To work around Zotero's lack of typed relations, we are currently using child notes that contain the Zotero URIs of related items, which Kerko then parse. Obviously, a better integrated solution in Zotero would be much more user-friendly, and more resilient to changes to the library (such as item merges). At the moment we display "Cites" and "Cited by" relations, which will be pretty useful to our users (have a look at an example here: https://docs.opendeved.net/lib/9IYKEUKJ). I thought I'd share this to give an example of a real use case, but my impression is that even for private libraries, semantic relations would be extremely useful to researchers as a kind of annotation (at least I know I would use them that way in my own research!).
Is the general idea of semantic relations still on the Zotero team's radar?
Also, I cannot find the "related items" in the exported .csv, do you know how to do that?
In addition there is https://scite.ai which also offers the zotero plugin https://github.com/scitedotai/scite-zotero-plugin/