Semantic Relations

dstillman · September 21, 2007

At Bruce's request, I've created a wiki page for building up a list of potential semantic descriptors that could be used to extend the existing item relation ("Related") support. While any predefined list will be inherently limiting, having a base set of relations allows for all sorts of analysis and visualization of the data.

Feel free to edit the page. If you don't have a Trac account, you can create one or just suggest edits on this thread.

A few comments:

1) Like creator types, the relations probably need to be item type–specific. There probably even needs to be more complicated logic that takes into account the source item type and the target item type.

2) We need to decide how this is related (if at all) to the upcoming hierarchical data model. As in, if an artwork item is created and defined via the data model as being within a book item, is that also reflected wherever these semantic relations are displayed? What about a translation of another work? Are these two separate classes of relations—data model relations and semantic relations—or are they fundamentally the same and just represented in the UI differently, with some relations bestowing the ability to interact with the related item in special ways (say, to edit "parent" fields from within the metadata pane)?

3) As extensive a list as this becomes, it probably will need to be extended with the same mechanism we end up using for custom types and fields, since there's no way this could satisfy all use cases, especially once custom types and fields are possible.

dbot · September 22, 2007

I had wanted to add the suggestions below by bdarcus to the wikipage but wasn't quite sure how to approach it. The suggestions were made under http://forums.zotero.org/discussion/1311/

cites (or references) -
- refutes
- supports

dbot · September 22, 2007

For me it makes more sense to use 'refuted by' and 'supported by' rather than refutes and supports. I find it gives a more intuitive idea of what is intended and would also suggest other relationships.
I have requested a Trac Account.

bdarcus · September 22, 2007

The "by" relations are inverses of the ones I listed.


A refutes B = B refuted by A

Right?

dbot · September 22, 2007

if A the your observation (note), what if you want A is refuted by B?

raf · September 22, 2007

Similar to "cites" another relationship could be called "inspired" - A inspired B.

I'm not sure whether "inspired" would be the best way to call it, but it might be useful for early works, when people did not use to cite their sources or publish bibliographies, but where textual analysis shows that whatever B wrote was derived from A.

scot · September 23, 2007

Another suggestion for further discussion:

A "develops ideas of" B (that is he considers himself to have taken up the same conversation or line of enquiry) This could perhaps be put more succinctly.

jverber · September 24, 2007

@dbot

If you have item A (a note of your own invention, another item, whatever) and it is refuted by B, then you would simply need a relationship of "B refutes A"

The list of relations for A and for B would then both include "B refutes A"

In your case, from what I gather, you would get something like this: "'Title of Some (Mistaken) Article' refutes My Idea Stored in a Note". It would show up when viewing the relations of A and the relations of B.

MTBradley · September 27, 2007

Perhaps this is a way to deal with some of the issues I brought up at http://forums.zotero.org/discussion/1274/ ?

If so, I suggest the following (with the idea expressed in the wording being more important than the wording itself):
-"Published version of", to relate a published primary source to an archival holding and a scrubbed-up dissertation to the original dissertation
-"Reprints"
-"Image of", which would be useful for linking a photo/sketch of a material culture object with that object's record
-"Copy of" and "Reproduction of" for cartobibliography
-"Transcription of", to relate text to an audio/video recording

SalishSea · January 16, 2008

Can I offer an incremental perspective? It seems to me that a very small set of semantic operators would provide a useful test-bed, encourage early adoption, etc Thus, I suggest:

1) cites article X in bibliography
2) is cited by article Y in Y's bibliography
3) Some evidence exists of another relationship exists between X and Y that is not strictly bibliographic - we might want to limit the potential sorts of relationships

I would discourage normative semantic operators (refutes, supports, etc) in favour of strictly empirical assessments, particularly as these operators may one day more from my highly opinionated view of the world into something more "wiki" like, where my "supports" is someone else's "refutes".

SalishSea · January 16, 2008

I would also note that graphviz (graphviz.org) provides remarkable open-source graph drawing tools that build graphical representations of data -- change the data, and that changes the graph...a feat not obtainable when graphs are drawn out by users. Graphvix supports a large array of graph types, and the flexibility of the graphic language, suggest it may be a useful way of depicting the networks that zotero might document.

Ideally Zotero would be neutral about any associated graphing tools, instead generating a simple array that could then be post-processed by an as-yet-unwritten tool to generate inputs for graphviz or other tools.

bdarcus · January 16, 2008

I would discourage normative semantic operators (refutes, supports, etc) in favour of strictly empirical assessments, particularly as these operators may one day more from my highly opinionated view of the world into something more "wiki" like, where my "supports" is someone else's "refutes".

That's a reasonable perspective.

I find myself losing track of this discussion, since it was awhile ago, and there's not much context above. But IIRC, there are basically two kinds of relations of relevance: what we might call "content" relations (part of, review of, translation of, etc.), and what I'm going to call "annotation" or maybe "commentary" relations (the refutes, etc. stuff).

I also get a little worried about the second class, which I believe we talked about in the context of notes. But one area where they matter a lot is law. But these probably need specific relations (affirms, overturns, etc.).

LGauvreau · January 16, 2008

I agree with Bdarcus that "there are basically two kinds of relations":

1) "content" relations (part of, review of, translation of, and I add: reprint of, adaptation of, remake of, manuscript of, etc.) are a kind of "objectives" relations between items-docs, independant of the searcher, as the relation between quoted/quoting items is independant of searcher opinion or analyse: it's a fact. I note that theses relations are (always?) "one way" relation, non-symetric:
A is part of B / B can't be a part of A
A quote B / B can't quote A (it's logically impossible!)
A is a translation of B / B can't be a translation of A
etc.

2) Bdarcus: "annotation" or maybe "commentary" relations (the refutes, etc. stuff).
Here, the relations are based, not on "facts", but on "meaning", relations are more intellectuals. To have a relation like that, 2 items must have a "subject-keywords" in commun: they are both related to this subject and they disagree, refute, confirm, comment, etc., about these "subject".
Exemple, if you like to compare the position of authors about "War in Iraq": 1) find it and add a tag "War of Iraq" to all selected items; after, add an other tag to each of them: "Support war" or "Against War"... (there is others ways to do it too.) These relations are very a different kind than the first kind of relations.

It's why, Zotero should separate the 2 kinds of relations and not try to put all of them under the actual "related" (even we can name and tag the relation), wich should be limited to the first kind of relation, factual and objective relations between items.
----
I don't know if it could help (and it's not new also), but I found in "OWL Web Ontology Language", that OWL distinguishes between two main categories of properties that an ontology builder may want to define:

1) Object properties link individuals to individuals.
2) Datatype properties link individuals to data values.

I'm not used to this language, but the first kind of relation, between items as "part of", seem to be "Object properties (which) link individuals to individuals".

The second kind of relation, between "subjects", seem to be "Datatype properties which link individuals to data values".

Does it make sens? Can it helps to create semantic relations in Zotero?

Luc

bdarcus · January 17, 2008

Luc, I agree with what you say, but the OWL distinction between object and datatype properties actually confuses the issue.

In plain language, the first is represented by the statement "x is related to y" while the second by "x title is 'Some Title'". E.g. in the RDF subject-predicate-object model, it just says the content of the object is different (a URI vs. a string).

In this discussion, by contrast, we're in essence always assuming the object is a URI (or in a relational database, another row). We're just talking about different kinds of predicates/relations.

mark · January 22, 2008

I'm missing 'part of'/'chapter of' on the wiki page. This seems to me to be an uncontroversial type 1 relation in bdarcus' parlance.

yeti · March 20, 2008

Having browsed through a number of forum threads, I am wondering whether there is actually a need for a specific mechanism to represent semantic and/or hierarchical relationships. What would seem cleanest (and easiest to implement) to me would be a mechanisms to support an arbitrary number of DIRECTED, TAGGED relationship entries for each citation. This would support any structure that can be represented by a directed graph and could easily be utilized to automatically create interactive graphical representations.
To facilitate data exchange and automatic analysis, this could come with a set of predefined tags for an arbitrary number of standard relationships, such as those mentioned on the wiki page, while still allowing users to create their own (and use available visualization and analysis tools on those as well as the standard ones).
Cited by/cites type relationships should then be automatically updateable by extracting, e.g., Scopus or ISI information.
Best,
Sven

WA · June 9, 2008

Dan,

As promised, I looked at this (Semantic Relations) forum and have a "sketch" of an suggested approach. It is not a "solution"; I don't know enough about Zotero or the full requirements for Semantic Relations. But, it might be a start for discussion and correction.

There are four sections to follow, separated to better the chance that some will get through:
1) Semantic Relation database "Rules".
2) Relation Issues
3) Relation Examples, per Feature Requests for Semantic Relations
4) Diagram of Semantic Relation "Rules"

I hope this helps. Thank you for your help.

Bill

WA · June 9, 2008

(1 of 4) Semantic Relation "Rules":
Collection: A Collection may represent a research project or a specific area of interest. Collections may have subordinate Subcollections. Titles (Items) are organized under Collections and their Subcollections. A single Title may be in multiple Collections at the same time.

Each "Collection" may contain one or more "Collection"s (e.g., Subcollections).
Each "Collection" may define one or more "DefinedTag"s.

CollectionTitle: Implements the feature that a single Title may be in multiple Collections at the same time.
Each "Collection" may reference one or more "CollectionTitle"s.
Each " CollectionTitle " always belongs to one “Collection".
Each "Title" may belong to one or more "CollectionTitle"s.
Each " CollectionTitle " always references one “Title".
A “Title” may be transferred to a different “CollectionTitle” (effectively, another “Collection”).

Title: Carries selected identifying attributes about the item.
Each "Title" may contain one or more "Statement"s.
Each "Title" may contain one or more "Statement”s.

Each "Title" may have one "SourceSnapshot". (Apparently an existing limit.)
Each "SourceSnapshot" always is of one "Title". (Apparently an existing limit.)

Statement: A Statement is a textual description of one or more assertions or comments to be found in or about the parent Title. A Title may have one or more Statements (Assertions), each of which may have a Relation to one or more other Statements of Title(s). By themselves, sequenced statements could form an outline of points to be taken about the Title. (Also, the Statement capability enables one assertion or comment to reference another Statement in the same Title document, such an internal contradiction.)
- A Statement is identified by its Title ID and Statement ID.
- A Statement has a Statement Heading attribute.
- A Statement has a Description attribute.
- A Statement may have a Sequence attribute for display purposes.
- A Statement may have an InTitleAs attribute referencing a location within the document
- A Statement has an IsDefault attribute indicating whether or not it is a placeholder for later edit.
Each "Statement" always belongs to one "Title".
Each "Statement" may originate one or more "Relation"s.
Each "Statement" may be referred to by one or more "Relation"s.

Relation: A Relation is an abstracted "concept" object (thing of interest to the user). It represents the existence of a relation between a statement in one document and another statement in a document. An instance (row) is independent of direction.
It would display differently depending on the "Statement" from which the "Relation" is originated. Default Verb Phrases are provided from the “Type” table, but may be over-ridden.

Each "Relation" is always classified by one "Type".
Each "Relation" is always originated by exactly one "Statement".
Each "Relation" always refers to one (referenced) "Statement".
No "Relation" may refer to the "Statement" from which it originated.
- A Relation has a Verb Phrase Originating attribute used when the originating statement is compared to the referenced statement.
- A Relation has a Verb Phrase Referenced attribute used when the referenced statement is compared to the originating statement.
- A Relation allows a Relation Description attribute explaining “why” the originating statement is related to the referenced statement.
- A Relation has an IsDefault attribute indicating whether or not it is a placeholder for later edit.

Bill

WA · June 9, 2008

(2/4) Relation Issues

Relation Issues:

Cycling: This structure resembles that of a Bill Of Materials; it would raise the problem of preventing cycling ("A" consists of "B" & "C"; "C" consists of "A"& "D"; and so on forever). Workaround: Provide no capability for Bill Of Materials processing. Document the fact that “cycling” detection and prevention are not provided.

Implementation: may be tricky. If a user just wants to record that a relationship exists between Title A and Title B, the application should be capable of creating a defaulted A-Statement N, B-Statement M, and a Relation R between A-N and B-M. The IsDefault flag facilitates follow-up.

Simplicity: I’m not sure that the "rules" for handling this situation as a database belong in Zotero; they may be too complex for good performance. Perhaps in a personal database that links to Zotero if I can figure out how to link it and keep it synchronized.

It might be simpler to eliminate separate "Statement"s, but then “Relation” would have to provide for Statement Heading & Description, Originating & Destination Verb Phrases, and the Relation Description. Also, it might be difficult to identify contradictory statements within a single Title.

Bill

WA · June 9, 2008

(3/4) Relation Examples, per Feature Requests for Semantic Relations

Type A Originating Verb Phrase Originating Referenced
Name: B Referenced Verb Phrase Referenced Originating

Support: Statement A Is Supported By Statement B
----------> Statement B Supports Statement A
Refute: Statement A Is Refuted By Statement B
----------> Statement B Refutes Statement A
Publish: Statement A Is Published As Statement B
----------> Statement B Is Publication Of Statement A
Image: Statement A Has Image In Statement B
----------> Statement B Is Image Of Statement A
Copy: Statement A Copied As Statement B
----------> Statement B Is Copy Of Statement A
Reproduce: Statement A Is Reproduced As Statement B
----------> Statement B Is Reproduction Of Statement A
Transcribe: Statement A Transcribed As Statement B
----------> Statement B Is Transcription Of Statement A
Review: Statement A Reviewed In Statement B
----------> Statement B Is Review Of Statement A
Adaptation: Statement A Adapted In Statement B
----------> Statement B Adaptation Of Statement A
Annotation: Statement A Annotated In Statement B
----------> Statement B Annotation Of Statement A
Commentary: Statement A Has Commentary In Statement B
----------> Statement B Is Commentary On Statement A
Remake: Statement A Remade As Statement B
----------> Statement B Is Remake Of Statement A
Contradiction: Statement A Contradicts Statement B
----------> Statement B Contradicts Statement A

This did not preserve the formatting. It might still be readable. Sorry.

Bill

WA · June 9, 2008

(4/4) Diagram of Rules

My attempt to copy a gif of the diagram here did not succeed.

I will try Trak.

Bill

WA · June 9, 2008

(4/4) Diagram of Rules

The Diagram of Rules has been attached as a file on TRAC.
See:
ERD.gif
as an attachment on
https://www.zotero.org/trac

I hope this helps.

Bill

bdarcus · June 9, 2008

Bill: a 14kb gif of an ERD is pretty much unreadable ;-)

More broadly, I'm going to ask the obvious: why reinvent something vaguely like RDF in SQL, rather than just use RDF more directly, particularly when the primary import/export format will in fact be RDF? Or to put this more generically what would be the relation (no pun intended) between this model and an RDF representation?

For example, in RDF you've got the standard triples: subject, predicate, object. The subject and objects WRT to Zotero and this particular use case are the Zotero items. Each has an internal ID, and (when 2.0 comes) a global URI. So the primary issue we're dealing with are the predicates, also identified by URI.

Of course, triples alone aren't useful here b/c need to track who's making whatever statements. For this, most triples stores (including those built on top of MySQL, etc.) add a fourth element: context.

So context comes in as in essence "user-generated statements." Every user has a graph of their statements. Those statements can then also be trivially aggregated.

Am not offering any specific implementation suggestions here; just wondering how RDF might help think about this use case.

WA · June 9, 2008

Sorry! :-[

I did not mean to suggest any particular technical implementation. Also, I agree that a 15KB gif results in a miserable illustration.

About the illustration (diagram): I had not tried to transfer a graphic to you folks before, so I chose the smallest possible file size, intending to copy it into an HTML post. That did not work, so I sent the .gif thru TRAC.

I am limited to screen capture techniques and the diagramming application. I could send you .jpg, .wmf or .pspimage. All screen captures would be a bit fuzzy. I could also send you the original diagram (Visio 2000 or 2003). Your choice. One more thing: If I did it right, the "rules" directly correspond to the essential content and relationships in the diagram.

Let me try to answer the broader question. I led myself astray by looking at what files the Zotero installation had put on my PC: .js and .sqlite. I assumed an SQL implementation. After your note, I looked up RDF, URI and WRT (Assumed URI means Universal Resource Identifier, found RDF but not WRT), together with several links claiming RDF could be implemented in a relational database. (I can't comment on those claims.)

Many entries in the Zotero Forum on Semantic Relations appeared to request that the "Related" element make sense forward and backward. E.g., 1) subject, active predicate (as verb), object and 2) object, passive predicate (as verb), subject. I took that as a requirement and generalized (abstracted) the "Related" element so that it could be bi-directional. Also, I added the "Statement" element to support "what" was being referenced in each Title. The "Type" element was added as an attempt to make it easier to enter a "Relation".

I think that the basic problem is the bi-directional requirement. I do not know how this is implemented in a Resource Description Framework; I am out of my depth there.

Anyhow, take the diagram for what it's worth, if anything.

With best regards,

Bill

Julia Thornton · June 10, 2008

This is a "babe in the woods" comment, but it would be just great to visualise semantic relations dynamically like Visual thesaurus, http://www.visualthesaurus.com/ only with the links as well as the nodes descriptively labeled (nodes = items in library, links = relations) but firstly this is proprietary software and secondly, I guess you would have to do the semantic work as above anyway only with an added layer of design complexity.

jodler · June 12, 2008

I am sure that a great implementation of these ideas would have high potential. I was thinking about this: some of these relations might be page specific: Author A refutes B's concepts not in the entire book or article but on page 202ff.
So it might be useful to add a short string to a semantic relation such as the page number. It might, however, get to complicated...

diegomaranan · March 31, 2010

Hi all, I'm a non-Zotero-developer and a huge fan of Zotero. I just wanted to add weight to SalishSea's Jan 16th, 2008, comment. A "cited-by/cited-in" relationship would be INCREDIBLY useful. :)

Also, in my admittedly naïve opinion, I think that a semantic-based relationship (such as a "refuted by" relationship) would be less useful than a cited-by/cited-in relationship, although I suppose that a user can just create arbitrary labels for relationships, so that they can create new semantically-based relationships like, "kind of refuted by" or "totally misinterprets".

A way to export these as some kind of visual graph would be fantastic as well.

I searched through the forum posts and this is the discussion thread that seemed most appropriate to say this.

Thank you!

antikorpo · February 21, 2013

May I hook in here and suggest a feature for Zotero (or Zotfile?).

ZotFile already allows extraction of Annotations from PDFs, so I think it is possible to read text from a pdf file for further analysis:

Zotero should automatically search for citations in attached PDF-files (when indexing? in the context menu?) to determine whether one of the cited files is already in the library.

It would be great to have a "cited" realtionship (or until such feature is implemented just a simple relationship) between the articles that cite "each other". Of course only the newer article can cite the older one, at least for static publications: this could serve as an optional filter for better performance, since an article from 1899 cannot possibly cite any newer article in the library an thus will Zotero would only have to check a few (or even no) older articles.

dlesieur · August 12, 2020

This thread is somewhat inactive, but it appears to be the most detailed one on the subject of semantic/typed relations. I too find that the "Related" field is not very expressive and would like to be able to use typed and directed relations.

The Citation Typing Ontology defines a great list of relation types: https://sparontologies.github.io/cito/current/cito.html.

@bjohas and I are currently experimenting with typed relations in Kerko (a Zotero client web app). To work around Zotero's lack of typed relations, we are currently using child notes that contain the Zotero URIs of related items, which Kerko then parse. Obviously, a better integrated solution in Zotero would be much more user-friendly, and more resilient to changes to the library (such as item merges). At the moment we display "Cites" and "Cited by" relations, which will be pretty useful to our users (have a look at an example here: https://docs.opendeved.net/lib/9IYKEUKJ). I thought I'd share this to give an example of a real use case, but my impression is that even for private libraries, semantic relations would be extremely useful to researchers as a kind of annotation (at least I know I would use them that way in my own research!).

Is the general idea of semantic relations still on the Zotero team's radar?

polisank · September 17, 2020

@dlesieur I totally agree: "I too find that the "Related" field is not very expressive and would like to be able to use typed and directed relations."

Also, I cannot find the "related items" in the exported .csv, do you know how to do that?

Alb · June 18, 2021

Maybe you do not know so I'll share: to overcome these "lack of correlation" I use https://www.citationgecko.com/ and https://www.connectedpapers.com which do the job I need quite good. To have both integrated somehow directly in zotero would be the best but it is still reasonable to go directly on their websites.

In addition there is https://scite.ai which also offers the zotero plugin https://github.com/scitedotai/scite-zotero-plugin/