How are roles, also called association ends, used in conceptual data models? It was a question that I pondered about some five years ago, with the aim to improve their use for more precise modelling and add more ontological principles to it. VerbNet seemed to fit right in there to contribute to, if not be the, solution: a knowledge base about verbs and verb classes, the kind of things that participate in the action represented by the verb and the roles they play in it. But I got stuck. It didn’t add up as I thought it would, and then sabbatical time was up, and other work took over. I dusted it off over half a year ago to give it a try to ‘un-stuck’ it, since the topic resurfaced as part of the ‘abstract representation’ for the Abstract Wikipedia project. After additional analysis, I got to a better understanding of the problem [Keet23], but no concrete usable solution so far. Yet, the new insights that the analysis resulted in was already enough for the work to be accepted at the 13th International Conference on Formal Ontology in Information Systems (FOIS2023) that will be held in July in Sherbrooke, Canada and in September online.
Before I can even start to clarify it informally in this post, we first need to disambiguate which roles I’m referring to. The term ‘role’ is used for many things, including the social roles people play (e.g., student), roles (/positions/argument places) as part of a relation in an ontology of relations, the roles as components of fact types in Object-Role Modeling, roles that are mostly binary relationships in Description Logics, or roles in linguistics, like the agent and undergoer roles in verbs and subject and object roles in a sentence. The element under investigation in the paper is the role as part of a relation(ship), committing to positionalism as a (to be refined) theory for the ontology of relations [Fine00,Leo08,Orilia11]. In conceptual data modelling, they are called roles, association ends, or relationship components; in linguistics, from what it seemed initially at least, semantic or thematic roles. Both fields deem roles particularly useful for a wide range of reasons—regardless whether philosophers like that extra piece of fundamental furniture of the universe—such as using them in constraint declarations, adding annotations, or improving parsing of text.
Here’s a plausible-looking small ORM diagram about animals living in geographic regions on the left-hand side of the image:
It’s neat and tidy and assists with disentangling key elements like the reading labels, roles with role names, and a name for the relation that shows up in the behind-the-scenes formalisation. The role name [inhabitant] sounds like a social role and [location] sounds like a thematic role; conversely, [location] is definitely not a social role and [inhabitant] is surely not a thematic role. Still, the names are perfectly fine for an ORM diagram. Ontologically, and with ontology-driven conceptual data modelling in mind, it’s a bit murky.
Naming the roles may not happen very often in either ORM, UML class diagrams, or EER, but maybe something can be learned from it nonetheless and be used for Abstract Wikipedia’s abstract representation, and, likewise, we may be able to pick up something from VerbNet’s thematic roles [Palmer17] and conjure up a fine cocktail for solution and be ontology-informed or else contribute to the further development of an ontology of relations. This led to two key questions to start tackling the issue:
- How are roles used in conceptual data models? Are they named and if so, how, and do they map usefully into semantic or thematic roles as specified in linguistic resources or ontologies? Can this modus operandi be copied over to Abstract Wikipedia’s abstract representation?
- To what extent do those verb classes with their roles and fillers in the authoritative linguistic resource VN adhere to ontological principles? Can that be improved upon further, using basic modelling guidance from ontology development and without the need for major theoretical `overhead’ for an end user writing Abstract Wikipedia’s constructors?
Instead of a purely theoretical analysis, I updated that 5-year-old VerbNet hands-in-the-mud data-based analysis with the updates that VerbNet had made—neither their set of roles nor their verb class characterisations remained static over the years—and I revisited the manual analysis of roles in a corpus of 101 conceptual data models from 2018 that extended an earlier data analysis reported in [KeetF15]. Those conceptual data models have only about half of the roles named in UML class diagrams, where it is actually mandatory to do so, and scoring much better than EER and ORM diagrams where less than 10% of the roles in the models in the corpus are named. When roles are named, they mostly are of the type of so-called ‘deep’, or subject domain-specific, roles, which also may be called ontological roles (even if a particular role name may not be the most suitable ontologically). They are roles with names in the examined UML, EER, and ORM models such as [participant], [member], [work (for)], [manage], [parent], [client], [physician], and [upperValue]. Ideally, there would be some modelling guidance and quality control for them eventually.
For VerbNet, and with an eye on the possibility that relationships might be defined by their roles and participants, I coded up their XML-based specification of 5 selected verb classes and their subclasses in a test ontology and ran the reasoner over it. A few equivalences were deduced, such as Deprive-10.6.2 and Cheat-10.6.1; or: at least the information in VerbNet is not enough to distinguish verb classes that way. But the deductions may help evaluate those modelling decisions of VerbNet to seek areas for refinement of the representation. Also, 36 subsumptions were deduced, even across major verb classes, such as Fire-10.10 being deduced to be a subclass of Hire-13.5.3, which also pinpoint to possible areas for improvement.
Digging deeper, it was clear that there are a few infelicities in both VerbNet’s thematic role hierarchy and in the specification of the role players; the paper motivates improvements on both. For the hierarchy, I remodelled the multiple inheritances to a single inheritance hierarchy, but I kept the ontologically awkward terms to keep backward compatibility (there’s room for improvement there). Redesigned, it looks like this:
For the role players, I separated ontological categories from the grammatical features of the words we use to describe them and used DOLCE categories to indicate the category of the entity that can participate in the relations referred to by that verb class (also this can be refined further).
Consider, for instance, the verb class of knead. VerbNet has it that the agent role in knead can be played by ‘Animate or Machine’, with as example that a human can be kneading the dough or the bread machine can do that (and cats knead, too). Animacy is a semantic feature (in linguistics) and therewith a grammatical feature, whereas machine is a physical object in the real world. But it ought to be a union among kind and at the same level of analysis, not mixing ontology and linguistics. So, then either we’d have, e.g., ‘Physical object’ in the sense of a foundational ontology such as DOLCE, comprising both the animals who knead and the kneading machines, or ‘Animate or Inanimate’ as linguistic constraints on the role players of agent in knead. They each deserve their own framework to deal with it—the relations with their ontological roles and participants on the one hand, and the verbs with their linguistic roles and features of words on the other. For conceptual modelling and an ontology of relations, one would be more interested in the former; for natural language generation, the latter will also be useful.
To squeeze the related work and all the analysis into a mere 15 pages was not easy and some details have been left out for readability; there’s more in the supplementary material as well. Does that contain any concrete new frameworks rolling out of all this? Not yet, but, with the conceptual muddles cleared up, this should be doable to specify as a next step. Or: TBC…
I’ll present the paper as part of the FOIS2023 online sessions in September, but I will still attend FOIS2023 thanks to joining ISAO 2023 as a facilitator (and to present another paper at FOIS2023), so if you have any questions or comments, please feel free to email or, even better: let’s meet up while I’m there next month!
References
[Fine00] Fine K. Neutral Relations. The Philosophical Review. 2000;109(1):1-33.
[Keet23] Keet, C.M. An analysis of positionalism’s roles in use. 13th International Conference on Formal Ontology in Information Systems 2023 (FOIS’23). IOS Press, FAIA vol. xxx, xx-xx. 18-20 July Sherbrooke, Canada / Sept online. (in print)
[KeetF15] Keet, C.M., Fillottrani, P.R. An analysis and characterisation of publicly available conceptual models. 34th International Conference on Conceptual Modeling (ER’15). Johannesson, P., Lee, M.L. Liddle, S.W., Opdahl, A.L., Pastor Lopez, O. (Eds.). Springer LNCS vol 9381, 585-593. 19-22 Oct, Stockholm, Sweden.
[Leo08] Leo J. Modeling relations. Journal of Philosophical Logic. 2008;37:353-85.
[Orilia11] Orilia F. Relational Order and Onto-Thematic Roles. Metaphysica. 2011;12:1-18.
[Palmer17] Palmer M, Bonial C, Hwang JD. VerbNet: Capturing English verb behavior, meaning and usage. In: Chipman SEF, ed. The Oxford Handbook of Cognitive Science. OUP. 2017. pp315-336.