Part-whole relations and foundational ontologies

Part-whole relations seem like a never-ending story—and it still doesn’t bore me. In this case, the ingredients were the taxonomy of part-whole relations [1] and a couple of foundational ontologies and the aim was to link the former to the latter. But what started off with the intention to write just a short workshop note, for seemingly clear and just in need of actually doing it, turned out to be not so straightforward after all. The selected foundational ontologies were not as compatible as assumed, and creating the corresponding orchestration of OWL files was a ‘non-trivial exercise’.

What were (some of) the issues? On the one hand, there are multiple part-whole relations, which are typically named differently when they have a specific domain or range. For instance, to relate a process to a sub-process (e.g., eating involves chewing), to relate a region to a region it contains, relating portions of stuff, and so on. Those relations are fairly well established in the literature. What they do demand for, however, is clarity as to what those categories really are. For instance, with the process example, is that to be understood as Process as meant in the DOLCE ontology, or, say, Process in BFO? What if a foundational ontology does not have a category needed for a commonly used part-whole relation?

The first step to answer such questions was to assess several foundational ontologies on 1) which of the part-whole relations they have now, and which categories are present that are needed for the domain and range declarations for those common part-whole relations. I assessed that for DOLCE, BFO, GFO, SUMO, GIST, and YAMATO. This foundational ontology comparison is summarised in tables 1 and 2 in the paper that emanated from the assessment [2], entitled “A note on the compatibility of part-whole relations with foundational ontologies” that I recently presented at FOUST-II: 2nd Workshop on Foundational Ontology, Joint Ontology Workshops 2017 in Bolzano, Italy. In short: none fits perfectly for various reasons, but there are more and less suitable ontologies for a possible alignment. DOLCE and SUMO were evaluated to have the best approximations. It appeared at the workshops presentation’s Q&A session, where two of the DOLCE developers were present, that the missing Collective was an oversight, or: the ontology is incomplete and it was not an explicit design choice to exclude it. This, then, would make DOLCE the best/easiest fit.

I’ll save you the trials and tribulations creating the orchestrated OWL files. The part-whole relations, their inverses, and their proper parthood versions were manually linked to modules of DOLCE and SUMO, and automatically linked to BFO and GFO. That was an addition of 49 relations (OWL object properties) and 121 logical axioms, which were then extended further with another 11 mereotopological relations and its 16 logical axioms. These files are accessible online directly here and also listed with brief descriptions.

While there is something usable now and, by design at least, these files are reusable as well, what it also highlighted is that there are still some outstanding questions, as there already were for the top-level categories of previously aligned foundational ontologies [3]. For instance, some categories seem the same, but they’re in ‘incompatible’ parts of the taxonomy (located in disjoint branches), so then either not the same after all, or this happened unintentionally. Only GIST has been updated recently, and it may be useful if the others foundational ontologies were to be as well, so as to obtain clarity on these issues. The full interaction of part-whole relations with classical mereology is not quite clear either: there are various extensions and deviations, such as specifically for portions [4,5], but one for processes may be interesting as well. Not that such prospective theories would be usable as-is in OWL ontology development, but there are more expressive languages that start having tooling support where it could be an interesting avenue for future work. I’ll write more about the latter in an upcoming post (covering the K-CAP 2017 paper that was recently accepted).

On a last note: the Joint Ontology Workshops (JOWO 2017) was a great event. Some 100 ontologists from all over the world attended. There were good presentations, lively conversations, and it was great to meet up again with researchers I had not seen for years, finally meet people I knew only via email, and make new connections. It will not be an easy task to surpass this event next year at FOIS 2018 in Cape Town.

 

References

 

[1] Keet, C.M., Artale, A. Representing and Reasoning over a Taxonomy of Part-Whole Relations. Applied Ontology, 2008, 3(1-2):91-110.

[2] Keet, C.M. A note on the compatibility of part-whole relations with foundational ontologies. FOUST-II: 2nd Workshop on Foundational Ontology, Joint Ontology Workshops 2017, 21-23 September 2017, Bolzano, Italy. CEUR-WS Vol. (in print)

[3] Khan, Z.C., Keet, C.M. Foundational ontology mediation in ROMULUS. Knowledge Discovery, Knowledge Engineering and Knowledge Management: IC3K 2013 Selected Papers. A. Fred et al. (Eds.). Springer CCIS vol. 454, pp. 132-152, 2015. preprint

[4] Donnelly, M., Bittner, T. Summation relations and portions of stuff. Philosophical Studies, 2009, 143, 167-185.

[5] Keet, C.M. Relating some stuff to other stuff. 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW’16). Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (Eds.). Springer LNAI vol. 10024, 368-383. 19-23 November 2016, Bologna, Italy.

Advertisements

Aligning different relations: the case of part-whole relations—LDK2017

Despite the best intentions, I did not get around to writing a post on the paper that I presented last week at the First International Conference on Language, Data and Knowledge 2017, 19-20 June, Galway, Ireland, and now Paul Groth also ‘beat’ me to it writing a nice conference report of it. On the bright side, it is an opportunity to say upfront I really enjoyed the conference and look forward to the next edition in 2019. The ESWC’17 organisers might be slightly disappointed that there was no special track on the multilingual semantic web after all, but I did get the distinct impression that the LDK17 authors might just all have gambled on LDK17—an opportunity to binge two days on all things natural language & Semantic Web—rather than on one track at an overpriced conference (despite the allure of it being A-rated).

So, what was my paper about that could have been submitted to either? I ended up struggling—and solving—an issue with aligning OWL object properties that were not simple 1:1 mappings, in a similar scope as our ESWC17 paper (introduced here) [4], but then with too many complications. Complications were due to the different conceptualisations of part-whole relations and that one of the requirements was to solve what to do with an object property (relation, relationship) that does not have a neat, single, label, and therewith neither fitting with the common OWL modelling paradigm nor with the recently agreed-upon ontolex-lemon model for linguistic annotations.

The start of all this sounded nice and doable: we need to generate natural language for healthcare, using, e.g., SNOMED CT, in local languages in South Africa, focussing on the largest one, being isiZulu. Medical terminologies are riddled with part-whole relations, so we sought to address that one (simple existentials already having been solved), availing of a standard list of part-whole relations (e.g. [1]). That turned out to be a non-trivial exercise, but doable eventually [2]. What wasn’t addressed in [2] was that some ‘common’ part-whole relations, such as membership and containment, weren’t like that in isiZulu, at all. Moreover, it wasn’t just a language issue, but ontological as well. The LDK17 paper “Representing and aligning similar relations: parts and wholes in isiZulu vs English” [3] describes this in some detail.

Here’s a (simplified) list of (assumed to be) common part-whole relations, which takes into account both transitivity differences and domain and range:

Now here’s the one based on the isiZulu language and some ontological analysis of that:

That is: there are both generalisations—some distinctions are not being made—and specialisations—some distinctions are made here but not elsewhere. For instance, ‘musician is part of some orchestra’ and ‘heart is part of some human’ (or vv.) is all done and described in the same way (ingxenye ‘part of’ and SC+CONJ for ‘has part’ [more about that below]). Yet, there is a difference between an individual (e.g., a voter) participating in some process and a collective (e.g., the electorate) participating in a process, or vv. The paper describes this more precisely, going into some detail regarding the differences in categories of domain and range and into the consequences on transitivity of mereological parthood.

The other ‘odd thing’—cf. current multilingual Semantic Web assumptions and technologies, that is—is that while the conceptualisation of ‘has part’ exists, it does not have a single label as in English (or in several other languages, such as heeft as deel), but it is dependent on the noun class of the noun of the class that play the part and play the whole in the relation. It combines the subject concord (~conjugation) of the noun class of the noun that plays the whole with a conjunction that is phonologically conditioned based on the first letter of the noun that plays the part; with verbalisation in the plural and three phonological cases, there are 18 possible strings all denoting ‘has part’. This still could be sorted with a language with inverses, provided the part-of direction has a name, like the ingxenye. This is not the case for containment, however. Instead of the relation (object property) having a name—be this a verb like ‘contained in’ or some noun phrase—it is the noun that plays the whole (the container, if you will) that gets modified. For instance, imvilophu ‘envelope’ and emvilophini denoting ‘contained in the envelope’, or, for individuals and locations, the city iTheku ‘Durban’ and eThekwini meaning ‘located in Durban’ (no typo—there’s some phonological conditioning I’m brushing over). While I have gotten used to such constructions, it generated some surprise among several attendees that one can have notions, concepts, views on or interpretations or descriptions of reality, that exist but do not have even one single string of text throughout to refer to regardless the context it is used.

The naming issue was solved by adding some arbitrary string as ‘name’ of the object property, and relating that to the function that verbalises that specific part-whole relation. The former issue, i.e., not all the same part-whole relations, required a bit more work, using ontology pattern alignments, by extending one correspondence pattern from the ODP catalogue and introducing a new one (see paper for the formal details), using the same broad framework of formalisation as proposed in [4].

All this was then implemented and aligned, and verified to not result in some unsatisfiable classes, object properties, or inconsistency (files). This also works in the isiZulu verbalisation tool we demo-ed at ESWC17 (described in the previous post) [5], all as part of the NRF-funded GeNI project.

Now, ideally, I already would have had the time to read the papers I flagged in my LDK17 conference notes with “check paper”. I haven’t yet due to end-of-semester tasks. So, on the basis of just a positive-seeming presentation, here are a few that are on the top of my list to check out first, for quite different reasons:

  • Interaction between natural language reading capabilities and math education, focusing on language production (i.e., ‘can you talk about it?’) [6], mainly because math education in South Africa faces a lot of problems. It also generated a lively discussion in the Q&A session.
  • The OnLiT ontology for linguistic [7] and LLODifying linguistic glosses [8] terminology (also: one of the two also won the best paper award).
  • Deep text generation, for it was looking at trying to address skewed or limited data to learn from [9], which is an issue we face when trying to do some NLP with most South African languages.

 

References

[1] Keet, C.M., Artale, A. Representing and Reasoning over a Taxonomy of Part-Whole Relations. Applied Ontology, 2008, 3(1-2):91-110.

[2] Keet, C.M., Khumalo, L. On the verbalization patterns of part-whole relations in isiZulu. 9th International Natural Language Generation conference (INLG’16), September 5-8, 2016, Edinburgh, UK. ACL.

[3] Keet, C.M. Representing and aligning similar relations: parts and wholes in isiZulu vs English. In: Gracia J., Bond F., McCrae J., Buitelaar P., Chiarcos C., Hellmann S. (eds) Language, Data, and Knowledge LDK 2017. Springer LNAI vol 10318, 58-73.

[4] Fillottrani, P.R., Keet, C.M. Patterns for Heterogeneous TBox Mappings to Bridge Different Modelling Decisions. 14th Extended Semantic Web Conference (ESWC’17). Springer LNCS. Portoroz, Slovenia, May 28 – June 2, 2017.

[5] Keet, C.M. Xakaza, M., Khumalo, L. Verbalising OWL ontologies in isiZulu with Python. 14th Extended Semantic Web Conference (ESWC’17). Springer LNCS. Portoroz, Slovenia, May 28 – June 2, 2017. (demo paper)

[6] Crossley, S., Kostyuk, V. Letting the genie out of the lamp: using natural language processing tools to predict math performance. In: Gracia J., Bond F., McCrae J., Buitelaar P., Chiarcos C., Hellmann S. (eds) Language, Data, and Knowledge LDK 2017. Springer LNAI vol 10318, 330-342.

[7] Klimek, B., McCrae, J.P., Lehmann, C., Chiarcos, C., Hellmann, S. OnLiT: and ontology for linguistic terminology. In: Gracia J., Bond F., McCrae J., Buitelaar P., Chiarcos C., Hellmann S. (eds) Language, Data, and Knowledge LDK 2017. Springer LNAI vol 10318, 42-57.

[8] Chiarcos, C., Ionov, M. Rind-Pawlowski, M., Fäth, C., Wichers Schreur, J., Nevskaya. I. LLODifying linguistic glosses. In: Gracia J., Bond F., McCrae J., Buitelaar P., Chiarcos C., Hellmann S. (eds) Language, Data, and Knowledge LDK 2017. Springer LNAI vol 10318, 89-103.

[9] Dethlefs N., Turner A. Deep Text Generation — Using Hierarchical Decomposition to Mitigate the Effect of Rare Data Points. In: Gracia J., Bond F., McCrae J., Buitelaar P., Chiarcos C., Hellmann S. (eds) Language, Data, and Knowledge LDK 2017. Springer LNAI vol 10318, 290-298.

On heterogeneous mappings between ontologies

Representing information and knowledge often can be done in different ways even when the same representation language is used. In some cases, one way of representing it is always better than another—or: the other option is sub-optimal or plain wrong—but in other cases the distinction is not all that clear-cut. For instance, whether to represent ‘Employee’ as a subclass of ‘Person’ or that it inheres in ‘Person’. Now, if two ontologies (or conceptual models) represent it differently but they have to be aligned, then how to find such different modelling patterns and how to align them? And, taking a step back: which alternate modelling patterns are there, and why those? We sought to answer these questions, whose outcome will be presented (and appear in the proceedings of [1]) the 14th Extended Semantic Web Conference (ESWC’17) that will take place later this month in Portoroz, Slovenia.

Setting aside the formal stuff in this blog post, let’s first have a look at some of those different modelling patterns. At it’s core, there are 1) modelling practices in ontologies vs conceptual models and 2) foundational [or: top-level, or upper] ontology guidance vs being ‘compacter’ in representing the knowledge. The generalisations of the following handwaivy examples are described in more detail in the paper, but for this blog post, it hopefully will do as a teaser of the six formalised patterns. Take, e.g., the following examples that are all variations on the same theme: to-reify-or-not-to-reify, where the example in B is further dressed up with content from a foundational ontology:

Indeed, in the examples, what is shown on the left-hand side does not have the exact same information content as what is shown on the right-hand side, but the underlying conceptualization is pretty much the same. The models on the right-hand side are more precise, for one has the opportunity to specify those, like stating that a particular marriage is between two persons (so, no group marriages allowed). Whether one always needs such more precise constraints is a separate matter.

Then there’s the Employee example mentioned in this post’s introduction with two alternate ways of representing it:

That is, a modeller chooses between representing the role an object performs/has as a subclass of that object or in a separate hierarchy of roles. Foundational ontologies take the latter option, domain ontologies the former.

These examples are instantiations of small modelling patterns (of which there may be more than the six formalised in the paper). To devise mappings between them, one ends up with alignments in such a way that they are between two patterns, rather than 1:1 mappings. To get there, we had to take some preliminary steps on how to represent it all formally, such as specifying the language for a pattern and a defining an ontology pattern alignment. This allowed us to formalise the patterns and devise that formal specification of the heterogeneous alignments.

That outcome, in turn, feeds into the alignment pattern search and checking algorithms. The algorithms show that it is feasible to find those patterns automatically, which then can propose possible alignments to the modeller, and that, upon aligning, one can check whether that’s done correctly. For instance, take the following two ontologies graphically represented in an (extended, enhanced) ICOM tool:

Two inter-ontology assertions have been made, pointed out with the two yellow arrows; i.e., ‘Tennis’ is a subclass of ‘Tournament’ and ‘TennisPlayer’ is a subclass of ‘Athlete’. The pattern search algorithm then will try to find instantiations for the small modelling patterns for alignment. Once something is found—in this case, pattern A fits—it will check whether all conditions for the alignment can be satisfied, and if so, it will propose a possible alignment, which is shown in the following illustrative figure:

Of interest here is, perhaps, the ‘new’ object property being proposed, indicated with the yellow arrow, that amounts to an equivalence to the partOf+Match+played. (That threesome can’t be mapped as equivalent to ‘participated’ due to differences in domain and range axioms, and drawing three subsumption lines from ‘participated’ to ‘part of’, ‘Match’, and ‘played’ is awkward.). The algorithms’ output then thus reduces the alignment into a final question to the modeller along the line of “are you ok with the alignment between the purple elements in the two diagrams?”, and accept or reject it. Please refer to the paper for further details.

The principles presented could possibly be used also for refactoring of an ontology, like in TDD [2] or when ‘preparing’ an ontology to align to a foundational ontology. More results on this topic are in the pipeline, and if you want to know now already, we can have a chat at ESWC.

References

[1] Fillottrani, P.R., Keet, C.M. Patterns for Heterogeneous TBox Mappings to Bridge Different Modelling Decisions. 14th Extended Semantic Web Conference (ESWC’17). Springer LNCS. Portoroz, Slovenia, May 28 – June 2, 2017. (in print)

[2] Keet, C.M., Lawrynowicz, A. Test-Driven Development of Ontologies. In: Proceedings of the 13th Extended Semantic Web Conference (ESWC’16). Springer LNCS 9678, 642-657. 29 May – 2 June, 2016, Crete, Greece.