Archive for the ‘part-whole’ Category

A few notes on a successful ESWC’12 and OWLED’12

Slightly later than near-realtime due to flight delays, here are a few notes on the 9th Extended Semantic Web Conference ESWC’12 and OWL: Experiences and Directions OWLED’12, which I attended about two weeks ago in Crete, Greece.

ESWC’12

ESWC’12 was as selective as previous years, with, on average, a 25% acceptance rate. The proceedings are published by Springer; where applicable, I’ve linked the freely available versions in the references below. There’s also metadata and a list of award winners.

Main background picture of the ESWC’12 conference, with Cretan hills

Keynotes

I assume that, like last year, The keynotes have been put on the video lectures website; below follows a brief impression. for now, you’ll have to make do with a brief impression through my lenses.

Alon Halevy, head of structured data at Google, gave his keynote the morning after the social dinner (but the conference hall was full nevertheless). He entertains the perspective of Knowledge Representation and the Semantic Web as being “databases on steroids”. The talk’s topics were Google fusion tables with lightweight semantics that are intended as a “data management for the 99%” and Webtables, which was about a search for data tables on the Web, with as goal to have an easy to use database system that is integrated with the web. The work on web tables was alike a very large-scale attempt at bottom-up lightweight conceptual data model and ontology development. They crawled the Web for raw tables (14 billion), of which an estimated 154 million can pass for real relations (relations from the database viewpoint, with structured data, not using a html table for the layout of a page), which then ended up as 2.5 million schemas as recovered table/relation semantics. And then there’s Halevy’s enthusiasm about coffee.

Aleksander Kolcz from Twitter went over a few problems they are trying to solve at Twitter, such as the tweet relevance, who to follow, content recommendation, language, anti-spam, and user interest modeling. As small tidbit of data: there are 140 million users, 340 million tweets/day, and 2.3 billion search queries/day (i.e., 26K/sec.). Apparently, when one has enough, i.e., very large amounts, of data, simple models work “remarkably well” and ensembles of classifiers perform better in accuracy.

Abraham Bernstein’s keynote was about getting our act together in the semantic web research area and promoting the “garbage can theory” that was introduced by Cohen, March and Olsen in 1973: or, some ideas, theories, and tools are ‘thrown away’ into the garbage, where they can meet others, and combine so that something beautiful can come of it after all (this is my simplistic, shorthand version of it).

Unfortunately I missed the pre-conference keynote by Julius van der Laar because OWLED was still ongoing. By hearsay, I’ve heard it was a good/interesting one about what (sneaky) social media strategies the Obama campaign used in the previous presidential elections in 2008.

Papers

There were several tracks that ran in parallel, hence attendance was necessarily limited due to those logistic constraints. I’ve attended the ontologies, reasoning, semantic data management, digital libraries and cultural heritage, and in use sessions. The following pointers are based on my attendance of the presentations and partial reading of the papers.

Ontologies track. Yves Raimond from the BBC presented a query-driven evaluation framework for ontologies, defining their way of ‘good’ with respect to the task and data, and applied it to the music ontology (online slides), noting some room for improvements. The paper also has a neat brief overview of techniques for ontology evaluation [1]. I presented the paper co-authored with Francis Fernandez and Annette Morales on mereotopology and the OntoPartS tool that helps modellers to represent part-whole relations [2], which I introduced in an earlier post. OntoPartS was also presented at the demo session [3], which generated quite some interest among logicians and practitioners alike. Besides my ‘toy ontology’ examples to demonstrate the tool’s functionality, Martin Hepp had brought his GoodRelations ontology for e-commerce, which I thus used instead to illustrate adding part-whole relations to a real ontology. The demo session ended officially at 9pm, but it was after 10pm before I packed up my tablet.

Semantic data management track. Craig Knoblock and co-authors developed a system to link data to ontologies and preserve the linking in a so-called (logic-based) “source model” that is computed semi-automatically by taking as input the data, an ontology, some learned semantic types, and a refinement step by the user in a nice GUI [4]. This was evaluated with a set of bio-informatics resources, such as UniProt. The presentation by Lorena Etcheverry was a bit long on the intro, but the idea nice: enhancing OLAP analysis with ‘good enough’ temporary cubes generated from web sources, the introduction of a new vocabulary, Open Cubes, for the specification and publication of multidimensional cubes on the Semantic Web (which, unfortunately, the authors still have not shared online), and an algorithm for creating the SPARQL 1.1 query for rollup [5].

In use track. Michel Dumontier demonstrated an extension to the HyQue hypothesis formulator and evaluator, using rules sets using the SPARQL Inferencing Notation (SPIN) so that users can trace their hypothesis evaluation [6]. Stefan Scheglmann presented a paper on their efforts how to provide “programming access” to ontologies and have an accompanying tool OntoMDE, a model-driven engineering toolkit (which, however, does not seem to be online available, although a link was shown in the presentation, and I jotted down something on Eclipse plugins) [7]. StorySpace was put in the Digital Libraries and cultural heritage track, but could just as well have been in in-use: it is an environment for constructing and navigating stories, plots, and narratives, guided by the newly introduced curate ontology [8]. We’ll have to look at all that in more detail in the context of our IKMS development [9].

OWLED’12

The proceedings of OWLED’12 are available on CEUR-WS. Over 30 papers were submitted, so, the workshop ended up to be somewhat selective compared to previous years. 18 papers were presented, a keynote, and two tutorials. The following is, again, a selection of that (mainly due to my time constraints reading the papers and typing up something).

Mariano Rodriguez presented the ontopQuest system [10] for Ontology-Based Data Access, providing SPARQL query answering with OWL 2 QL/RDFS entailments.  It works with the so-called “classic ABox mode” with an internal relational database and in “virtual ABox mode”, and, unlike, say, QuOnto, it embeds most of the TBox semantics into the database by availing of a (also recently developed) semantic indexing technique. (Hopefully that’ll help my ontologies & knowledge bases students to answer the OBDA questions better next time, who ought to have read at least David Toman’s slides on the principal approaches to realize OBDA before the test.) Staying with reasoning, Dmitry Tsarkov presented the idea of using metareasoning that takes into account both the features of current reasoners and modularisation to come up with the ‘best’ reasoning strategy to answer a query over only that part of the ontology that is relevant for the query [11].

An extension to the OWLGrEd tool for modeling OWL ontologies through a UML-like interface was presented: the developers have added a ‘splitter’ to enable a user to decide which axioms to close (using the OWL + Integrity Constraints), then to send the serialization to the reasoner and display the inferences [12]. Pity that it works only with the commercial RDF database Stardog by Clark & Parsia. Bijan Parsia  presented—among other things—a paper on automatically generating analogy questions, which are widely used in multiple choice questions, and determining somehow their difficulty. The automated generation was facilitated by an ontology, and the initial results are promising [13]. I presented the paper on OWL requirements for indigenous knowledge management systems [9], about which I blogged earlier, as one of my co-authors, Ronell Alberts, was already presenting a paper based on her recently completed MSc thesis [14].

One of the tutorials was about modularity, which was presented by Chiara del Vescovo and Dmitry Tsarkov from Manchester University (see their modularity website for more info). The tutorial presented an overview of where modularity is useful, and how. Some of the reasons to modularise are to facilitate the explanation services, to perform incremental reasoning, semantic diff, and hotspot detection (= splitting an ontology into the simple and the complex part). That is, it presented a viewpoint on modularity as possible solution for the issues of (and the need for) scalability and performance of automated reasoning. Modularity and modularization during modeling and to reduce the so-called cognitive overload—i.e., involving some, or even driven by, subject domain semantics—was here (and is in most other DL-oriented outlets) apparently entirely outside the scope, which is a missed opportunity (more about that another time).

Typical tourist picture of the conference hotel (the view from my room wasn’t that great, but with the busy schedule, that didn’t matter anyway)

Aside from the stimulating papers and keynotes, and ensuing conversations with fellow researchers, it was great to meet people again and meet new people, and we had a lot of fun socialising. Now back to work so as to have shot at next year’s installment of ESWC in Montpellier, France (which is close to a village I used to go on holidays for some 8 years, many years ago).

References

[1] Raimond, Y., Sandler, M. Evaluation of the music ontology framework. ESWC’12, Springer LNCS vol 7295, 255-269.

[2] Keet, C.M., Fernandez-Reyes, F.C., Morales-Gonzalez, A. Representing mereotopological relations in OWL ontologies with OntoPartS. In: Proceedings of the 9th Extended Semantic Web Conference (ESWC’12), 29-31 May 2012, Heraklion, Crete, Greece. Springer, LNCS 7295, 240-254.

[3] Morales-Gonzalez, A., Fernandez-Reyes, F.C., Keet, C.M. OntoPartS: a tool to select part-whole relations in OWL ontologies. 9th Extended Semantic Web Conference (ESWC’12), 29-31 May 2012, Heraklion, Crete, Greece. Demo with paper.

[4] Knoblock et al. Semi-automatically mapping structured sources into the semantic web. ESWC’12, Springer LNCS vol 7295, 375-390

[5] Etcheverry, L., Vaisman, A. A. Enhancing OLAP analysis with web cubes. ESWC’12, Springer LNCS vol 7295, 467-483.

[6] Callahan, A, Dumontier, M. Evaluating scientific hypotheses using the SPARQL inferecing notation. ESWC’12, Springer LNCS vol 7295, 647-658.

[7] Scheglmann, S. Scherp, A, Staab, S. Declarative Representation of Programming Access to Ontologies. ESWC’12, Springer LNCS vol 7295, 659-673.

[8] Mulholland, P., Wolff, A., and Collins, T. Curate and StorySpace: On ontology and Web-based environment for describing curatorial narrative. ESWC’12, Springer LNCS vol 7295, 748-762.

[9] Alberts, R., Fogwill, T., Keet, C.M. Several Required OWL Features for Indigenous Knowledge Management Systems. 7th Workshop on OWL: Experiences and Directions (OWLED 2012).  Klinov, P. and Horridge, M. (Eds.). 27-28 May, Heraklion, Crete, Greece. CEUR-WS Vol. 849.

[10] Rodriguez-Muro, M., Calvanese, D. Quest, an OWL 2 QL reasoner for ontology-based data access.  OWLED’12. CEUR-WS Vol. 849.

[11] Dmitry Tsarkov and Ignazio Palmisano, Divide et Impera: Metareasoning for Large Ontologies. OWLED’12. CEUR-WS Vol. 849.

[12] Kārlis Čerāns, Guntis Barzdins, Renārs Liepiņš, Jūlija Ovčiņnikova, Sergejs Rikačovs and Arturs Sprogis, Graphical Schema Editing for Stardog OWL/RDF Databases using OWLGrEd/S. OWLED’12. CEUR-WS Vol. 849.

[13] Tahani Alsubait, Bijan Parsia and Uli Sattler, Mining Ontologies for Analogy Questions: A Similarity-based Approach. OWLED’12. CEUR-WS Vol. 849.

[14] Ronell Alberts and Enrico Franconi, An integrated method using conceptual modelling to generate an ontology-based query mechanism. OWLED’12. CEUR-WS Vol. 849.

Part-whole relations, mereotopology and the OntoPartS tool

Part-whole relations are considered essential in knowledge representation and reasoning and, more practically, in ontology development and conceptual data modelling, especially in the subject domains of biology, medicine, geographic information systems, and manufacturing. In contrast to Ontology that sticks to one type of part-of, the modellers and subject domain experts have come up with a plethora of part-whole relations, some of which are considered real parthood relations and others only meronymic (or: due to imprecise natural language use). For instance, the Foundational Model of Anatomy has 8 basic locative part-whole relations [1], GALEN has come up with 26 part-whole relations [2], and in cognitive science and conceptual data modelling, it hovers around about 6 types [3,4]. They have been structured in a taxonomy of part-whole relations that makes a distinction between mereology and meronomy, transitivity and in- or non-transitivity, and the domain and range of the relationship [5], and some initial usage guidelines were proposed in [6].

But that’s not enough for the complex subject domains and demands on the representation and reasoning over the ontologies. This holds in particular when one has to represent that some things are contained in or located in something else. For instance, the way how Paris and France relate is somehow different from how the euro coin in your wallet relate to each other—the latter being an example of  (spatial) containment, but not structural part of—whereas in other case, the spatial containment of regions of space and the structural parthood of the objects occupying those regions do coincide, e.g., your heart in your body. Or consider representing that Alto Adige/Südtirol is a border province of Italy (bordering Austria), where we have to handle both the notion of administrative entities and connecting geographical regions. That is, handling regions and ‘things’ that occupy those regions (mereotopology).

Being more precise about how the things relate provides nice inferences. Take, e.g., NTPLI as ‘non-tangential proper located in’—a part is located in the whole but not at the boundary of it—and EnclosedCountry \equiv Country \sqcap \exists NTPLI.Country , with the following instances in our knowledge base NTPLI(Lesotho, South Africa) , Country(Lesotho) , and Country(South Africa) , then it deduces correctly that EnclosedCountry(Lesotho) , whereas with a mere ‘part-of’, we would not have been able to obtain this result.

Besides these examples, there are actual system requirements for, among others, annotating and querying multimedia documents and cartographic maps, such as annotating a photo of a beach where the area of the photo that depicts the sand touches the area that depicts the seawater so that, together with the knowledge that Varadero is a tangential proper part of Cuba, the semantically enhanced system can infer possible locations where the photo has been taken, or, vv., it can propose that the photo may depict a beach scene.

But how to cater for such things?

Let me summarise the three main basic problems that have to be resolved first:

  1. There is lack of oversight on plethora of part-whole relations, that include real parthood (mereology) parts with their locations (mereotopology), and other part-whole relations (from meronymy);
  2. The challenge to figure out which one to use when;
  3. The underspecified representation and reasoning consequences when one has to put up with less expressive languages for which technological infrastructure exists.

We propose to solve that in the following way, which is described in detail in [7] that recently got accepted at the 9th Extended Semantic Web Conference (ESWC’12).

The short answer for the reader who is not interested in all the theory, design, and evaluation, but just wants to model quickly: the OntoPartS tool guides you to choose the most appropriate relation and saves the selection into your OWL file.

Now for a slightly longer answer. First, we extend the taxonomy of part-whole relations of [5] with the novel addition of a taxonomy of formally defined mereotopological relations, which is driven by the KGEMT mereotoplogical theory of Varzi [8], resulting in a taxonomy of 23 part-whole relations—mereological, mereotopological, and meronymic ones—therewith ensuring a solid ontological and logic-based foundation.

Second, some things have to be simplified from the KGEMT theory to make it implementable in OWL, and we describe the design rationale and trade-offs so that OntoPartS can load OWL/OWL2-formalised ontologies, and, if desired, modify the OWL file with the chosen relation. Which OWL species is best suited obviously depends on your individual requirements, but from a representation & reasoning and mereotopology viewpoint, OWL 2 DL and OWL 2 RL seem to fit better than the other ones. (Note: there are papers on DL and representing spatial relations and on DL and parthood, and alternative representation choices are discussed in the paper, yet, as far as we are aware of, none deals with mereotopological relations in OWL or, more generally, in DL.)

Third, there is the ‘how to select’ from the 23 relations. To enable a quick selection of the appropriate relation, we avail of a simplified OWL-ized DOLCE ontology—well, just the taxonomy of categories—for the domain and range restrictions imposed on the part-whole relations and with that, we can let the user take shortcuts compared to a lengthy decision procedure. In this way, we reduced the selection procedure to 0-4 options based on just 2-3 inputs. All of this has been structured neatly in implementation-independent activity diagrams, and subsequently has been implemented; see also the demos, the tool, and the OWL version of the taxonomy of the 23 relations.

Last, we have tested OntoPartS with modellers in controlled experiments and it was shown to improve efficiency and accuracy in modeling of part-whole relations.

As mentioned, further details can be found in [7], Representing mereotopological relations in OWL ontologies with OntoPartS, which I co-authored with Francis Fernández-Reyes, with the Instituto Superior Politécnico “José Antonio Echeverría” (CUJAE), and Annette Morales-González, with the Advanced Technologies Application Center (CENATAV), both located in Cuba (the example on semantic annotation of multimedia with spatial relations comes straight from the image processing research being done at CENATAV). A tidbit of non-scientific information: the first version of the OntoPartS tool was developed as part of the mini-project that Francis, Annette (and Alexis, who is into fish fulltime now) had chosen to carry out for the ontology engineering course I taught at the University of Havana in 2010 (mentioned earlier here and here). For the paper, we added some more theory, minor refinements to the tool, and a user evaluation with several CUJAE and UKZN students and a few FUB colleagues (thanks again for their cooperation and interest). We’ve started work on additional features, so if you have any particular request, drop me a line.

References

  1. Mejino, J.L.V., Agoncillo, A.V., Rickard, K.L., Rosse, C.: Representing complexity in part-whole relationships within the foundational model of anatomy. In: Proc. of the AMIA Fall Symposium. pp. 450–454 (2003)
  2. http://www.opengalen.org/tutorials/crm/tutorial9.html up to http://www.opengalen.org/tutorials/crm/tutorial16.html/.
  3. Winston, M., Chaffin, R., Herrmann, D.: A taxonomy of part-whole relations. Cognitive Science 11(4), 417–444 (1987)
  4. Odell, J.: Advanced Object-Oriented Analysis & Design using UML. Cambridge: Cambridge University Press (1998)
  5. Keet, C.M., Artale, A.: Representing and reasoning over a taxonomy of part-whole relations. Applied Ontology 3(1-2), 91–110 (2008)
  6. Keet, C.M.: Part-whole relations in object-role models. In: Proc. of ORM’06, OTM Workshops 2006. LNCS, vol. 4278, pp. 1116–1127. Springer (2006)
  7. Keet, C.M., Fernández Reyes, F.C., Morales-González, A.: Representing mereotopological relations in OWL ontologies with OntoPartS. In Simperl, et al., eds.: Proc. of ESWC’12. LNCS, Springer (2012) 27-31 May 2012, Heraklion, Greece.
  8. Varzi, A.: Handbook of Spatial Logics, chap. Spatial reasoning and ontology: parts, wholes, and locations, pp. 945–1038. Berlin Heidelberg: Springer Verlag (2007)

Lecture notes for the ontologies and knowledge bases course

The regular reader may recollect earlier posts about the ontology engineering courses I have taught at FUB, UH, UCI, Meraka, and UKZN. Each one had some sort of syllabus or series of blog posts with some introductory notes. I’ve put them together and extended them significantly now for the current installment of the Ontologies and Knowledge Bases Honours module (COMP718) at UKZN, and they are bound and printed into lecture notes for the enrolled students. These lecture notes are now online and I will add accompanying slides on the module’s webpage as we go along in the semester.

Given that the target audience is computer science students in their 4th year (honours), the notes are of an introductory nature. There are essentially three blocks: logic foundations, ontology engineering, and advanced topics. The logic foundations contain a recap of FOL, basics of Description Logics with ALC, all the DL-based OWL species, and some automated reasoning. The ontology engineering block covers top-down and bottom-up ontology development, and methods and methodologies, with top-down ontology development including mainly foundational ontologies and part-whole relations, and bottom-up the various approaches to extract knowledge from ‘legacy’ representations, such as from databases and thesauri. The advanced topics are balanced in two directions: one is toward ontology-based data access applications (i.e., an ontology-drive information system) and the other one has more theory with temporal ontologies.

Each chapter has a section with recommended/required reading and a set of exercises.

Unsurprisingly, the lecture notes have been written under time constraints and therefore the level of relative completeness of sections varies slightly. Suggestions and corrections are welcome!

Nontransitive vs. intransitive direct part-whole relations in OWL

Confusing is-a with part-of is known to be a common mistake by novice ontology developers. Each time I taught the ontology engineering course, I had included a session of 1-2 hours to explain some basic aspects of part-whole relations and, lo and behold, none of the participants made that mistake in the labs or mini-projects! One awkward thing did pop-up there and at other occasions, though, which had to do with modelling direct parthood that does not go well at the moment, to say the least, for a plethora of reasons. Inclusion of direct parthood is not without philosophical quarrels, and the more I think of it, the more I dislike the relation, but somehow the issue appears often in the context of part-whole relations in ontologies. The observed underlying modelling issue—representing intransitivity versus nontransitivity—holds for any OWL object property anyway, so I will proceed with the general case with an example about giraffes.

Preliminaries

First of all, to clarify terms in the post’s title: INtransitive means that for all x, y, z, if Rxy and Ryz then Rxz does not hold; formally \forall x, y, z (R(x,y) \land R(y,z) \rightarrow \neg R(x,z) and an option to state this in a Description Logic is to use role chaining: R \circ R \sqsubseteq \neg R NONtransitive means that we cannot say either way if the property is transitive or intransitive, i.e., in some cases is may be transitive but not in other occasions. Direct parthood is to be understood as follows: if some part x is a direct part of a y, then there is no other object z such that x is a part of z and z is a part of y; formally, \forall x,y (dpo(x, y) \equiv \neg \exists z (partof(x,z) \land partof(z,y))) . If direct parthood is in- or non-transitive is beside the point at this stage, so let us look now at what happens with it in an OWL ontology when one tries to model it one way or another.

The OWL ontology and the reasoner

Given that I used the African Wildlife Ontology as a tutorial ontology earlier and the theme appeals to people, I will use it again here. Depending on what we do with the direct parthood relation in the ontology, Giraffe is, or is not, classified automatically as a subclass of Herbivore. Herbivore is a defined class, equivalent to, in Protégé 4.1 notation, (eats only plant) or (eats only (is-part-of some plant)), and Giraffe is a subclass of both Animal and eats only (leaf or Twig). Leaves are part of a twig, twigs of a branch, and branches of a tree that in turn is a subclass of plant. The is-part-of is, correctly according to mereology, included in the ontology as being transitive. Instead of all the is-part-of and is-proper-part-of between plant parts and plants in the AfricanWildlifeOntology1.owl, we model them using direct-part. AfricanWildlifeOntology4a.owl has direct-part as sister object property to is-part-of, AfricanWildlifeOntology4b.owl has it as sub-object property of is-part-of, and neither ontology has any “characteristics” (relational properties) checked for direct-part. Before running the reasoner to classify the taxonomy, what do you think will happen with our Giraffe in both cases?

In AfricanWildlifeOntology4a.owl, Giraffe is still a mere direct subclass of Animal, whereas with AfricanWildlifeOntology4b.owl, we do obtain the (desired) deduction that Giraffe is a Herbivore. That is, we obtain different results depending on where we put the uncharacterized direct-part object property in the RBox. Why is this so?

By not clicking the checkbox “transitive”, an object property is non­-transitive, but not in-transitive. In fact, we cannot represent explicitly that an object property is intransitive in OWL (see OWL guide and related documents). If we put the object property at the top level (or, as in Protégé 4.1, as immediate subproperty of topObjectProperty), then we obtain the behaviour as if the property were intransitive (and therefore Giraffe is not classified as a subclass of Herbivore). However, the direct-part property is really nontransitive in the ontology. When direct-part is put as subproperty of is-part-of, then it inherits the transitivity characteristic from is-part-of and therefore Giraffe is classified as a Herbivore (because now leaf and Twig are part of plant thanks to the transitivity).

Obviously, it holds for any OWL/OWL2 object property that one cannot assert intransitivity explicitly, that an object property’s characteristics are inherited to its subproperties, and this kind of behaviour of nontransitive object properties depends on where you place it in the RBox—whether you like it or not.

How to go forward?

Direct parthood is called isComponentOf in the componency ontology design pattern and is a subproperty of isPartOf. Its inverse is called haspart_directly in the W3C best practices document on Simple Part-Whole relations [1], and is a subproperty of the transitive haspart. The componency.owl notes that isComponentOf is “hasPart relation without transitivity”, the ODP page’s “intent” of the pattern is that it is intended to “represent (non-transitively) that objects either are proper parts of other objects, or have proper parts”, and the W3C best Practices note that, unlike mereological parthood, it is “not transitive”. Hence, if you include either one in your OWL ontology, you will not obtain the intended behaviour. Therefore, I do not recommend using either suggestion.

Setting aside the W3C’s best practices motivation for inclusion of haspart_directly—easier querying for immediate parts, but for the ontology purist this ought not to be the motivation for its inclusion—it is worth digging a little deeper into the semantics of the direct parthood. Maybe a modeller actually wants to represent collections with their members, like each Fleet has as direct parts more than one Ship, or constitution of objects, like clay is directly part of some vase? In both cases, however, we deal with meronymic part-whole relations, not mereological ones (see [2] and references therein); hence, they should not be subsumed by the mereological part-of relation anyway. They can be modelled as sister properties of the part-of relation and have the intended nontransitive behaviour as in, e.g., the pwrelations.owl ontology with a taxonomy of part-whole relations (that can be imported into the wildlife ontology).

Alternatively, there is always the option to choose a sufficiently expressive non-OWL language to represent the direct parthood and the rest of the subject domain and use one of the many first/second order theorem provers.

References

[1] Alan Rector and Chris Welty. Simple Part-Whole relations in OWL ontologies. W3C Editor’s draft, 11 August 2005.

[2] C. Maria Keet and Alessandro Artale. Representing and Reasoning over a Taxonomy of Part-Whole Relations. Applied Ontology, 2008, 3(1-2): 91-110.

72010 SemWebTech lecture 6: Parts and temporal aspects

The previous three lectures covered the core topics in ontology engineering. There are many ontology engineering topics that zoom in on one specific aspect of the whole endeavour, such as modularization, the semantic desktop, ontology integration, combining data mining and clustering with ontologies, and controlled natural language interfaces to OWL. In the next two lectures on Dec 1 and Dec 14, we will look at three such advanced topics in modelling and language and tool development, being the (ever recurring) issues with part-whole relations, temporalizations and its workarounds, and languages and tools for dealing with vagueness and uncertainty.

Part-whole relations

On the one hand, there is a SemWeb best practices document about part-whole relations and further confusion by OWL developers [1, 2] that was mentioned in a previous lecture.  On the other hand, part-whole relations are deemed essential by the most active adopters of ontologies—i.e., bio- and medical scientist—while its full potential is yet to be discovered by, among others, manufacturing. A few obvious examples are how to represent plant or animal anatomy, geographic information data, and components of devices. And then the need to reason over it. When we can deduce which part of the device is broken, then only that part has to be replaced instead of the whole it is part of (saving a company money). One may want to deduce that when I have an injury in my ankle, I have an injury in my limb, but not deduce that if you have an amputation of your toe, you also have an amputation of your foot that the toe is (well, was) part of. If a toddler swallowed a Lego brick, it is spatially contained in his stomach, but one does not deduce it is structurally part of his stomach (normally it will leave the body unchanged through the usual channel). This toddler-with-lego-brick gives a clue why, from an ontological perspective, equation 23 in [2] is incorrect.

To shed light on part-whole relations and sort out such modelling problems, we will look first at mereology (the Ontology take on part-whole relations), and to a lesser extent meronymy (from linguistics), and subsequently structure the different terms that are perceived to have something to do with part-whole relations into a taxonomy of part-whole relations [3]. This, in turn, is to be put to use, be it with manual or software-supported guidelines to choose the most appropriate part-whole relation for the problem, and subsequently to make sure that is indeed represented correctly in an ontology. The latter can be done by availing of the so-called RBox Reasoning Service [3]. All this will not solve each modelling problem of part-whole relations, but at least provide you with a sound basis.

Temporal knowledge representation and reasoning

Compared to part-whole relations, there are fewer loud and vocal requests for including a temporal dimension in OWL, even though it is needed. For instance, you can check the annotations in the OWL files of BFO and DOLCE (or, more conveniently, search for “time” in the pdf) where they mention temporality that cannot be represented in OWL, or SNOMED CT’s concepts like “Biopsy, planned” and “Concussion with loss of consciousness for less than one hour” where the loss of consciousness still can be before or after the concussion, or a business rule alike ‘RentalCar must be returned before Deposit is reimbursed’ or the symptom HairLoss during the treatment Chemotherapy, and Butterfly is a transformation of Caterpillar.

Unfortunately, there is no single (computational) solution to address all these examples at once. Thus far, it is a bit of a patchwork, with, among many aspects, the Allen’s interval algebra (qualitative temporal relations, such as before, during, etc.), Linear Temporal Logics (LTL), and Computational Tree Logics (CTL, with branching time), and a W3C Working draft of a time ontology.

If one assumes that recent advances in temporal Description Logics may have the highest chance of making it into a temporal OWL (tOWL)—although there are no proof-of-concept temporal DL modelling tools or reasoners yet—then the following is ‘on offer’. A very expressive (undecidable) DL language is DLRus (with the until and since operators), which already has been used for temporal conceptual data modelling [4] and for representing essential and immutable parts and wholes [5]. A much simpler language is TDL-Lite [6], which is a member of the DL-Lite family of DL languages of which one is the basis for OWL 2 QL; but these first results are theoretical, hence no “lite tOWL” yet. It is already known that EL++ (the basis for OWL 2 EL) does not keep the nice computational properties when extended with LTL, and results with EL++ with CTL are not out yet. If you are really interested in the topic, you may want to have a look at a recent survey [7] or take a broader scope with any of the four chapters in [8] (that cover temporal KR&R, situation calculus, event calculus, and temporal action logics), and several people with the KRDB Research Centre work on temporal knowledge representation & reasoning.  Depending on the remaining time during the lecture, more or less about time and temporal ontologies will pass the revue.

References

[1] I. Horrocks, O. Kutz, and U. Sattler. The Even More Irresistible SROIQ. In Proc. of the 10th International Conference of Knowledge Representation and Reasoning (KR-2006), Lake District UK, 2006.

[2] B. Cuenca Grau, I. Horrocks, B. Motik, B. Parsia, P. Patel-Schneider, and U. Sattler. OWL 2: The next step for OWL. Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 6(4):309-322, 2008

[3] Keet, C.M. and Artale, A. Representing and Reasoning over a Taxonomy of Part-Whole Relations. Applied Ontology, IOS Press, 2008, 3(1-2): 91-110.

[4] Alessandro Artale, Christine Parent, and Stefano Spaccapietra. Evolving objects in temporal information systems. Annals of Mathematics and Artificial Intelligence (AMAI), 50:5-38, 2007, Springer.

[5] Artale, A., Guarino, N., and Keet, C.M. Formalising temporal constraints on part-whole relations. 11th International Conference on Principles of Knowledge Representation and Reasoning (KR’08). Gerhard Brewka, Jerome Lang (Eds.) AAAI Press, pp 673-683. Sydney, Australia, September 16-19, 2008

[6] Alessandro Artale, Roman Kontchakov, Carsten Lutz, Frank Wolter and Michael Zakharyaschev. Temporalising Tractable Description Logics. Proc. of the 14th International Symposium on Temporal Representation and Reasoning (TIME-07), Alicante, June 2007.

[7] Carsten Lutz, Frank Wolter, and Michael Zakharyaschev.  Temporal Description Logics: A Survey. In  Proceedings of the Fifteenth International Symposium on Temporal Representation and Reasoning. IEEE Computer Society Press, 2008.

[8] Frank van Harmelen, Vladimir Lifschitz and Bruce Porter (Eds.). Handbook of Knowledge Representation. Elsevier, 2008, 1034p. (also available from the uni library)

Note: reference 3 is mandatory reading, 4 optional reading, 2 was mandatory and 1 recommended for an earlier lecture, and 5-8 are optional.

Lecture notes: lecture 6 – Parts and temporal issues

Course webpage

Visiting DERI in sunny Galway

Believe it it not, but the weather is indeed dry and sunny in Galway, already for 5 days in a row; although I did not come to Ireland for the good weather, it is a nice bonus. One of the reasons I am in Ireland is to visit the Digital Enterprise Research Institute (DERI) in Galway, which is the largest Semantic Web-oriented research group in Ireland with about 120 employees.

I am hosted by Paul Buitelaar‘s NLP unit, looking into options to improve NLP implementations with ontologies, as DERI puts somewhat more emphasis on validation with applications than Bolzano does. In this context I gave a seminar about representing and reasoning over a taxonomy of part-whole relations, which is based on the Applied Ontology paper with the same title [1], but where the slides focus on the motivations from a linguistics perspective. It led some attendees to believe linguistics was the core motivation for disambiguating different types of part-whole relations. However, correct modelling of part-whole relations gets probably more attention in conceptual data modelling (primarily, UML) and it receives lots of attention in attempting to address the demands put forward by bio(medical) ontologists to (i) have a language with which one can represent all properties of parthood relations in ontology languages, which we still cannot in OWL, and (ii) distinguish between part-whole notions such as (spatial) containment, structural parthood, and membership of a collective.

An orthogonal dimension to the types of part-whole relations are the notions of essential and immutable parts and wholes, which can be solved by resorting to a temporalisation of relationships [2]. If one would want to ‘translate’ that to any usage in NLP, then one can deal adequately—with a formal and ontological foundation—with linguistic expressions that have a wider range of verb tenses, like “researcherAbc will become a member of researchGroup123” (a scheduled meronymic part-whole relation) and heart#123 has been transplanted from patientAbc” (a so-called disabled relation where the heart used to be a structural part of that patient). But this is still music for future work.

Other topics passed, and are passing, the revue as well, such as a possible use case with roles and rules with Axel Polleres, making it a stimulating and enjoyable visit.

References

[1] Keet, C.M. and Artale, A. Representing and Reasoning over a Taxonomy of Part-Whole Relations. Applied Ontology, 2008, 3(1-2): 91-110.

[2] Artale, A., Guarino, N., and Keet, C.M. Formalising temporal constraints on part-whole relations. 11th International Conference on Principles of Knowledge Representation and Reasoning (KR’08). Gerhard Brewka, Jerome Lang (Eds.) AAAI Press, pp 673-683. Sydney, Australia, September 16-19, 2008.

New book on innovations in information systems modeling

To give my bias upfront: the book that contains my first book chapter is released today, in Innovations in Information Systems Modeling: Methods and Best Practices (part of the Advances in Database Research Book Series), which is edited by Terry Halpin, John Krogstie, and Erik Proper. To lazily copy the short description, the book has as scope (see title information sheet):

Modeling is used across a number of tasks in connection to information systems, but it is rare to see and easily compare all the uses of diagrammatical models as knowledge representation in one place, highlighting both commonalities and differences between different kinds of modeling.

Innovations in Information Systems Modeling: Methods and Best Practices provides up-to-date coverage of central topics in information systems modeling and architectures by leading researchers in the field. With chapters presented by top researchers from countries around the globe, this book provides a truly international perspective on the latest developments in information systems modeling, methods, and best practices.

The book has 15 chapters divided into four sections, being (I) language issues and improvements, (II) modelling approaches, (III) frameworks, architectures, and applications, and (IV) selected readings, containing altogether 15 chapters. The book chapters, whose abstracts are online here, range from refinements on subtyping, representing part-whole relations, and adapting ORM for representing application ontologies, to methodologies for enterprise and active knowledge modelling, to an ontological framework for method engineering and designing web information systems. The selected readings sections deal with, among others, a formal agent based approach for the modeling and verification of intelligent information systems and metamodelling in relation to software quality.

The chapter that I co-authored with Alessandro Artale is called “Essential, Mandatory, and Shared Parts in Conceptual Data Models” [1], which zooms in on formally representing the life cycle semantics of part-whole relations in conceptual data models such as those represented with ER, ORM and UML. We do this by using the temporal modality and some new fancy extensions to ERvt—a temporal EER based on the description logic language DLRus—to cover things such as essential parts, temporally suspended relations, and shareability options such as sequentially versus concurrently being part of some whole. To aid the modeler in applying it during the conceptual analysis stage, we also provide a set of closed questions and decision diagrams to find the appropriate life cycle.

A disadvantage of publishing with IGI is that they don’t accept latex files, but the poor lad from the typesetting office was patient and did his best to make something presentable out of it in MS Word (ok, I wasted quite some time on it, too). I don’t have a soft copy of the final layout version, but if you would like to have a latex-ed preprint, feel free to drop me an email. Alternatively, to gain access to all the chapters: the early-bird price (until Feb. 1, 2009) knocks off $15 of the full price of the hardcover.

[1] Alessandro Artale, and C. Maria Keet. Essential, Mandatory, and Shared Parts in Conceptual Data Models (chapter 2). In: Innovations in Information Systems Modeling: Methods and Best Practices, Terry Halpin, John Krogstie, and Erik Proper (Eds.). IGI Global, 2008, pp 17-52. ISBN: 978-1-60566-278-7

BFO’s specific and generic dependence and generalising progress in essential and mandatory parts

The Basic Formal Ontology (BFO) version 1.1 has, compared to v1.0, the additions SpecificallyDependentContinuant (SDC) and GenericallyDependentContinuant (GDC); at least on 30 June when I downloaded it. They are defined as follows (emphasis added):
SDC = “A continuant [snap:Continuant] that inheres in or is borne by other entities. Every instance of A requires some specific instance of B which must always be the same”. To note: it subsumes Quality and RealizableEntity.
GDC = “A continuant [snap:Continuant] that is dependent on one or other independent continuant [snap:IndependentContinuant] bearers. For every instance of A requires some instance of (an independent continuant [snap:IndependentContinuant] type) B but which instance of B serves can change from time to time”.
Setting aside the lack of similarity in the formulation of the definitions, difference in constraints on the participating entities, awkward English, and absence of full formal definitions (neither in the text nor in the OWL file), the interesting bit I will zoom in on now is the mandatoryness with “same” [at all times] and “some…from time to time”. This sounds just like our work on essential versus plain mandatory parts and wholes, but then counting for an arbitrary relation that relates [in]dependent continuants as opposed to limiting it to part-whole relations. For more details, see also the previous post on essential and mandatory parts or the DL’08 paper with technical details, and the extension that specifically deals with specific and generic dependence [1], constrained to part-whole relations due to the scope of the paper.
Put differently, one needs to represent the life cycle semantics of the participating entities to be able to distinguish between “some instance—be it x1, or …, or xn—of type X must participate in a relational instance of relation of type R” and “the same instance y of type Y must participate in a relational instance of relation of type S”. More practically, and pattern-wise, given some object z that is an instance of Z (a continuant) and t0, t1, … , tn point in time, then we have for GDC
r1 : < x1, z> at t0 (at the start of active z)
r2 : < x2, z> at t1, where t1>t0, and (x1 = x2 or < x1, z> or not < x1, z> )
r3 : < x3, z> at t2, where t2 >t1 , etc… until z cease to exist as instance of Z at tn
whereas for SDC
s1 : < y1, z1> at t0 (at the start of active z)
s1 : < y1, z1> at t1, where t1>t0
s1 : < y1, z1> at t2, where t2 >t1 until z cease to exist as instance of Z at tn.
s2 : < y2, z2> at t0 (at the start of active z)
s2 : < y2, z2> at t1, where t1>t0
s2 : < y2, z2> at t2, where t2 >t1 until z cease to exist as instance of Z at tn, and where it may be that z1 = z2 or y1 = y2 or both.

Taking the GDC example from the OWL file “a certain PDF file that exists in different and in several hard drives”, then we have an example where x…xi are the distinct hard drives and z the PDF file (well, the elusive ‘contents’ of the file—clearly, there are different bits involved in the different hard drives).

The provided SDC examples, however, are somewhat more complicated: “the mass of a cloud, the smell of mozzarella, the liquidity of blood, the color of a tomato, the disposition of fish to decay, the role of being a doctor, the function of the heart in the body: to pump blood, to receive de-oxygenated and oxygenated blood”. Obviously, each cloud must have a mass, but generally not the same mass, and some mass, say, 10kg, is not necessarily related always to the same cloud as clouds can grow and decrease in volume and, thus, in amount of mass. Given the example, we only can have the specific dependence if a ‘grown’ cloud (>10kg) counts as a different cloud (which is counterintuitive). Likewise, a certain liquidity of blood can change in value (due to drinking alcoholic beverages, for instance), although blood must have some value for liquidity (which may or may not be measures and which reaches 0 if it is dried up blood in a healing wound). Vice versa, a certain liquidity value does not have to be related to the same blood for the duration of its existence. If, however, we consider instead, say, that the doctor takes a blood sample and measures the liquidity and the result of that measurement is stored in a database or written on a paper-based health record, then that measured value ‘123’ is permanent for the duration of its existence related to ‘blood sample from patient p1 taken at time hh:mm at date dd-mm-yyyy’. But the latter reading is certainly a different case from just blood & liquidity. So, overall, this seems to contradict the SDC definition—or the examples don’t quite fit the definition.

In addition, we can have variations in the life cycles of the SDC and its bearer, which I don’t think are covered. Take the following figure, where there are two principal options: we fix the lifetime of the [independent]continuant and vary the SDC’s lifetime in (A), or fix the lifetime of the SDC and vary the lifetime of the [independent]continuant in (B).

In the case of the SDC definition, one would have to focus on (B): the [independent]continuant bearer might have one or more SDCs, but given an SDC, it must always be related to the Cx, which has the same or a longer lifetime than the SDC but never a shorter lifetime. In the case of our blood sample as bearer, then we have either C2 (if the sample continues after the record of the measurement is destroyed) or C4 (if the sample is destroyed together with the recording of the taken measurement).

So, it is either me who doesn’t get it, or there is room for improvement in the SDC/GDC definitions and/or examples. Anyone has some clarifying thoughts on this?

—–

[1] Artale, A., Guarino, N., and Keet, C.M. Formalising temporal constraints on part-whole relations. 11th International Conference on Principles of Knowledge Representation and Reasoning (KR’08). Gerhard Brewka, Jerome Lang (Eds.) AAAI Press. Sydney, Australia, September 16-19, 2008.

Representing the difference between mandatory and essential parts and wholes

As mentioned earlier, there is more in the pipeline about part-whole relations than only the taxonomy of types of part-whole relations and the RBox Compatibility service [1]. There are a lot of issues in representing parts, wholes, and part-whole relations—in particular in bio(medical) ontologies and conceptual data models. One of them is the distinction between the plain mandatory constraint on the participation of the part (whole) in the part-whole relation and the stronger notion of essential part (whole). Informally, they deal with representing that “the part must be part of some whole” versus “the part must be part of the same whole”. A classical example is the difference between how your heart is part of your body versus how your brain is part of your body: your heart is replaceable and as long as you have some heart in your body you’ll be fine (well, continue to exist), whereas this is different for your brain[1]. This, again, is different from parts that a whole normally has (or is supposed to have), such as two eyes and two kidneys in case of a human: without the eyes, you still can live healthily without medical intervention, whereas without the kidneys, you will die if there’s no possibility for regular dialysis—hence, there is somehow a difference in modality on the participation of the parts and wholes in the part-whole relation.

To represent this sort of difference, one can resort to adding existence and necessity [2], but also assess it along the temporal dimension. To say that a part is essential to a whole, then throughout its entire lifetime, the whole has exactly that part related through only that part-whole relation. This does not say anything about the part, though: that part might well have existed before the whole or continue to exist after the whole ceased to exist as a whole. Vice versa, if a whole is essential to the part, then that part cannot survive as is without that whole it is part of. Of course, this can be combined so that the part and the whole are mutually essential.

To represent this talk about “before”, “after”, and “during” in the setting of essential parts and wholes, one can add time t to the predicates, add an ordering over time points (chronons) t1, …, tn, and construct long formalizations to represent precisely the temporal constrains over the objects participating in the part-whole relation as well as over the part-whole relation itself. With an eye on potential for implementation, however, we chose to take the well-studied Description Logic language DLRus and its corresponding ERvt temporal conceptual data modeling language (see [3] for the latest comprehensive treatment of both) so as to capture succinctly the set of constraints for mandatory and essential parts and wholes. A rather dense, DL-readership-oriented, paper has just been accepted for DL’08 that presents this solution [4], which I’ll try to render in a brief digest-format in the following paragraph and give a few realistic examples afterward.

DLRus is an expressive temporal description logic with the Until and Since operators and can capture most of the common conceptual data modeling languages, such as n-aries, cardinality restrictions, sub-relations, disjointness, covering etc. ERvt is, roughly, EER with extra constructs for the time aspects and for each ERvt conceptual model, there is an equi-statisfiable DLRus knowledge base.

In [3] you will find explanation on inclusion of the notion of status classes (well-known in temporal information systems), where some instance o can be member of Scheduled-C, Active-C, Suspended-C, or Disabled-C, with Active-C denoting the usual class C in a conceptual model (or call it concept C in DL terminology, universal C in an OBO Foundry ontology, whichever). There is a range of implications to ensure correct behaviour of the status classes, such as if an object is member of Suspended-C then it first must have been member of C. If we entertain ourselves with a particular instance o1 of the Papilionoidae, then when o1 is member of Caterpillar, we might as well make o1 also member of the Scheduled-Butterfly class and of the Disabled-Egg class (if it is interesting to do so, is another topic). We can do the same for relations; i.e., in [4] we extend ERvt by introducing the notion of status relations (from §3 onwards, including an informal description). Applying that to the partof relation, we get Scheduled-partof, Active-partof, Suspended-partof, and Disabled-partof. For the axioms that deal with essential participation, we first have that the partof relation cannot be suspended, and subsequently add axioms to say that the lifetime of the part (or whole) either starts before that of the whole (or part) or at the same time, and if the part (whole) finishes at the same time or if the part (whole) can outlive the whole (part). Thus, there are eight combinations of the possible constraints, which are drawn in an illustrative figure as well (Fig.3): four for essential parts and four for essential wholes (theorems 1 and 2). That’s it.

With this addition of status relations, we can represent a lot more than only the distinction between mandatory and essential parts and wholes—for quite realistic information, actually. For instance, we would like to say in a medical ontology or conceptual data model intended for development of a transplant database that all transplanted hearts must have been part of some other human. Put differently, and at the instance-level for illustrative purpose, such a constraint would enforce that if a heart h1 as member of Heart is partof p2 that is member of class Human and this partof is member of Active-partof, then there must be a human p1 that is member of Disabled-Human (i.e., p1 has died, assuming that a person cannot live without having a heart) and there must be a relational instance (tuple) of partof that relates h1 and p1 that is member of Disabled-partof. For kidney transplants, we can amend this to say that p1 is member of either Human or Disabled-Human (one could have donated just one kidney). For planning purposes, we can have donors in the transplant database whose organs are scheduled to become part of another human, i.e., the parts and wholes are both in their respective active classes, but a partof relation is member of Scheduled-part of relating the organ to a prospective recipient. Further, if we drop the standard essential part (whole) to less restrictive cases so that the objects and relations may become suspended some time during their lifespan, we can keep track of, say, some car engine e1 at the car mechanic who has removed it from the car c1 for maintenance purposes, but this e1 surely is supposed to be reinstalled in that car c1. And so forth.

Now, before running off to go forth and play with, e.g., the temporalised relations in the RO [5], some of those (like derivation), as well as other options, have already been addressed in [3] under the heading of so-called “evolution constraints”. And a caveat is that the full DLRus is undecidable[2], but there’s ongoing work on temporalising the well-behaved computationally nice DL-lite and some subsets of DLRus are in Exptime (see the last section of [3] for a summary).

[1] Keet, C.M., Artale, A. Representing and Reasoning over a Taxonomy of Part-Whole Relations. Applied Ontology, in print.
[2] Guizzardi, G. Ontological foundations for structural conceptual models. PhD Thesis, Telematica Institute, Twente University, Enschede, the Netherlands. 2005.
[3] Artale, A., Parent, C., Spaccapietra, S. Evolving objects in temporal information systems. Annals of Mathematics and Artificial Intelligence (AMAI), 2007, 50(1-2), 5-38.
[4] Artale, A., Keet, C.M. Essential and mandatory part-whole relations in conceptual data models. 21st International Workshop on Description Logics (DL’08 ), 13-16 May 2008, Dresden, Germany.
[5] Smith, B., Ceusters, W., Klagges, B., Koehler, J., Kumar, A., Lomax, J., Mungall, C., Neuhaus, F., Rector, A.L., Rosse, C. (2005).
Relations in biomedical ontologies. Genome Biology, 2005, 6:R46.


[1] Other subtopics, such as optional parts, amount of parts, or parts that a whole should have are not further considered in [4].

[2] Who cares? At least now we know what we need to represent the distinction between mandatory and essential parts and wholes… as well as several other cases with part-wholes relations.

Follow

Get every new post delivered to your Inbox.

Join 25 other followers

%d bloggers like this: