Tutorial: OntoClean in OWL and with an OWL reasoner

The novelty surrounding all things OntoClean described here, is that we made a tutorial out of a scientific paper and used an example that is different from the (in?)famous manual example to clean up a ‘dirty’ taxonomy.

I’m assuming you have at least heard of OntoClean, which is an ontology-inspired method to examine the taxonomy of an ontology, which may be useful especially when the classes (/universals/concepts/..) have no or only a few properties or attributes declared. Based on that ontological information provided by the modeller, it will highlight violations of ontological principles in the taxonomy so that the ontologist may fix it. Its most recent overview is described in Guarino & Welty’s book chapter [1] and there are handouts and slides that show some of the intermediate steps; a 1.5-page summary is included as section 5.2.2 in my textbook [2].

Besides that paper-based description [1], there have been two attempts to get the reasoning with the meta-properties going in a way that can exploit existing technologies, which are OntOWLClean [3] and OntOWL2Clean [4]. As the names suggest, those existing and widely-used mechanisms are OWL and the DL-based reasoners for OWL, and the latter uses OWL2-specific language features (such as role chains) whereas the former does not. As it happened, some of my former students of the OE course wanted to try the OntoOWLClean approach by Welty, and, as they were with three students in the mini-project team, they also had to make their own example taxonomy, and compare the two approaches. It is their—Todii Mashoko, Siseko Neti, and Banele Matsebula’s—report and materials we—Zola Mahlaza and I—have brushed up and rearranged into a tutorial on OntoClean with OWL and a DL reasoner with accompanying OWL files for the main stages in the process.

There are the two input ontologies in OWL (the domain ontology to clean and the ‘ontoclean ontology’ that codes the rules in the TBox), an ontology for the stage after punning the taxonomy into the ABox, and one after having assigned the meta-properties, so that students can check they did the steps correctly with respect to the tutorial example and instructions. The first screenshot below shows a section of the ontology after pushing the taxonomy into the ABox and having assigned the meta-properties. The second screenshot illustrates a state after having selected, started, and run the reasoner and clicked on “explain” to obtain some justifications why the ontology is inconsistent.

section of the punned ontology where meta-properties have been assigned to each new individual.

A selection of the inconsistencies (due to violating OntoClean rules) with their respective explanations

Those explanations, like shown in the second screenshot, indicate which OntoClean rule has been violated. Among others, there’s the OntoClean rule that (1) classes that are dependent may have as subclasses only those classes that are also dependent. The ontology, however, has: i) Father is dependent, ii) Male is non-dependent, and iii) Father has as subclass Male. This subsumption violates rule (1). Indeed, not all males are fathers, so it would be, at least, the other way around (fathers are males), but it also could be remodelled in the ontology such that father is a role that a male can play.

Let us look at the second generated explanation, which is about violating another OntoClean rule: (2) sortal classes have only as subclasses classes that are also sortals. Now, the ontology has: i) Ball is a sortal, ii) Sphere is a non-sortal, and iii) Ball has as subclass Sphere. This violates rule (2). So, the hierarchy has to be updated such that Sphere is not subsumed by Ball anymore. (e.g., Ball has as shape some Sphere, though note that not all balls are spherical [notably, rugby balls are not]). More explanations of the rule violations are described in the tutorial.

Seeing that there are several possible options to change the taxonomy, there is no solution ontology. We considered creating one, but there are at least two ‘levels’ that will influence what a solution may look like: one could be based on a (minimum or not) number of changes with respect to the assigned meta-properties and another on re-examining the assigned meta-properties (and then restructuring the hierarchy). In fact, and unlike the original OntoClean example, there is at least one case where there is a meta-property assignment that would generally be considered to be wrong, even though it does show the application of the OntoClean rule correctly. How best to assign a meta-property, i.e., which one it should be, is not always easy, and the student is also encouraged to consider that aspect of the method. Some guidance on how to best modify the taxonomy—like Father is-a Male vs. Father inheres-in some Male—may be found in other sections and chapters of the textbook, among other resources.

 

p.s.: this tutorial is the result of one of the activities to improve on the OE open textbook, which are funded by the DOT4D project, as was the tool to render the axioms in DL in Protégé. A few more things are in the pipeline (TBC).

 

References

[1] Guarino, N. and Welty, C. A. (2009). An overview of OntoClean. In Staab, S. and Studer, R., editors, Handbook on Ontologies, International Handbooks on Information Systems, pages 201-220. Springer.

[2] Keet, C. M. (2018). An introduction to ontology engineering. College Publications, vol 20. 344p.

[3] Welty, C. A. (2006). OntOWLClean: Cleaning OWL ontologies with OWL. In Bennett, B. and Fellbaum, C., editors, Proceedings of the Fourth International Conference on Formal Ontology in Information Systems (FOIS 2006), Baltimore, Maryland, USA, November 9-11, 2006, volume 150 of Frontiers in Artificial Intelligence and Applications, pages 347-359. IOS Press.

[4] Glimm, B., Rudolph, S., Volker, J. (2010). Integrated metamodeling and diagnosis in OWL 2. In Peter F. Patel-Schneider, Yue Pan, Pascal Hitzler, Peter Mika, Lei Zhang, Je_ Z. Pan, Ian Horrocks, and Birte Glimm, editors, Proceedings of the 9th International Semantic Web Conference, LNCS vol 6496, pages 257-272. Springer.

Logical and ontological reasoning services?

The SubProS and ProChainS compatibility services for OWL ontologies to check for good and ‘safe’ OWL object property expression [5] may be considered ontological reasoning services by some, but according others, they are/ought to be plain logical reasoning services. I discussed this issue with Alessandro Artale back in 2007 when we came up with the RBox Compatibility service [1]—which, in the end, we called an ontological reasoning service—and it came up again during EKAW’12 and the Ontologies and Conceptual Modelling Workshop (OCM) in Pretoria in November. Moreover, in all three settings, the conversation was generalized to the following questions:

  1. Is there a difference between a logical and an ontological reasoning service (be that ‘onto’-logical or ‘extra’-logical)? If so,
    1. Why, and what, then, is an ontological reasoning service?
    2. Are there any that can serve at least as prototypical example of an ontological reasoning service?

There’s still no conclusive answer on either of the questions. So, I present here some data and arguments I had and that I’ve heard so far, and I invite you to have your say on the matter. I will first introduce a few notions, terms, tools, and implicit assumptions informally, then list the three positions and their arguments I am aware of.

Some aspects about standard, non-standard, and ontological reasoning services

Let me first introduce a few ideas informally. Within Description Logics and the Semantic Web, a distinction is made between so-called ‘standard’ and ‘non-standard’ reasoning services. The standard reasoning services—which most of the DL-based reasoners support—are subsumption reasoning, satisfiability, consistency of the knowledge base, instance checking, and instance retrieval (see, e.g., [2,3] for explanations). Non-standard reasoning services include, e.g., glass-box reasoning and computing the least common subsumer, they are typically designed with the aim to facilitate ontology development, and tend to have their own plugin or extension to an existing reasoner. What these standard and non-standard reasoners have in common, is that they all focus on the (subset of first order predicate logic) logical theory only.

Take, on the other hand, OntoClean [4], which assigns meta-properties (such as rigidity and unity) to classes, and then, according to some rules involving those meta-properties, computes the class taxonomy. Those meta-properties are borrowed from Ontology in philosophy and the rules do not use the standard way of computing subsumption (where every instance of the subclass is also an instance of its super class and, thus, practically, the subclass has more or features or has the same features but with more constrained values/ranges). Moreover, OntoClean helps to distinguish between alternative logical formalisations of some piece of knowledge so as to choose the one that is better with respect to the reality we want to represent; e.g., why it is better to have the class Apple that has as quality a color green, versus the option of a class GreenObject that has shape apple-shaped. This being the case, OntoClean may be considered an ontological reasoning service. My SubProS and ProChainS [5] put constraints on OWL object property expressions so as to have safe and good hierarchies of object properties and property chains, based on the same notion of class subsumption, but then applied to role inclusion axioms: the OWL object sub-property (relationship, DL role) must be more constrained than its super-property and the two reasoning services check if that holds. But some of the flawed object property expressions do not cause a logical inconsistency (merely an undesirable deduction), so one might argue that the compatibility services are ontological.

The arguments so far

The descriptions in the previous paragraph contain implicit assumptions about the logical vs ontological reasoning, which I will spell out here. They are a synthesis from mine as well as other people’s voiced opinions about it (the other people being, among others and in alphabetical order, Alessandro Artale, Arina Britz, Giovanni Casini, Enrico Franconi, Aldo Gangemi, Chiara Ghidini, Tommie Meyer, Valentina Presutti, and Michael Uschold). It goes without saying they are my renderings of the arguments, and sometimes I state the things a little more bluntly to make the point.

1. If it is not entailed by the (standard, DL/other logic) reasoning service, then it is something ontological.

Logic is not about the study of the truth, but about the relationship of the truth of one statement and that of another. Effectively, it doesn’t matter what terms you have in the theory’s vocabulary—be this simply A, B, C, etc. or an attempt to represent Apple, Banana, Citrus, etc. conformant to what those entities are in reality—as it uses truth assignments and the usual rules of inference. If you want some reasoning that helps making a distinction between a good and a bad formalisation of what you aim to represent (where both theories are consistent), then that’s not the logician’s business but instead is relegated to the domain of whatever it is that ontologists get excited about. A counter-argument raised to that was that the early logicians were, in fact, concerned with finding a way to formalize reality in the best way; hence, not only syntax and semantics of the logic language, but also the semantics/meaning of the subject domain. A practical counter-example is that both Glimm et al [6] and Welty [7] managed to ‘hack’ OntoClean into OWL and use standard DL reasoners for it to obtain de desired inferences, so, presumably, then even OntoClean cannot be considered an ontological reasoning service after all?

2. Something ‘meta’ like OntoClean can/might be considered really ontological, but SubProS and ProChainS are ‘extra-logical’ and can be embedded like the extra-logical understanding of class subsumption, so they are logical reasoning services (for it is the analogue to class subsumption but then for role inclusion axioms).

This argument has to do with the notion of ‘standard way’ versus ‘alternative approach’ to compute something and the idea of having borrowed something from Ontology recently versus from mathematics and Aristotle somewhat longer ago. (note: the notion of subsumption in computing was still discussed in the 1980s, where the debate got settled in what is now considered the established understanding of class subsumption.) We simply can apply the underlying principles for class-subclass to one for relationships (/object properties/roles). DL/OWL reasoners and the standard view assume that the role box/object property expressions are correct and merely used to compute the class taxonomy only. But why should I assume the role box is fine, even when I know this is not always the case? And why do I have to put up with a classification of some class elsewhere in the taxonomy (or be inconsistent) when the real mistake is in the role box, not the class expression? Differently, some distinction seems to have been drawn between ‘meta’ (second order?), ‘extra’ to indicate the assumptions built into the algorithms/procedures, and ‘other, regular’ like satisfiability checking that we have for all logical theories. Another argument raised was that the ‘meta’ stuff has to do with second order logics, for which there are no good (read: sound and complete) reasoners.

3. Essentially, everything is logical, and services like OntoClean, SubProS, ProChainS can be represented formally with some clearly, precisely, formally, defined inferencing rules, so then there is no ontological reasoning, but there are only logical reasoning services.

This argument made me think of the “logic is everywhere” mug I still have (a goodie from the ICCL 2005 summer school in Dresden). More seriously, though, this argument raises some old philosophical debates whether everything can indeed be formalized, and provided any logic is fine and computation doesn’t matter. Further, it conflates the distinction, if any, between plain logical entailment, the notion of undesirable deductions (e.g., that a CarChassis is-a Perdurant [some kind of a process]), and the modeling choices and preferences (recall the apple with a colour vs. green object that has an apple-shape). But maybe that conflation is fine and there is no real distinction (if so: why?).

In my paper [5] and in the two presentations of it, I had stressed that SubProS and ProChainS were ontological reasoning services, because before that, I had tried but failed to convince logicians of the Type-I position that there’s something useful to those compatibility services and that they ought to be computed (currently, they are mostly not computed by the standard reasoners). Type-II adherents were plentiful at EKAW’12 and some at the OCM workshop. I encountered the most vocal Type-III adherent (mathematician) at the OCM workshop. Then there were the indecisive ones and people who switched and/or became indecisive. At the moment of writing this, I still lean toward Type-II, but I’m open to better arguments.

References

[1] Keet, C.M., Artale, A.: Representing and reasoning over a taxonomy of part-whole relations. Applied Ontology, 2008, 3(1-2), 91–110.

[2] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, and P. F. Patel-Schneider (Eds). The Description Logics Handbook. Cambridge University Press, 2009.

[3] Pascal Hitzler, Markus Kroetzsch, Sebastian Rudolph. Foundations of Semantic Web Technologies. Chapman & Hall/CRC, 2009,

[4] Guarino, N. and Welty, C. An Overview of OntoClean. In S. Staab, R. Studer (eds.), Handbook on Ontologies, Springer Verlag 2009, pp. 201-220.

[5] Keet, C.M. Detecting and Revising Flaws in OWL Object Property Expressions. Proc. of EKAW’12. Springer LNAI vol 7603, pp2 52-266.

[6] Birte Glimm, Sebastian Rudolph, and Johanna Volker. Integrated metamodeling and diagnosis in OWL 2. In Peter F. Patel-Schneider, Yue Pan, Pascal Hitzler, Peter Mika, Lei Zhang, Jeff Z. Pan, Ian Horrocks, and Birte Glimm, editors, Proceedings of the 9th International Semantic Web Conference, volume 6496 of LNCS, pages 257-272. Springer, November 2010.

[7] Chris Welty. OntOWLclean: cleaning OWL ontologies with OWL. In B. Bennet and C. Fellbaum, editors, Proceedings of Formal Ontologies in Information Systems (FOIS’06), pages 347-359. IOS Press, 2006.

Lecture notes for the ontologies and knowledge bases course

The regular reader may recollect earlier posts about the ontology engineering courses I have taught at FUB, UH, UCI, Meraka, and UKZN. Each one had some sort of syllabus or series of blog posts with some introductory notes. I’ve put them together and extended them significantly now for the current installment of the Ontologies and Knowledge Bases Honours module (COMP718) at UKZN, and they are bound and printed into lecture notes for the enrolled students. These lecture notes are now online and I will add accompanying slides on the module’s webpage as we go along in the semester.

Given that the target audience is computer science students in their 4th year (honours), the notes are of an introductory nature. There are essentially three blocks: logic foundations, ontology engineering, and advanced topics. The logic foundations contain a recap of FOL, basics of Description Logics with ALC, all the DL-based OWL species, and some automated reasoning. The ontology engineering block covers top-down and bottom-up ontology development, and methods and methodologies, with top-down ontology development including mainly foundational ontologies and part-whole relations, and bottom-up the various approaches to extract knowledge from ‘legacy’ representations, such as from databases and thesauri. The advanced topics are balanced in two directions: one is toward ontology-based data access applications (i.e., an ontology-drive information system) and the other one has more theory with temporal ontologies.

Each chapter has a section with recommended/required reading and a set of exercises.

Unsurprisingly, the lecture notes have been written under time constraints and therefore the level of relative completeness of sections varies slightly. Suggestions and corrections are welcome!

72010 SemWebTech lecture 5: Methods and Methodologies

The previous two lectures have given you a basic idea about the two principal approaches for starting developing an ontology—top-down and bottom-up—but they do not constitute an encompassing methodology to develop ontologies. In fact, there is no proper, up-to-date comprehensive methodology for ontology development like there is for conceptual model development (e.g., [1]) or ‘waterfall’ versus ‘agile’ software development methodologies. There are many methods and, among others, the W3C’s Semantic Web best practices, though, which to a greater or lesser extent can form part of a comprehensive ontology development methodology.

As a first step towards methodologies that gives a general scope, we will look at a range of parameters that affect ontology development in one way or another [2]. There are four influential factors to enhance the efficiency and effectiveness of developing ontologies, which have to do with the purpose(s) of the ontology; what to reuse from existing ontologies and ontology-like artifacts and how to reuse them; the types of approaches for bottom-up ontology development from other legacy sources; and the interaction with the choice of representation language and reasoning services.

Second, methods that helps the ontologist in certain tasks of the ontology engineering process include, but are not limited to, assisting the modelling itself, how to integrate ontologies, and supporting software tools. We will take a closer look at OntoClean [3] that contributes to modelling taxonomies. One might ask oneself: who cares, after all we have the reasoner to classify our taxonomy anyway, right? Indeed, but that works only if you have declared many properties for the classes, which is not always the case, and the reasoner sorts out the logical issues, but not the ontological issues. OntoClean uses several notions from philosophy, such as rigidity, identity criteria, and unity [4, 5] to provide modelling guidelines. For instance, that anti-rigid properties cannot subsume rigid properties; e.g., if we have, say, both Student and Person in our ontology, the former is subsumed by the latter. The lecture will go into some detail of OntoClean.

If, on the other hand, you do have a rich ontology and not mostly a bare taxonomy, ‘debugging’ by availing of an automated reasoner is useful in particular with larger ontologies and ontologies represented in an expressive ontology language. Such ‘debugging’ goes under terms like glass box reasoning [6], justification [7], explanation [8], and pinpointing errors. While they are useful topics, we will spend comparatively little time on it, because it requires some more knowledge of Description Logics and its (mostly tableaux-based) reasoning algorithms that will be introduced only in the 2nd semester (mainly intended for the EMCL students). Those techniques use the automated reasoner to at least locate modelling errors and explain in the most succinct way why this is so, instead of just returning a bunch of inconsistent classes; proposing possible fixes is yet a step further (one such reasoning service will be presented in lecture 6 on Dec. 1).

Aside from parameters, methods, and tools, there are only few methodologies, which are even coarse-grained: they do not (yet) contain all the permutations at each step, i.e. what and how to do each step, given the recent developments. A comparatively comprehensive one is Methontology [10], which has been applied to various subject domains (e.g., chemicals, legal domain [9,11]) since its development in the late 1990s. While some practicalities are superseded with new [12] and even newer languages and tools, some of the core aspects still hold. The five main steps are: specification, conceptualization (with intermediate representations, such as in text or diagrams, like with ORM [1] and pursued by the modelling wiki MOKI that was developed during the APOSDLE project for work-integrated learning), formalization, implementation, and maintenance. Then there are various supporting tasks, such as documentation and version control.

Last, but not least, there are many tools around that help you with one method or another. WebODE aims to support Methontology, the NeOn toolkit aims to support distributed development of ontologies, RacerPlus for sophisticated querying, Protégé-PROMPT for ontology integration (there are many other plug-ins for Protégé), SWOOGLE to search across ontologies, OntoClean with Protégé, and so on and so forth. For much longer listings of tools, see the list of semantic web development tools, the plethora of ontology reasoners and editors, and range of semantic wiki projects engines and features for collaborative ontology development. Finding the right tool to solve the problem at hand (if it exists) is a skill of its own and it is a necessary one to find a feasible solution to the problem at hand. From a technologies viewpoint, the more you know about the goals, features, strengths, and weaknesses of available tools (and have the creativity to develop new ones, if needed), the higher the likelihood you bring a potential solution of a problem to successful completion.

References

[1] Halpin, T., Morgan, T.: Information modeling and relational databases. 2nd edn. Morgan Kaufmann (2008)

[2] Keet, C.M. Ontology design parameters for aligning agri-informatics with the Semantic Web. 3rd International Conference on Metadata and Semantics (MTSR’09) — Special Track on Agriculture, Food & Environment, Oct 1-2 2009 Milan, Italy. F. Sartori, M.A. Sicilia, and N. Manouselis (Eds.), Springer CCIS 46, 239-244.

[3] Guarino, N. and Welty, C. An Overview of OntoClean. in S. Staab, R. Studer (eds.), Handbook on Ontologies, Springer Verlag 2004, pp. 151-172

[4] Guarino, N., Welty, C.: A formal ontology of properties. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNAI, vol. 1937, pp. 97–112. Springer, Heidelberg (2000)

[5] Guarino, N., Welty, C.: Identity, unity, and individuality: towards a formal toolkit for ontological analysis. In: Proc. of ECAI 2000. IOS Press, Amsterdam (2000)

[6] Parsia, B., Sirin, E., Kalyanpur, A. Debugging OWL ontologies. World Wide Web Conference (WWW 2005). May 10-14, 2005, Chiba, Japan.

[7] M. Horridge, B. Parsia, and U. Sattler. Laconic and Precise Justifications in OWL. In Proc. of the 7th International Semantic Web Conference (ISWC 2008), Vol. 5318 of LNCS, Springer, 2008.

[8] Alexander Borgida, Diego Calvanese, and Mariano Rodriguez-Muro. Explanation in the DL-Lite family of description logics. In Proc. of the 7th Int. Conf. on Ontologies, DataBases, and Applications of Semantics (ODBASE 2008), LNCS vol 5332, 1440-1457. Springer, 2008.

[9] Fernandez, M.; Gomez-Perez, A. Pazos, A.; Pazos, J. Building a Chemical Ontology using METHONTOLOGY and the Ontology Design Environment. IEEE Expert: Special Issue on Uses of Ontologies, January/February 1999, 37-46.

[10] Gomez-Perez, A.; Fernandez-Lopez, M.; Corcho, O. Ontological Engineering. Springer Verlag London Ltd. 2004.

[11] Oscar Corcho, Mariano Fernández-López, Asunción Gómez-Pérez, Angel López-Cima. Building legal ontologies with METHONTOLOGY and WebODE. Law and the Semantic Web 2005. Springer LNAI 3369, 142-157.

[12] Corcho, O., Fernandez-Lopez, M. and Gomez-Perez, A. (2003). Methodologies, tools and languages for building ontologies. Where is their meeting point?. Data & Knowledge Engineering 46(1): 41-64.

Note: references 2, 3, and 9 are mandatory reading, 6, 7, and 10 recommended, and 1, 4, 5, 8, 11, and 12 are optional.

Lecture notes: lecture 5 – Methodologies

Course webpage

Transformations, again

A while ago I wrote about refining RO’s transformation_of relation with additional constraints to improve the representation of the intended meaning of the relation and what kind of entities the relata should be. In the meantime, a flurry of emails about the topic passed the revue on the BFO-discuss and OBO-relations lists (without a tangible concrete outcome), and this week my paper about it was accepted for oral presentation at the AI*IA’09 conference in December and will be published in an LNAI volume. Weehee.

By ‘completing’ one piece of work, one always finds more open issues to tackle, which holds also for this topic regarding about, and beyond, the transformation_of (or similar) relation. More on both the ontology and logic is, and soon will be, in the pipeline.


Refining constraints on RO’s transformation_of relation

Things change—develop, mature morph—but not everything in the same way. Representing this knowledge in biomedical ontologies faces issues on three fronts: what the category of the participating objects are, which type of relations they involve, and where constraints should be added. Taking the transformation­_of relation from the OBO Foundry Relation Ontology [1] as example, one can dress up this relation with various constraints that deal with the temporal aspects of the intention of that relation, but in such a way that it is still possible to represent it in an a-temporal ontology language. This can be done by using some basic notions of the OntoClean approach [2] to ontology engineering and generalising the notion of status classes from formal temporal conceptual data modelling into a novel status property. Criteria are identified, formulated in 17 additional constraints, and assessed on applicability for representing transformations more accurately, thereby enabling developers of bio(medical) ontologies to represent and relate entities more precisely, such as monocyte & macrophage and healthy & unhealthy organs. You can find the details in [3].

Rest me to add one comment for BFO-groupies: note that I am using OntoClean, not DOLCE, so please do not waste yours and my time with [a / another] rant that I am not allowed to use DOLCE for the realist-based RO, because I do not. OntoClean and DOLCE are orthogonal (hence, so are OntoClean and BFO & RO) and OntoClean is independent of the debate on realist-versus-other ontological commitment.

References

[1] Smith, B., et al.: Relations in biomedical ontologies. Genome Biology, 6 (2005) R46.
[2] Guarino, N., Welty, C.: An overview of OntoClean. In Staab, S., Studer, R., eds.: Handbook on ontologies. Springer Verlag (2004) 151-159
[3] C. Maria Keet. Constraints for representing transforming entities in bio-ontologies. KRDB Research Centre Technical Report KRDB09-2, Faculty of Computer Science, Free University of Bozen-Bolzano, Italy. April 22, 2009.