# Tutorial: OntoClean in OWL and with an OWL reasoner

The novelty surrounding all things OntoClean described here, is that we made a tutorial out of a scientific paper and used an example that is different from the (in?)famous manual example to clean up a ‘dirty’ taxonomy.

I’m assuming you have at least heard of OntoClean, which is an ontology-inspired method to examine the taxonomy of an ontology, which may be useful especially when the classes (/universals/concepts/..) have no or only a few properties or attributes declared. Based on that ontological information provided by the modeller, it will highlight violations of ontological principles in the taxonomy so that the ontologist may fix it. Its most recent overview is described in Guarino & Welty’s book chapter [1] and there are handouts and slides that show some of the intermediate steps; a 1.5-page summary is included as section 5.2.2 in my textbook [2].

Besides that paper-based description [1], there have been two attempts to get the reasoning with the meta-properties going in a way that can exploit existing technologies, which are OntOWLClean [3] and OntOWL2Clean [4]. As the names suggest, those existing and widely-used mechanisms are OWL and the DL-based reasoners for OWL, and the latter uses OWL2-specific language features (such as role chains) whereas the former does not. As it happened, some of my former students of the OE course wanted to try the OntoOWLClean approach by Welty, and, as they were with three students in the mini-project team, they also had to make their own example taxonomy, and compare the two approaches. It is their—Todii Mashoko, Siseko Neti, and Banele Matsebula’s—report and materials we—Zola Mahlaza and I—have brushed up and rearranged into a tutorial on OntoClean with OWL and a DL reasoner with accompanying OWL files for the main stages in the process.

There are the two input ontologies in OWL (the domain ontology to clean and the ‘ontoclean ontology’ that codes the rules in the TBox), an ontology for the stage after punning the taxonomy into the ABox, and one after having assigned the meta-properties, so that students can check they did the steps correctly with respect to the tutorial example and instructions. The first screenshot below shows a section of the ontology after pushing the taxonomy into the ABox and having assigned the meta-properties. The second screenshot illustrates a state after having selected, started, and run the reasoner and clicked on “explain” to obtain some justifications why the ontology is inconsistent.

section of the punned ontology where meta-properties have been assigned to each new individual.

A selection of the inconsistencies (due to violating OntoClean rules) with their respective explanations

Those explanations, like shown in the second screenshot, indicate which OntoClean rule has been violated. Among others, there’s the OntoClean rule that (1) classes that are dependent may have as subclasses only those classes that are also dependent. The ontology, however, has: i) Father is dependent, ii) Male is non-dependent, and iii) Father has as subclass Male. This subsumption violates rule (1). Indeed, not all males are fathers, so it would be, at least, the other way around (fathers are males), but it also could be remodelled in the ontology such that father is a role that a male can play.

Let us look at the second generated explanation, which is about violating another OntoClean rule: (2) sortal classes have only as subclasses classes that are also sortals. Now, the ontology has: i) Ball is a sortal, ii) Sphere is a non-sortal, and iii) Ball has as subclass Sphere. This violates rule (2). So, the hierarchy has to be updated such that Sphere is not subsumed by Ball anymore. (e.g., Ball has as shape some Sphere, though note that not all balls are spherical [notably, rugby balls are not]). More explanations of the rule violations are described in the tutorial.

Seeing that there are several possible options to change the taxonomy, there is no solution ontology. We considered creating one, but there are at least two ‘levels’ that will influence what a solution may look like: one could be based on a (minimum or not) number of changes with respect to the assigned meta-properties and another on re-examining the assigned meta-properties (and then restructuring the hierarchy). In fact, and unlike the original OntoClean example, there is at least one case where there is a meta-property assignment that would generally be considered to be wrong, even though it does show the application of the OntoClean rule correctly. How best to assign a meta-property, i.e., which one it should be, is not always easy, and the student is also encouraged to consider that aspect of the method. Some guidance on how to best modify the taxonomy—like Father is-a Male vs. Father inheres-in some Male—may be found in other sections and chapters of the textbook, among other resources.

p.s.: this tutorial is the result of one of the activities to improve on the OE open textbook, which are funded by the DOT4D project, as was the tool to render the axioms in DL in Protégé. A few more things are in the pipeline (TBC).

References

[1] Guarino, N. and Welty, C. A. (2009). An overview of OntoClean. In Staab, S. and Studer, R., editors, Handbook on Ontologies, International Handbooks on Information Systems, pages 201-220. Springer.

[2] Keet, C. M. (2018). An introduction to ontology engineering. College Publications, vol 20. 344p.

[3] Welty, C. A. (2006). OntOWLClean: Cleaning OWL ontologies with OWL. In Bennett, B. and Fellbaum, C., editors, Proceedings of the Fourth International Conference on Formal Ontology in Information Systems (FOIS 2006), Baltimore, Maryland, USA, November 9-11, 2006, volume 150 of Frontiers in Artificial Intelligence and Applications, pages 347-359. IOS Press.

[4] Glimm, B., Rudolph, S., Volker, J. (2010). Integrated metamodeling and diagnosis in OWL 2. In Peter F. Patel-Schneider, Yue Pan, Pascal Hitzler, Peter Mika, Lei Zhang, Je_ Z. Pan, Ian Horrocks, and Birte Glimm, editors, Proceedings of the 9th International Semantic Web Conference, LNCS vol 6496, pages 257-272. Springer.

# Reblogging 2012: Fixing flaws in OWL object property expressions

From the “10 years of keetblog – reblogging: 2012”: There are several 2012 papers I (co-)authored that I like and would have liked to reblog—whatever their citation counts may be. Two are on theoretical, methodological, and tooling advances in ontology engineering using foundational ontologies in various ways, in collaboration with Francis Fernandez and Annette Morales following a teaching and research visit to Cuba (ESWC’12 paper on part-whole relations), and a dedicated Honours student who graduated cum laude, Zubeida Khan (EKAW’12 paper on foundational ontology selection). The other one, reblogged here, is of a more fundamental nature—principles of role [object property] hierarchies in ontologies—and ended up winning best paper award at EKAW’12; an extended version has been published in JoDS in 2014. I’m still looking for a student to make a proof-of-concept implementation (in short, thus far: when some are interested, there’s no money, and when there’s money, there’s no interest).

———–

OWL 2 DL is a very expressive language and, thanks to ontology developers’ persistent requests, has many features for declaring complex object property expressions: object sub-properties, (inverse) functional, disjointness, equivalence, cardinality, (ir)reflexivity, (a)symmetry, transitivity, and role chaining. A downside of this is that with the more one can do, the higher is the chance that flaws in the representation are introduced; hence, an unexpected or undesired classification or inconsistency may actually be due to a mistake in the object property box, not a class axiom. While there are nifty automated reasoners and explanation tools that help with the modeling exercise, the standard reasoning services for OWL ontologies assume that the axioms in the ‘object property box’ are correct and according to the ontologist’s intention. This may not be the case. Take, for instance, the following thee examples, where either the assertion is not according to the intention of the modeller, or the consequence may be undesirable.

• Domain and range flaws; asserting hasParent $\sqsubseteq$ hasMother instead of hasMother $\sqsubseteq$ hasParent in accordance with their domain and range restrictions (i.e., a subsetting mistake—a more detailed example can be found in [1]), or declaring a domain or a range to be an intersection of disjoint classes;
• Property characteristics flaws: e.g., the family-tree.owl (when accessed on 12-3-2012) has hasGrandFather $\sqsubseteq$ hasAncestor and Trans(hasAncestor) so that transitivity unintentionally is passed down the property hierarchy, yet hasGrandFather is really intransitive (but that cannot be asserted in OWL);
• Property chain issues; for instance the chain hasPart $\circ$ hasParticipant $\sqsubseteq$ hasParticipant in the pharmacogenomics ontology [2] that forces the classes in class expressions using these properties—in casu, DrugTreatment and DrugGeneInteraction—to be either processes due to the domain of the hasParticipant object property, or they will be inconsistent.

Unfortunately, reasoner output and explanation features in ontology development environments do not point to the actual modelling flaw in the object property box. This is due to that implemented justification and explanation algorithms [3, 4, 5] consider logical deductions only and that class axioms and assertions about instances take precedence over what ‘ought to be’ concerning object property axioms, so that only instances and classes can move about in the taxonomy. This makes sense from a logic viewpoint, but it is not enough from an ontology quality viewpoint, as an object property inclusion axiom—being the property hierarchies, domain and range axioms to type the property, a property’s characteristics (reflexivity etc.), and property chains—may well be wrong, and this should be found as such, and corrections proposed.

So, we have to look at what type of mistakes can be made in object property expressions, how one can get the modeller to choose the ontologically correct options in the object property box so as to achieve a better quality ontology and, in case of flaws, how to guide the modeller to the root defect from the modeller’s viewpoint, and propose corrections. That is: the need to recognise the flaw, explain it, and to suggest revisions.

To this end, two non-standard reasoning services were defined [6], which has been accepted recently at the 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12): SubProS and ProChainS. The former is an extension to the RBox Compatibility Service for object subproperties by [1] so that it now also handles the object property characteristics in addition to the subsetting-way of asserting object sub-properties and covers the OWL 2 DL features as a minimum. For the latter, a new ontological reasoning service is defined, which checks whether the chain’s properties are compatible by assessing the domain and range axioms of the participating object properties. Both compatibility services exhaustively check all permutations and therewith pinpoint to the root cause of the problem (if any) in the object property box. In addition, if a test fails, one or more proposals are made how best to revise the identified flaw (depending on the flaw, it may include the option to ignore the warning and accept the deduction). Put differently: SubProS and ProChainS can be considered so-called ontological reasoning services, because the ontology does not necessarily contain logical errors in some of the flaws detected, and these two services thus fall in the category of tools that focus on both logic and additional ontology quality criteria, by aiming toward ontological correctness in addition to just a satisfiable logical theory. (on this topic, see also the works on anti-patterns [7] and OntoClean [8]). Hence, it is different from other works on explanation and pinpointing mistakes that concern logical consequences only [3,4,5], and SubProS and ProChainS also propose revisions for the flaws.

SubProS and ProChainS were evaluated (manually) with several ontologies, including BioTop and the DMOP, which demonstrate that the proposed ontological reasoning services indeed did isolate flaws and could propose useful corrections, which have been incorporated in the latest revisions of the ontologies.

Theoretical details, the definition of the two services, as well as detailed evaluation and explanation going through the steps can be found in the EKAW’12 paper [6], which I’ll present some time between 8 and 12 October in Galway, Ireland. The next phase is to implement an efficient algorithm and make a user-friendly GUI that assists with revising the flaws.

References

[1] Keet, C.M., Artale, A.: Representing and reasoning over a taxonomy of part-whole relations. Applied Ontology 3(1-2) (2008) 91–110

[2] Dumontier, M., Villanueva-Rosales, N.: Modeling life science knowledge with OWL 1.1. In: Fourth International Workshop OWL: Experiences and Directions 2008 (OWLED 2008 DC). (2008) Washington, DC (metro), 1-2 April 2008

[3] Horridge, M., Parsia, B., Sattler, U.: Laconic and precise justifications in OWL. In: Proceedings of the 7th International Semantic Web Conference (ISWC 2008). Volume 5318 of LNCS., Springer (2008)

[4] Parsia, B., Sirin, E., Kalyanpur, A.: Debugging OWL ontologies. In: Proceedings of the World Wide Web Conference (WWW 2005). (2005) May 10-14, 2005, Chiba, Japan.

[5] Kalyanpur, A., Parsia, B., Sirin, E., Grau, B.: Repairing unsatisfiable concepts in OWL ontologies. In: Proceedings of ESWC’06. Springer LNCS (2006)

[6] Keet, C.M. Detecting and Revising Flaws in OWL Object Property Expressions. 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12), Oct 8-12, Galway, Ireland. Springer, LNAI, 15p. (in press)

[7] Roussey, C., Corcho, O., Vilches-Blazquez, L.: A catalogue of OWL ontology antipatterns. In: Proceedings of K-CAP’09. (2009) 205–206

[8] Guarino, N., Welty, C.: An overview of OntoClean. In Staab, S., Studer, R., eds.: Handbook on ontologies. Springer Verlag (2004) 151–159

# Some ontology authoring guidelines to prevent pitfalls: TIPS

We showed pervasiveness of pitfalls in ontologies ealier [1], and it is overdue to look at how to prevent them in a structured manner. From an academic viewpoint, preventing them is better, because it means you have a better grasp of ontology development. Following our KEOD’13 paper [1], we received a book chapter invitation from its organisers, and the Typical pItfall Prevention Scheme (TIPS) is described there. Here I include a ‘sneak preview’ selection of the 10 sets of guidelines (i.e., it is somewhat reworded and shortened for this blog post).

The TIPS are relevant in general, including also to the latest OWL 2. They are structured in an order of importance in the sense of how one typically goes about developing an ontology at the level of ontology authoring, and they embed an emphasis with respect to occurrence of the pitfall so that common pitfalls can be prevented first. The numbers in brackets refer to the type of pitfall, and is the same numbering as in the OOPS! pitfall catalogue and in [1].

T1: Class naming and identification (includes P1, P2, P7, C2, and C5): Synonymy and polysemy should be avoided in naming a class: 1) distinguish the concept/universal itself from the names it can have (the synonyms) and create just one class for it and add other names using rdfs:label annotations; 2) in case of polysemy (the same name has different meanings), try to disambiguate the term and refine the names. Concerning identifying classes, do not lump several together into one with an ‘and’ or ‘or’ (like a class TaskOrGoal or ShrubsAndBushes), but try to divide them into subclasses. Squeezing in modality (like ‘can’, ‘may’, ‘should’) in the name is readable for you, but has no effect on reasoning—if you want that, choose another language—and sometimes can be taken care of in a different way (like a canCook: the stove has the function or affordability to cook). Last, you should have a good URI indicating where the ontology will be published and a relevant name for the file.

T2: Class hierarchy (includes P3, P6, P17, and P21): A taxonomy is based on is-a relationships, meaning that classA is-a classB, if and only if every instance of A is also instance of B, and is-a is transitive. The is-a is present in the language already (subclassOf in OWL), so do not introduce it as an object property. Also, do not confuse is-a with instance-of: the latter is used for representing membership of an individual in a class (which also has a primitive in OWL). Consider the leaf classes of the hierarchy: are they are still classes (entities that can have instances) or individuals (entities that cannot be instantiated anymore)? If the latter, then convert them into instances. What you typically want to avoid are cycles in the hierarchy, as then some class down in the hierarchy—and all of them in between—ends up as equivalent to one of its superclasses. Also try to avoid adding some class named Unknown, Other or Miscellaneous in a class hierarchy just because the set of sibling classes defined is incomplete.

T3: Domain and range of a class (includes P11 and P18): When you add an object or data property, answer the question “What is the most general class in the ontology for which this property holds?” and declare that class as domain/range of the property.  If the answer happens to be multiple classes, then ensure you combine them with ‘or’, not a simple list of those classes (which amounts to the intersection), likewise if the answer is owl:Thing, then try to combine several subclasses instead of using the generic owl:Thing (can the property really relate anything to anything?). For the range of a data property, you should take the answer to the question “What would be the format of data (strings of characters, positive numbers, dates, floats, etc.) used to fill in this information?” (the most general one is literal).

T4: Equivalent relations (includes P12 and P27):

T5: Inverse relations (includes P5, P13, P25, and P26): For object properties that are declared inverses of each other, check that the domain class of one is the same class as the range of the other one, and vv. (for a single object property, consider T6).

T6: Object property characteristics (includes P28 and P29): Go through the object properties and check their characteristics, such as symmetry, functional, and transitivity. See also the SubProS reasoning service [2] to ensure to have ‘safe’ object property characteristics declared that will not have unexpected deductions Concerning reflexivity, be sure to distinguish between the case where a property holds for all objects in your ontology—if so, declare it reflexive—and when it counts only for a particular relation and instances of the participating classes—then use the Self construct.

T7: Intended formalization (includes P14, P15, P16, P19, C1, and C4): As mentioned in T3, a property’s domain or range can consist of more than one class, which is usually a union of the classes, not the intersection of them. For a property’s usage in an axiom, there are typically three cases: (i) if there is at least one such relation (quite common), then use SomeValuesFrom/some/$\exists$; (ii)  ‘closing’ the relation, i.e., it doesn’t relate to anything else than the class(es) specified, then also add a AllValuesFrom/only/$\forall$; (iii) stating there is no such relation in which the class on the left-hand side participates, you have to be precise at what you really want to say: to achieve the latter, put the negation before the quantifier, but when there is a relation that is just not with some particular class, then the negation goes in front of the class on the right-hand side. For instance, a vegetarian pizza does have ingredients but not meat ($\neg\exists hasIngredient.Meat$), which is different from saying that it has as ingredients anything in the ontology—cucumber, beer, soft drink, marsh mellow, chocolate, …—that is not meat ($\exists hasIngredient.\neg Meat$). Don’t create a ‘hack’ by introducing a class with negation in the name, alike a NotMeat, but use negation properly in the axiom. Finally, when you are convinced that all relevant properties for a class have been represented, convert it to a defined class (if not already done so), which gets you more deductions for free.

T8: Modelling aspects (includes P4, P23, and C3):

T9: Domain coverage and requirements (includes P9 and P10):

T10: Documentation and understandability (includes P8, P20, and P22): annotate!

I don’t know yet when the book with the selected papers from KEOD will be published, but I assume within the next few months. (date will be added here once I know).

References

[1] Keet, C.M., Suárez Figueroa, M.C., and Poveda-Villalón, M. (2013) The current landscape of pitfalls in ontologies. International Conference on Knowledge Engineering and Ontology Development (KEOD’13). 19-22 September, Vilamoura, Portugal.

[2] C. Maria Keet. Detecting and Revising Flaws in OWL Object Property Expressions. EKAW’12. Springer LNAI vol 7603, pp2 52-266.

# Fixing flaws in OWL object property expressions

OWL 2 DL is a very expressive language and, thanks to ontology developers’ persistent requests, has many features for declaring complex object property expressions: object sub-properties, (inverse) functional, disjointness, equivalence, cardinality, (ir)reflexivity, (a)symmetry, transitivity, and role chaining. A downside of this is that with the more one can do, the higher is the chance that flaws in the representation are introduced; hence, an unexpected or undesired classification or inconsistency may actually be due to a mistake in the object property box, not a class axiom. While there are nifty automated reasoners and explanation tools that help with the modeling exercise, the standard reasoning services for OWL ontologies assume that the axioms in the ‘object property box’ are correct and according to the ontologist’s intention. This may not be the case. Take, for instance, the following thee examples, where either the assertion is not according to the intention of the modeller, or the consequence may be undesirable.

• Domain and range flaws; asserting hasParent $\sqsubseteq$ hasMother instead of hasMother $\sqsubseteq$ hasParent in accordance with their domain and range restrictions (i.e., a subsetting mistake—a more detailed example can be found in [1]), or declaring a domain or a range to be an intersection of disjoint classes;
• Property characteristics flaws: e.g., the family-tree.owl (when accessed on 12-3-2012) has hasGrandFather $\sqsubseteq$ hasAncestor and Trans(hasAncestor) so that transitivity unintentionally is passed down the property hierarchy, yet hasGrandFather is really intransitive (but that cannot be asserted in OWL);
• Property chain issues; for instance the chain hasPart $\circ$ hasParticipant $\sqsubseteq$ hasParticipant in the pharmacogenomics ontology [2] that forces the classes in class expressions using these properties—in casu, DrugTreatment and DrugGeneInteraction—to be either processes due to the domain of the hasParticipant object property, or they will be inconsistent.

Unfortunately, reasoner output and explanation features in ontology development environments do not point to the actual modelling flaw in the object property box. This is due to that implemented justification and explanation algorithms [3, 4, 5] consider logical deductions only and that class axioms and assertions about instances take precedence over what ‘ought to be’ concerning object property axioms, so that only instances and classes can move about in the taxonomy. This makes sense from a logic viewpoint, but it is not enough from an ontology quality viewpoint, as an object property inclusion axiom—being the property hierarchies, domain and range axioms to type the property, a property’s characteristics (reflexivity etc.), and property chains—may well be wrong, and this should be found as such, and corrections proposed.

So, we have to look at what type of mistakes can be made in object property expressions, how one can get the modeller to choose the ontologically correct options in the object property box so as to achieve a better quality ontology and, in case of flaws, how to guide the modeller to the root defect from the modeller’s viewpoint, and propose corrections. That is: the need to recognise the flaw, explain it, and to suggest revisions.

To this end, two non-standard reasoning services were defined [6], which has been accepted recently at the 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12): SubProS and ProChainS. The former is an extension to the RBox Compatibility Service for object subproperties by [1] so that it now also handles the object property characteristics in addition to the subsetting-way of asserting object sub-properties and covers the OWL 2 DL features as a minimum. For the latter, a new ontological reasoning service is defined, which checks whether the chain’s properties are compatible by assessing the domain and range axioms of the participating object properties. Both compatibility services exhaustively check all permutations and therewith pinpoint to the root cause of the problem (if any) in the object property box. In addition, if a test fails, one or more proposals are made how best to revise the identified flaw (depending on the flaw, it may include the option to ignore the warning and accept the deduction). Put differently: SubProS and ProChainS can be considered so-called ontological reasoning services, because the ontology does not necessarily contain logical errors in some of the flaws detected, and these two services thus fall in the category of tools that focus on both logic and additional ontology quality criteria, by aiming toward ontological correctness in addition to just a satisfiable logical theory. (on this topic, see also the works on anti-patterns [7] and OntoClean [8]). Hence, it is different from other works on explanation and pinpointing mistakes that concern logical consequences only [3,4,5], and SubProS and ProChainS also propose revisions for the flaws.

SubProS and ProChainS were evaluated (manually) with several ontologies, including BioTop and the DMOP, which demonstrate that the proposed ontological reasoning services indeed did isolate flaws and could propose useful corrections, which have been incorporated in the latest revisions of the ontologies.

Theoretical details, the definition of the two services, as well as detailed evaluation and explanation going through the steps can be found in the EKAW’12 paper [6], which I’ll present some time between 8 and 12 October in Galway, Ireland. The next phase is to implement an efficient algorithm and make a user-friendly GUI that assists with revising the flaws.

References

[1] Keet, C.M., Artale, A.: Representing and reasoning over a taxonomy of part-whole relations. Applied Ontology 3(1-2) (2008) 91–110

[2] Dumontier, M., Villanueva-Rosales, N.: Modeling life science knowledge with OWL 1.1. In: Fourth International Workshop OWL: Experiences and Directions 2008 (OWLED 2008 DC). (2008) Washington, DC (metro), 1-2 April 2008

[3] Horridge, M., Parsia, B., Sattler, U.: Laconic and precise justifications in OWL. In: Proceedings of the 7th International Semantic Web Conference (ISWC 2008). Volume 5318 of LNCS., Springer (2008)

[4] Parsia, B., Sirin, E., Kalyanpur, A.: Debugging OWL ontologies. In: Proceedings of the World Wide Web Conference (WWW 2005). (2005) May 10-14, 2005, Chiba, Japan.

[5] Kalyanpur, A., Parsia, B., Sirin, E., Grau, B.: Repairing unsatisfiable concepts in OWL ontologies. In: Proceedings of ESWC’06. Springer LNCS (2006)

[6] Keet, C.M. Detecting and Revising Flaws in OWL Object Property Expressions. 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12), Oct 8-12, Galway, Ireland. Springer, LNAI, 15p. (in press)

[7] Roussey, C., Corcho, O., Vilches-Blazquez, L.: A catalogue of OWL ontology antipatterns. In: Proceedings of K-CAP’09. (2009) 205–206

[8] Guarino, N., Welty, C.: An overview of OntoClean. In Staab, S., Studer, R., eds.: Handbook on ontologies. Springer Verlag (2004) 151–159