Some ontology authoring guidelines to prevent pitfalls: TIPS

We showed pervasiveness of pitfalls in ontologies ealier [1], and it is overdue to look at how to prevent them in a structured manner. From an academic viewpoint, preventing them is better, because it means you have a better grasp of ontology development. Following our KEOD’13 paper [1], we received a book chapter invitation from its organisers, and the Typical pItfall Prevention Scheme (TIPS) is described there. Here I include a ‘sneak preview’ selection of the 10 sets of guidelines (i.e., it is somewhat reworded and shortened for this blog post).

The TIPS are relevant in general, including also to the latest OWL 2. They are structured in an order of importance in the sense of how one typically goes about developing an ontology at the level of ontology authoring, and they embed an emphasis with respect to occurrence of the pitfall so that common pitfalls can be prevented first. The numbers in brackets refer to the type of pitfall, and is the same numbering as in the OOPS! pitfall catalogue and in [1].

T1: Class naming and identification (includes P1, P2, P7, C2, and C5): Synonymy and polysemy should be avoided in naming a class: 1) distinguish the concept/universal itself from the names it can have (the synonyms) and create just one class for it and add other names using rdfs:label annotations; 2) in case of polysemy (the same name has different meanings), try to disambiguate the term and refine the names. Concerning identifying classes, do not lump several together into one with an ‘and’ or ‘or’ (like a class TaskOrGoal or ShrubsAndBushes), but try to divide them into subclasses. Squeezing in modality (like ‘can’, ‘may’, ‘should’) in the name is readable for you, but has no effect on reasoning—if you want that, choose another language—and sometimes can be taken care of in a different way (like a canCook: the stove has the function or affordability to cook). Last, you should have a good URI indicating where the ontology will be published and a relevant name for the file.

T2: Class hierarchy (includes P3, P6, P17, and P21): A taxonomy is based on is-a relationships, meaning that classA is-a classB, if and only if every instance of A is also instance of B, and is-a is transitive. The is-a is present in the language already (subclassOf in OWL), so do not introduce it as an object property. Also, do not confuse is-a with instance-of: the latter is used for representing membership of an individual in a class (which also has a primitive in OWL). Consider the leaf classes of the hierarchy: are they are still classes (entities that can have instances) or individuals (entities that cannot be instantiated anymore)? If the latter, then convert them into instances. What you typically want to avoid are cycles in the hierarchy, as then some class down in the hierarchy—and all of them in between—ends up as equivalent to one of its superclasses. Also try to avoid adding some class named Unknown, Other or Miscellaneous in a class hierarchy just because the set of sibling classes defined is incomplete.

T3: Domain and range of a class (includes P11 and P18): When you add an object or data property, answer the question “What is the most general class in the ontology for which this property holds?” and declare that class as domain/range of the property.  If the answer happens to be multiple classes, then ensure you combine them with ‘or’, not a simple list of those classes (which amounts to the intersection), likewise if the answer is owl:Thing, then try to combine several subclasses instead of using the generic owl:Thing (can the property really relate anything to anything?). For the range of a data property, you should take the answer to the question “What would be the format of data (strings of characters, positive numbers, dates, floats, etc.) used to fill in this information?” (the most general one is literal).

T4: Equivalent relations (includes P12 and P27):

T5: Inverse relations (includes P5, P13, P25, and P26): For object properties that are declared inverses of each other, check that the domain class of one is the same class as the range of the other one, and vv. (for a single object property, consider T6).

T6: Object property characteristics (includes P28 and P29): Go through the object properties and check their characteristics, such as symmetry, functional, and transitivity. See also the SubProS reasoning service [2] to ensure to have ‘safe’ object property characteristics declared that will not have unexpected deductions Concerning reflexivity, be sure to distinguish between the case where a property holds for all objects in your ontology—if so, declare it reflexive—and when it counts only for a particular relation and instances of the participating classes—then use the Self construct.

T7: Intended formalization (includes P14, P15, P16, P19, C1, and C4): As mentioned in T3, a property’s domain or range can consist of more than one class, which is usually a union of the classes, not the intersection of them. For a property’s usage in an axiom, there are typically three cases: (i) if there is at least one such relation (quite common), then use SomeValuesFrom/some/\exists ; (ii)  ‘closing’ the relation, i.e., it doesn’t relate to anything else than the class(es) specified, then also add a AllValuesFrom/only/\forall ; (iii) stating there is no such relation in which the class on the left-hand side participates, you have to be precise at what you really want to say: to achieve the latter, put the negation before the quantifier, but when there is a relation that is just not with some particular class, then the negation goes in front of the class on the right-hand side. For instance, a vegetarian pizza does have ingredients but not meat (\neg\exists hasIngredient.Meat ), which is different from saying that it has as ingredients anything in the ontology—cucumber, beer, soft drink, marsh mellow, chocolate, …—that is not meat (\exists hasIngredient.\neg Meat ). Don’t create a ‘hack’ by introducing a class with negation in the name, alike a NotMeat, but use negation properly in the axiom. Finally, when you are convinced that all relevant properties for a class have been represented, convert it to a defined class (if not already done so), which gets you more deductions for free.

T8: Modelling aspects (includes P4, P23, and C3):

T9: Domain coverage and requirements (includes P9 and P10):

T10: Documentation and understandability (includes P8, P20, and P22): annotate!

I don’t know yet when the book with the selected papers from KEOD will be published, but I assume within the next few months. (date will be added here once I know).

References

[1] Keet, C.M., Suárez Figueroa, M.C., and Poveda-Villalón, M. (2013) The current landscape of pitfalls in ontologies. International Conference on Knowledge Engineering and Ontology Development (KEOD’13). 19-22 September, Vilamoura, Portugal.

[2] C. Maria Keet. Detecting and Revising Flaws in OWL Object Property Expressions. EKAW’12. Springer LNAI vol 7603, pp2 52-266.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s