Ontological realism, methodologies, and mud slinging: a few notes on the AO trilogy

In July at the start of the MOWS’10 course on ontology engineering I pointed to more background literature about the debate about ontology as reality representation, its principle references, the new comprehensive assessment on its problems by Gary Merrill [1], and I included the note from the Applied Ontology Journal editors that Barry Smith and Werner Ceusters were writing a comprehensive rebuttal, to which Merrill would response in turn. They’re out now [2,3], and also freely available through the dedicated AO page.

On cursory glance seeing some juicy sentences, Smith and Ceusters’ 50-page reply [2] seemed like a good pastime to read on the gray, rainy, and cold Sunday afternoon last week and to ponder if and how I would incorporate it in an updated version of the ontology engineering course. It, however, contains many harsh statements with the main message that they’re doing a great thing with their so-called “realist methodology” and that Merrill’s critique is irrelevant. Merrill’s 30-page response to that [3], which I finished reading recently, is that Smith and Ceusters’ clarification made matters worse and thereby confirming it is a misdirection.

So, what to make of all that? If I were a VIP in ontology engineering, I would ask the AO editors to write a proper reply to the Smith and Ceusters’ (BS & WC) paper. But I am not; hence, I will mention a few aspects on my blog only (which might me do more harm than good, but I hope not). I will start with a note on realism, then the usage of the term “application ontologies”, and finally claims about BS & WC’s “realist methodology” that is not a methodology.

Notes on realism

On the realism dimension of the debate, I have not much to say. I subscribe to the, what Merrill formulates as the, “Empiricist Doctrine” [1], which states that “the terms of science… are to be taken to refer to the actually existing entities in the real world”, especially when it comes to ontologies for the life sciences and (bio)medicine. If you want an ontology of deities, fairies, or other story characters, that’s fine, too—just do not put them in a bio-ontology. What I had understood from the conversations, presentations, and papers of BS & WC is that if you accept the “Empiricist Doctrine”, then so you must go along with universals (as opposed to concepts). Merrill calls the latter component the “Universalist Doctrine” where “the terms of science… are to be understood as referring directly to universals”, which is one of many metaphysical stances [1]. I do not know if I subscribe to universals and I do not care about that that much. Although I did some philosophy of science and philosophy of nature a while ago and read up on other subjects in philosophy in the past few years, I am not a philosopher by training and do not know about all intricacies of all alternatives around (but maybe I should).

Another reason for my misunderstanding—or: conflating the two doctrines—is also due to the fact that descriptions and definitions in the BS & WC papers are not consistent throughout (elaborated on by Merrill [1,3]). For instance, in [4], ontology is taken as reality representations, but in [2] it is reality representation that is described by science, i.e., as scientists understand it, or in other words: the representation of the theories. Thus, where the things in the ontology are terms that do not have a ‘direct link’ to the actual entities, but they go through the scientists’ mind with their conceptualizations of reality. This is quite a difference from [4]. Make of it what you like.

Last, the ‘funny’ thing is that when you use the Empiricists Doctrine it does not matter if you use BFO, DOLCE, GFO, or whichever foundational ontology for practical ontology development. The current formalisations of BFO, DOLCE or any of the others do not have in their formalisation that the categories [unary predicates] denote either universals or concepts. Clearly, the communication of the informal intentions would be different if the top (OWL:thing or similar) in the ontology is called Universal or Concept, but in BFO it is called Entity and in DOLCE it is Particular. Thus, de facto, neither one commits to one philosophical doctrine or another in the top-level categorization and formalisation.

What are “application ontologies”?

Smith and Ceusters in [4] make a distinction between reference ontologies and application ontologies, the former intended to represent “settled science” and latter that part of science that is in flux. This rather difficult to maintain distinction is discussed at length in [3]. What I wish to add, and which was only mentioned in passing in [3], is that the notion of ‘application ontologies’ elsewhere in the ontologies enterprise is used quite differently. It refers to OWL- or DL-formalised conceptual data models modeled in one of the common conceptual data modelling languages (UML, EER, ORM), but not real ontologies. The discussion about the difference between an ontology and a conceptual data model is beyond the current scope, but it is important to note that the same term means something different in pretty much all other literature about ontologies. Perhaps BS & WC have not read that literature, given that they happily attack computer science, knowledge engineering, and conceptual modelling (section 3.1 in [2]) with ‘justifications’ that Wüster-the-businessman over at ISO is a telling example of knowledge engineering and conceptual modelling (he is not), and that it was the training in cognitive psychology we all got as computer scientists (we did not) that makes us confused and stick to concepts instead of buying into universalist doctrine. Such statements are not helpful.

Either way, application ontology as a formal conceptual data model is definitely a more tenable definition [setting aside if one agrees with it] than application ontology for the non-settled science for the fact that there is no crisp boundary between settled and non-settled science. If the vague distinction is not enough already to complicate the debate: concepts are allowed to appear in BS & WC’s application ontologies.

About “methodologies”

Smith and Ceusters propose their “realist methodology” in section 1 of [2], but a methodology it is not—at least, not in the sense I, and (m)any other people in CS & IT, use the term. What BS & WC put forward is a set of principles. It does not say what to do, how, and when. And there is no empirical validation that the resultant ontologies are better (validation sensu a proper scientific experiment with subjects with/without using the ‘methodology’, measurable quality criteria, statistically significant, etc.).

An example of a fairly straightforward methodology for ontology development is METHONTOLOGY (among others [5]), and a more recent one for collaborative distributed ontology development: the NeON Methodology [6]. The latter has a nice fairly comprehensive overview picture of the interactions of the different steps (see Fig. 1, below) that are described further in [6] (and an aspect of this are the interactions between the different steps [7]). In my lectures, I like to be impartial and include a variety of options to sensitise  ontology developers to the plethora of options (see, e.g., Sections 3 and 4 of the MOWS’10 course, which is an updated version of SemWebTech lecture 3+4, where the what comes before the how, outlined in SemWebTech lecture 5: Methods and Methodologies), but a set of principles that is labeled “methodology” is not something that fits in a real methodology section (though they may well fit in another module).

How can BS & WC even dare to propose a methodology for ontology development when disregarding all literature on ontology development (except for the OntoClean method)? If their methodology is so superior, than give me evidence why and how it is better than all the methodologies that have been proposed over the past 15 years or so. Spoon-feed me about the shortcomings of those procedures; that is, not a lecture about the realist vs anti-realist, conceptualist and what have you, but why I should not buy into collecting non-ontological resources, looking at ontology design patterns, providing intermediate steps for the formalization, and so forth.

Whilst reading section 1 of [2], I have been trying to extract a methodology—that is, reading it with a positive attitude to try to make something of it—but could find little, and what I extracted from it, is not enough for practical ontology development and maintenance. As example, let us take the step of  “non-ontological resource reuse” for the chosen subject domain. In an ontology engineering methodology, this includes options, such as assessing chosen sources such as relevant thesauri, databases, natural language text, and methods for each option, i.e., the how-to to reuse the non-ontological resources, such as the manual database reverse engineering steps vs. semi-automated tools (in, say, VisioModeler, or the Protégé plugin Lubyte  developed [8]), data mining and clustering, the different methods to extract terms from text etc. From [2], e.g. section 1.13, I gather that the only way to execute this step of “non-ontological resource reuse” is that domain experts manually read the scientific literature and manually add the knowledge to the ontology. No help from, say, the KEGG, AGROVOC, ICD10, or ontologies that were already developed by other groups—all that should be ignored—let alone automating anything to find, say, candidate terms automatically with NLP tools. That surely must be a joke (or oversight, or sheer ignorance) and does not reflect what happens during the development of OBO ontologies. Or take, e.g., METHONTOLOGY or MoKi’s stage of intermediate representations between de domain expert’s informal representation and the formalisation of it in a suitable logic language, such as pseudo-natural language, diagrams as syntactic sugar for the underlying logic, the Protégé and OBOEdit ODEs: are they to be ignored, too? Of course not; well, I presume that that is not the intention of BS & WC’s “methodology”.

They may have enjoyed having written a trashing of 20 years of knowledge engineering and conceptual data modelling whose outputs apparently can be ignored, but there surely is room to learn a thing or two about it. After reading up on the related works on methodologies, they can make a real attempt at developing a methodology that satisfies the set of principles, be that by developing a methodology from scratch or integrating it into (or extending) existing methodologies. Until then, what is presented in section 1 of [2] will not—cannot—be added to a ‘methods and methodologies’ module in an ontology engineering course.

P.S.: Other views

A different online debate about realism in ontology engineering can be read over at Phil Lord’s blog (The Status quo farewell tour on realism, Why not?, and Why realism is wrong) and his paper together with Robert Stevens at PLoS ONE [9], versus David Sutherland’s Realism, Really? and Yes, really in favour of the realist approach for practical ontology development. Then there is the OBO-Foundry discussion list, and, e.g., a paper in FOIS’10 by Michel Dumontier and Robert Hoehndorf [10], and undoubtedly more papers about the issues raised in the AO trilogy will follow.

References

[1] Gary H. Merrill. Ontological realism: Methodology or misdirection? Applied Ontology, 5 (2010) 79–108.

[2] Barry Smith and Werner Ceusters. Ontological realism: A methodology for coordinated evolution of scientific ontologies. Applied Ontology, 5 (2010) 79–108.

[3] Gary H. Merrill. Realism and reference ontologies: Considerations, reflections and problems. Applied Ontology, 5 (2010) 79–108.

[4] Barry Smith. Beyond Concepts, or: Ontology as Reality Representation. Achille Varzi and Laure Vieu (eds.), Formal Ontology and Information Systems. Proceedings of the Third International Conference (FOIS 2004), Amsterdam: IOS Press, 2004, 73-84.

[5] Corcho, O., Fernandez-Lopez, M. and Gomez-Perez, A. (2003). Methodologies, tools and languages for building ontologies. Where is their meeting point?. Data & Knowledge Engineering 46(1): 41-64.

[6] Mari Carmen Suarez-Figueroa, Guadalupe Aguado de Cea, Carlos Buil, Klaas Dellschaft, Mariano Fernandez-Lopez, Andres Garcia, Asuncion Gomez-Perez, German Herrero, Elena Montiel-Ponsoda, Marta Sabou, Boris Villazon-Terrazas, and Zheng Yufei. NeOn Methodology for Building Contextualized Ontology Networks. NeOn Deliverable D5.4.1. 2008.

[7] Keet, C.M. Dependencies between Ontology Design Parameters. International Journal of Metadata, Semantics and Ontologies, 2010, 5(4): 265-284.

[8] Lina Lubyte. Techniques and Tools for the Design of Ontologies for Data Access. PhD Thesis, Free University of Bozen-Bolzano, KRDB Dissertation Series DS-2010-02, 2010.

[9] Lord, P. & Stevens, R. Adding a little reality to building ontologies for biology. PLoS One, 2010, 5(9), e12258. DOI: 10.1371/journal.pone.0012258.

[10] Dumontier, M. & Hoehndorf, R. Realism for scientific ontologies. In: Proceeding of the Conference on Formal Ontology in Information Systems: Proceedings of the Sixth International Conference (FOIS 2010), 387–399. Amsterdam: IOS Press.

Fig 1. Graphical depiction of different steps in ontology development, where each step has its methods and interactions with other steps (taken from 6).

Advertisements

Nontransitive vs. intransitive direct part-whole relations in OWL

Confusing is-a with part-of is known to be a common mistake by novice ontology developers. Each time I taught the ontology engineering course, I had included a session of 1-2 hours to explain some basic aspects of part-whole relations and, lo and behold, none of the participants made that mistake in the labs or mini-projects! One awkward thing did pop-up there and at other occasions, though, which had to do with modelling direct parthood that does not go well at the moment, to say the least, for a plethora of reasons. Inclusion of direct parthood is not without philosophical quarrels, and the more I think of it, the more I dislike the relation, but somehow the issue appears often in the context of part-whole relations in ontologies. The observed underlying modelling issue—representing intransitivity versus nontransitivity—holds for any OWL object property anyway, so I will proceed with the general case with an example about giraffes.

Preliminaries

First of all, to clarify terms in the post’s title: INtransitive means that for all x, y, z, if Rxy and Ryz then Rxz does not hold; formally \forall x, y, z (R(x,y) \land R(y,z) \rightarrow \neg R(x,z) and an option to state this in a Description Logic is to use role chaining: R \circ R \sqsubseteq \neg R NONtransitive means that we cannot say either way if the property is transitive or intransitive, i.e., in some cases is may be transitive but not in other occasions. Direct parthood is to be understood as follows: if some part x is a direct part of a y, then there is no other object z such that x is a part of z and z is a part of y; formally, \forall x,y (dpo(x, y) \equiv \neg \exists z (partof(x,z) \land partof(z,y))) . If direct parthood is in- or non-transitive is beside the point at this stage, so let us look now at what happens with it in an OWL ontology when one tries to model it one way or another.

The OWL ontology and the reasoner

Given that I used the African Wildlife Ontology as a tutorial ontology earlier and the theme appeals to people, I will use it again here. Depending on what we do with the direct parthood relation in the ontology, Giraffe is, or is not, classified automatically as a subclass of Herbivore. Herbivore is a defined class, equivalent to, in Protégé 4.1 notation, (eats only plant) or (eats only (is-part-of some plant)), and Giraffe is a subclass of both Animal and eats only (leaf or Twig). Leaves are part of a twig, twigs of a branch, and branches of a tree that in turn is a subclass of plant. The is-part-of is, correctly according to mereology, included in the ontology as being transitive. Instead of all the is-part-of and is-proper-part-of between plant parts and plants in the AfricanWildlifeOntology1.owl, we model them using direct-part. AfricanWildlifeOntology4a.owl has direct-part as sister object property to is-part-of, AfricanWildlifeOntology4b.owl has it as sub-object property of is-part-of, and neither ontology has any “characteristics” (relational properties) checked for direct-part. Before running the reasoner to classify the taxonomy, what do you think will happen with our Giraffe in both cases?

In AfricanWildlifeOntology4a.owl, Giraffe is still a mere direct subclass of Animal, whereas with AfricanWildlifeOntology4b.owl, we do obtain the (desired) deduction that Giraffe is a Herbivore. That is, we obtain different results depending on where we put the uncharacterized direct-part object property in the RBox. Why is this so?

By not clicking the checkbox “transitive”, an object property is non­-transitive, but not in-transitive. In fact, we cannot represent explicitly that an object property is intransitive in OWL (see OWL guide and related documents). If we put the object property at the top level (or, as in Protégé 4.1, as immediate subproperty of topObjectProperty), then we obtain the behaviour as if the property were intransitive (and therefore Giraffe is not classified as a subclass of Herbivore). However, the direct-part property is really nontransitive in the ontology. When direct-part is put as subproperty of is-part-of, then it inherits the transitivity characteristic from is-part-of and therefore Giraffe is classified as a Herbivore (because now leaf and Twig are part of plant thanks to the transitivity).

Obviously, it holds for any OWL/OWL2 object property that one cannot assert intransitivity explicitly, that an object property’s characteristics are inherited to its subproperties, and this kind of behaviour of nontransitive object properties depends on where you place it in the RBox—whether you like it or not.

How to go forward?

Direct parthood is called isComponentOf in the componency ontology design pattern and is a subproperty of isPartOf. Its inverse is called haspart_directly in the W3C best practices document on Simple Part-Whole relations [1], and is a subproperty of the transitive haspart. The componency.owl notes that isComponentOf is “hasPart relation without transitivity”, the ODP page’s “intent” of the pattern is that it is intended to “represent (non-transitively) that objects either are proper parts of other objects, or have proper parts”, and the W3C best Practices note that, unlike mereological parthood, it is “not transitive”. Hence, if you include either one in your OWL ontology, you will not obtain the intended behaviour. Therefore, I do not recommend using either suggestion.

Setting aside the W3C’s best practices motivation for inclusion of haspart_directly—easier querying for immediate parts, but for the ontology purist this ought not to be the motivation for its inclusion—it is worth digging a little deeper into the semantics of the direct parthood. Maybe a modeller actually wants to represent collections with their members, like each Fleet has as direct parts more than one Ship, or constitution of objects, like clay is directly part of some vase? In both cases, however, we deal with meronymic part-whole relations, not mereological ones (see [2] and references therein); hence, they should not be subsumed by the mereological part-of relation anyway. They can be modelled as sister properties of the part-of relation and have the intended nontransitive behaviour as in, e.g., the pwrelations.owl ontology with a taxonomy of part-whole relations (that can be imported into the wildlife ontology).

Alternatively, there is always the option to choose a sufficiently expressive non-OWL language to represent the direct parthood and the rest of the subject domain and use one of the many first/second order theorem provers.

References

[1] Alan Rector and Chris Welty. Simple Part-Whole relations in OWL ontologies. W3C Editor’s draft, 11 August 2005.

[2] C. Maria Keet and Alessandro Artale. Representing and Reasoning over a Taxonomy of Part-Whole Relations. Applied Ontology, 2008, 3(1-2): 91-110.

IJMSO paper on dependencies between ontology design parameters online

At the time I wrote the previous post on dependencies between ontology design parameters in June, I was of the understanding that I would not be allowed to put my paper accepted for publication in the International Journal on Metadata, Semantics and Ontologies online on my homepage due to the restrictions mentioned in the small print in the copyright form. But upon having gone through the full printing procedure, Inderscience informed me that “the Author may post a postprint of the Article (defined as the Authors post-peer review,accepted paper submitted for final publication by Inderscience) on the authors personal web pages”.

So, the paper was published in volume 5, issue 4, of IJMSO recently and the nicely formatted version can be downloaded there. For those who do not have a subscription or do not want to pay for the improved layout, the scruffy postprint is now freely available and the complete reference is:

Keet, C.M. Dependencies between Ontology Design Parameters. International Journal of Metadata, Semantics and Ontologies, 2010, 5(4): 265-284. DOI: 10.1504/IJMSO.2010.035550.