In July at the start of the MOWS’10 course on ontology engineering I pointed to more background literature about the debate about ontology as reality representation, its principle references, the new comprehensive assessment on its problems by Gary Merrill [1], and I included the note from the Applied Ontology Journal editors that Barry Smith and Werner Ceusters were writing a comprehensive rebuttal, to which Merrill would response in turn. They’re out now [2,3], and also freely available through the dedicated AO page.
On cursory glance seeing some juicy sentences, Smith and Ceusters’ 50-page reply [2] seemed like a good pastime to read on the gray, rainy, and cold Sunday afternoon last week and to ponder if and how I would incorporate it in an updated version of the ontology engineering course. It, however, contains many harsh statements with the main message that they’re doing a great thing with their so-called “realist methodology” and that Merrill’s critique is irrelevant. Merrill’s 30-page response to that [3], which I finished reading recently, is that Smith and Ceusters’ clarification made matters worse and thereby confirming it is a misdirection.
So, what to make of all that? If I were a VIP in ontology engineering, I would ask the AO editors to write a proper reply to the Smith and Ceusters’ (BS & WC) paper. But I am not; hence, I will mention a few aspects on my blog only (which might me do more harm than good, but I hope not). I will start with a note on realism, then the usage of the term “application ontologies”, and finally claims about BS & WC’s “realist methodology” that is not a methodology.
Notes on realism
On the realism dimension of the debate, I have not much to say. I subscribe to the, what Merrill formulates as the, “Empiricist Doctrine” [1], which states that “the terms of science… are to be taken to refer to the actually existing entities in the real world”, especially when it comes to ontologies for the life sciences and (bio)medicine. If you want an ontology of deities, fairies, or other story characters, that’s fine, too—just do not put them in a bio-ontology. What I had understood from the conversations, presentations, and papers of BS & WC is that if you accept the “Empiricist Doctrine”, then so you must go along with universals (as opposed to concepts). Merrill calls the latter component the “Universalist Doctrine” where “the terms of science… are to be understood as referring directly to universals”, which is one of many metaphysical stances [1]. I do not know if I subscribe to universals and I do not care about that that much. Although I did some philosophy of science and philosophy of nature a while ago and read up on other subjects in philosophy in the past few years, I am not a philosopher by training and do not know about all intricacies of all alternatives around (but maybe I should).
Another reason for my misunderstanding—or: conflating the two doctrines—is also due to the fact that descriptions and definitions in the BS & WC papers are not consistent throughout (elaborated on by Merrill [1,3]). For instance, in [4], ontology is taken as reality representations, but in [2] it is reality representation that is described by science, i.e., as scientists understand it, or in other words: the representation of the theories. Thus, where the things in the ontology are terms that do not have a ‘direct link’ to the actual entities, but they go through the scientists’ mind with their conceptualizations of reality. This is quite a difference from [4]. Make of it what you like.
Last, the ‘funny’ thing is that when you use the Empiricists Doctrine it does not matter if you use BFO, DOLCE, GFO, or whichever foundational ontology for practical ontology development. The current formalisations of BFO, DOLCE or any of the others do not have in their formalisation that the categories [unary predicates] denote either universals or concepts. Clearly, the communication of the informal intentions would be different if the top (OWL:thing or similar) in the ontology is called Universal or Concept, but in BFO it is called Entity and in DOLCE it is Particular. Thus, de facto, neither one commits to one philosophical doctrine or another in the top-level categorization and formalisation.
What are “application ontologies”?
Smith and Ceusters in [4] make a distinction between reference ontologies and application ontologies, the former intended to represent “settled science” and latter that part of science that is in flux. This rather difficult to maintain distinction is discussed at length in [3]. What I wish to add, and which was only mentioned in passing in [3], is that the notion of ‘application ontologies’ elsewhere in the ontologies enterprise is used quite differently. It refers to OWL- or DL-formalised conceptual data models modeled in one of the common conceptual data modelling languages (UML, EER, ORM), but not real ontologies. The discussion about the difference between an ontology and a conceptual data model is beyond the current scope, but it is important to note that the same term means something different in pretty much all other literature about ontologies. Perhaps BS & WC have not read that literature, given that they happily attack computer science, knowledge engineering, and conceptual modelling (section 3.1 in [2]) with ‘justifications’ that Wüster-the-businessman over at ISO is a telling example of knowledge engineering and conceptual modelling (he is not), and that it was the training in cognitive psychology we all got as computer scientists (we did not) that makes us confused and stick to concepts instead of buying into universalist doctrine. Such statements are not helpful.
Either way, application ontology as a formal conceptual data model is definitely a more tenable definition [setting aside if one agrees with it] than application ontology for the non-settled science for the fact that there is no crisp boundary between settled and non-settled science. If the vague distinction is not enough already to complicate the debate: concepts are allowed to appear in BS & WC’s application ontologies.
About “methodologies”
Smith and Ceusters propose their “realist methodology” in section 1 of [2], but a methodology it is not—at least, not in the sense I, and (m)any other people in CS & IT, use the term. What BS & WC put forward is a set of principles. It does not say what to do, how, and when. And there is no empirical validation that the resultant ontologies are better (validation sensu a proper scientific experiment with subjects with/without using the ‘methodology’, measurable quality criteria, statistically significant, etc.).
An example of a fairly straightforward methodology for ontology development is METHONTOLOGY (among others [5]), and a more recent one for collaborative distributed ontology development: the NeON Methodology [6]. The latter has a nice fairly comprehensive overview picture of the interactions of the different steps (see Fig. 1, below) that are described further in [6] (and an aspect of this are the interactions between the different steps [7]). In my lectures, I like to be impartial and include a variety of options to sensitise ontology developers to the plethora of options (see, e.g., Sections 3 and 4 of the MOWS’10 course, which is an updated version of SemWebTech lecture 3+4, where the what comes before the how, outlined in SemWebTech lecture 5: Methods and Methodologies), but a set of principles that is labeled “methodology” is not something that fits in a real methodology section (though they may well fit in another module).
How can BS & WC even dare to propose a methodology for ontology development when disregarding all literature on ontology development (except for the OntoClean method)? If their methodology is so superior, than give me evidence why and how it is better than all the methodologies that have been proposed over the past 15 years or so. Spoon-feed me about the shortcomings of those procedures; that is, not a lecture about the realist vs anti-realist, conceptualist and what have you, but why I should not buy into collecting non-ontological resources, looking at ontology design patterns, providing intermediate steps for the formalization, and so forth.
Whilst reading section 1 of [2], I have been trying to extract a methodology—that is, reading it with a positive attitude to try to make something of it—but could find little, and what I extracted from it, is not enough for practical ontology development and maintenance. As example, let us take the step of “non-ontological resource reuse” for the chosen subject domain. In an ontology engineering methodology, this includes options, such as assessing chosen sources such as relevant thesauri, databases, natural language text, and methods for each option, i.e., the how-to to reuse the non-ontological resources, such as the manual database reverse engineering steps vs. semi-automated tools (in, say, VisioModeler, or the Protégé plugin Lubyte developed [8]), data mining and clustering, the different methods to extract terms from text etc. From [2], e.g. section 1.13, I gather that the only way to execute this step of “non-ontological resource reuse” is that domain experts manually read the scientific literature and manually add the knowledge to the ontology. No help from, say, the KEGG, AGROVOC, ICD10, or ontologies that were already developed by other groups—all that should be ignored—let alone automating anything to find, say, candidate terms automatically with NLP tools. That surely must be a joke (or oversight, or sheer ignorance) and does not reflect what happens during the development of OBO ontologies. Or take, e.g., METHONTOLOGY or MoKi’s stage of intermediate representations between de domain expert’s informal representation and the formalisation of it in a suitable logic language, such as pseudo-natural language, diagrams as syntactic sugar for the underlying logic, the Protégé and OBOEdit ODEs: are they to be ignored, too? Of course not; well, I presume that that is not the intention of BS & WC’s “methodology”.
They may have enjoyed having written a trashing of 20 years of knowledge engineering and conceptual data modelling whose outputs apparently can be ignored, but there surely is room to learn a thing or two about it. After reading up on the related works on methodologies, they can make a real attempt at developing a methodology that satisfies the set of principles, be that by developing a methodology from scratch or integrating it into (or extending) existing methodologies. Until then, what is presented in section 1 of [2] will not—cannot—be added to a ‘methods and methodologies’ module in an ontology engineering course.
P.S.: Other views
A different online debate about realism in ontology engineering can be read over at Phil Lord’s blog (The Status quo farewell tour on realism, Why not?, and Why realism is wrong) and his paper together with Robert Stevens at PLoS ONE [9], versus David Sutherland’s Realism, Really? and Yes, really in favour of the realist approach for practical ontology development. Then there is the OBO-Foundry discussion list, and, e.g., a paper in FOIS’10 by Michel Dumontier and Robert Hoehndorf [10], and undoubtedly more papers about the issues raised in the AO trilogy will follow.
References
[1] Gary H. Merrill. Ontological realism: Methodology or misdirection? Applied Ontology, 5 (2010) 79–108.
[2] Barry Smith and Werner Ceusters. Ontological realism: A methodology for coordinated evolution of scientific ontologies. Applied Ontology, 5 (2010) 79–108.
[3] Gary H. Merrill. Realism and reference ontologies: Considerations, reflections and problems. Applied Ontology, 5 (2010) 79–108.
[4] Barry Smith. Beyond Concepts, or: Ontology as Reality Representation. Achille Varzi and Laure Vieu (eds.), Formal Ontology and Information Systems. Proceedings of the Third International Conference (FOIS 2004), Amsterdam: IOS Press, 2004, 73-84.
[5] Corcho, O., Fernandez-Lopez, M. and Gomez-Perez, A. (2003). Methodologies, tools and languages for building ontologies. Where is their meeting point?. Data & Knowledge Engineering 46(1): 41-64.
[6] Mari Carmen Suarez-Figueroa, Guadalupe Aguado de Cea, Carlos Buil, Klaas Dellschaft, Mariano Fernandez-Lopez, Andres Garcia, Asuncion Gomez-Perez, German Herrero, Elena Montiel-Ponsoda, Marta Sabou, Boris Villazon-Terrazas, and Zheng Yufei. NeOn Methodology for Building Contextualized Ontology Networks. NeOn Deliverable D5.4.1. 2008.
[7] Keet, C.M. Dependencies between Ontology Design Parameters. International Journal of Metadata, Semantics and Ontologies, 2010, 5(4): 265-284.
[8] Lina Lubyte. Techniques and Tools for the Design of Ontologies for Data Access. PhD Thesis, Free University of Bozen-Bolzano, KRDB Dissertation Series DS-2010-02, 2010.
[9] Lord, P. & Stevens, R. Adding a little reality to building ontologies for biology. PLoS One, 2010, 5(9), e12258. DOI: 10.1371/journal.pone.0012258.
[10] Dumontier, M. & Hoehndorf, R. Realism for scientific ontologies. In: Proceeding of the Conference on Formal Ontology in Information Systems: Proceedings of the Sixth International Conference (FOIS 2010), 387–399. Amsterdam: IOS Press.