Ontologies for better communication

I just returned from an interesting summer school [1] about reasoning on the Web with rules and ontologies and their possible applications for the life sciences. As per usual, discussions about the covered material and related topics continued during the breaks and dinners. One of those aspects was about the aim of ontologies – that is, the plural that generally refers to the engineering artifacts, and not the singular that focuses on philosophical considerations.

Notwithstanding the pictures in Silvie Spreeuwenberg and Michael Schroeder’s presentations that ontologies (aim to or are used to) facilitate machine-machine communication, Ben Good was convinced that the ultimate aim of ontologies is to improve human-to-human communication. For instance, the Gene Ontology project [2] started because geneticists investigating genes of different types of organisms wanted to link up the data in their databases. This can be seen as linking and integrating databases, i.e. machine-machine communication, but also that it is ultimately humans from different research communities who want to share a common vocabulary to communicate and do cross-species research. Likewise, ontologies for content negotiation and mediation among software agents can be seen in the light of human-to-human communication. Sometimes, if one stretches the context. For instance, if you book a ticket online, then your software interacts with the software of the ticket-seller and the whole point is to reduce, or even eliminate, user-user interaction. But of course, when you extend the context, you could argue that behind the software of the ticket-seller there is, indirectly, a human being – although one who does not want to interact with you personally. Idem ditto linking up other resources on the Web, like biological databases and workflows: you don’t want bother some curator personally, but access the data directly through machine-machine communication facilitated with an ontology to ensure your and their data is of the same type.In addition, it has been noted before (e.g. [3]) that ontologies are for run-time access to represented knowledge in order to improve computation for data- information- and knowledge management, i.e. models intended to be machine-interpretable but not necessarily human-readable.

Another factor in favour of the ontologies for human-to-human communication is to see an ontology as a (formal) representation of a scientific theory, advocated by e.g. Barry Smith and cs [4]. I have made a more cautious observation [5] that it appeared to be a nice ‘side effect’ of the ontologies-approach – in the scope of that research (semi-automated ontology development of eco-ontologies based on STELLA models). The disambiguated formal representation would make scientific discourse easier, according to the user. However, this approach tends toward the philosophical notion of Ontology as a representation of reality, not the engineering artifacts of computer science & IT that have to have an immediate use, competitive advantage, and return on investment through database integration, software agent mediation and the like (see [6] on Ontology-driven Information Systems for a useful introduction).

A third aim for ontologies can be to improve communication between human and computer. However, at present this more of an intermediate aim, because improving human-computer interaction to improve on the ontology development and maintenance aims to improve the quality of an ontology, which then is ultimately used for human-human or computer-computer communication. For instance, one can verbalize the formal ontologies and conceptual models [7], which results in a for the domain expert understandable rendering of the formalisms into fixed-syntax pseudo-natural language statements instead of, say, first order logic or description logic axioms. This option is since quite some time available for the English language and ORM models and supported in software like VisioModeler and Microsoft Visio, and now available for 10 languages in DogmaModeler [8].

Taking a consensus-approach, one could say that it depends on the (sub-)goal of deployment of ontologies, which are intended primarily to improve either human-human or machine-machine communication. From an engineering perspective, the latter is certainly more important, but if one develops domain ontologies and interacts with the domain experts, putting an emphasis on the former may be more effective to achieve its adoption among a broader user base.

References

[1] ReasoningWeb http://reasoningweb.org/2006/.
[2] Gene Ontology http://www.geneontology.org/.
[3] Jarrar, M., Demy, J., Meersman, R. On Using Conceptual Data Modeling for Ontology Engineering. Journal on Data Semantics Special issue on “Best papers from the ER/ODBASE/COOPIS 2002 Conferences”, 2003, 1(1): 185-207.
[4] OBO Foundry http://www.obofoundry.org/.
[5] Keet, C.M. Factors affecting ontology development in ecology. Data Integration in the Life Sciences 2005 (DILS2005), Ludaescher, B, Raschid, L. (eds.). San Diego, USA, 20-22 July 2005. Lecture Notes in Bioinformatics 3615, Springer Verlag, 2005. pp46-62.
[6] Guarino, N. Formal Ontology and Information Systems. Formal Ontology in Information Systems, Proceedings of FOIS’98, Trento, Italy, Amsterdam: IOS Press. 1998.
[7] Jarrar, M., Keet, C.M., Dongilli, P. Multilingual verbalization of ORM conceptual models and axiomatized ontologies. STARLab Technical Report, Vrije Universiteit Brussel. February 2006.
[8] Technical reports with an example for each supported language (English, Dutch, German, Italian, Spanish, Catalan, French, Arabic, Russian, and Lithuanian): http://www.starlab.vub.ac.be/staff/mustafa/orm/verbalization/.

Advertisements

4 responses to “Ontologies for better communication

  1. As one of the prime instigators of the discussion that lead to this post and the leader of the opposition 🙂 I guess I should put my two cents in here.. Reading your post reminds me of a question I had – do you consider ontologies any different from any other potential software component? From my perspective, they are an example of something fairly distinct from say, the software that manages bit-level communications between my computer and your computer. They are distinct because they are generally meant to embody knowledge about a particular domain where that domain generally has nothing to do with computer-computer interaction.

    I guess maybe the fundamental thing is that an ontology is meant for interaction with some ‘intelligent’ agent. Whether that agent is a person or a clever computer program, the value of the ontology is in its ability to somehow ‘match’ with some component of the intelligent agent’s knowledge. For example, a simple, impotant thing you can do with ontologies is extend search with reasoning. Searching for ‘developmental process’ using the gene ontology can get you things like ‘angiogenesis’ because the GO stores the knowledge that angiogenesis is a kind of development. Great, but only when the knowledge in the ontology somehow agrees with what the user ‘knows’ is it useful. If the user does not ‘know’ anything than there is no value in interacting with the ontology (unless they are trying to learn from it..) . So, ontologies are for interaction with intelligent agents, people are by far the most common form of such agents and thus, for the moment – ontologies are for interaction with people 😉

  2. On the questions if I consider ontologies any different from any other potential software component, I would have answered initially yes were it not for the sentences you write after the question, that you consider ontologies to be fairly distinct from software that manages bit-level communication. In fact, one of the early distinctions made between ontologies and conceptual models was that conceptual models are mostly “off-line” models whereas ontologies demand run-time availability for interoperability, querying etc. With tools like iCOM and ConQuer/ActiveQuery, this distinction got somewhat blurred: e.g. iCOM has a mapping from ER and UML-like diagrams to DLR, which in turn is hooked up to a reasoner, hence the conceptual model is usable at runtime for other tasks. This aside, ontologies are, unless scribbled on paper, machine interpretable, hence can be, and are, part of the bit-level communications. But they are not just some piece of software: a piece of software is always implementation-dependent and therefore not particularly reusable. Even C++ libraries and design patters are limited to OO design and programming, but have to be redesigned if you want to use some of it for database development. Ontologies, at least in principle, are at a higher level of abstraction by focusing more on the what is represented than on the technology-dependent how. An ontology, or part of it, can be used to develop conceptual models that in turn form the basis for designing and programming a software component, so in that sense they are different from the (application) software component itself.

    The second salient part of your comment I would like to respond on is the attempt to justify the “intelligent” communication (be it an agent or human) and that the user must have to “know” and agree with the ontology on the knowledge it contains and that there is supposedly “no value in interacting with the ontology” if the user doesn’t know. I beg to differ. Given the well-known Database Comprehension Problem – the difficulty of grasping the content at both conceptual model level and implemented database when the database becomes ‘too large’ – an Ontology Comprehension Problem pops up even sooner, in particular with the bio-ontologies. Who knows all the details of the GO with its 18000 or so entity types and many more relations, let alone the FMA with its 72000 entity types and 1.9 million relations? What if OBO or some other more or less coordinated effort manages to link up ontologies spanning different levels of granularity in biology? Then the scientist can do focused searches. For instance, an expert of microbiological and biochemical facets of cholera needs to know details about the intestine and some related entity types, but doesn’t care about, say, anatomical structures of the brain. There is no point for the researcher to learn all of the intricacies of human anatomy and reinvent the wheel, and s/he wouldn’t ‘know’ sufficient about what exactly is in the ontology. In the other direction, wading through all the information about genetics takes way too much time, and then it is very efficient if there is an ontology built by geneticists-ontologists where the researcher can either query directly to the desired information or drill down to just that information s/he’s looking for on genes involved in cholera infections. Or a pharmacologist has as drug target molecule of type x at location y and reachablility through z (intravenous, oral,…), then it will be very useful to ask the oracle (= those ideal ontologies linked up or integrated that cover your domain of interest) if type x happens to be located somewhere else in the body too, and if so what x is involved in there, if that other location y’ happens to be part of the same system at y, and if they also share z (because then the potential side effects of the drug may become too serious).
    By doing so, quite a lot of automated reasoning/querying occurs (ultimately that “bit-level communication”), and while the user doesn’t comprehend the ontology fully, it is yet at the same time a time-saving method of information retrieval from the researcher’s perspective. The user isn’t dumb but cannot know everything, so it would be smart indeed to efficiently and effectively use other people’s knowledge about nature that is represented in the bio-ontologies in order to speed up one’s own research, which then can be used by other people who happen to be (over?)specialized in their own field but need to dabble in another sub-discipline to answer a focussed question, and so forth. In this sense, large ontologies are a different kind of “information repositories” with the information at your fingertips, compared to, say, string searches in Google.

    So, you still haven’t convinced me that ontologies are for interacting with people [only / mainly ]. I do see the usefulness of ontology development to get people to communicate, and collaborate, share etc etc, but that is just part of the whole “let’s do ontologies” approach.

  3. I never meant to imply that a consumer of an ontology had to know ~every detail of the knowledge represented within it in order to take advantage of it.. I believe this is where you were getting that idea ? “…If the user does not ‘know’ anything than there is no value in interacting with the ontology (unless they are trying to learn from it..)…” . There is a rather substantial difference between “everything” and “anything”. In your examples there is a clear need for some knowledge on the part of the consumer – your cholera researcher clearly knows that intestines exist prior to making a request to get more details about their parts in the FMA…

    Let me change the direction of the discussion a little bit. I would say that we would both agree that ontologies (like lots of other things) are useful repositories of knowledge. Before going further, could you state your position on this question –

    What properties must an entity exhibit in order for it to ~use knowledge?

  4. No, I was quoting the “if the user does not ‘know’…” part from your first comment. What a/the minimal amount of information should be is another topic, but reasoning features can ‘sustain’ quite ignorant human users & software agents.

    On “what properties must an entity exhibit in order for it to use knowledge?”, there can be many readings of the question and long answers. Being finicky, I pose two counter questions before trying to answer the question:
    1) what do you have in mind with “using”, for an indication on the ‘how’ and using as in using by computers, by humans, or both?
    2) you mean “knowledge” in the sense of the standard drill in CS about distinctions with the sequence ‘from data to information to knowledge’? Even with that default distinction, things can get a bit blurred; for instance, for granular reasoning over ontologies, the ontology is my data source (even though it represents knowledge we have of a certain subject domain and/or it is an as accurate as possible representation of a piece of reality) – or would you consider that to be an example of “using knowledge”?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s