Preliminary results on multilingual ontologies in Bantu languages

As the avid reader of this blog may remember, I wrote about isiZulu verbalization of ontologies before, which presupposed that there was some way in which the isiZulu terms were stored in the ontology, but it did not say anything about those details. In addition, with the 11 official languages in South Africa, some multilingualism may have to be catered for as well. Multilingual ontologies—be it for localization or internationalization of ontologies—is a hot topic: lots of results are becoming available and one of the linguistic models for multilingual ontologies, ontolex-lemon, is a W3C Community Group result (specs). We, being my co-author Catherine Chavula and I, now have now some first insights into that for Bantu languages, which are described in the paper Is Lemon Sufficient for Building Multilingual Ontologies for Bantu Languages? that was accepted recently at the 11th OWL: Experiences and Directions Workshop (OWLED’14), where I’ll present the paper (Riva del Garda, Italy, Oct 17-18, 2014).

The answer to the question in the title of the paper is a ‘not quite’. To justify that, we first identify the requirements for building Bantu lexica, be it in lemon format or another, with a focus in the paper on Chichewa (a language spoken widely in Malawi) and a bit on isiZulu. The Bantu noun class system is challenging, especially when taken together with verb conjugation that is necessary for the OWL object properties. Noun classes are used to group nouns together, like masculine and feminine in some languages, but then based on semantic criteria, like whether the noun refer)s to a person, an animal, a long thin object, etc. Bantu languages have somewhere between 10 and 23 noun classes and they affect word forms. This in itself requires some creativity for creating a lexicon for an ontology, but the issue is exacerbated when considering the verbs, which are used to name object properties in an OLW ontology.

The common ontology development suggestion to put a verb in 3rd person singular to name the object property, which won’t work that easily for Bantu languages, however: the noun class of the noun (of the OWL class) that plays the subject (or: the first class in, say, an all-some axiom) determines how a verb is conjugated. For instance, if a person (in noun class 1) eats something, it is udla (in isiZulu), whereas when a giraffe (in noun class 9 in isiZulu) eats something, it is idla. In lemon, this would amount to an awful lot of rules snuck into each lemon lexicon, hand-crafted for each OWL class where it applies (i.e., for those axioms in which a particular object property appears), and thus also with a lot of duplication, which is undesirable. Even when you know that the domain and range will be one OWL class (e.g., always person), the entry—using the lemon Morphology module—is non-trivial (fig 5 in the paper shows it for foaf:knows in Chichewa).
Annotating an ontology with noun classes and lemon is possible, but not immediately with an ‘out of the box’ lemon. The reason for this is that there was no linguistic resource that actually had sufficient information on the noun class system. So we had to develop a small noun class ontology so that it can be used in conjunction with other linguistic resources such as LexInfo. This is described in some detail in the paper. An example of the Chichewa nc:1 and nc:2 morphology using lemon rules is as follows:

fig3owledTo put lemon to the test with this ncs ontology, Catherine made a version of FOAF in Chichewa using lemon, and did part of the GoodRelations ontology as well (available here). The foaf:person in Chichewa entry in the lexicon, which uses lemon, LexInfo, and the ncs ontology looks like this:

fig4owledThe paper closes with some open issues that will have to be addressed to increase usability of lemon and ‘Bantu ontologies’, and we’re working on some of them (to be continued…).

The presentation of this paper and 10 other full presentations, 2 short presentations, several posters and demos, and two invited talks (by Nicola Guarino and Claudia d’Amato) are on the programme of OWLED’14. Registration is open, and I hope to see you there!

Advertisement

On ‘swapping’ your foundational ontology to increase interoperability

Over the past few years, I’ve been putting some effort into methods and tools and some data collection and analysis that would aid the use of foundational (top-level) ontologies in ontology engineering, such as DOLCE, GFO, and BFO, and some of its relations (mainly part-whole relations). Tools include the Ontology Selection and Explanation Tool to choose the most suitable foundational ontology [1] and OntoPartS [2] and OntoPartS-2 [3] for software-supported modeling of part-whole relations, and experimentally validating using a foundational ontology does make a difference [4]. The latest addition is SUGOI—Software Used to Gain Ontology Interchangeability, initiated by Zubeida Khan’s idea mentioned in her (cum laude) MSc thesis, which I supervised.

In the meantime, SUGOI has been implemented, and we have used it to answer principally two questions:

  1. Is it feasible to automatically generate links between ontology Oa and foundational ontology Oy, given Oa is linked to Ox? Say, I have an ontology linked to BFO, then can I swap BFO for DOLCE?
  2. If there are issues with the former, what is causing it? Or: in praxis, which entities of Ox are typically used for mappings with domain ontologies that may not be present, or present in an incompatible way, in Oy? Or: if not, then why not?

We tested this with 16 ontologies that are linked to a foundational ontology, and the results have just been accepted at the 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW’14) [5].

Now, I already know that some of you will say (and, in fact, have said!), this is not feasible at all. Arguments on philosophical distinctions are there, yes, but not all of that appears in an OWL file and in the modeller’s view (see also an earlier post and references therein). Put differently: things are not that clear-cut and black-and-white as it initially may seem. We did observe a basic, or raw, ‘swapping success rate’ from 2% for the PID ontology from the GFO it was aligned to, to BFO, to up to a whopping 82% for the IDO ontology from BFO to either DOLCE or GFO (averaging at 36% for the real ontologies we tested with). Now, there.

So, what’s really happening? The success rate actually depends on several factors. Some entities in, say, BFO, while named differently, do have an equivalent in DOLCE or GFO, that may or may not be in a similar place in the ontology (if not, then you still end up with an inconsistency, which we removed as mapping), others do not. Those mappings have been investigated in detail [6], and, indeed, there aren’t many, but surely there are some. Several domain ontologies have alignments to only a few categories in a foundational ontology, others have more. If there aren’t many links, or predominantly to those for which there exists an equivalence assertion, then your ‘swapping success rate’ (called raw interchangeability in the paper) is high. Thus, it is not that it is not feasible at all.

sugoiscreen

The interface of the online desktop version of SUGOI.

Sounds obvious when one puts it like that. But what about my ontology, you may wonder. Use SUGOI to find out. The log file shows what’s been done in the process, and does compute those raw interchangeability metrics for you. SUGOI is ‘trivial’ to extend to include foundational ontologies other than DOLCE, BFO, and GFO—just the mapping files have to be added, but it doesn’t really change the algorithm.

We also looked at the data, especially for the ones with a low success rate, to figure out what causes it. It appeared that for those that use DOLCE, they probably do so because it has some nice knowledge about attributive properties that are not represented (BFO) or represented in an incompatible way (GFO) elsewhere. Likewise, those ontologies that were linked to BFO or GFO and for which there was a lower interchangeability to DOLCE, had quite a few links to aspects on roles, which aren’t in DOLCE proper, so that was causing a relatively lower success rate there (more details in the paper). We leave it up to the developers of the respective foundational ontologies to decide whether they wan to fill that ‘gap’ in their respective ontology.

We also checked SUGOI’s output against ontologies that had been aligned manually to more than one foundational ontology by the developers. We could find only two that were: BioTop and the Stuff Ontology. Mainly, we found the odd error in alignment and a few ones missed by manual alignment, but with n=2, those results are quite at the level of interesting anecdote (observing that the plural of anecdote is not data).

Whether you want to swap, or offer your ontology aligned to more than one foundational ontology to increase its interoperability with other ontologies, is, clearly, your choice to make. If you decide to do so, you could do that manually, but SUGOI automates that process for you as much as possible. Both Zubeida and I plan to be at EKAW’14, hopefully also with a demo, so that you not only can test it with your ontology (which you can do already on the SUGOI page already), but also gain some further detailed insights into the algorithm, the mapping files it used, and the consequences for your ontology.

References

[1] Khan, Z., Keet, C.M. ONSET: Automated Foundational Ontology Selection and Explanation. 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12), A. ten Teije et al. (Eds.). Oct 8-12, Galway, Ireland. Springer, LNAI 7603, 237-251.

[2] Keet, C.M., Fernandez-Reyes, F.C., Morales-Gonzalez, A. Representing mereotopological relations in OWL ontologies with OntoPartS. 9th Extended Semantic Web Conference (ESWC’12), Simperl et al. (eds.), 27-31 May 2012, Heraklion, Crete, Greece. Springer, LNCS 7295, 240-254.

[3] Keet, C.M., Khan, M.T., Ghidini, C. Ontology Authoring with FORZA. 22nd ACM International Conference on Information and Knowledge Management (CIKM’13). ACM proceedings, pp569-578. Oct. 27 – Nov. 1, 2013, San Francisco, USA.

[4] Keet, C.M. The use of foundational ontologies in ontology development: an empirical assessment. 8th Extended Semantic Web Conference (ESWC’11), G. Antoniou et al (Eds.), Heraklion, Crete, Greece, 29 May-2 June, 2011. Springer, Lecture Notes in Computer Science LNCS 6643, 321-335.

[5] Khan, Z.C., Keet, C.M. Feasibility of automated foundational ontology interchangeability. 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW’14). 24-28 Nov, 2014, Linkoping, Sweden. Springer LNAI. (accepted)

[6] Khan, Z., Keet, C.M. Addressing issues in foundational ontology mediation. 5th International Conference on Knowledge Engineering and Ontology Development (KEOD’13), Vilamoura, Portugal, 19-22 September. Filipe, J. and Dietz, J. (Eds.), SCITEPRESS, pp5-16.