Version 1.5 of the textbook on ontology engineering is available now

“Extended and Improved!” could some advertisement say of the new v1.5 of “An introduction to ontology engineering” that I made available online today. It’s not that v1 was no good, but there were a few loose ends and I received funding from the digital open textbooks for development (DOT4D) project to turn the ‘mere pdf’ into a proper “textbook package” whilst meeting the DOT4D interests of, principally, student involvement, multilingualism, local relevance, and universal access. The remainder of this post briefly describes the changes to the pdf and the rest of it.

The main changes to the book itself

With respect to contents in the pdf itself, the main differences with version 1 are:

  • a new chapter on modularisation, which is based on a part of the PhD thesis of my former student and meanwhile Senior Researcher at the CSIR, Dr. Zubeida Khan (Dawood).
  • more content in Chapter 9 on natural language & ontologies.
  • A new OntoClean tutorial (as Appendix A of the book, introduced last year), co-authored with Zola Mahlaza, which is integrated with Protégé and the OWL reasoner, rather than only paper-based.
  • There are about 10% more exercises and sample answers.
  • A bunch of typos and grammatical infelicities have been corrected and some figures were updated just in case (as the copyright stuff of those were unclear).

Other tweaks have been made in other sections to reflect these changes, and some of the wording here and there was reformulated to try to avoid some unintended parsing of it.

The “package” beyond a ‘mere’ pdf file

Since most textbooks, in computer science at least, are not just hardcopy textbooks or pdf-file-only entities, the OE textbook is not just that either. While some material for the exercises in v1 were already available on the textbook website, this has been extended substantially over the past year. The main additions are:

There are further extras that are not easily included in a book, yet possibly useful to have access to, such as list of ontology verbalisers with references that Zola Mahlaza compiled and an errata page for v1.

Overall, I hope it will be of some (more) use than v1. If you have any questions or comments, please don’t hesitate to contact me. (Now with v1.5 there are fewer loose ends than with v1, yet there’s always more that can be done [in theory at least].)

p.s.: yes, there’s a new front cover, so as to make it easier to distinguish. It’s also a photo I took in South Africa, but this time standing on top of Table Mountain.

Computer ethics (SIPP) notes relevant to South Africa

Social issues and Professional Practice in IT & Computing (formerly known as ‘computer ethics’ in our curriculum) increased in prominence in curriculum guidelines in recent years. Also, there is an increase in popular and scientific literature on computer ethics especially since Big Data, the popularisation of Artificial Intelligence, and now the 4th Industrial Revolution. Most of the articles and books are focussed on ethical and social issues where SIPP is taught mostly, being in ‘the West’.

It is taught elsewhere as well. For instance, since the early 2000s, the Computer Science Department at the University of Cape Town has taught it as part of a Masters in IT conversion course and as a block in a first-year computer science course. While initial material and lecture notes were reused from one of those universities in ‘the West’, over time, attempts have been made to localise it to some extent at least. For instance, South Africa has its own version of EU’s GDPR (the POPI Act), there is a South African IT organisation (IITPSA) with its code of conduct, and is the textbook case that illustrates the concept of leapfrogging with its wireless network (and perhaps also with the digital divide). In addition, some ‘aspects’ look different from a country that is classified as an emerging economy than for a high-income country; e.g., as patent protection and Silicon Valley’s data collection vs. potentially stifling emerging local tech companies and digital colonialism, respectively.

Updating lecture notes takes time, and so it is typically a multi-author effort carried out every few years, as it is in this case. Differently from the previous main update, is that, in line with teaching and with the times, the lecture notes are now publicly available for free on UCT’s “Open Educational Resources” site. It is with some hesitation, as it clearly does not have the quality of a textbook and we know of certain limitations that I would have liked to be better. Yet, I hope that it may be of some use already nonetheless, be it for people in the region or from ‘outside’ looking in.

I have contributed some sections as well, partially because I think it’s an interesting theme and partially because I have to teach it. I would have liked to add more, but time was running out (i.e., it’s a balancing act with other commitments, like research, teaching, and admin). With more time, the privacy chapter would have been updated better (e.g., also touching upon privacy in the context of the common practice of mobile phone sharing), emerging concepts would have been better integrated (e.g., digital colonialism, surveillance capitalism), some of the separate exercises could have been integrated, and so on and so forth. Alas, maybe a next time. (To any of my students reading this: some of these aspects are already integrated in the slides that are used in the CSC1016S lectures, which are running ahead in content compared to the written notes, and that is examinable content as well.)

Design rationale and overview of the African Wildlife tutorial ontologies

(update 30-7-2020: more details are described in the journal article published in the Journal of Biomedical Semantics)

There are several tutorial ontologies, which typically focus on illustrating one or two aspects of ontology development, notably language features and automated reasoning. This may suffice for one’s aims, but for an ontology engineering course, one would need to be able to illustrate a myriad of development factors and devise exercises for a wider range of tasks of ontology development. For instance, to illustrate the use of ontology design patterns, competency questions, foundational ontologies, and science-based modelling practices, neither of which is addressed easily by the popular tutorial ontologies (notably: wine and pizza), perhaps because they predate most of the advances made in ontology engineering research. Also, I have noticed that my students replicate examples from the exercises they carry out and from inspecting popular and easy-to-find ontologies. Marking the practical assignments, I got to see sandwich and ice cream and burger ontologies with toppings and value partitions, and software and mobile phone ontologies where laptop models are instances rather than classes. Not providing good and versatile examples holistically, causes the propagation of sub-optimal ontology development at least in the exercises, which then also may affect negatively the development of an operational domain ontology that the graduates may have to develop later on.

I’ve been exploring alternatives and variants over the past 11 years in the ontology engineering courses that I have taught yearly to about 8-40 students/year. In an attempt to systematise and possibly generalise from that, I’ve identified 22 requirements that contribute to a good tutorial ontology, which concern the suitability of the subject domain (7 factors), the ease of demonstrating logics and reasoning tasks (7), and assistance with demonstrating engineering aspects (8). Its details are described in a technical report [1]. I don’t claim that it’s an exhaustive list, but that it is one that may help someone to develop their own tutorial ontology in a fun or interesting topic if they so wish—after all, not everyone is interested in pizzas, wines, African wildlife, pets, shirts, a small university, or Robert Stevens’ family.

I’ve tried out a variety of extant tutorial ontologies as well as a range of versions of the African Wildlife Ontology (AWO) over the years (early experiences), eventually settling for a set of 14 versions, all the way from the example from the Primer [2] to DOLCE- and BFO-aligned to translated in several languages, and some with possible answers to some of the exercises. A graphical rendering of the main classes and relations is shown in the following figure:

The versions of the AWO are summarised in the following table, which is also mentioned as annotation in the OWL files.

 

The AWO meets a majority of the 22 requirements, is mature by now, and it has been used yearly in an ontology engineering course or tutorial since 2010. Also, it is links up with my ontology engineering textbook with relevant examples and exercises. The AWO provides a wide range of options concerning examples and exercises for ontology engineering well beyond illustrating only logic features and automated reasoning. For instance, it assists in demonstrating tasks about ontology quality, such as alignment to a foundational ontology and satisfying competency questions, versioning, and multilingual ontologies. For instance, it is easier to demonstrate alignment of a class Animal to DOLCE’s (Non-Agentive) Physical Object than, say, debating what Algorithm aligns with or descend into political debates on the gender binary or what constitutes a family. One can use the height or the colours of the plants and animals to discuss how to model attributes as qualities or dependent entities cf. OWL’s data properties or an artificial ValuePartition. Declare, say, de individual lion simba as an instance of Lion, rather than the confusion regarding grape varieties. Use intuitively obvious disjointness between animals and plants, and subsequently easy catches on sensitising modellers to the far-reaching effects of declaring domain and range axioms by first asserting that animals eat animals, and then adding that carnivorous plants eat insects. In addition, it links up easily to topics for ontology integration activities, such as with biodiversity data, wildlife trade, and tourism to create, e.g., an OBDA system with freely available data (e.g., taken from here) or an ontology-enhanced website for an organisation that offers environmentally sustainable safaris. More examples of broad usage options are described in section 2.3 in the tech report.

The AWO is freely available under a CC-BY licence through the textbook’s webpage at https://people.cs.uct.ac.za/~mkeet/OEbook/ in this folder. A more comprehensive description of the requirements, design, and content is described in a technical report [1] for the time being.

 

References

[1] Keet, CM. The African Wildlife Ontology tutorial ontologies: requirements, design, and content. Technical Report 1905.09519. 23 May 2019. https://arxiv.org/abs/1905.09519.

[2] Antoniou, G., van Harmelen, F. A Semantic Web Primer. MIT Press, USA. 2003.

Tutorial: OntoClean in OWL and with an OWL reasoner

The novelty surrounding all things OntoClean described here, is that we made a tutorial out of a scientific paper and used an example that is different from the (in?)famous manual example to clean up a ‘dirty’ taxonomy.

I’m assuming you have at least heard of OntoClean, which is an ontology-inspired method to examine the taxonomy of an ontology, which may be useful especially when the classes (/universals/concepts/..) have no or only a few properties or attributes declared. Based on that ontological information provided by the modeller, it will highlight violations of ontological principles in the taxonomy so that the ontologist may fix it. Its most recent overview is described in Guarino & Welty’s book chapter [1] and there are handouts and slides that show some of the intermediate steps; a 1.5-page summary is included as section 5.2.2 in my textbook [2].

Besides that paper-based description [1], there have been two attempts to get the reasoning with the meta-properties going in a way that can exploit existing technologies, which are OntOWLClean [3] and OntOWL2Clean [4]. As the names suggest, those existing and widely-used mechanisms are OWL and the DL-based reasoners for OWL, and the latter uses OWL2-specific language features (such as role chains) whereas the former does not. As it happened, some of my former students of the OE course wanted to try the OntoOWLClean approach by Welty, and, as they were with three students in the mini-project team, they also had to make their own example taxonomy, and compare the two approaches. It is their—Todii Mashoko, Siseko Neti, and Banele Matsebula’s—report and materials we—Zola Mahlaza and I—have brushed up and rearranged into a tutorial on OntoClean with OWL and a DL reasoner with accompanying OWL files for the main stages in the process.

There are the two input ontologies in OWL (the domain ontology to clean and the ‘ontoclean ontology’ that codes the rules in the TBox), an ontology for the stage after punning the taxonomy into the ABox, and one after having assigned the meta-properties, so that students can check they did the steps correctly with respect to the tutorial example and instructions. The first screenshot below shows a section of the ontology after pushing the taxonomy into the ABox and having assigned the meta-properties. The second screenshot illustrates a state after having selected, started, and run the reasoner and clicked on “explain” to obtain some justifications why the ontology is inconsistent.

section of the punned ontology where meta-properties have been assigned to each new individual.

A selection of the inconsistencies (due to violating OntoClean rules) with their respective explanations

Those explanations, like shown in the second screenshot, indicate which OntoClean rule has been violated. Among others, there’s the OntoClean rule that (1) classes that are dependent may have as subclasses only those classes that are also dependent. The ontology, however, has: i) Father is dependent, ii) Male is non-dependent, and iii) Father has as subclass Male. This subsumption violates rule (1). Indeed, not all males are fathers, so it would be, at least, the other way around (fathers are males), but it also could be remodelled in the ontology such that father is a role that a male can play.

Let us look at the second generated explanation, which is about violating another OntoClean rule: (2) sortal classes have only as subclasses classes that are also sortals. Now, the ontology has: i) Ball is a sortal, ii) Sphere is a non-sortal, and iii) Ball has as subclass Sphere. This violates rule (2). So, the hierarchy has to be updated such that Sphere is not subsumed by Ball anymore. (e.g., Ball has as shape some Sphere, though note that not all balls are spherical [notably, rugby balls are not]). More explanations of the rule violations are described in the tutorial.

Seeing that there are several possible options to change the taxonomy, there is no solution ontology. We considered creating one, but there are at least two ‘levels’ that will influence what a solution may look like: one could be based on a (minimum or not) number of changes with respect to the assigned meta-properties and another on re-examining the assigned meta-properties (and then restructuring the hierarchy). In fact, and unlike the original OntoClean example, there is at least one case where there is a meta-property assignment that would generally be considered to be wrong, even though it does show the application of the OntoClean rule correctly. How best to assign a meta-property, i.e., which one it should be, is not always easy, and the student is also encouraged to consider that aspect of the method. Some guidance on how to best modify the taxonomy—like Father is-a Male vs. Father inheres-in some Male—may be found in other sections and chapters of the textbook, among other resources.

 

p.s.: this tutorial is the result of one of the activities to improve on the OE open textbook, which are funded by the DOT4D project, as was the tool to render the axioms in DL in Protégé. A few more things are in the pipeline (TBC).

 

References

[1] Guarino, N. and Welty, C. A. (2009). An overview of OntoClean. In Staab, S. and Studer, R., editors, Handbook on Ontologies, International Handbooks on Information Systems, pages 201-220. Springer.

[2] Keet, C. M. (2018). An introduction to ontology engineering. College Publications, vol 20. 344p.

[3] Welty, C. A. (2006). OntOWLClean: Cleaning OWL ontologies with OWL. In Bennett, B. and Fellbaum, C., editors, Proceedings of the Fourth International Conference on Formal Ontology in Information Systems (FOIS 2006), Baltimore, Maryland, USA, November 9-11, 2006, volume 150 of Frontiers in Artificial Intelligence and Applications, pages 347-359. IOS Press.

[4] Glimm, B., Rudolph, S., Volker, J. (2010). Integrated metamodeling and diagnosis in OWL 2. In Peter F. Patel-Schneider, Yue Pan, Pascal Hitzler, Peter Mika, Lei Zhang, Je_ Z. Pan, Ian Horrocks, and Birte Glimm, editors, Proceedings of the 9th International Semantic Web Conference, LNCS vol 6496, pages 257-272. Springer.

DL notation plugin for Protégé 5.x

Once upon a time… the Protégé ontology development environment used Description Logic (DL) symbols and all was well—for some users at least. Then Manchester Syntax came along as the new kid on the block, using hearsay and opinion and some other authors’ preferences for an alternative rendering to the DL notation [1]. Subsequently, everyone who used Protégé was forced to deal with those new and untested keywords in the interface, like ‘some’ and ‘only’ and such, rather than the DL symbols. That had another unfortunate side-effect, being that it hampers internationalisation, for it jumbles up things rather awkwardly when your ontology vocabulary is not in English, like, say, “jirafa come only (oja or ramita)”. Even in the same English-as-first-language country, it turned out that under a controlled set-up, the DL axiom rendering in Protégé fared well in a fairly large sized experiment when compared to the Protégé interface with the sort of Manchester syntax with GUI [2], and also the OWL 2 RL rules rendering appear more positive in another (smaller) experiment [3]. Various HCI factors remain to be examined in more detail, though.

In the meantime, we didn’t fully reinstate the DL notation in Protégé in the way it was in Protégé v3.x from some 15 years ago, but with our new plugin, it will at least render the class expression in DL notation in the tool. This has the benefits that

  1. the modeller will receive immediate feedback during the authoring stage regarding a notation that may be more familiar to at least a knowledge engineer or expert modeller;
  2. it offers a natural language-independent rendering of the axioms with respect to the constructors, so that people may develop their ontology in their own language if they wish to do so, without being hampered by continuous code switching or the need for localisation; and
  3. it also may ease the transition from theory (logics) to implementation for ontology engineering novices.

Whether it needs to be integrated further among more components of the tabs and views in Protégé or other ODEs, is also a question for HCI experts to answer. The code for the DL plugin is open source, so you could extend it if you wish to do so.

The plugin itself is a jar file that can simply be dragged into the plugin folder of a Protégé installation (5.x); see the github repo for details. To illustrate it briefly, after dragging the jar file into the plugin folder, open Protégé, and add it as a view:

Then when you add some new axioms or load an ontology, select a class, and it will render all the axioms in DL notation, as shown in the following two screenshots form different ontologies:

For the sake of illustration, here’s the giraffe that eats only leaves or twigs, in the Spanish version of the African Wildlife Ontology:

The first version of the tool was developed by Michael Harrison and Larry Liu as part of their mini-project for the ontology engineering course in 2017, and it was brushed up for presentation beyond that just now by Michael Harrison (meanwhile an MSc student a CS@UCT), which was supported by a DOT4D grant to improve my textbook on ontology engineering and accompanying educational resources. We haven’t examined all possible ‘shapes’ that a class expression can take, but it definitely processes the commonly used features well. At the time of writing, we haven’t detected any errors.

p.s.: if you want your whole ontology exported at once in DL notation and to latex, for purposes of documentation generation, that is a different usage scenario and is already possible [4].

p.p.s.: if you want more DL notation, please let me know, and I’ll try to find more resources to make a v2 with more features.

References

[1] Matthew Horridge, Nicholas Drummond, John Goodwin, Alan Rector, Robert Stevens and Hai Wang (2006). The Manchester OWL syntax. OWL: Experiences and Directions (OWLED’06), Athens, Georgia, USA, 10-11 Nov 2016, CEUR-WS vol 216.

[2] E. Alharbi, J. Howse, G. Stapleton, A. Hamie and A. Touloumis. The efficacy of OWL and DL on user understanding of axioms and their entailments. The Semantic Web – ISWC 2017, C. d’Amato, M. Fernandez, V. Tamma, F. Lecue, P. Cudre-Mauroux, J. Sequeda, C. Lange and J. He (eds.). Springer 2017, pp20-36.

[3] M. K. Sarker, A. Krisnadhi, D. Carral and P. Hitzler, Rule-based OWL modeling with ROWLtab Protégé plugin. Proceedings of ESWC’17, E. Blomqvist, D. Maynard, A. Gangemi, R. Hoekstra, P. Hitzler and O. Hartig (eds.). Springer. 2017, pp 419-433.

[4] Cogan Shimizu, Pascal Hitzler, Matthew Horridge: Rendering OWL in Description Logic Syntax. ESWC (Satellite Events) 2017. Springer LNCS. pp109-113

ISAO 2018, Cape Town, ‘trip’ report

The Fourth Interdisciplinary School on Applied Ontology has just come to an end, after five days of lectures, mini-projects, a poster session, exercises, and social activities spread over six days from 10 to 15 September in Cape Town on the UCT campus. It’s not exactly fair to call this a ‘trip report’, as I was the local organizer and one of the lecturers, but it’s a brief recap ‘trip report kind of blog post’ nonetheless.

The scientific programme consisted of lectures and tutorials on:

The linked slides (titles of the lectures, above) reveal only part of the contents covered, though. There were useful group exercises and plenary discussion with the ontological analysis of medical terms such as what a headache is, a tooth extraction, blood, or aspirin, an exercises on putting into practice the design process of a conceptual modelling language of one’s liking (e.g.: how to formalize flowcharts, including an ontological analysis of what those elements are and ontological commitments embedded in a language), and trying to prove some theorems of parthood theories.

There was also a session with 2-minute ‘blitztalks’ by participants interested in briefly describing their ongoing research, which was followed by an interactive poster session.

It was the first time that an ISAO had mini-projects, which turned out to have had better outcomes than I expected, considering the limited time available for it. Each group had to pick a term and investigate what it meant in the various disciplines (task description); e.g.: what does ‘concept’ or ‘category’ mean in psychology, ontology, data science, and linguistics, and ‘function’ in manufacturing, society, medicine, and anatomy? The presentations at the end of the week by each group were interesting and most of the material presented there easily could be added to the IAOA Education wiki’s term list (an activity in progress).

What was not a first-time activity, was the Ontology Pub Quiz, which is a bit of a merger of scientific programme and social activity. We created a new version based on questions from several ISAO’18 lecturers and a few relevant questions created earlier (questions and answers; we did only questions 1-3,6-7). We tried a new format compared to the ISAO’16 quiz and JOWO’17 quiz: each team had 5 minutes to answer a set of 5 questions, and another team marked the answers. This set-up was not as hectic as the other format, and resulted in more within-team interaction cf. among all participants interaction. As in prior editions, some questions and answers were debatable (and there’s still the plan to make note of that and fix it—or you could write an article about it, perhaps :)). The students of the winning team received 2 years free IAOA membership (and chocolate for all team members) and the students of the other two teams received one year free IAOA membership.

Impression of part of the poster session area, moving into the welcome reception

As with the three previous ISAO editions, there was also a social programme, which aimed to facilitate getting to know one another, networking, and have time for scientific conversations. On the first day, the poster session eased into a welcome reception (after a brief wine lapse in the coffee break before the blitztalks). The second day had an activity to stretch the legs after the lectures and before the mini-project work, which was a Bachata dance lesson by Angus Prince from Evolution Dance. Not everyone was eager at the start, but it turned out an enjoyable and entertaining hour. Wednesday was supposed to be a hike up the iconic Table Mountain, but of all the dry days we’ve had here in Cape Town, on that day it was cloudy and rainy, so an alternative plan of indoor chocolate tasting in the Biscuit Mill was devised and executed. Thursday evening was an evening off (from scheduled activities, at least), and Friday early evening we had the pub quiz in the UCT club (the campus pub). Although there was no official planning for Saturday afternoon after the morning lectures, there was again an attempt at Table Mountain, concluding the week.

The participants came from all over the world, including relatively many from Southern Africa with participants coming also from Botswana and Mauritius, besides several universities in South Africa (UCT, SUN, CUT). I hope everyone has learned something from the programme that is or will be of use, enjoyed the social programme, and made some useful new contacts and/or solidified existing ones. I look forward to seeing you all at the next ISAO or, better, FOIS, in 2020 in Bolzano, Italy.

Finally, as a non-trip-report comment from my local chairing viewpoint: special thanks go to the volunteers Zubeida Khan for the ISAO website, Zola Mahlaza and Michael Harrison for on-site assistance, and Sam Chetty for the IT admin.

From ontology verbalisation to language learning exercises

I’m aware that to most people ‘playing with’ (investigating) ontologies and isiZulu does not sound particularly useful on the face of it. Yet, there’s the some long-term future music, like eventually being able to generate patient discharge notes in one’s own language, which will do its bit to ameliorate the language barrier in healthcare in South Africa so that patients at least will adhere to the treatment instructions a little better, and therewith receive better quality healthcare. But benefits in the short-term might serve something as well. To that end, I proposed an honours project last year, which has been completed in the meantime, and one of the two interesting outcomes has made it into a publication already [1]. As you may have guessed from the title, it’s about automation for language learning exercises. The results will be presented at the 6th Workshop on Controlled Natural Language, in Maynooth, Ireland in about 2 weeks time (27-28 August). In the remainder of this post, I highlight the main contributions described in the paper.

First, regarding the post’s title, one might wonder what ontology verbalisation has to do with language learning. Nothing, really, except that we could reuse the algorithms from the controlled natural language (CNL) for ontology verbalisation to generate (computer-assisted) language learning exercises whose answers can be computed and marked automatically. That is, the original design of the CNL for things like pluralising nouns, verb conjugation, and negation that is used for verbalising ontologies in isiZulu in theory [2] and in practice [3], was such that the sentence generator is a detachable module that could be plugged in elsewhere for another task that needs such operations.

Practically, the student who designed and developed the back-end, Nikhil Gilbert, preferred Java over Python, so he converted most parts into Java, and added a bit more, notably the ‘singulariser’, a sentence scrabble, and a sentence generator. Regarding the sentence generator, this is used as part of the exercises & answers generator. For instance, we know that humans and the roles they play (father, aunt, doctor, etc.) are mostly in isiZulu’s noun classes 1, 2, 1a, 2a, or 3a, that those classes do not (or rarely?) have non-human nouns and generally it holds for all humans and their roles that they can ‘eat’, ‘talk’ etc. This makes it relatively easy create a noun chain and a verb chain list to mix and match nouns with verbs accordingly (hurrah! for the semantics-based noun class system). Then, with the 231 nouns and 59 verbs in the newly constructed mini-corpus, the noun chain and the verb chain, 39501 unique question sentences could be generated, using the following overall architecture of the system:

Architecture of the CNL-driven CALL system. The arrows indicate which upper layer components make use of the lower layer components. (Source: [1])

From a CNL perspective as well as the language learning perspective, the actual templates for the exercises may be of interest. For instance, when a learner is learning about pluralising nouns and their associated verb, the system uses the following two templates for the questions and answers:

Q: <prefixSG+stem> <SGSC+VerbRoot+FV>
A: <prefixPL+stem> <PLSC+VerbRoot+FV>
Q: <prefixSG+stem> <SGSC+VerbRoot+FV> <prefixSG+stem>
A: <prefixPL+stem> <PLSC+VerbRoot+FV> <prefixPL+stem>

The answers can be generated automatically with the algorithms that generate the plural noun (from ‘prefixSG’ to ‘prefixPL’) and add the plural subject concord (from ‘SGSC’ to ‘PLSC’, in agreement with ‘prefixPL’), which were developed as part of the GeNI project on ontology verbalization. This can then be checked against what the learner has typed. For instance, a generated question could be umfowethu usula inkomishi and the correct answer generated (to check the learner’s response against) is abafowethu basula izinkomishi. Another example is generation of the negation from the positive, or, vv.; e.g.:

Q: <PLSC+VerbRoot+FV>
A: <PLNEGSC+VerbRoot+NEGFV>

For instance, the question may present batotoba and the correct answer is then abatotobi. In total, there are six different types of sentences, with two double, like the plural above, hence a total of 16 templates. It is not a lot, but it turned out it is one of the very few attempts to use a CNL in such way: there is one paper that also will be presented at CNL’18 in the same session [4], and an earlier one [5] uses a fancy grammar system (that we don’t have yet computationally for isiZulu). This is not to be misunderstood as that this is one of the first CNL/NLG-based system for computer-assisted language learning—e.g., there’s assistance in essay writing, grammar concept question generation, reading understanding question generation—but curiously very little on CNLs or NLG for the standard entry-level type of questions to learn the grammar. Perhaps the latter is considered ‘boring’ for English by now, given all the resources. However, thousands of students take introduction courses in isiZulu each year, and some automation can alleviate the pressure of routine activities from the lecturers. We have done some evaluations with learners—with encouraging results—and plan to do some more, so that it may eventually transition to actual use in the courses; that is: TBC…

 

References

[1] Gilbert, N., Keet, C.M. Automating question generation and marking of language learning exercises for isiZulu. 6th International Workshop on Controlled Natural language (CNL’18). IOS Press. Co. Kildare, Ireland, 27-28 August 2018. (in print)

[2] Keet, C.M., Khumalo, L. Toward a knowledge-to-text controlled natural language of isiZulu. Language Resources and Evaluation, 2017, 51(1): 131-157.

[3] Keet, C.M. Xakaza, M., Khumalo, L. Verbalising OWL ontologies in isiZulu with Python. The Semantic Web: ESWC 2017 Satellite Events, Blomqvist, E. et al. (eds.). Springer LNCS vol. 10577, 59-64.

[4] Lange, H., Ljunglof, P. Putting control into language learning. 6th International Workshop on Controlled Natural language (CNL’18). IOS Press. Co. Kildare, Ireland, 27-28 August 2018. (in print)

[5] Gardent, C., Perez-Beltrachini, L. Using FB-LTAG Derivation Trees to Generate Transformation-Based Grammar Exercises. Proc. of TAG+11, Sep 2012, Paris, France. pp117-125, 2012.

An Ontology Engineering textbook

My first textbook “An Introduction to Ontology Engineering” (pdf) is just released as an open textbook. I have revised, updated, and extended my earlier lecture notes on ontology engineering, amounting to about 1/3 more new content cf. its predecessor. Its main aim is to provide an introductory overview of ontology engineering and its secondary aim is to provide hands-on experience in ontology development that illustrate the theory.

The contents and narrative is aimed at advanced undergraduate and postgraduate level in computing (e.g., as a semester-long course), and the book is structured accordingly. After an introductory chapter, there are three blocks:

  • Logic foundations for ontologies: languages (FOL, DLs, OWL species) and automated reasoning (principles and the basics of tableau);
  • Developing good ontologies with methods and methodologies, the top-down approach with foundational ontologies, and the bottom-up approach to extract as much useful content as possible from legacy material;
  • Advanced topics that has a selection of sub-topics: Ontology-Based Data Access, interactions between ontologies and natural languages, and advanced modelling with additional language features (fuzzy and temporal).

Each chapter has several review questions and exercises to explore one or more aspects of the theory, as well as descriptions of two assignments that require using several sub-topics at once. More information is available on the textbook’s page [also here] (including the links to the ontologies used in the exercises), or you can click here for the pdf (7MB).

Feedback is welcome, of course. Also, if you happen to use it in whole or in part for your course, I’d be grateful if you would let me know. Finally, if this textbook will be used half (or even a quarter) as much as the 2009/2010 blogposts have been visited (around 10K unique visitors since posting them), that would mean there are a lot of people learning about ontology engineering and then I’ll have achieved more than I hoped for.

UPDATE: meanwhile, it has been added to several open (text)book repositories, such as OpenUCT and the Open Textbook Archive, and it has been featured on unglue.it in the week of 13-8 (out of its 14K free ebooks).

Ontology pub quiz questions of ISAO 2016 and JOWO 2017

In 2016 when I was a PC chair of the International School for Applied Ontology (ISAO 2016), the idea of organising a contest for the participants turned into a pub quiz somehow. The lecturers provided one or more questions on the topics they’d be teaching and I added a few as well. This set of ISAO16 ontology pub quiz questions is now finally online. It comes with the warning that it is biased toward the topics covered at ISAO 2016, and it turned out that there were a few questions not well formulated and/or not everyone agreed with the answer.

Notwithstanding, it was deemed sufficiently ok as idea in that the general chair of the Joint Ontology Workshops (JOWO 2017) wanted one for JOWO 2017 as well. Several questions were thrown out of the ISAO16 set for various reasons and more general Ontology questions made their way in, as well as a few ‘fun’ and trivia ones in the hope to add some more entertainment to the ontology pub quiz. The JOWO17 pub quiz question set with answers is now also online to play with, which, in my opinion, is a nicer set than the ISAO16 one. Here are a few questions to give you a taste of it:

  • Where/when can a pointless theory be relevant?
  • What is the goal of guerrilla ontology?
  • No Italian pizza has fruit as topping. Which of the following is (an)/are Italian pizza(s)? Pizza Hawaii, Pizza margherita, Pizza bianca romana (‘white roman pizza’)
  • When was the earliest published occurrence of the word “ontology”?

It turned out that it still was not entirely free of debate. If you disagree with one of the answers now, then let me paraphrase Stefano Borgo, who co-ran the JOWO17 pub quiz at the Irish pub in Bolzano on 23 September: maybe there’s something there to write up and submit a paper to FOIS 2018 :-). Or you can write it in the blog post comments section below, so that those questions will/should not be recycled and I can add longer answers to the questions.

Round 2 of the search engine, browser, and language bias mini-experiment

Exactly a year ago I did a mini-experiment to see whether search engine bias exist in South Africa as well. It did. The notable case was that Google in English on Safari on the Mac (GES) showed results for ‘politically interesting searches’ that had less information and was leaning to the right-side of the political spectrum in a way that raised cause for concern, as compared to Google in isiZulu in Firefox (GiF) and Bing in English in Firefox (BEF). I repeated the experiment in the exact same way, with some of the same queries and a few more new ones that take into account current affairs; the only difference being using my Internet connection at home rather than at work. The same problem still exists, sometimes quite dramatically. As recommendation, then: don’t use Google in English on Safari on the Mac unless you want to be in an “anti-government Democratic Alliance as centre-of-the-world” bubble.

To back it all up, I took screenshots again, with the order fltr GiF, GES, BEF, so you can check for yourself what users with different configurations see on the first page of the search results. The set of clearly different/biased results are listed first.

  • EFF”, which in South Africa is a left populist opposition party, and internationally the abbreviation of the electronic frontier foundation:

    “EFF” search

    GiF lists it as political party; GES in relation to the DA first and then as political party; BEF as political party and electronic frontier foundation.

  • jacob zuma”, the current president of the country: GiF first has a google ad to oust zuma, then general info and news; GES with a google ad to oust zuma, comment by JZ’s son

    “jacob zuma” search

    blaming the whites (probably fuelling racial divisiveness), then general info and news; BEF has general info and news.

  • ANC”, currently the largest political party nationally and in power: GiF has first a link to ANC site, one

    “ANC” search

    news, and for the rest contact info; GES has first ‘bad press’ for the ANC as top stories, then twitter, then the ANC website; BEF lists first the ANC site, then news and info.

  • Manana”, who is the Higher Education deputy minister who faces allegations of mistreatment by female

    “Manana” search

    staff members in his department: GiF with news about the accusations; GES has negative news about the ANC women’s league and DA actions; BEF shows info about Manana and mixed it up with the Spanish mañana.

  • The autocomplete function when typing “ANC” was somewhat surprising: GiF also associates it with ‘eff news’, and ‘zuma’;

    exploring the autocomplete on “ANC”

    GES doesn’t have ‘eff news’ to suggest, so autocomplete also seems to be determined by the client-side configuration; BEF has all sorts of things.

  • white monopoly capital” (long story): GiF shows general info and news; GES also shows general info

    “white monopoly capital” search

    and news, but with that inciting blaming the whites news item; BEF shows general info and news as well, but differently ordered from Google’s result.

  • DA”, which in South Africa refers to the abbreviation of the Democratic Alliance opposition party

    “DA” search

    (capitalist, for the rich): GiF lists the DA website and some news; GES shows news on DA action and opinion, then the DA website; BEF lists the DA site, some general info and disambiguation.

  • motion of no confidence”, which was held last week against Jacob Zuma

    “motion of no confidence” search

    (the motion failed, but not by a large margin): GiF has again that Google ad for the organization to oust Zuma, then info and mostly news (with 1 international news site [Al Jazeera]); GES has info then SA opinion pieces rather than news; BEF has news and info.

  • FeesMustFall”, which was one of the tags of the student protests in 2015

    “FeesMustFall” search

    and 2016 (for free higher education): GiF has general info and news; GES shows first two ads to join the campaign, then general info and news; BEF has info and news. So, this seems flipped cf. last year.

Then the set of searches of which the results are roughly the same. I had expected this for “Law on cookies in South Africa” and “Socialism”, for they were about the same last year as well. I wasn’t sure about “women’s month” (this month, August), given its history; there are slight differences, but not much. The interesting one, perhaps, was that “state capture gupta” also showed similar results across the three configurations, all of them showing results to pages that treat it as fact and at least some detailed background reading on it.

“Law on cookies in South Africa” search

“Socialism” search

“women’s month” search

 

 

“state capture gupta” search

Finally, last year the mini-experiment was motivated by lecture preparations for the “Social Issues and Professional Practice” block of CSC1016S that I’m scheduled to teach in the upcoming semester (if there won’t be protests, that is). As compared to last year, now I can also add a note on the Algorithmic Transparency and Accountability statement from the ACM, in addition to the ‘filter bubble’ and ‘search engine manipulation’ items. Maybe I should cook up an exercise for the students so we can get data rather still being in the realm of anecdotes with my 20 searches and three configurations. If you did the same with a different configuration, please let me know.