A requirements catalogue for ontology languages

If you could ‘mail order’ a language for representing ontologies or fancy knowledge graphs, what features would you want it to have? Or, from an artefact development viewpoint: what requirements would it have to meet? Perhaps it may not be a ‘Christmas wish list’ in these days, but a COVID-19 lockdown ‘keep dreaming’ one instead, although perhaps it may even be feasible to realise if you don’t ask for too much. Either way, answering this on the spot may not be easy, and possibly incomplete. Therefore, I have created a sample catalogue, based on the published list of requirements and goals for OWL and CL, and I added a few more. The possible requirements to choose from currently are loosely structured into six groups: expressiveness/constructs/modelling features; features of the language as a whole; usability by a computer; usability for modelling by humans; interaction with ‘outside’, i.e., other languages and systems; and ontological decisions. If you think the current draft catalogue should be extended, please leave a comment on this post or contact the author, and I’ll update accordingly.


Expressiveness/constructs/modelling features

E-1 Equipped with basic language elements: predicates (1, 2, n-ary), classes, roles, properties, data-types, individuals, … [select or add as appropriate].

E-2 Equipped with language features/constraints/constructs: domain/range axioms, equality (for classes, for individuals), cardinality constraints, transitivity, … [select or add as appropriate].

E-3 Sufficiently expressive to express various commonly used ‘syntactic sugarings’ for logical forms or commonly used patterns of logical sentences.

E-4 Such that any assumptions about logical relationships between different expressions can be expressed in the logic directly.


Features of the language as a whole

F-1 It has to cater for meta-data; e.g., author, release notes, release date, copyright, … [select or add as appropriate].

F-2 An ontology represented in the language may change over time and it should be possible to track that.

F-3 Provide a general-purpose syntax for communicating logical expressions.

F-4 Unambiguous, i.e., not needed to have negotiation about syntactic roles of symbols, or translations between syntactic roles.

F-5 Such that every name has the same logical meaning at every node of the network.

F-6 Such that it is possible to refer to a local universe of discourse (roughly: a module).

F-7 Such that it is possible to relate the ontology to other such universes of discourse.

F-8 Specified with a particular semantics.

F-9 Should not make arbitrary assumptions about semantics.

F-10 Cater for internationalization (e.g., language tags, additional language model).

F-11 Extendable (e.g., regarding adding more axioms to same ontology, add more vocabulary, and/or in the sense of importing other ontologies).

F-12 Balance expressivity and complexity (e.g., for scalable applications, for decidable automated reasoning tasks).

F-13 Have a query language for the ontology.

F-14 Declared with Closed World Assumption.

F-15 Declared with Open World Assumption.

F-16 Use Unique Name Assumption.

F-17 Do not use Unique Name Assumption.

F-18 Ability to modify the language with the language features.

F-19 Ability to plug in language feature extensions; e.g., ‘loading’ a module for a temporal extension.


Usability by computer

UC-1 Be an (identifiable) object on the Web.

UC-2 Be usable on the Web.

UC-3 Using URIs and URI references that should be usable as names in the language.

UC-4 Using URIs to give names to expressions and sets of expressions, in order to facilitate Web operations such as retrieval, importation and cross reference.

UC-5 Have a serialisation in [XML/JSON/…] syntax.

UC-6 Have symbol support for the syntax in LaTeX/…

UC-7 Such that the same entailments are supported, everywhere on the network of ontologies.

UC-8 Able to be used by tools that can do subsumption reasoning/taxonomic classification.

UC-9 Able to be used by tools that can detect inconsistency.

UC-10 Possible to read and write in the document with simple tools, such as a text editor.

UC-11 Unabiguous and simple grammar to ensure parsing documents as simple as possible.


Usability & modelling by humans

HU-1 Easy to use

HU-2 Have at least one compact, human-readable syntax defined which can be used to express the entire language

HU-3 Have at least one compact, human-readable syntax defined so that it can be easily typed up in emails

HU-4 Such that no agent should be able to limit the ability of another agent to refer to any entity or to make assertions about any entity

HU-5 Such that a modeller is free to invent new names and use them in published content.

HU-6 Have clearly definined syntactic sugar, such as a controlled natural language for authoring or rendering the ontology or an exhaustive diagramamtic notation


Interaction with outside

I-1 Shareable (e.g., on paper, on the computer, concurrent access)

I-2 Interoperable (with what?)

I-3 Compatible with existing standards (e.g., RDF, OWL, XML, URIs, Unicode)

I-4 Support an open networks of ontologies

I-5 Possible to import ontologies (theories, files)

I-6 Option ot declare inter-ontology assertions


Ontological decisions

O-1 3-Dimensionalist commitment, where entities are in space but one doesn’t care about time

O-2 3-Dimensionalist with a temporal extension

O-3 4-Dimensionalist commitment, where entities are in spacetime

O-4 Standard view of relations and relationships (there is an order in which the entities participare)

O-5 Positionalist relations and relationships (there’s no order, but entities play a role in the relation/relationship)

O-6 Have additional primitives, such as for subsumption, parthood, collective, stuff, sortal, anti-rigid entities, … [select or add as appropriate]

O-7 Statements are either true or false

O-8 Statements may vague or uncertain; e.g., fuzzy, rough, probabilistic [select as appropriate]

O-9 There should be a clear separation between natural language and ontology

O-10 Ontology and natural language are intertwined


That’s all, for now.

What can you do when you have to stay at home?

Most people may not be used to having to stay at home. Due to a soccer (football) injury, I had to stay put for a long time, yet, I hardly ever got bored (lonely, at times, yes, but doing things makes one forget about that, be content with one’s own company, and get lots of new knowledge experiences along the way). As a silver lining of that—and since I’m missing out on some social activities now as well—I’m compiling a (non-exhaustive) ‘what to do?’ list, which may give you some idea(s) to make good use of the time spent at home, besides working for home if you can or have to. They’re structured in three main categories: enriching the mind, being creative, and exercising the body, and there’s an ‘other’ category at the end.


Enrich the mind


Leisure reading

If you haven’t signed up for the library, or aren’t allowed to go there anymore, here are a few sources that may distract you from the flood of COVID-19 news bites:

  • Old novels for free: The Gutenberg project, where people have scanned and typed up old books.
  • Newer novels for free: here’s an index of free books, or search for ‘public domain books’ in your favourite search engine.



  • A new language to read, speak, and write. Currently, the most popular site for that is probably Duolingo. If you’re short on a dictionary: Wordreference is good for, at least, Spanish, Italian, and English, Leo for German<->English, and isiZulu.net for isiZulu<->English, to name but a few.
  • A programming language. There are very many free lessons, textbooks, and video lectures for young and old. If you have never done this before, try Python.
  • Dance. See ‘exercises’ below.
  • Some academic topic. There are several websites with legally free textbooks, such as the Open Textbook Archive, and there is a drive toward open educational resources at several universities, including UCT’s OpenUCT (which also has our departmental course notes on computer ethics), and there are many MOOCs.
  • Science experiments at home. Yes, some of those can be done at home, and they’re fun to do. A few suggestions: here (for kids, with household stuff), and here, or here, among many sites.


Be creative



  • Keeping a diary may sound boring, but we live in interesting times. What you’re experiencing now may easily be blurred by whatever comes next. Write it down, so you can look back and reflect on the pandemic later.
  • Write stories (though maybe don’t go down the road of apocalypses). You think you’re not creative enough for that? Then try to re-tell GoT to someone who hasn’t seen the series, or write a modern-day version of, say, red riding hood or Romeo & Juliet.
  • Write about something else. For instance, writing this blog post took me as much time as I would otherwise have spent on two dance classes, this post took me three evenings + another 2-3 hours to write, and this series of posts eventually evolved into a textbook. Or you can add a few pages to Wikipedia.



These activities tend to call for lots of materials, but those shops are possibly closed already. The following list is an intersection of supermarket-materials and artsy creations.

  • Durable ‘bread’ figures with salt dough, for if you have no clay. Regular dough for bread perishes, but add lots of salt, and after baking it, it will remain good for years. The solid dough allows for many creations.
  • Food art with fruit and vegetables (and then eat it, of course); there are pictures for ideas, as well as YouTube videos.
  • Paper-folding and cutting to make decorations, like paper doll chains, origami, kirigami.
  • Painting with food paints or make your own paint. For instance, when cooking beetroot, the water turns very dark red-ish—don’t throw that away. iirc onion for yellow and spinach for green. This can be used for, among others, painting eggs and water-colour painting on paper. Or take a tea sieve and a toothbrush, cut out a desired figurine, dip the toothbrush in the colour-water and scrape it against the sieve to create small irregular drops and splashes.

  • Life-size toilet roll elephant figures… or even toilet roll art (optionally with paper) 😉
  • Knitting, sewing and all that. For instance, take some clothes that don’t fit anymore and rework it into something new (trousers into shorts, t-shirt as a top, insert colourful bands on the sides).
  • Colourful thread art, which requires only a hammer, nails, and >=1 colours of sewing threads.


Exercise that body

one of the many COVID-19 memes (source: passed by on FB)–Let’s try not to gain too much weight.

Barbie memes aside, it is very well possible to exercise at home, even if you have only about 1-2 square meters available. If you don’t: you get double the exercise by moving the furniture out of the way 🙂

  • Yoga and pilates. There are several websites with posters and sheets demonstrating moves.
  • Gym-free exercises, like running on the spot, making a ‘steps’ from two piles of books and a plank and doing those steps or take the kitchen mini-ladder or go up and down the stairs 20 times, push-ups, squats, crunches, etc. There are several websites with examples of such exercises. If you need weights but don’t have them: fill two 500ml bottles with water or sand. Even the NHS has a page for it, and there are many other sites with ideas.
  • Dance. True, for some dance styles, one needs a lot of space. Then again, think [back at/about] the clubs you frequent[ed]: they are crowded and there isn’t a lot of space, but you still manage(d) to dance and get tired. So, this is doable even with a small space available. For instance, the Kizomba World Project: while you’d be late for that now to submit a flashmob video, you still can practice it at home, using their instruction videos and dance together once all this is over. There are also websites with dance lessons (for-payment) and tons of free instruction videos on YouTube (e.g., for Salsa and Bachata—no partner? Search for ‘salsa shines’ or ‘bachata shines’ or footwork that can be done on your own, or try Bollywood or a belly dance workout [disclaimer: I did not watch these videos]).
  • Zumba in the living room?



Ontologically an awful category, but well, they still are good for keeping you occupied:


If you have more low-cost ideas that require little resources: please put them in the comments section.

p.s.: I did a good number of the activities listed above, but not all—yet.

Digital Assistants and AMAs with configurable ethical theories

About a year ago, there was a bit of furore in the newspapers on digital assistants, like Amazon Echo’s Alexa, Apple’s Siri, or Microsoft’s Cortana, in a smart home to possibly snitch on you if you’re the marijuana-smoking family member [1,2]. This may be relevant if you live in a conservative state or country, where it is still illegal to do so. Behind it is a multi-agent system that would do some argumentation among the stakeholders (the kids, the parents, and the police). That example sure did get the students’ attention in the computer ethics class I taught last year. It did so too with an undergraduate student—double majoring in compsci and philosophy—who opted to do the independent research module. Instead of the multiple actor scenario, however, we considered it may be useful to equip such a digital assistant, or an artificial moral agent (AMA) more broadly, with multiple moral theories, so that a user would be able to select their preferred theory and let the AMA make the appropriate decision for her on whichever dilemma comes up. This seems preferable over an at-most-one-theory AMA.

For instance, there’s the “Mia the alcoholic” moral dilemma [3]: Mia is disabled and has a new model of the carebot that can fetch her alcoholic drinks in the comfort of her home. At some point, she’s getting drunk but still orders the carebot to bring her one more tasty cocktail. Should the carebot comply? The answer depends on one’s ethical viewpoint. If you had answered with ‘yes’, you probably would not want to buy a carebot that would refuse to serve you, and likewise vv. But how to make the AMA culturally and ethically more flexible to be able to adjust to the user’s moral preferences?

The first step in that direction has now been made by that (undergrad) research student, George Rautenbach, which I supervised. The first component is a three-layered approach, with at the top layer a ‘general ethical theory’ model (called Genet) that is expressive enough to be able to model a specific ethical theory, such as utilitarianism, ethical egoism, or Divine Command Theory. This was done for those three and Kantianism, so as to have a few differences in consequence-based or not, the possible ‘patients’ of the action, sort of principles, possible thresholds and such. These reside in the middle layer. Then there’s Mia’s egoism, the parent’s Kantian viewpoint about the marijuana, a train company’s utilitarianism to sort out the trolley problem, and so on at the bottom layer, which are instantiations of the respective specific ethical theories in the middle layer.

The Genet model was evaluated by demonstrating that those four theories can be modelled with Genet and the individual theories were evaluated with a few use cases to show that the attributes stored are relevant and sufficient for those reasoning scenarios for the individuals. For instance, eventually, Mia’s egoism wouldn’t get her another drink fetched by the carebot, but as a Kantian, she would have been served.

The details are described in the technical report “Toward Equipping Artificial Moral Agents with multiple ethical theories” [4] and the models are also available for download as XML files and an OWL file. To get all this to work in a device, there’s still the actual reasoning component to implement (a few architectures exist for that) and for a user to figure out which theory they actually subscribe to so as to have the device configured accordingly. And of course, there is a range of ethical issues with digital assistants and AMAs, but that’s a topic perhaps better suited for the SIPP (FKA computer ethics) module in our compsci programme [5] and other departments.


p.s.: a genet is also an agile cat-like animal mostly living in Africa, just in case you were wondering about the abbreviation of the model.



[1] Swain, F. AIs could debate whether a smart assistant should snitch on you. New Scientist, 22 February 2019. Online: https://www.newscientist.com/article/2194613-ais-could-debatewhether-a-smart-assistant-should-snitch-on-you/ (last accessed: 5 March 2020).

[2] Liao, B., Slavkovik, M., van der Torre, L. Building Jiminy Cricket: An Architecture for Moral Agreements Among Stakeholders. ACM Conference on Artificial Intelligence, Ethics, and Society 2019, Hawaii, USA. Preprint: arXiv:1812.04741v2, 7 March 2019.

[3] Millar, J. An ethics evaluation tool for automating ethical decision-making in robotsand self-driving cars. Applied Artificial Intelligence, 30(8):787–809, 2016.

[4] Rautenbach, G., Keet, C.M. Toward equipping Artificial Moral Agents with multiple ethical theories. University of Cape Town. arxiv:2003.00935, 2 March 2020.

[5] Computer Science Department. Social Issues and Professional Practice in IT & Computing. Lecture Notes. 6 December 2019.

Dancing algorithms and algorithms for dance apps

Browsing through teaching material a few years ago, I had stumbled upon dancing algorithms, which illustrate common algorithms in computing using dance [1] and couldn’t resist writing about since I used to dance folk dances. More of them have been developed in the meantime. The list further below has them sorted by algorithm and by dance style, with links to the videos on YouTube. Related ideas have also been used in mathematics teaching, such as for teaching multiplication tables with Hip Hop singing and dancing in a class in Cape Town, dancing equations, mathsdance [2], and, stretching the scope a bit, rapping fractions elsewhere.

That brought me to the notion of algorithms for dancing, which takes a systematic and mathematical or computational approach to dance. For instance, the maths in salsa [2] and an ontology to describe some of dance [3], and a few more, which go beyond the hard-to-read Labanotation that is geared toward ballet but not pair dancing, let alone a four-couple dance [4] or, say, a rueda (multiple pairs in a circle, passing on partners). Since there was little for Salsa dance, I had proposed a computer science & dance project last year, and three computer science honours students were interested to develop their Salsational Dance App. The outcome of their hard work is that now there’s a demonstrated-to-be-usable API for the data structure to describe moves (designed for beats that counts in multiples of four), a grammar for the moves to construct valid sequences, and some visualization of the moves, therewith improving on the static information from Salsa is good that counted as baseline. The data structure is extensible to other dance styles beyond Salsa that have multiples of four, such as Bachata (without the syncopations, true).

In my opinion, the grammar is the coolest component, since it is both from a scientific and from an engineering perspective the most novel aspect and it was technically the most challenging task of the project. The grammar’s expressiveness remained within a context-free grammar, which is computationally still doable. This may be because of the moves covered—the usual moves one learns during a Salsa beginners course—or maybe in any case. The grammar has been tested to cover a series of test cases in the system, which all worked well (whether all theoretically physically feasible sequences feel comfortable to dance is a separate matter). The parsing is done by the JavaCC parser, which carries out a formal verification to check if the sequence of moves is valid, and it even does that on-the-fly. That is, when a user selects a move during the planning of a sequence of moves, it instantly re-computes which one(s) of the moves in the system can follow the last one selected, as can be seen in the following screenshot.

source: http://projects.cs.uct.ac.za/honsproj/cgi-bin/view/2019/baijnath_chetty_marajh.zip/DEDANCE_website/images/parser/ui4.PNG

Screenshot of planning a sequence of moves.

The grammar thus even has a neat ‘wrapper’ in the form of an end-user usable tool, which was evaluated by several members of Evolution Dance Company in Cape Town. Special thanks go to its owner, Mr. Angus Prince, who served also as external expert on the project. Some more screenshots, the code, and the project papers by the three students—Alka Baijnath, Jordy Chetty, and Micara Marajh—are available from the CS honours project archive.

The project also showed that much more can be done, not just porting it to other dance styles, but also still for salsa. This concerns not only the grammar, but also how to encode moves in a user-friendly way and how to link it up to the graphics so that the ‘puppets’ will dance the sequence of moves selected, as well as meeting other requirements, such as a mobile app as ‘cheat sheet’ to quickly check a move during a social dance evening and choreography planning. Based on my own experiences goofing around writing down moves, the latter (choreography) seems to be less hard to realise [documenting, at least] than the ‘any move’ scenario. Either way, the honours projects topics are being finalised around now. Hopefully, there will be another 2-3 students interested in computing and dance this year, so we can build up a larger body of software tools and techniques to support dance.


Dancing algorithms by type


– Quicksort as a Hungarian folk dance

– Bubble sort as a Hungarian dance, Bollywood style dance, and with synthetic music

– Shell sort as a Hungarian dance.

– Select sort as a gypsy/Roma dance

– Merge sort in ‘Transylvananian-Saxon’ dance style

– Insert sort in Romanian folk dance style

– Heap sort also as in Hungarian folk dance style

There are more sorting algorithms than these, though, so there’s plenty of choice to pick your own. A different artistic look at the algorithms is this one, with 15 sorts that almost sound like music (but not quite).


– Linear search in Flamenco style

– Binary search also in Flamenco style

Backtracking as a ballet performance


Dancing algorithms by dance style

European folk:

– Hungarian dance for the quicksort, bubble sort, shell sort, and heap sort;

– Roma (gypsy) dance for the select sort;

– Transylvananian-saxon dance for the merge sort;

– Romanian dance for an insert sort.

Latin American folk: Flamenco dancing a binary search and a linear search.

Bollywood dance where students are dancing a bubble sort.

Classical (Ballet) for a backtracking algorithm.

Modern (synthetic music) where a class attempts to dance a bubble sort.


That’s all for now. If you make a choreography for an algorithm, get people to dance it, record it, and want to have the video listed here, feel free to contact me and I’ll add it.



[1] Zoltan Katai and Laszlo Toth. Technologically and artistically enhanced multi-sensory computer programming education. Teaching and Teacher Education, 2010, 26(2): 244-251.

[2] Stephen Ornes. Math Dance. Proceedings of the National Academy of Sciences of the United States of America 2013. 110(26): 10465-10465.

[3] Christine von Renesse and Volker Ecke. Mathematics and Salsa Dancing. Journal of Mathematics and the Arts, 2011, 5(1): 17-28.

[4] Katerina El Raheb and Yannis Yoannidis. A Labanotation based ontology for representing dance movement. In: Gesture and Sign language in Human-Computer Interaction and Embodied Communication (GW’11). Springer, LNAI vol. 7206, 106-117. 2012.

[5] Michael R. Bush and Gary M. Roodman. Different partners, different places: mathematics applied to the construction of four-couple folk dances. Journal of Mathematics and the Arts, 2013, 7(1): 17-28.

Version 1.5 of the textbook on ontology engineering is available now

“Extended and Improved!” could some advertisement say of the new v1.5 of “An introduction to ontology engineering” that I made available online today. It’s not that v1 was no good, but there were a few loose ends and I received funding from the digital open textbooks for development (DOT4D) project to turn the ‘mere pdf’ into a proper “textbook package” whilst meeting the DOT4D interests of, principally, student involvement, multilingualism, local relevance, and universal access. The remainder of this post briefly describes the changes to the pdf and the rest of it.

The main changes to the book itself

With respect to contents in the pdf itself, the main differences with version 1 are:

  • a new chapter on modularisation, which is based on a part of the PhD thesis of my former student and meanwhile Senior Researcher at the CSIR, Dr. Zubeida Khan (Dawood).
  • more content in Chapter 9 on natural language & ontologies.
  • A new OntoClean tutorial (as Appendix A of the book, introduced last year), co-authored with Zola Mahlaza, which is integrated with Protégé and the OWL reasoner, rather than only paper-based.
  • There are about 10% more exercises and sample answers.
  • A bunch of typos and grammatical infelicities have been corrected and some figures were updated just in case (as the copyright stuff of those were unclear).

Other tweaks have been made in other sections to reflect these changes, and some of the wording here and there was reformulated to try to avoid some unintended parsing of it.

The “package” beyond a ‘mere’ pdf file

Since most textbooks, in computer science at least, are not just hardcopy textbooks or pdf-file-only entities, the OE textbook is not just that either. While some material for the exercises in v1 were already available on the textbook website, this has been extended substantially over the past year. The main additions are:

There are further extras that are not easily included in a book, yet possibly useful to have access to, such as list of ontology verbalisers with references that Zola Mahlaza compiled and an errata page for v1.

Overall, I hope it will be of some (more) use than v1. If you have any questions or comments, please don’t hesitate to contact me. (Now with v1.5 there are fewer loose ends than with v1, yet there’s always more that can be done [in theory at least].)

p.s.: yes, there’s a new front cover, so as to make it easier to distinguish. It’s also a photo I took in South Africa, but this time standing on top of Table Mountain.

A set of competency questions and SPARQL-OWL queries, with analysis

As a good beginning of the new year, our Data in Brief article Dataset of Ontology Competency Questions to SPARQL-OWL Queries Translations [1] was accepted and came online this week, which accompanies our Journal of Web Semantics article Analysis of Ontology Competency Questions and their Formalisations in SPARQL-OWL [2] that was published in December 2019—with ‘our’ referring to my collaborators in Poznan, Dawid Wisniewski, Jedrzej Potoniec, and Agnieszka Lawrynowicz, and myself. The former article provides extensive detail of a dataset we created that was subsequently used for analysis that provided new insights that is described in the latter article.

The dataset

In short, we tried to find existing good TBox-level competency questions (CQs) for available ontologies and manually formulate (i.e., formalise the CQ in) SPARQL-OWL queries for each of the CQs over said ontologies. We ended up with 234 CQs for 5 ontologies, with 131 accompanying SPARQL-OWL queries. This constitutes the first gold standard pipeline for verifying an ontology’s requirements and it presents the systematic analyses of what is translatable from the CQs and what not, and when not, why not. This may assist in further research and tool development on CQs, automating CQ verification, assessing the main query language constructs and therewith language optimisation, among others. The dataset itself is indeed independently reusable for other experiments, and has been reused already [3].

The key insights

The first analysis we conducted on it, reported in [2], revealed several insights. First, a larger set of CQs (cf. earlier work) indeed did increase the number of CQ patterns. There are recurring patterns in the shape of the CQs, when analysed linguistically; a popular one is What EC1 PC1 EC2? obtained from CQs like “What data are collected for the trail making test?” (a Dem@care CQ). Observe that, yes, indeed, we did decouple the language layer from the formalisation layer rather than mixing the two; hence, the ECs (resp. PCs) are not necessarily classes (resp. object properties) in an ontology. The SPARQL-OWL queries were also analysed at to what is really used of that query language, and used most often (see table 7 of the paper).

Second, these characteristics are not the same across CQ sets by different authors of different ontologies in different subject domains, although some patterns do recur and are thus somehow ‘popular’ regardless. Third, the relation CQ (pattern or not) : SPARQL-OWL query (or its signature) is m:n, not 1:1. That is, a CQ may have multiple SPARQL-OWL queries or signatures, and a SPARQL-OWL query or signature may be put into a natural language question (CQ) in different ways. The latter sucks for any aim of automated verification, but unfortunately, there doesn’t seem to be an easy way around that: 1) there are different ways to say the same thing, and 2) the same knowledge can be represented in different ways and therewith leading to a different shape of the query. Some possible ways to mitigate either is being looked into, like specifying a CQ controlled natural language [3] and modelling styles [4] so that one might be able to generate an algorithm to find and link or swap or choose one of them [5,6], but all that is still in the preliminary stages.

Meanwhile, there is that freely available dataset and the in-depth rigorous analysis, so that, hopefully, a solution may be found sooner rather than later.



[1] Potoniec, J., Wisniewski, D., Lawrynowicz, A., Keet, C.M. Dataset of Ontology Competency Questions to SPARQL-OWL Queries Translations. Data in Brief, 2020, in press.

[2] Wisniewski, D., Potoniec, J., Lawrynowicz, A., Keet, C.M. Analysis of Ontology Competency Questions and their Formalisations in SPARQL-OWL. Journal of Web Semantics, 2019, 59:100534.

[3] Keet, C.M., Mahlaza, Z., Antia, M.-J. CLaRO: a Controlled Language for Authoring Competency Questions. 13th Metadata and Semantics Research Conference (MTSR’19). 28-31 Oct 2019, Rome, Italy. Springer CCIS vol 1057, 3-15.

[4] Fillottrani, P.R., Keet, C.M. Dimensions Affecting Representation Styles in Ontologies. 1st Iberoamerican conference on Knowledge Graphs and Semantic Web (KGSWC’19). Springer CCIS vol 1029, 186-200. 24-28 June 2019, Villa Clara, Cuba. Paper at Springer

[5] Fillottrani, P.R., Keet, C.M. Patterns for Heterogeneous TBox Mappings to Bridge Different Modelling Decisions. 14th Extended Semantic Web Conference (ESWC’17). Springer LNCS vol 10249, 371-386. Portoroz, Slovenia, May 28 – June 2, 2017.

[6] Khan, Z.C., Keet, C.M. Automatically changing modules in modular ontology development and management. Annual Conference of the South African Institute of Computer Scientists and Information Technologists (SAICSIT’17). ACM Proceedings, 19:1-19:10. Thaba Nchu, South Africa. September 26-28, 2017.

Computer ethics (SIPP) notes relevant to South Africa

Social issues and Professional Practice in IT & Computing (formerly known as ‘computer ethics’ in our curriculum) increased in prominence in curriculum guidelines in recent years. Also, there is an increase in popular and scientific literature on computer ethics especially since Big Data, the popularisation of Artificial Intelligence, and now the 4th Industrial Revolution. Most of the articles and books are focussed on ethical and social issues where SIPP is taught mostly, being in ‘the West’.

It is taught elsewhere as well. For instance, since the early 2000s, the Computer Science Department at the University of Cape Town has taught it as part of a Masters in IT conversion course and as a block in a first-year computer science course. While initial material and lecture notes were reused from one of those universities in ‘the West’, over time, attempts have been made to localise it to some extent at least. For instance, South Africa has its own version of EU’s GDPR (the POPI Act), there is a South African IT organisation (IITPSA) with its code of conduct, and is the textbook case that illustrates the concept of leapfrogging with its wireless network (and perhaps also with the digital divide). In addition, some ‘aspects’ look different from a country that is classified as an emerging economy than for a high-income country; e.g., as patent protection and Silicon Valley’s data collection vs. potentially stifling emerging local tech companies and digital colonialism, respectively.

Updating lecture notes takes time, and so it is typically a multi-author effort carried out every few years, as it is in this case. Differently from the previous main update, is that, in line with teaching and with the times, the lecture notes are now publicly available for free on UCT’s “Open Educational Resources” site. It is with some hesitation, as it clearly does not have the quality of a textbook and we know of certain limitations that I would have liked to be better. Yet, I hope that it may be of some use already nonetheless, be it for people in the region or from ‘outside’ looking in.

I have contributed some sections as well, partially because I think it’s an interesting theme and partially because I have to teach it. I would have liked to add more, but time was running out (i.e., it’s a balancing act with other commitments, like research, teaching, and admin). With more time, the privacy chapter would have been updated better (e.g., also touching upon privacy in the context of the common practice of mobile phone sharing), emerging concepts would have been better integrated (e.g., digital colonialism, surveillance capitalism), some of the separate exercises could have been integrated, and so on and so forth. Alas, maybe a next time. (To any of my students reading this: some of these aspects are already integrated in the slides that are used in the CSC1016S lectures, which are running ahead in content compared to the written notes, and that is examinable content as well.)

More and better TDD for ontology authoring

Test-driven development (TDD) for ontology authoring [1] has received attention previously, including its accompanying tool TDDOnto [2] that was subsequently improved upon into the (also open source) TDDonto2 tool [3]. The TDDonto2 demo paper [3] did not contain the technical details about the new-and-improved algorithms and specification for TDD testing that we claimed it had. They are published just now in the International Journal on Artificial Intelligence Tools, as the article entitled More Effective Ontology Authoring with Test-Driven Development and the TDDonto2 tool [4]. The better algorithms cover more OWL language features than the original v1 of the theory and tool and it includes a specification for TDD testing such that there is not just pass/fail/absent as test result, but specific outcomes of the TDD test that are more informative, like that the ontology will become incoherent if that axiom were to be added. Given that model, the general flow for a simple standard case of a single TDD test (though more axioms can be tested at once) is as follows:

simplified view of the extended TDD process (source: adapted from [4])

The elements in the figure that are coloured light grey are the steps covered by the specification for TDD testing, algorithms, and TDDonto2 tool that is introduced in the paper.

The paper’s title clearly also hints to another contribution: using TDDonto2 for ontology authoring is significantly more effective. It was compared against the commonly used (and test-last) Protégé interface, which showed that the participants completed a larger part of the task in less time and with fewer mistakes. It also requires fewer interactions (clicking and typing) in the interface, which we reported on in an earlier (longer) tech report [5].

screenshot of the outcome of running the four tests on the sample ontology, in TDDonto2

As usual with research, more can be done. This is especially with respect to the white boxes in the figure above, i.e., the other aspects that would contribute toward a complete TDD methodology for ontology development. One step that we have been working on, is the idea of turning competency questions into axioms for TDD, which now is doable from CQ to SPARQL-OWL query [6] (more about that later), a CNL that may contribute to the authoring [7], and trying to figure out the modelling styles more precisely [8], since they hamper automation of these first steps in the process to get those axioms into the TDD plugin in a user-friendly way.



[1] Keet, C.M., Lawrynowicz, A. Test-Driven Development of Ontologies. 13th Extended Semantic Web Conference (ESWC’16). Springer LNCS vol. 9678, 642-657. 29 May – 2 June, 2016, Crete, Greece.

[2] Lawrynowicz, A., Keet, C.M. The TDDonto Tool for Test-Driven Development of DL Knowledge bases. 29th International Workshop on Description Logics (DL’16). April 22-25, Cape Town, South Africa. CEUR WS vol. 1577.

[3] Davies, K. Keet, C.M., Lawrynowicz, A. TDDonto2: A Test-Driven Development Plugin for arbitrary TBox and ABox axioms. The Semantic Web: ESWC 2017 Satellite Events, Blomqvist, E et al. (eds.). Springer LNCS vol 10577, 120-125. Portoroz, Slovenia, May 28 – June 2, 2017.

[4] Davies, K., Keet, C.M., Lawrynowicz, A. More Effective Ontology Authoring with Test-Driven Development and the TDDonto2 tool. International Journal on Artificial Intelligence Tools, 2019, 28(7): 1950023.

[5] Keet, C.M., Davies, K., Lawrynowicz, A. More Effective Ontology Authoring with Test-Driven Development. Technical Report 1812.06015. December 2018

[6] Wisniewski, D., Potoniec, J., Lawrynowicz, A., Keet, C.M. Analysis of Ontology Competency Questions and their Formalisations in SPARQL-OWL. Journal of Web Semantics. (in print)

[7] Keet, C.M., Mahlaza, Z., Antia, M.-J. CLaRO: a Controlled Language for Authoring Competency Questions. 13th Metadata and Semantics Research Conference (MTSR’19). 28-31 Oct 2019, Rome, Italy. Springer CCIS. (in print)

[8] Fillottrani, P.R., Keet, C.M.. Dimensions Affecting Representation Styles in Ontologies. 1st Iberoamerican conference on Knowledge Graphs and Semantic Web (KGSWC’19). Springer CCIS vol. 1029, 186-200. 23-30 June 2019, Villa Clara, Cuba.

Localising Protégé with Manchester syntax into your language of choice

Some people like a quasi natural language interface in ontology development tools, which is why Manchester Syntax was proposed [1]. A downside is that it locks the ontology developer into English, so that weird chimaeras are generated in the interface if the author prefers another language for the ontology, such as, e.g., the “jirafa come only (oja or ramita)” mentioned in an earlier post and that was deemed unpleasant in an experiment a while ago [2]. Those who prefer the quasi natural language components will have to resort to localising Manchester syntax and the tool’s interface.

This is precisely what two of my former students—Adam Kaliski and Casey O’Donnell—did during their mini-project in the ontology engineering course of 2017. A localisation in Afrikaans, as the case turned out to be. To make this publicly available, Michael Harrison brushed up the code a bit and tested it worked also in the new version of Protégé. It turned out it wasn’t that easy to localise it to another language the way it was done, so one of my PhD students, Toky Raboanary, redesigned the whole thing. This was then tested with Spanish, and found to be working. The remainder of the post describes informally some main aspects of it. If you don’t want to read all that but want to play with it right away: here are the jar files, open source code, and localisation instructions for if you want to create, say, a French or Dutch variant.

Some sensible constraints, some slightly contrived ones (and some bad ones), for the purpose of showing the localisation of the interface for the various keywords. The view in English is included in the screenshot to facilitate comparison.

Some sensible constraints, some slightly contrived ones (and some bad ones), for the purpose of showing the localisation of the interface for the various keywords. The view in English is included in the screenshot to facilitate comparison.

The localisation functions as a plugin for Protégé as a ‘view’ component. It can be selected under “Windows – Views – Class views” and then Beskrywing for the Afrikaans and Descripción for Spanish, and dragged into the desired position; this is likewise for object properties.

Instead of burying the translations in the code, they are specified in a separate XML file, whose content is fetched during the rendering. Adding a new ‘simple’ (more about that later) language merely amounts to adding a new XML file with the translations of the Protégé labels and of the relevant Manchester syntax. Here are the ‘simple’ translations—i.e., where both are fixed strings—for Afrikaans for the relevant tool interface components:


Class Description



(Label in Afrikaans)

Equivalent To Dieselfde as
SubClass Of Subklas van
General Class axioms Algemene Klasaksiomas
SubClass Of (Anonymous Ancestor) Subklas van (Naamlose Voorvader)
Disjoint With Disjunkte van
Disjoint Union Of Disjunkte Unie van


The second set of translations is for the Manchester syntax, so as to render that also in the target language. The relevant mappings for Afrikaans class description keywords are listed in the table below, which contain the final choices made by the students who developed the original plugin. For instance, min and max could have been rendered as minimum and maksimum, but the ten minste and by die meeste were deemed more readable despite being multi-word strings. Another interesting bit in the translation is negation, where there has to be a second ‘no’ since Afrikaans has double negation in this construction, so that it renders it as nie <expression> nie. That final rendering is not grammatically perfect, but (hopefully) sufficiently clear:

An attempt at double negation with a fixed string

An attempt at double negation with a fixed string

Manchester OWL Keyword Afrikaans Manchester OWL

Keyword or phrase

some sommige
only slegs
min ten minste
max by die meeste
exactly precies
and en
or of
not nie <expression> nie
SubClassOf SubklasVan
EquivalentTo DieselfdeAs
DisjointWith DisjunkteVan


The people involved in the translations for the object properties view for Afrikaans are Toky, my colleague Tommie Meyer (also at UCT), and myself; snyding for ‘intersection’ sounds somewhat odd to me, but the real tough one to translate was ‘SuperProperty’. Of the four options that were considered—SuperEienskap, SuperVerwantskap, SuperRelasie, and SuperVerband SuperVerwantskap was chosen with Tommie having had the final vote, which is also a semantic translation, not a literal translation.

Screenshot of the object properties description, with comparison to the English

The Spanish version also has multi-word strings, but at least does not do double negation. On the other hand, it has accents. To generate the Spanish version, myself, my collaborator Pablo Fillottrani from the Universidad Nacional del Sur, Argentina, and Toky had a go at it in translating the terms. This was then implemented with the XML file. In case you do not want to dig into the XML file and not install the plugin either, but have a quick look at the translations, they are as follows for the class description view:


Class Description



(in Spanish)

Equivalent To Equivalente a
SubClass Of Subclase de
General Class axioms Axiomas generales de clase
SubClass Of (Anonymous Ancestor) Subclase de (Ancestro Anónimo)
Disjoint With Disjunto con
Disjoint Union Of Unión Disjunta de
Instances Instancias


Manchester OWL Keyword Spanish Manchester OWL Keyword
some al menos uno
only sólo
min al mínimo
max al máximo
and y
or o
not no
exactly exactamente
SubClassOf SubclaseDe
EquivalentTo EquivalenteA
DisjointWith DisjuntoCon

And here’s a rendering of a real ontology, for geo linked data in Spanish, rather than African wildlife yet again:

screenshot of the plugin behaviour with someone else’s ontology in Spanish

One final comment remains, which has to do with the ‘simple’ mentioned above. The approach of localisation presented here works only with fixed strings, i.e., the strings do not have to change depending on the context where it is uses. It won’t work with, say, isiZulu—a highly agglutinating and inflectional language—because isiZulu doesn’t have fixed strings for the Manchester syntax keywords nor for some other labels. For instance, ‘at least one’ has seven variants for nouns in the singular, depending on the noun class of the noun of the OWL class it quantifies over; e.g., elilodwa for ‘at least one’ apple, and esisodwa for ‘at least one’ twig. Also, the conjugation of the verb for the object property depends on the noun class of the noun of the OWL class, but in this case for the one that plays the subject; e.g., it’s “eats” in English for both humans and elephants eating, say, fruit, so one string for the name of the object property, but that’s udla and idla, respectively, in isiZulu. This requires annotations of the classes with ontolex-lemon or a similar approach and a set of rules (which we have, btw) to determine what to do in which case, which requires on-the-fly modifications to Manchester syntax keywords and elements’ names or labels. And then there’s still phonological conditioning to account for. It surely can be done, but it is not as doable as with the ‘simple’ languages that have at least a disjunctive orthography and much less genders or noun classes for the nouns.

In closing, while there’s indeed more to translate in the Protégé interface in order to fully localise it, hopefully this already helps as-is either for reading at least a whole axiom in one’s language or as stepping stone to extend it further for the other terms in the Manchester syntax and the interface. Feel free to extend our open source code.



[1] Matthew Horridge, Nicholas Drummond, John Goodwin, Alan Rector, Robert Stevens and Hai Wang (2006). The Manchester OWL syntax. OWL: Experiences and Directions (OWLED’06), Athens, Georgia, USA, 10-11 Nov 2016, CEUR-WS vol 216.

[2] Keet, C.M. The use of foundational ontologies in ontology development: an empirical assessment. 8th Extended Semantic Web Conference (ESWC’11), G. Antoniou et al (Eds.), Heraklion, Crete, Greece, 29 May-2 June, 2011. Springer LNCS 6643, 321-335.

[3] Keet, C.M., Khumalo, L. Toward a knowledge-to-text controlled natural language of isiZulu. Language Resources and Evaluation, 2017, 51:131-157. accepted version

Design rationale and overview of the African Wildlife tutorial ontologies

(update 30-7-2020: more details are described in the journal article published in the Journal of Biomedical Semantics)

There are several tutorial ontologies, which typically focus on illustrating one or two aspects of ontology development, notably language features and automated reasoning. This may suffice for one’s aims, but for an ontology engineering course, one would need to be able to illustrate a myriad of development factors and devise exercises for a wider range of tasks of ontology development. For instance, to illustrate the use of ontology design patterns, competency questions, foundational ontologies, and science-based modelling practices, neither of which is addressed easily by the popular tutorial ontologies (notably: wine and pizza), perhaps because they predate most of the advances made in ontology engineering research. Also, I have noticed that my students replicate examples from the exercises they carry out and from inspecting popular and easy-to-find ontologies. Marking the practical assignments, I got to see sandwich and ice cream and burger ontologies with toppings and value partitions, and software and mobile phone ontologies where laptop models are instances rather than classes. Not providing good and versatile examples holistically, causes the propagation of sub-optimal ontology development at least in the exercises, which then also may affect negatively the development of an operational domain ontology that the graduates may have to develop later on.

I’ve been exploring alternatives and variants over the past 11 years in the ontology engineering courses that I have taught yearly to about 8-40 students/year. In an attempt to systematise and possibly generalise from that, I’ve identified 22 requirements that contribute to a good tutorial ontology, which concern the suitability of the subject domain (7 factors), the ease of demonstrating logics and reasoning tasks (7), and assistance with demonstrating engineering aspects (8). Its details are described in a technical report [1]. I don’t claim that it’s an exhaustive list, but that it is one that may help someone to develop their own tutorial ontology in a fun or interesting topic if they so wish—after all, not everyone is interested in pizzas, wines, African wildlife, pets, shirts, a small university, or Robert Stevens’ family.

I’ve tried out a variety of extant tutorial ontologies as well as a range of versions of the African Wildlife Ontology (AWO) over the years (early experiences), eventually settling for a set of 14 versions, all the way from the example from the Primer [2] to DOLCE- and BFO-aligned to translated in several languages, and some with possible answers to some of the exercises. A graphical rendering of the main classes and relations is shown in the following figure:

The versions of the AWO are summarised in the following table, which is also mentioned as annotation in the OWL files.


The AWO meets a majority of the 22 requirements, is mature by now, and it has been used yearly in an ontology engineering course or tutorial since 2010. Also, it is links up with my ontology engineering textbook with relevant examples and exercises. The AWO provides a wide range of options concerning examples and exercises for ontology engineering well beyond illustrating only logic features and automated reasoning. For instance, it assists in demonstrating tasks about ontology quality, such as alignment to a foundational ontology and satisfying competency questions, versioning, and multilingual ontologies. For instance, it is easier to demonstrate alignment of a class Animal to DOLCE’s (Non-Agentive) Physical Object than, say, debating what Algorithm aligns with or descend into political debates on the gender binary or what constitutes a family. One can use the height or the colours of the plants and animals to discuss how to model attributes as qualities or dependent entities cf. OWL’s data properties or an artificial ValuePartition. Declare, say, de individual lion simba as an instance of Lion, rather than the confusion regarding grape varieties. Use intuitively obvious disjointness between animals and plants, and subsequently easy catches on sensitising modellers to the far-reaching effects of declaring domain and range axioms by first asserting that animals eat animals, and then adding that carnivorous plants eat insects. In addition, it links up easily to topics for ontology integration activities, such as with biodiversity data, wildlife trade, and tourism to create, e.g., an OBDA system with freely available data (e.g., taken from here) or an ontology-enhanced website for an organisation that offers environmentally sustainable safaris. More examples of broad usage options are described in section 2.3 in the tech report.

The AWO is freely available under a CC-BY licence through the textbook’s webpage at https://people.cs.uct.ac.za/~mkeet/OEbook/ in this folder. A more comprehensive description of the requirements, design, and content is described in a technical report [1] for the time being.



[1] Keet, CM. The African Wildlife Ontology tutorial ontologies: requirements, design, and content. Technical Report 1905.09519. 23 May 2019. https://arxiv.org/abs/1905.09519.

[2] Antoniou, G., van Harmelen, F. A Semantic Web Primer. MIT Press, USA. 2003.