From ontology verbalisation to language learning exercises

I’m aware that to most people ‘playing with’ (investigating) ontologies and isiZulu does not sound particularly useful on the face of it. Yet, there’s the some long-term future music, like eventually being able to generate patient discharge notes in one’s own language, which will do its bit to ameliorate the language barrier in healthcare in South Africa so that patients at least will adhere to the treatment instructions a little better, and therewith receive better quality healthcare. But benefits in the short-term might serve something as well. To that end, I proposed an honours project last year, which has been completed in the meantime, and one of the two interesting outcomes has made it into a publication already [1]. As you may have guessed from the title, it’s about automation for language learning exercises. The results will be presented at the 6th Workshop on Controlled Natural Language, in Maynooth, Ireland in about 2 weeks time (27-28 August). In the remainder of this post, I highlight the main contributions described in the paper.

First, regarding the post’s title, one might wonder what ontology verbalisation has to do with language learning. Nothing, really, except that we could reuse the algorithms from the controlled natural language (CNL) for ontology verbalisation to generate (computer-assisted) language learning exercises whose answers can be computed and marked automatically. That is, the original design of the CNL for things like pluralising nouns, verb conjugation, and negation that is used for verbalising ontologies in isiZulu in theory [2] and in practice [3], was such that the sentence generator is a detachable module that could be plugged in elsewhere for another task that needs such operations.

Practically, the student who designed and developed the back-end, Nikhil Gilbert, preferred Java over Python, so he converted most parts into Java, and added a bit more, notably the ‘singulariser’, a sentence scrabble, and a sentence generator. Regarding the sentence generator, this is used as part of the exercises & answers generator. For instance, we know that humans and the roles they play (father, aunt, doctor, etc.) are mostly in isiZulu’s noun classes 1, 2, 1a, 2a, or 3a, that those classes do not (or rarely?) have non-human nouns and generally it holds for all humans and their roles that they can ‘eat’, ‘talk’ etc. This makes it relatively easy create a noun chain and a verb chain list to mix and match nouns with verbs accordingly (hurrah! for the semantics-based noun class system). Then, with the 231 nouns and 59 verbs in the newly constructed mini-corpus, the noun chain and the verb chain, 39501 unique question sentences could be generated, using the following overall architecture of the system:

Architecture of the CNL-driven CALL system. The arrows indicate which upper layer components make use of the lower layer components. (Source: [1])

From a CNL perspective as well as the language learning perspective, the actual templates for the exercises may be of interest. For instance, when a learner is learning about pluralising nouns and their associated verb, the system uses the following two templates for the questions and answers:

Q: <prefixSG+stem> <SGSC+VerbRoot+FV>
A: <prefixPL+stem> <PLSC+VerbRoot+FV>
Q: <prefixSG+stem> <SGSC+VerbRoot+FV> <prefixSG+stem>
A: <prefixPL+stem> <PLSC+VerbRoot+FV> <prefixPL+stem>

The answers can be generated automatically with the algorithms that generate the plural noun (from ‘prefixSG’ to ‘prefixPL’) and add the plural subject concord (from ‘SGSC’ to ‘PLSC’, in agreement with ‘prefixPL’), which were developed as part of the GeNI project on ontology verbalization. This can then be checked against what the learner has typed. For instance, a generated question could be umfowethu usula inkomishi and the correct answer generated (to check the learner’s response against) is abafowethu basula izinkomishi. Another example is generation of the negation from the positive, or, vv.; e.g.:

Q: <PLSC+VerbRoot+FV>
A: <PLNEGSC+VerbRoot+NEGFV>

For instance, the question may present batotoba and the correct answer is then abatotobi. In total, there are six different types of sentences, with two double, like the plural above, hence a total of 16 templates. It is not a lot, but it turned out it is one of the very few attempts to use a CNL in such way: there is one paper that also will be presented at CNL’18 in the same session [4], and an earlier one [5] uses a fancy grammar system (that we don’t have yet computationally for isiZulu). This is not to be misunderstood as that this is one of the first CNL/NLG-based system for computer-assisted language learning—e.g., there’s assistance in essay writing, grammar concept question generation, reading understanding question generation—but curiously very little on CNLs or NLG for the standard entry-level type of questions to learn the grammar. Perhaps the latter is considered ‘boring’ for English by now, given all the resources. However, thousands of students take introduction courses in isiZulu each year, and some automation can alleviate the pressure of routine activities from the lecturers. We have done some evaluations with learners—with encouraging results—and plan to do some more, so that it may eventually transition to actual use in the courses; that is: TBC…

 

References

[1] Gilbert, N., Keet, C.M. Automating question generation and marking of language learning exercises for isiZulu. 6th International Workshop on Controlled Natural language (CNL’18). IOS Press. Co. Kildare, Ireland, 27-28 August 2018. (in print)

[2] Keet, C.M., Khumalo, L. Toward a knowledge-to-text controlled natural language of isiZulu. Language Resources and Evaluation, 2017, 51(1): 131-157.

[3] Keet, C.M. Xakaza, M., Khumalo, L. Verbalising OWL ontologies in isiZulu with Python. The Semantic Web: ESWC 2017 Satellite Events, Blomqvist, E. et al. (eds.). Springer LNCS vol. 10577, 59-64.

[4] Lange, H., Ljunglof, P. Putting control into language learning. 6th International Workshop on Controlled Natural language (CNL’18). IOS Press. Co. Kildare, Ireland, 27-28 August 2018. (in print)

[5] Gardent, C., Perez-Beltrachini, L. Using FB-LTAG Derivation Trees to Generate Transformation-Based Grammar Exercises. Proc. of TAG+11, Sep 2012, Paris, France. pp117-125, 2012.

Advertisements

On ‘open access’ CS conference proceedings

It perhaps sounds nice and doing-good-like, for the doe-eyed ones at least: publish computer science conference proceedings as open access so that anyone in the world can access the scientific advances for free. Yay. Free access to scientific materials is good for a multitude of reasons. There’s downside in the set-up in the way some try to push this now, though, which amounts to making people pay for what used to be, and still mostly is, for free already. I take issue with that. Instead of individualising a downside of open access by heaping more costs onto the individual researchers, the free flow of knowledge should be—and remain—a collectivised effort.

 

It is, and used to be, the case that most authors put the camera-ready-copy (CRC) on their respective homepages and/or institutional repositories, and it used to be typically even before the conference (e.g., mine are here). Putting the CRC on one’s website or in an openly accessible institutional repository seems to happen slightly less often now, even though it is legal to do so. I don’t know why. Even if it were not entirely legal, a collective disobedience is not something that the publishers easily can fight. It doesn’t help that Google indexes the publisher quicker than the academics’ webpages, so the CRCs on the authors’ pages don’t turn up immediately in the search results even whey the CRCs are online, but that would be a pathetic reason for not uploading the CRC. It’s a little extra effort to lookup an author’s website, but acceptable as long as the file is still online and freely available.

Besides the established hallelujah’s to principles of knowledge sharing, there’s since recently a drive at various computer science (CS) conferences to make sure the proceedings will be open access (OA). Like for OA journal papers in an OA or hybrid journal, someone’s going to have to pay for the ‘article processing charges’. The instances that I’ve seen close-up, put those costs for all papers of the proceedings in the conference budget and therewith increase the conference registration costs. Depending on 1) how good or bad the deal is that the organisers made, 2) how many people are expected to attend, and 3) how many papers will go in the volume, it hikes up the registration costs by some 50 euro. This is new money that the publishing house is making that they did not use to make before, and I’m pretty sure they wouldn’t offer an OA option if it were to result in them making less profit from the obscenely lucrative science publishing business.

So, who pays? Different universities have different funding schemes, as have different funders as to what they fund. For instance, there exist funds for contributing to OA journal article publishing (also at UCT, and Springer even has a list of OA funders in several countries), but that cannot be used in this case, for the OA costs are hidden in the conference registration fee. There are also conference travel funds, but they fund part of it or cap it to a maximum, and the more the whole thing costs, the greater the shortfall that one then will have to pay out of one’s own research fund or one’s own pocket.

A colleague (at another university) who’s pushing for the OA for CS conference proceedings said that his institution is paying for all the OA anyway, not him—he easily can have principles, as it doesn’t cost him anything anyway. Some academics have their universities pay for the conference proceedings access already anyway, as part of the subscription package; it’s typically the higher-ranking technical universities that have access. Those I spoke to, didn’t like the idea that now they’d have to pay for access in this way, for they already had ‘free’ (to them) access, as the registration fees come from their own research funds. For me, it is my own research funds as well, i.e., those funds that I have to scramble together through project proposal applications with their low acceptance rates. If I’d go to/have papers at, say, 5 such conferences per year (in the past several years, it was more like double that), that’s the same amount as paying a student/scientific programmer for almost a week and about a monthly salary for the lowest-paid in South Africa, or travel costs or accommodation for the national CS&IT conference (or both) or its registration fees. That is, with increased registration fees to cover the additional OA costs, at least one of my students or I would lose out on participating in even a local conference, or students would be less exposed to doing research and obtaining programming experience that helps them to get a better job or better chance at obtaining a scholarship for postgraduate studies. To name but a few trade-offs.

Effectively, the system has moved from “free access to the scientific literature anyway” (the online CRCs), to “free access plus losing money (i.e.: all that I could have done with it) in the process”. That’s not an improvement on the ground.

Further, my hard-earned research funds are mine, and I’d like to decide what to do with it, rather than having that decision been taken for me. Who do the rich boys up North think they are to say that I should spend it on OA when the papers were already free, rather than giving a student an opportunity to go to a national conference or devise and implement an algorithm, or participate in an experiment etc.! (Setting aside them trying to reprimand and ‘educate’ me on the goodness—tsk! as if I don’t know that the free flow of scientific information is a good thing.)

Tell me, why should the OA principles trump the capacity building when the papers are free access already anyway? I’ve not seen OA advocates actually weighing up any alternatives on what would be the better good to spend money on. As to possible answers, note that an “it ought to be the case that there would be enough money for both” is not a valid answer in discussing trade-offs, nor is a “we might add a bit of patching up as conference registration reduction for those needy that are not in the rich inner core” for it hardly ever happens, nor is a “it’s not much for each instance, you really should be able to cover it” because many instances do add up. We all know that funding for universities and for research in general is being squeezed left, right, and centre in most countries, especially over the past 8-10 years, and such choices will have to, and are being, made already. These are not just choices we face in Africa, but this holds also in richer countries, like in the EU (fewer resources in relative or absolute terms and greater divides), although a 250 euro (the 5 conferences scenario) won’t go as far there as in low-income countries.

Also, and regardless the funding squeeze: why should we start paying for free access that already was a de facto, and with most CS proceedings publishers, also a de jure, free access anyway? I’m seriously starting to wonder who’s getting kickbacks for promoting and pushing this sort of scheme. It’s certainly not me, and nor would I take it if some publisher would offer it to me, as it contributes to the flow of even more money from universities and research institutes to the profits of multinationals. If it’s not kickbacks, then to all those new ‘conference proceedings need to be OA’ advocates: why do you advocate paying for a right that we had for free? Why isn’t it enough for you to just pay for a principle yourself as you so desire, but instead insist to force others to do so too even when there is already a tacit and functioning agreement going on that realises that aim of free flow of knowledge?

Sure, the publisher has a responsibility to keep the papers available in perpetuity, which I don’t, and link rot does exist. One easily could write a script to search all academics’ websites and get the files, like citeseer used to do well. They get funding for such projects for long-term archiving, like arxiv.org does as well, and philpapers, and SSRN as popular ones (see also a comprehensive list of preprint servers), and most institution’s repositories, too (e.g., the CS@UCT pubs repository). So, the perpetuity argument can also be taken care of that way, without the researchers actually having to pay more.

Really, if you’re swimming in so much research money that you want to pay for a principle that was realised without costs to researchers, then perhaps instead do fund the event so that, say, some student grants can be given out, that it can contribute to some nice networking activity, or whatever part of the costs. The new “we should pay for OA, notwithstanding that no one was suffering when it was for free” attitude for CS conference proceedings is way too fishy to actually being honest; if you’re honest and not getting kickbacks, then it’s a very dumb thing to advocate for.

For the two events where this scheme is happening that I’m involved in, I admit I didn’t forcefully object at the time it was mentioned (nor had I really thought through the consequences). I should have, though. I will do so a next time.

An Ontology Engineering textbook

My first textbook “An Introduction to Ontology Engineering” (pdf) is just released as an open textbook. I have revised, updated, and extended my earlier lecture notes on ontology engineering, amounting to about 1/3 more new content cf. its predecessor. Its main aim is to provide an introductory overview of ontology engineering and its secondary aim is to provide hands-on experience in ontology development that illustrate the theory.

The contents and narrative is aimed at advanced undergraduate and postgraduate level in computing (e.g., as a semester-long course), and the book is structured accordingly. After an introductory chapter, there are three blocks:

  • Logic foundations for ontologies: languages (FOL, DLs, OWL species) and automated reasoning (principles and the basics of tableau);
  • Developing good ontologies with methods and methodologies, the top-down approach with foundational ontologies, and the bottom-up approach to extract as much useful content as possible from legacy material;
  • Advanced topics that has a selection of sub-topics: Ontology-Based Data Access, interactions between ontologies and natural languages, and advanced modelling with additional language features (fuzzy and temporal).

Each chapter has several review questions and exercises to explore one or more aspects of the theory, as well as descriptions of two assignments that require using several sub-topics at once. More information is available on the textbook’s page [also here] (including the links to the ontologies used in the exercises), or you can click here for the pdf (7MB).

Feedback is welcome, of course. Also, if you happen to use it in whole or in part for your course, I’d be grateful if you would let me know. Finally, if this textbook will be used half (or even a quarter) as much as the 2009/2010 blogposts have been visited (around 10K unique visitors since posting them), that would mean there are a lot of people learning about ontology engineering and then I’ll have achieved more than I hoped for.

UPDATE: meanwhile, it has been added to several open (text)book repositories, such as OpenUCT and the Open Textbook Archive, and it has been featured on unglue.it in the week of 13-8 (out of its 14K free ebooks).

Ontology, part-whole relations, isiZulu and culture

The title is a mouthful, but it can go together. What’s interesting, is that the ‘common’ list of part-whole relations are not exactly like that in isiZulu and Zulu culture.

Part-whole relations have been proposed over the past 30 years, such as to relate a human heart to the human it is part of, that Gauteng is located in South Africa (geographically a part of), and the slice of the cake is a portion of the cake, and they seemed well-established by now. The figure below provides an informal view of it.

Informal taxonomy of common part-whole relations (source: [2])

My co-author, Langa Khumalo, and I already had an inkling this hierarchy probably would not work for isiZulu, based, first, on a linguistic analysis to generate natural language [1], and, second, the Shuter & Shooter English-isiZulu dictionary already lists 18 translations for just ‘part’ alone. Yet, if those ‘common’ part-whole relations are universal, the differences observed ought to be just an artefact of language, not ontological differences. To clear up the matter, we guided ourselves with the following questions:

  1. Which part-whole relations have been named in isiZulu, and to what extent are they not only lexically but also semantically distinct?
  2. Can all those part-whole relations be mapped with equivalence relations to the common part-whole relations?
  3. For those that cannot be mapped with equivalence relations: is the difference in meaning ontologically possibly interesting for ontology engineering?
  4. Is there something different as gleaned from isiZulu part-whole relations that is useful in improving the theoretical appreciation of part-whole relations?

To figure this out, we first took a bottom-up approach with evidence gathering, and then augmented it with further ontological analysis. Plodding though the isiZulu-English dictionaries got us 81 terms that had something to do with parts. 41 were discarded because they were not applicable upon closer inspection (e.g., referring to creating parts cf. relating parts, idioms). Further annotations and examples were added, which reduced it to 28 (+ 3 we had missed and were added). Of those 28, we selected 13 for ontological analysis and formalisation. That selection was based on importance (like ingxenye) and some of them that seemed a bit overly specific, like iqatha for portions of meat, and meat only. The hierarchy of the final selection is shown in the figure below, with an informal indication of what the relation relates.

Selected isiZulu terms with informal descriptions. (Source: [2])

They held up ontologically, i.e., some are the same as the ‘common’ ones, yet some others are really different, like the hlanganyela for a collective (cf. individual object) being part of (participating in) an event. Admitted, some of the domains/ranges aren’t very clearly delineated. For instance, isiqephu relates solid and ‘solid-like’ portions, as in, e.g., Zonke izicezu zesinkwa ziyisiqephu sesinkwa esisodwa ‘all slices of bread are a portion of some loaf of bread’. Where exactly that border of ‘solid-like’ is and when it really counts as a liquid (and thus isiqephu applies no more), is not yet clear—that’s a separate question orthogonal to the relation. Nonetheless, the investigation did clear up several things, especially the more precise umunxa that took me a while to unravel, which turned out to be a chain of parthood relations; e.g., the area where the fireplace is in the hut is a portion of the hut (sample use with the verbaliser: Onke amaziko angumunxa wexhiba). We didn’t touch upon really thorny issues that probably will deserve a paper of their own. For instance, the temporalised parthood isihlephu is used to relate a meaningful scattered part with identity to the whole it was part of, such as the broken-off ear of a cup that was part of the cup (but it cannot be used for the chip of the cup, as a chip isn’t identifiable in the same way as the ear is).

We did try to test the terms against the isiZulu National Corpus to see how the terms are used, but with the limited functionalities and tooling, not as much came out of it as we had hoped for. In any case, the detailed assessment of a section of the corpus did show the relevant uses were not contradicting the formalisation.

Further details can be found in our paper “On the ontology of part-whole relations in Zulu language and culture” that will be presented at the 10th International Conference on Formal Ontology in Information Systems 2018 (FOIS’18) that will be held from 17 to 21 September in Cape Town, South Africa.

As far as I know, this is the first such investigation. Checking out other languages a bit (mainly Spanish and German), and some related works on Turkish and Chinese, it might be the case that also there the ‘common’ part-whole relations may not be exactly the same. We carried out whole process systematically, which is described as such in the paper, so that anyone who’d like to do something like this for another language region and culture, could follow the same procedure.

 

References

[1] Keet, C.M., Khumalo, L. On the verbalization patterns of part-whole relations in isiZulu. 9th International Natural Language Generation conference (INLG’16), September 5-8, 2016, Edinburgh, UK. ACL, 174-183.

[2] Keet, C.M., Khumalo, L. On the ontology of part-whole relations in Zulu language and culture. 10th International Conference on Formal Ontology in Information Systems 2018 (FOIS’18). IOS Press. 17-21 September, 2018, Cape Town, South Africa. (in print)

Not sorry at all—Review of “Sorry, not Sorry” by Haji Dawjee

Some papers are in the review pipeline for longer than they ought to be and the travel-part of conference attendance is a good opportunity to read books. So, instead of writing more about research, here’s a blogpost with a book review instead, being Sorry, not sorry—Experiences of a brown woman in a white South Africa by South African journalist Haji Mohamed Dawjee. It’s one of those books I bought out of curiosity, as the main title intrigued me on two aspects. First, it contradicts—if you’re not sorry, then don’t apologise for not doing so. Second, the subtitle, as it can be useful to read what people who don’t get much media coverage have to say. It turned out to have been published only last month, so let me break with the usual pattern and write a review now rather than wait until the usual January installments

The book contains 20 essays of Dawjee’s experiences broadly and with many specific events, and reflections thereof, on growing up and working in South Africa. Depending on your background, you’ll find more or less recognisable points in it, or perhaps none at all and you’ll just eat the whole spiced dish served, but if you’re a woke South African white or think of yourself as a do-gooder white, you probably won’t like certain sections of it. As it is not my intention to write a very long review, I’ve picked a few essays to comment on, but there’s no clear single favourite among the essays. There are two essays that I think the book could have done without, but well, I suppose the author is asserting something with it that has something to do with the first essay and that I’m just missing the point. That first essay is entitled ‘We don’t really write what we like’ and relates back to Biko’s statement and essay collection I write what I like, not the Writing what we like essay collection of 2016. It describes the media landscape, the difficulties of people of colour to get published, and that their articles are always expected to have some relevance and insight—“having to be on the frontlines of critical thinking”—rather than some drivel that white guys can get away with, as “We too have nice experiences. We think about things and dream and have magic in us. We have fuzzy fables to share.”. Dawjee doesn’t consider such airy fairy stories by the white guys to be brave, but exhibiting opportunity an privilege, and she wants to have that opportunity and privilege, too. This book, however, is mainly of the not-drivel and making-a-point sort of writing rather than flowery language devoid of a message.

For instance, what it was like from the journalism side when Mandela died, and the magazine she was working for changing her story describing a successful black guy into one “more Tsotsi-like”, because “[t]he obvious reason for the editorial manipulation was that no-one wanted a story of a good black kid. Only white kids are intrinsically exceptional.” (discussed in the essay ‘The curious case of the old white architect’). Several essays describe unpleasant behind-the-scenes experiences in journalism, such as at YOU magazine, and provide a context to her article Maid in South Africa that had as blurb “White people can walk their dogs, but not their children”, which apparently had turned out to cause a shitstorm on social media. There was an opinion-piece response by one of Dawjee’s colleagues, “coming to my ‘rescue’” and who “needed to whitesplain my thoughts and sanitise them with her ‘wokeness’” (p190). It’s a prelude to finishing off with a high note (more about that further below), and illustrates one of the recurring topics—the major irritation with the do-gooders, woke whites, the ones who put themselves in the ‘good whites’ box and ‘liberal left’, but who nonetheless still contribute to systemic racism. This relates to Biko’s essay on the problems with white liberals and similar essays in his I write what I like, there described as category, and in Dawjee’s book illustrated with multiple examples.

 

In an essay quite different in style, ‘Why I’m down with Downtown Abbey’ (the TV series), Dawjee revels in the joys of seeing white servants doing the scurrying around, cooking, cleaning etc for the rich. On the one hand, knowing a little of South African society by now, understandable. On the other hand, it leaves me wondering just how much messed up the media is that people here still (this is not the first or second time I came across this topic) seem to think that up in Europe most or all families also have maids and gardeners. They don’t. As one Irish placard had put it, “clean up your own shite” is the standard, as is DIY gardening and cooking. Those chores, or joys, are done by the women, children, and men of the nuclear family, not hired helps.

Related to that latter point—who’s doing the chores—two essays have to do with feminism and Islam. The essay title ‘And how the women of Islam did slay’ speaks for itself. And, yes, as Dawjee says, it cannot be repeated often enough that there were strong, successful, and intelligent women at the bedrock of Islam and women actually do have rights (unlike under Christianity); in case you want some references on women’s rights under Islam, have a look at the essay I wrote a while a go about it. ‘My mother, the true radical’ touches upon notions of feminism and who gets to decide who is feminist when and in what way.

 

I do not quite agree with Dawjee’s conclusion drawn from her Tinder experiences in ‘Tinder is a pocket full of rejection, in two parts’. On p129 she writes “Tinder in South Africa is nothing but fertile ground for race-based rejection.”. If it were a straightforward case of just race-based swiping, then, statistically, I should have had lots of matches with SA white guys, as I surely look white with my pale skin, blue eyes, and dark blonde hair (that I ended up in the 0.6% ‘other’ box in the SA census in 2011 is a separate story). But, nada. In my 1.5 years of Tinder experiment in Cape Town, I never ever got a match with a white guy from SA either, but plenty of matches with blacks, broad and narrow. I still hypothesise that the lack of matches with the white guys is because I list my employer, which scares away men who do not like women who’ve enjoyed some higher education, as it has scared away countless men in several other countries as well. Having educated oneself out of the marriage market, it is also called. There’s a realistic chance that a majority of those South African whites that swiped left on Dawjee are racist, but, sadly, their distorted views on humanity include insecurities on more than one front, and I’m willing to bet that Dawjee having an honours degree under her belt will have contributed to it. That said, two anecdotes doesn’t make data, and an OKCupid-type of analysis like Rudder’s Dataclysm (review) but then of Tinder data would be interesting so as to get to the bottom of that.

 

The two, imho, skippable essays are “Joining a cult is a terrible idea” (duh) and “Depression: A journal”. I’m not into too personal revelations, and would have preferred a general analysis on how society deals, or not, with mental illness, or, if something more concrete, to relate it to, say, the Life Esidimeni case from whichever angle.

 

Meandering around through the various serious subtopics and digressions, as a whole, the essays combine into chronicling the road taken by Dawjee to decolonise her mind, culminating in a fine series of statements in the last part of the last essay. She is not sorry for refusing to be a doormat, saying so, and the consequences that that will have for those who perpetuate and benefit from systemic racism, and she now lives from a position of strength rather than struggling and doubting as a receiver of it.

 

Overall, it was an interesting book and worthwhile to have read. The writing style is very accessible, so one can read the whole book in a day or so. In case you are still unsure whether you want to read it or not: there are free book extracts of ‘We don’t really write what we like’, ‘Begging to be white?’, and ‘And how the women of Islam did slay’ and, at the time of writing this blog post, one written review on News24 and Eusebius McKaiser’s Radio 702 interview with Dawjee (both also positive about the book).

‘Problem shopping’ and networking at IST-Africa’18 in Gaborone

There are several local and regional conferences in (Sub-Saharan) Africa with a focus on Africa in one way or another, be it for, say, computer science and information systems in (mainly) South Africa, computer networks in Africa, or for (computer) engineers. The IST-Africa series covers a broad set of topics and papers must explicitly state how and where all that research output is good for within an African context, hence, with a considerable proportion of the scope within the ICT for Development sphere. I had heard from colleagues it was a good networking opportunity, one of my students had obtained some publishable results during her CS honours project that could be whipped into paper-shape [1], I hadn’t been to Botswana before, and I’m on sabbatical so have some time. To make a long story short: the conference just finished, and I’ll write a bit about the experiences in the remainder of this post.

First, regarding the title of the post: I’m not quite an ICT4D researcher, but I do prefer to work on computer science problems that are based on actual problems that don’t have a solution yet, rather than invented toy examples. A multitude of papers presented at the conference were elaborate on problem specification, like them having gone out in the field and done the contextual inquiries, attitude surveys, and the like so as to better understand the multifaceted problems themselves before working toward a solution that will actually work (cf. the white elephants littered around on the continent). So, in a way, the conference also doubled in a ‘problem shopping’ event, though note that many solutions were presented as well. Here’s a brief smorgasbord of them:

  • Obstacles to eLearning in, say, Tanzania: internet access (40% only), lack of support, lack of local digital content, and too few data-driven analyses of experiments [2].
  • Digital content for healthcare students and practitioners in WikiTropica [3], which has the ‘usual’ problems of low resource needs (e.g., a textbook with lots of pictures but has to work on the mobile phone or tablet nonetheless), the last mile, and language. Also: the question of how to get people to participate to develop such resources? That’s still an open question; students of my colleague Hussein Suleman have been trying to figure out how to motivate them. As to the 24 responses by participants to the question “…Which incentive do you need?” the results were: 7 money/devices, 7 recognition, 4 none, 4 humanity/care/usefulness, 1 share & learn, and 1 not sure (my encoding).

    Content collaboration perceptions

    information sharing perceptions

    With respect to practices and attitudes toward information sharing, the answers were not quite encouraging (see thumbnails). Of course, all this is but a snapshot, but still.

  • The workshop on geospatial sciences & land administration had a paper on building a national database infrastructure that wasn’t free of challenges, among others: buying data is costly, available data but no metadata, privacy issues, data collected and cant ask for consent again for repurposing of that data (p16) [4].
  • How to overcome the (perceived to be the main) hurdle of lack of trust in electronic voting in Kenya [5]. In Thiga’s case, they let the students help coding the voting software and kept things ‘offline’ with a local network in the voting room and the server in sight [5]. There were lively comments in the whole session on voting (session 8c), including privacy issues, auditability, whether blockchain could help (yes on auditability and also anonymity, but consumes a lot of [too much?] electricity, according to a Namibian delegate also in attendance), and scaling up to the population or not (probably not for a while, due to digital literacy and access issues, in addition to the trust issue). The research and experiments continue.
  • Headaches of data integration in Buffalo City to get the water billing information system working properly [6]. There are the usual culprits in system integration from the information systems viewpoint (e.g., no buy-in by top management or users) that were held against the case in the city (cf. the CS side of the equation, like noisy data, gaps, vocabulary alignment etc.). Upon further inquiry, specific issues came to the surface, like not reading the water meters for several years and having been paying some guesstimate all the while, and issues that have to do with interaction between paying water (one system) and electricity (another system) cause problems for customers also when they have paid, among others [6]. A framework was proposed, but that hasn’t solved the actual data integration problem.

There were five parallel sessions over the three days (programme), so there are many papers to check out still.

As to networking with people in Africa, it was good especially to meet African ontologists and semantic web enthusiasts, and learn of the Botswana National Productivity Centre (a spellchecker might help, though needing a bit more research for seTswana then), and completely unrelated ending up bringing up the software-based clicker system we developed a few years ago (and still works). The sessions were well-attended—most of us having seen monkeys and beautiful sunsets, done game drives and such—and for many it was a unique opportunity, ranging from lucky postgrads with some funding to professors from the various institutions. A quick scan through the participants list showed that relatively many participants are affiliated with institutions from South Africa, Botswana, Tanzania, Kenya, and Uganda, but also a few from Cameroon, Burkina Faso, Senegal, Angola, and Malawi, among others, and a few from outside Africa, such as the USA, Finland, Canada, and Germany. There was also a representative from the EU’s DEVCO and from GEANT (the one behind Eduroam). Last, but not least, not only the Minister of Transport and Communication, Onkokame Kitso, was present at the conference’s opening ceremony, but also the brand new—39 days and counting—President of Botswana, Mokgweetsi Masisi.

No doubt there will be a 14th installment of the conference next year. The paper deadline tends to be in December and extended into January.

 

References

(papers are now only on the USB stick but will appear in IEEE Xplore soon)

[1] Mjaria F, Keet CM. A statistical approach to error correction for isiZulu spellcheckers. IST-Africa 2018.

[2] Mtebe J, Raphael C. A critical review of eLearning Research trends in Tanzania. IST-Africa 2018.

[3] Kennis J. WikiTropica: collaborative knowledge management in the field of tropical medicine and international health. IST-Africa 2018.

[4] Maphanyane J, Nkwae B, Oitsile T, Serame T, Jakoba K. Towards the Building of a Robust National Database Infrastructure (NSDI) Developing Country Needs: Botswana Case Study. IST-Africa 2018.

[5] Thiga M, Chebon V, Kiptoo S, Okumu E, Onyango D. Electronic Voting System for University Student Elections: The Case of Kabarak University, Kenya. IST-Africa 2018.

[6] Naki A, Boucher D, Nzewi O. A Framework to Mitigate Water Billing Information Systems Integration Challenges at Municipalities. IST-Africa 2018.

CFP 6th Controlled Natural Languages workshop

Here’s some advertisement to submit a paper to an great scientific event that has a constructive and stimulating atmosphere. How can one say these positive aspects upfront, one might wonder. I happened to have participated in previous editions (e.g., this time and another time) and now I’m also a member of the organising committee for this 6th edition of the workshop, and we’ll do our best to make it a great event again.

 

——–

Final Call for Papers

Sixth Workshop on Controlled Natural Language (CNL 2018)

Submission deadline (All papers): 15 April 2018

Workshop: 27-28 August 2018 in Maynooth, Co Kildare, Ireland

This workshop on Controlled Natural Language (CNL) has a broad scope and embraces all approaches that are based on natural language and apply restrictions on vocabulary, grammar, and/or semantics.

The workshop proceedings will be published open access by IOS Press.

For further information, please see: http://www.sigcnl.org/cnl2018.html