Reblogging 2010: South African women on leadership in science, technology and innovation

From the “10 years of keetblog – reblogging: 2010”: while the post’s data are from 5 years ago, there’s still room for improvement. That said, it’s not nearly as bad as in some other countries, like the Netherlands (though the university near my home town improved from 1.6% to 5% women professors over the past 5 years). As for the places I worked post-PhD, the percent female academics with full time permanent contract: FUB-KRDB group 0% (still now), UKZN-CS-Westville: 12.5% (me; 0% now), UCT-CS: 42%.

South African Women on leadership in science, technology and innovation; August 13, 2010


Today I participated in the Annual NACI symposium on the leadership roles of women in science, technology and innovation in Pretoria, which was organized by the National Advisory Council on Innovation, which I will report on further below. As preparation for the symposium, I searched a bit to consult the latest statistics and see if there are any ‘hot topics’ or ‘new approaches’ to improve the situation.

General statistics and their (limited) analyses

The Netherlands used to be at the bottom end of the country league tables on women professors (from my time as elected representative in the university council at Wageningen University, I remember a UN table from ’94 or ‘95 where the Netherlands was third last from all countries). It has not improved much over the years. From Myklebust’s news item [1], I sourced the statistics to Monitor Women Professors 2009 [2] (carried out by SoFoKleS, the Dutch social fund for the knowledge sector): less than 12% of the full professors in the Netherlands are women, with the Universities of Leiden, Amsterdam, and Nijmegen leading the national league table and the testosterone bastion Eindhoven University of Technology closing the ranks with a mere 1.6% (2 out of 127 professors are women). With the baby boom generation lingering on clogging the pipeline since a while, the average percentage increase has been about 0.5% a year—way too low to come even near the EU Lisbon Agreement Recommendation’s target of 25% by 2010, or even the Dutch target of 15%, but this large cohort will retire soon, and, in terms of the report authors, makes for a golden opportunity to move toward gender equality more quickly. The report also has come up with a “Glass Ceiling Index” (GCI, the percentage of women in job category X-1 divided by the percentage of women in job category X) and, implicitly, an “elevator” index for men in academia. In addition to the hard data to back up the claim that the pipeline is leaking at all stages, they note it varies greatly across disciplines (see Table 6.3 of the report): in science, the most severe blockage is from PhD to assistant professor, in Agriculture, Technology, Economics, and Social Sciences it is the step from assistant to associate professor, and for Law, Language & Culture, and ‘miscellaneous’, the biggest hurdle is from associate to full professor. From all GCIs, the highest GCI (2.7) is in Technology in the promotion from assistant to associate professor, whereas there is almost parity at that stage in Language & Culture (GCI of 1.1, the lowest value anywhere in Table 6.3).

“When you’re left out of the club, you know it. When you’re in the club, you don’t see what the problem is.” Prof. Jacqui True, University of Auckland [4]

Elsewhere in ‘the West’, statistics can look better (see, e.g., The American Association of University Professors (AAUP) survey on women 2004-05), or are not great either (UK, see [3], but the numbers are a bit outdated). However, one can wonder about the meaning of such statistics. Take, for instance, the NYT article on a poll about paper rights vs. realities carried out by The Pew Research in 22 countries [4]: in France, some 100% paid their lip service to being in favour of equal rights, yet 75% also said that men had a better life. It is only in Mexico (56%), Indonesia (55%) and Russia (52%) that the people who were surveyed said that women and men have achieved a comparable quality of life. But note that the latter statement is not the same as gender equality. And equal rights and opportunities by law does not magically automatically imply the operational structures are non-discriminatory and an adequate reflection of the composition of society.

A table that has generated much attention and questions over the years—but, as far as I know, no conclusive answers—is the one published in Science Magazine [5] (see figure below). Why is it the case that there are relatively much more women physics professors in countries like Hungary, Portugal, the Philippines and Italy than in, say, Japan, USA, UK, and Germany? Recent guessing for the answer (see blog comments) are as varied as the anecdotes mentioned in the paper.

Physics professors in several countries (Source: 5).

Barinaga’s [5] collection of anecdotes of several influential factors across cultures include: a country’s level of economic development (longer established science propagates the highly patriarchal society of previous centuries), the status of science there (e.g., low and ‘therefore’ open to women), class structure (pecking order: rich men, rich women, poor men, poor women vs. gender structure rich men, poor men, rich women, poor women), educational system (science and mathematics compulsory subjects at school, all-girls schools), and the presence or absence of support systems for combining work and family life (integrated society and/or socialist vs. ‘Protestant work ethic’), but the anecdotes “cannot purport to support any particular conclusion or viewpoint”. It also notes that “Social attitudes and policies toward child care, flexible work schedules, and the role of men in families dramatically color women’s experiences in science”. More details on statistics of women in science in Latin America can be found in [6] and [7], which look a lot better than those of Europe.

Barbie the computer engineer

Bonder, in her analysis for Latin America [7], has an interesting table (cuadro 4) on the changing landscape for trying to improve the situation: data is one thing, but how to struggle, which approaches, advertisements, and policies have been, can, or should be used to increase women participation in science and technology? Her list is certainly more enlightening than the lame “We need more TV shows with women forensic and other scientists. We need female doctor and scientist dolls.” (says Lotte Bailyn, a professor at MIT) or “Across the developed world, academia and industry are trying, together or individually, to lure women into technical professions with mentoring programs, science camps and child care.” [8] that only very partially addresses the issues described in [5]. Bonder notes shifts in approaches from focusing only on women/girls to both sexes, from change in attitude to change in structure, from change of women (taking men as the norm) to change in power structures, from focusing on formal opportunities to targeting to change the real opportunities in discriminatory structures, from making visible non-traditional role models to making visible the values, interests, and perspectives of women, and from the simplistic gender dimension to the broader articulation of gender with race, class, and ethnicity.

The NACI symposium

The organizers of the Annual NACI symposium on the leadership roles of women in science, technology and innovation provided several flyers and booklets with data about women and men in academia and industry, so let us start with those. Page 24 of Facing the facts: Women’s participation in Science, Engineering and Technology [9] shows the figures for women by occupation: 19% full professor, 30% associate professor, 40% senior lecturer, 51% lecturer, and 56% junior lecturer, which are in a race distribution of 19% African, 7% Coloured, 4% Indian, and 70% White. The high percentage of women participation (compared to, say, the Netherlands, as mentioned above) is somewhat overshadowed by the statistics on research output among South African women (p29, p31): female publishing scientists are just over 30% and women contributed only 25% of all article outputs. That low percentage clearly has to do with the lopsided distribution of women on the lower end of the scale, with many junior lecturers who conduct much less research because they have a disproportionate heavy teaching load (a recurring topic during the breakout session). Concerning distribution of grant holders in 2005, in the Natural & agricultural sciences, about 24% of the total grants (211 out of 872) have been awarded to women and in engineering & technology it is 11% (24 out of 209 grants) (p38). However, in Natural & agricultural sciences, women make up 19% and in engineering and technology 3%, which, taken together with the grant percentages, show there is a disproportionate amount of women obtaining grants in recent years. This leads one to suggest that the ones that actually do make it to the advanced research stage are at least equally as good, if not better, than their male counterparts. Last year, women researchers (PIs) received more than half of the grants and more than half of the available funds (table in the ppt presentation of Maharaj, which will be made available online soon).

Mrs Naledi Pandor, the Minister for Science and Technology, held the opening speech of the event, which was a good and entertaining presentation. She talked about the lack of qualified PhD supervisors to open more PhD positions, where the latter is desired so as to move to the post-industrial, knowledge-based economy, which, in theory at least, should make it easier for women to participate than in an industrial economy. She also mentioned that one should not look at just the numbers, but instead at the institutional landscape so as to increase opportunities for women. Last, she summarized the “principles and good practice guidelines for enhancing the participation of women in the SET sector”, which are threefold: (1) sectoral policy guidelines, such as gender mainstreaming, transparent recruiting policies, and health and safety at the workplace, (2) workplace guidelines, such as flexible working arrangements, remuneration equality, mentoring, and improving communication lines, and (3) re-entry into the Science, Engineering and Technology (SET) environment, such as catch-up courses, financing fellowships, and remaining in contact during a career break.

Dr. Thema, former director of international cooperation at the Department of Science and Technology added the issues of the excessive focus on administrative practicalities, the apartheid legacy and frozen demographics, and noted that where there is no women’s empowerment, this is in violation of the constitution. My apologies if I have written her name and details wrongly: she was a last-minute replacement for Prof. Immaculada Garcia Fernández, department of computer science at the University of Malaga, Spain. Garcia Fernández did make available her slides, which focused on international perspectives on women leadership in STI. Among many points, she notes that the working conditions for researchers “should aim to provide… both women and men researchers to combine work and family, children and career” and “Particular attention should be paid, to flexible working hours, part-time working, tele-working and sabbatical leave, as well as to the necessary financial and administrative provisions governing such arrangements”. She poses the question “The choice between family and profession, is that a gender issue?”

Dr. Romilla Maharaj, executive director for human and institutional capacity development at the National Research Foundation came with much data from the same booklet I mentioned in the first paragraph, but little qualitative analysis of this data (there is some qualitative information). She wants to move from the notion of “incentives” for women to “compensation”. The aim is to increase the number of PhDs five-fold by 2018 (currently the rate is about 1200 each year), which is not going to be easy (recollect the comment by the Minister, above). Concerning policies targeted at women participation, they appear to be successful for white women only (in postdoc bursaries, white women even outnumber white men). In my opinion, this smells more of a class/race structure issue than a gender issue, as mentioned above and in [5]. Last, the focus of improvements, according to Maharaj, should be on institutional improvements. However, during the break-out session in the afternoon, which she chaired, she seemed to be selectively deaf on this issue. The problem statement for the discussion was the low research output by women scientists compared to men, and how to resolve that. Many participants reiterated the lack of research time due to the disproportionate heavy teaching load (compared to men) and what is known as ‘death by committee’, and the disproportionate amount of (junior) lecturers who are counted in the statistics as scientists but, in praxis, do not do (or very little) research, thereby pulling down the overall statistics for women’s research output. Another participant wanted to se a further breakdown of the numbers by age group, as the suspicion was that it is old white men who produce most papers (who teach less, have more funds, supervise more PhD students etc.) (UPDATE 13-10-’10: I found some data that seems to support this). In addition, someone pointed out that counting publications is one thing, but considering their impact (by citations) is another one and for which no data was available, so that a recommendation was made to investigate this further as well (and to set up a gender research institute, which apparently does not yet exist in South Africa). The pay-per-publication scheme implemented at some universities could thus backfire for women (who require the time and funds to do research in the first place so as to get at least a chance to publish good papers). Maharaj’s own summary of the break-out session was an “I see, you want more funds”, but that does not rhyme fully with he institutional change she mentioned earlier nor with the multi-faceted problems raised during the break-out session that did reveal institutional hurdles.

Prof. Catherine Odora Hoppers, DST/NRF South African Research Chair in Development Education (among many things), gave an excellent speech with provoking statements (or: calling a spade a spade). She noted that going into SET means entering an arena of bad practice and intolerance; to fix that, one first has to understand how bad culture reproduces itself. The problem is not the access, she said, but the terms and conditions. In addition, and as several other speakers already had alluded to as well, she noted that one has to deal with the ghosts of the past. She put this in a wider context of the history of science with the value system it propagates (Francis Bacon, my one-line summary of the lengthy quote: science as a means to conquer nature so that man can master and control it), and the ethics of SET: SET outcomes have, and have had, some dark results, where she used the examples of the atom bomb, gas chambers, how SET was abused by the white male belittling the native and that it has been used against the majority of people in South Africa, and climate change. She sees the need for a “broader SET”, meaning ethical, and, (in my shorthand notation) with social responsibility and sustainability as essential components. She is putting this into practice by stimulating transdisciplinary research at her research group, and, at least and as a first step: people from different disciplines must to be able to talk to each other and understand each other.

To me, as an outsider, it was very interesting to hear what the current state of affairs is regarding women in SET in South Africa. While there were complaints, there we also suggestions for solutions, and it was clear from the data available that some improvements have been made over the years, albeit only in certain pockets. More people registered for the symposium than places available, and with some 120 attendees from academia and industry at all stages of the respective career paths, it was a stimulating mix of input that I hope will further improve the situation on the ground.


[1] Jan Petter Myklebust. THE NETHERLANDS: Too few women are professors. University World News, 17 January 2010, Issue: 107.

[2] Marinel Gerritsen, Thea Verdonk, and Akke Visser. Monitor Women Professors 2009. SoFoKleS, September 2009.

[3] Helen Hague. 9.2% of professors are women. Times Higher Education, May 28, 1999.

[4] Victoria Shannon. Equal rights for women? Surveys says: yes, but…. New York Times/International Herald Tribune—The female factor, June 30, 2010.

[5] Marcia Barinaga. Overview: Surprises Across the Cultural Divide. Compiled in: Comparisons across cultures. Women in science 1994. Science, 11 March 1994 263: 1467-1496 [DOI: 10.1126/science.8128232]

[6] Beverley A. Carlson. Mujeres en la estadística: la profesión habla. Red de Reestructuración y Competitividad, CEPAL – SERIE Desarrollo productivo, nr 89. Santiago de Chile, Noviembre 2000.

[7] Gloria Bonder. Mujer y Educación en América Latina: hacia la igualdad de oportunidades. Revista Iberoamericana de Educación, Número 6: Género y Educación, Septiembre – Diciembre 1994.

[8] Katrin Benhold. Risk and Opportunity for Women in 21st Century. New York Times International Herald Tribune—The female factor, March 5, 2010.

[9] Anon. Facing the facts: Women’s participation in Science, Engineering and Technology. National Advisory Council on Innovation, August 2009.

Article on the Repository of Ontologies for MULtiple Uses (ROMULUS) in print with JoDS

Yay, also the paper on my student’s implementation work of her MSc thesis made it into a journal paper: the Repository of Ontologies for MULtiple Uses (ROMULUS) populated with mediated foundational ontologies is now on online-first [1] with the Journal on Data Semantics. It will appear in a special issue on extended and revised papers of MEDI’13, edited by Alfredo Cuzzocrea.

Although I have mentioned the beta release of the repository earlier and noted as well that the student, Zubeida Khan, has won the CSIR prize of best Masters, that was 1-2 years ago and more has happened in the meantime.

From the technological viewpoint, there are more features available now than in the beta and MEDI’13 releases, such as the automated foundational interchangeability [2], and there’s more detail on the technologies used as well as an extended EER diagram for ontology storage and annotation, and it has an updated comparison with other repositories and usage statistics. Overall, it is the first attempt to realise the vision of an ontology library that was posed some 12 years ago in WonderWeb Deliverable D18, and it thus ended up having more features than those D18 requirements for a foundational ontology library. Have a look at ROMULUS online.

From a theoretical viewpoint, besides now having a book chapter on the mappings between the foundational ontologies [3], the ‘storyline’ and need for it—known very well in ontology engineering already—has been framed into one where the repository of the foundational ontologies is also needed for ontology-driven conceptual data modelling. Why is that so? There is an increasing amount of results on ontology-driven conceptual modelling (see ER’15 proceedings), which avails of foundational ontologies, such as UFO. There are multiple extensions to the conceptual modelling languages based on insights from ontology, and when they are based on different foundational ontologies, one can’t pick-and-choose anymore as there may be incompatibilities in how things are represented. Likewise, choosing for one foundational ontology limits, or enables, one to model one thing but not another. For instance, some do have ‘substance’ or ‘amount of matter’ (wine, alcohol and the like), others do not, so that there is, in theory, no place for such things in one’s conceptual data model. That’s not good—or at least complicates matters—for an information system or database that needs to store data about, say, a food processing plant or animal fodder. The paper presents more of such issues and how ROMULUS helps addressing them. Also, just like that ROMULUS can help choosing the most appropriate foundational ontology for ontology engineering and help analysing the foundational ontologies without reading umpteen papers on it first, it can do so for the conceptual modeller. Be this though ONSET or the web-based querying of the ontologies and their alignments.

Finally, in case you think there are shortcomings to the repository to the extent you feel the need to develop your own one: the paper provides ample material on how to build one yourself. If you don’t want to go through that trouble, then contact Zubeida or me for the feature request, and we’ll try to squeeze it in with the other activities.



[1] Khan, Z.C., Keet, C.M. ROMULUS: a Repository of Ontologies for MULtiple USes populated with foundational ontologies. Journal on Data Semantics. DOI: 10.1007/s13740-015-0052-1 (in print)

[2] Khan, Z.C., Keet, C.M. Feasibility of automated foundational ontology interchangeability. 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW’14). K. Janowicz et al. (Eds.). 24-28 Nov, 2014, Linkoping, Sweden. Springer LNAI 8876, 225-237.

[3] Khan, Z.C., Keet, C.M. Foundational ontology mediation in ROMULUS. Knowledge Discovery, Knowledge Engineering and Knowledge Management: IC3K 2013 Selected Papers. A. Fred et al. (Eds.). Springer CCIS vol. 454, pp. 132-152, 2015. preprint

Yes, the protests reduce productivity of academics as well…

…and no, we’re not worried that we won’t get our bonus this year because as academics we don’t get any bonuses anyway. Just to answer two recent ‘interesting’ questions in these times of nation-wide student protests in South Africa. With everything that’s been going on here, writing a report on attending the 34th International Conference on Conceptual Modelling (ER’15) ended up lower on the list of activities, and by now it’s almost a month ago, so I’ll let that slip by, despite that it was great and deserves attention. At the time I was in Stockholm for ER’15 and afterward a week at FUB in Bolzano (Italy), nation-wide coordinated student protests were going on, and still are albeit with fewer participants. As most people who heard of it at ER, in Bolzano, and collaborators only saw a brief international news item of the violence—police using stun grenades, rubber bullets—and assumed they were some typical run-of-the-mill student protests that happen also in other countries: I think this one is different from others, and more complex. Fundamentally, the protests are about the (mostly) young generation expressing that post-apartheid South Africa hasn’t improved nearly enough—neither the societal nor the educational nor the economic dimension—and demanding a better deal. So, here’s a coloured version of some of it, mainly intended for a non-South African readership to get a bit of an idea what’s going on and put some figures into perspective w.r.t. what I assume most of you are more familiar with. I could try to put up the pretence of objectivity, but I’m probably not. Some useful sources are news24, for quick short updates of events as they unfold, and Groundup, for some in-depth articles.


Main concrete issues

Over the past years, government funding of universities has been diminishing, with the shortfall being made up by yearly fees increases, which is an unsustainable financial model and it increasingly excludes more and more qualifying students to study at a university, especially since the student financial aid scheme hasn’t kept up and the fees increases are higher than inflation rate and wage increases that are 4-7% per year. The scheduled 10% for next year was the last straw. After the first week of protests, they managed to get a commitment from Zuma on Oct 23 for 0% fees increase for next year. While this is more than we achieved back in the ’90 in the Netherlands when we were protesting against fees increases (among other things), at that time, anyone who qualified still could get just about sufficient funds to attend university for 5 years to get a (Bachelors +) Masters degree (without it, I probably wouldn’t have gone to the university either). The latter is not the case here, not even close: the scholarship (‘studiebeurs’ in NL) then there amounts to about 100000 Rand a year here now, then with the average monthly salary of 17000K gross, that’s about half a parent’s net income/year for one year of study. But the average wage is not the kind of amount that leaves extras for saving. Apparently, for a nuclear-family household, one needs a sustained income of at least 500000/year to have enough to save over the years to pay for going to university—yes, at least twice the ‘jan modaal’/average income to be able to afford it. With South Africa having a shameful Gini coefficient of 0.71, go figure how many are in that category.

This was only the first core demand. Here, as in many universities across the world, there has been a drive for outsourcing of certain types of work—cleaning, garden maintenance and the like—in a drive for pushing down overhead costs. This might have looked good on the balance sheets at the time when the decisions were made, but the ‘collateral damage’ was that the outsourced workers did not get the benefits anymore that they had as employees of the university. Notably, the fee rebate for themselves and their family members. So, this is a double whammy for workers, making it even harder for their kids to go to university, for having to pay the full fee and for generally being on the really low pay scales that make attending university totally unaffordable and out of reach. At various points in or at the end of the second week of the protests, several universities (including UCT) committed to insourcing: when the contracts with the outsourcing companies terminate, they’ll become university employees again, with the fee rebate benefits.

That’s not all. A dastardly practice that cash-strapped universities resort to in a desperate attempt to get the unpaid or only partially paid fees from students (down to the last cent), is that when students still have outstanding fees to pay, they won’t get their final exam results and won’t be allowed to graduate. But that having-completed-the-degree-but-no-parchment-to-show-for limbo is precisely preventing students to get decent-paying jobs, or even a job at all, making it harder to pay up the remaining debt; double whammy here as well. Hence, the demand of clearing such historical debt, or at least to let them graduate, so they can get a job and start paying back soon (2? 3? 5? years) thereafter. The latter is quite common in other countries, including the country where I studied. (Had they not have that pay-back-later system, many a door would have remained closed to me as well (I had to borrow money for 4 months because of delays due to a serious sports injury near the end of my studies—after the 5 years funded, see above)). This issue is mostly still unresolved in South Africa. To relate to elsewhere: there’s many a sob story about graduates in the USA with “crippling college debts”, but what’s really crippling for one’s career is being stuck with the debt but not having the proof of the degree even though you satisfied its requirements. There’s some 25-30% unemployment rate in South Africa, and a degree paper really does make a difference.


Fair play to them, and I hope they achieve the demands. I would be very hypocritical if I were to not support them, as I have benefited from those things they want to have, and I wish that all countries would have the system we had back in the 1990s. True, I was then at one of the fronts of protests against the breakdown of it, and what we had certainly was not perfect. However, compared to what it has descended into in the Netherlands and other EU countries, and the lamentable state of the funding systems (well: the lack thereof) in most countries of the world, it almost sounds like an education paradise nowadays: finish highest level of (fee-free) secondary school, sign up for a degree at a university of your preference[1], get enough funding for 5 years that covers fees, books, living expenses, and free public transport (condition: >=25% courses passed/year). It should be at least like that, if not better, everywhere.


Other issues intersecting with it

It is not just about access to higher education, though. Once in, there’s still the so-called ‘legacy of apartheid’ to put up with, which many a student wants to see changed. This sneer-quoted term surely includes the racism, which is, perhaps, the only thing non-SA readers from my generation and older may think of. Perhaps less obvious are the issues of the “dead white men”-infested curricula, especially in the humanities, or, to phrase it positively: how to change a Euro-centric curriculum to one that is more relevant to Africa? There are notable African writers, philosophers, etc etc., but they don’t feature much now.

There’s the oppressive space and naming of buildings, with the #RhodesMustFall movement but one instance of trying to change this (tl;dr: Cecil Rhodes was an über-badass among the badass colonisers, yet having a statue in a central place on campus, which has been removed earlier this year).

Government funding post-1994 has focussed primarily on making the lives of the poorest-of-the-poor less hard, by building houses, working on providing potable water, electricity, and the like. Poor students somehow were not allowed to complain, for having the privilege of going to university. However, really scraping by is hard. That’s not of the type ‘just about enough’ I mentioned above, where we could afford cheap food, clothing, and housing—the basic necessities in Maslov’s pyramid. For instance, at the university I worked before (UKZN), a call to employees was put out in exam time at the end of the year to donate money so that the destitute students would be able to get a meal/day in exam time, as the alternative for them was no food at all. It was also not unusual that students were locked out of residence for not having paid (an unlocked lecture hall serving as make-shift sleeping place). The current protests created a space where such hardships were allowed to be voiced.

Then there’s the crazy police violence. It was not part of the original narrative for the protests, but it has become part of it. Universities here have a tendency to call in the police when there are protests. Once they’re in, they take over. Unpredictable horses and ‘refreshing’ water cannons is one thing (I know of those), and even tear gas (experienced that too), but rubber bullets (!) and the (wtf!) stun grenades, that’s of a yet different level of dastardliness. To add insult to injury, the police spokesperson even declared to be proud/satisfied that the police had acted with restraint. Compared to the massacre that Marikana was (police killed 34 strikers), I guess so, yes, but that certainly ought not to be the yardstick to measure up against. Although there are reports that some more recent protests did not remain peaceful from the protester-side, they were in the early days when the police provoked with the violence. On a related note: I heard that during the protests, academics on the frontline couldn’t stop the police from charging, but a ‘buffer’ of white students could make them hesitate at least. I’ll leave that fact for you to chew on.

This is not all, but, for now, it’ll have to do for this item, lest the blog post ends up way to long.


On the academics side

On the whole, I have the impression that the majority of academics have been supportive of the initial students demands, if not from day one then in hindsight. There have been supportive open letters signed by lots of academics, and a bunch joined in the protests. I cannot recall many supportive statements explicitly from staff/academics unions, however, but this may also be due to news reporting, or perhaps there’s room for a more progressive union. Some are pushed out of their comfort zone and feel it’s a bit scary but ok actually, other desperately want to remain in their comfy bubble and are afraid. Some academics are yelled at for being just too melanin-deficient that they could not possibly support the cause (even when they actually do), but are perceived to be part of the problem; this kind of over-generalising isn’t the way to get more academics on board to support the students’ cause. There’s the term coconut (black on the outside, white on the inside); what would the reverse be? The ‘schoolkrijt’ liquorice sweets they sell at Pick ‘n Pay (white on the outside, some brown-ish mixture on the inside)? Or, better, just human.

UCT was closed for two weeks due to the protests, which was a management decision that most academics did not like. Not for disruption of the daily routine, but for the notion of closing that space where ideas are posed, discussed, analysed, debated, contested, and possibly some solutions found.

It is not at all clear whether admin staff and academics will have to cough up the shortfall due to government’s insufficient compensation of the 0% and the insourcing, so there may be an aftermath match there. The tl;dr of many articles: education is a public good, not an individualist benefit, so society should pay, and a university is not a corporation.

At the same time, we’re devising a range of scenarios to cope with changing situations (like how to handle exam disruption), inform students, adjust things (e.g., rescheduling of revision lectures, the content of the actual exam papers, setting an extra exam) and so on. This takes time away from research and from other activities academics do. Which brings me back to the post’s title: yes, our work is affected in that we don’t get as much done as we usually do, and things slip through (deadline missed, belated response to a student query). In the grand scheme of things, they are minor compared to your (from abroad) typesetted-paper-chasing/article-review-invitation/…, and I hope you can bear with the occasional slight delay in my response (for the benefit of SA).

[1] provided you chose the right exam subjects—e.g., to study computer science, you need maths, to study physics you needed physics as subject in your high school exam—and with only medicine, physio, and dentistry were numerus fixus.

Reblogging 2009: Building bias into your database

From the “10 years of keetblog – reblogging: 2009”: The tl;dr of it: bad data management -> bad policy decisions, and how you can embed political preferences and prejudices in a conceptual data model.

While the post has a computing flavor to it especially on the database design and a touch of ontologies, it is surely also of general interest, because it gives some insight into the management of data that is used for policy-making in and for conflict zones. A nicer version of this blog post and the one after that made it into a paper-review article “Dirty wars, databases, and indices” in the Peace & Conflict Review journal (Fall 2009 issue) of the UN-mandated University for Peace in Costa Rica.

Building bias into your database; Jan 7, 2009

 p.s.: while I intended to write a post on attending the ER’15 conferences, the exciting times with the student protests in South Africa put that plan on the backburner for a few more days at least.


For developing bio-ontologies, if one follows Barry Smith and cs., then one is solely concerned with the representation of reality; moreover, it has been noted that ontologies can, or should be, seen as a representation of a scientific theory [1] or at least that they are an important part of doing science [2]. In that case, life is easy, not hard, for we have the established method of scientific inquiry to settle disputes (among others, by doing additional lab experiments to figure out more about reality). Domain- and application ontologies, as well as conceptual data models, for the enterprise universe of discourse require, at times, a consensus-based approach where some parts of the represented information are the outcome of negotiations and agreements among the stakeholders.

Going one step further on the sliding scale: for databases and application software for the humanities, and conflict databases in particular, one makes an ontology or conceptual data model conforming to one’s own (or the funding organisation’s) political convictions and with the desired conclusions in mind. Building data vaults seems to be the intended norm rather than the exception, hence, maintenance and usage and data analysis beyond the developers limited intentions, let alone integration, are a nightmare.

 In this post, I will outline some suggestions for building your own politicized representation—be it an ontology or conceptual data model—for armed conflict data, such as terrorist incidents, civil war, and inter-state war. I will discuss in the next post a few examples of conflict data analysis, both regarding extant databases and the ‘dirty war index’ application built on top of them. A later post may deal with a solution to the problems, but for now, it would already be a great help not to adhere to the tips below.

Tips for biasing the representation

In random order, you could do any of the following to pollute the model and hamper data analysis so as to ensure your data is scientifically unreliable but suitable to serve your political agenda.

1. Have a fairly flat taxonomy of types of parties; in fact, just two subtypes suffice: US and THEM, although one could subtype the latter into ‘they’, ‘with them’, and ‘for them’. The analogue, with ‘we’, ‘with us’, and ‘for us’ is too risky for potential of contagion of responsibility of atrocities and therefore not advisable to include; if you want to record any of it, then it is better to introduce types such as ‘unknown perpetrator’ or ‘not officially claimed event’ or ‘independent actor’.

2. Aggregate creatively. For instance, if some of the funding for your database comes from a building construction or civil engineering company, refine that section of target types, or include new target types only when you feel like it is targeted sufficiently often by the opponent to warrant a whole new tuple or table from then onwards. Likewise, some funding agencies would like to see a more detailed breakdown of types of victims by types of violence, some don’t. Last, be careful with the typology of arms used, in particular when your country is producing them; a category like ‘DIY explosive device’ helps masking the producer.

3. Under-/over-represent geography. Play with granularity (by city/village, region, country, continent) and categorization criteria (state borders, language, former chiefdoms, parishes, and so forth), e.g., include (or not) notions such as ‘occupied territory’ (related to the actors) and `liberated region’ or `autonomous zone’, or that an area may, or may not, be categorized or named differently at the same time. Above all, make the modelling decisions in an inconsistent way, so that no single dimension can be analysed properly.

4. Make an a-temporal model and pretend not to change it, but (a) allow non-traceable object migration so that defecting parties who used to be with US (see point 1) can be safely re-categorised as THEM, and (b) refine the hierarchy over time anyway so as to generate time-inconsistency for target types (see point 2) and geography (see point 3), in order to avoid time series analyses and prevent discovering possible patterns.

5. Have a minimal amount of classes for bibliographic information, lest someone would want to verify the primary/secondary sources that report on numbers of casualties and discovers you only included media reports from the government-censored newspapers (or the proxy-funding agency, or the rebel radio station, or the guerrilla pamphlets).

6. Keep natural language definitions for key concepts in a separate file, if recorded at all. This allows for time-inconsistency in operational definitions as well as ignorance of the data entry clerks so that each one can have his own ideas about where in the database the conflict data should go.

7. Minimize the use of database integrity constraints, hence, minimize representing constraints in the ontology to begin with, hence, use a very simple modelling language so you can blame the language for not representing the subject domain adequately.

I’m not saying all conflict databases use all of these tricks; but some use at least most of them, which ruins credibility of those database of which the analysts actually did try to avoid these pitfalls (assuming there are such databases, that is). Optimism wants me to believe developers did not think of all those issues when designing the database. However, there is a tendency that each conflict researcher compiles his own data set and that each database is built from scratch.

For the current scope, I will set aside the problems with data collection and how to arrive at guesstimated semi-reliable approximations of deaths, severe injuries, rape, torture victims and so forth (see e.g. [3] and appendix B of [4]). Inherent problems with data collection is one thing and difficult to fix, bad modelling and dubious or partial data analysis is a whole different thing and doable to fix. I elaborate on latter claim in the next post.


[1] Barry Smith. Ontology (Science). In: C. Eschenbach and M. Gruninger (eds.), Formal Ontology in Information Systems. Proceedings of FOIS 2008. preprint

[2] Keet, C.M. Factors affecting ontology development in ecology. Data Integration in the Life Sciences 2005 (DILS’05), Ludaescher, B, Raschid, L. (eds.). San Diego, USA, 20-22 July 2005. Lecture Notes in Bioinformatics LNBI 3615, Springer Verlag, 2005. pp46-62.

[3] Taback N (2008 ) The Dirty War Index: Statistical issues, feasibility, and interpretation. PLoS Med 5(12): e248. doi:10.1371/journal.pmed.0050248.

[4] Weinstein, Jeremy M. (2007). Inside rebellion—the politics of insurgent violence. Cambridge University Press. 402p.

Reblogging 2009: A collection of parameters for ontology design

From the “10 years of keetblog – reblogging: 2009”: How the paper introduced in this post came about is a story of its own (it was in the context of finding suitable ontologies for testing Ontology-Based Data Access systems). The short MTSR’09 paper that the post introduces was extended into a journal paper published in IJSMO in 2010.

A collection of parameters for ontology design; June 1, 2009


Ontology design is still more of an art than a science. A methodology, Methontology, does exist, but it does not cover all aspects of ontology development. Likewise, there are tools, such as Protégé and the NeOn toolkit, that make several steps in the whole procedure easier. But, with the plethora of resources around, where should one start developing one’s own domain ontology, what resources are available for reuse to speed up its development, for which purposes can the ontology be developed?

The novice ontology engineer would have to go through much of the extant literature, read case studies and draw their own conclusions on how to go about developing the ontology and/or also attend ontology engineering courses or summer schools, which is a rather high start-up cost.

To ameliorate this, but also to save myself from repeating such information informally, I gave it a try to condense that information in, effectively, 4 Springer-size pages [pdf] (plus 1 page intro and 1 page references) [1]. The paper contains a grouping of input parameters that determine effectiveness of ontology development and use, which are categorised along four dimensions: purpose, ontology reuse, ways of ontology learning, and the language and reasoning services.

The aim was to be brief, so while the list of parameters is long, the list of references is comparatively short—but the references are kept diverse and they do contain references to different paradigms around instead of just one. (A version with lots of references is in the making.)

The paper has several examples taken from the agriculture domain by having build upon experiences gained in previous and current projects and related literature. It is noteworthy, however, that development of agri-ontologies is in its infancy. Then, for a relatively seasoned ontology engineer, most, if not all, parameters may be known to a greater or lesser extent already, but from the intended audience perspective, the paper was deemed to be a timely, much needed, and useful overview. My impression is that those reviewers’ comments say more about the knowledge transfer—well, the lack thereof—from one discipline to another than about the modellers and domain experts.

For those of you who are interested in agri-ontologies and would like to know more about the latest developments in that area, there is the (third) special track on agriculture, food and the environment during MTSR’09 in Milan 1-2 October.

[1] Keet, C.M. Ontology design parameters for aligning agri-informatics with the Semantic Web. 3rd International Conference on Metadata and Semantics (MTSR’09) — Special Track on Agriculture, Food & Environment, Oct 1-2 2009 Milan, Italy. Springer CCIS. to appear.

SA ICPC Regionals 2013 problem analysis

Our 2015 Southern Africa ICPC Regionals is nearby, and we have been using some of the 2013 SA problems for training purposes as well as a teaser/taste of what’s to come on the 24th of October (registration closes on Oct 10). While the training materials are on vula (the UCT CMS for courses), some hints to solve some of them may be of general interest. I’ll give a breakdown and a ‘spoiler alert’ for five of the eight problems. The problem-solving aspects and explanations in the training sessions were longer, but these short notes will give you some useful starting points where to look for implementation details already anyway.

The problems can be categorised into the following types:

  1. Isle of the birds – computational geometry
  2. Fitness training – simple ad hoc
  3. Similarity – String processing
  4. Railways – Graphs
  5. Student IDs – String processing


Isle of the birds

There’s an island with trees, and the rubber band will enclose them all. That is, we need to find the polygon with corners of the outermost points enclosing the rest of the points. Thus, we need to compute a convex hull. How can that be done, and, more importantly, how can that be done efficiently? Computing the whole solution space is going to take too much time, as there can be between 3 and 15000 points. One technique is the sweepline (generally useful to check out), and one of those tailored to finding the convex hull is the Graham Scan algorithm: first, starting with the left-lowest point, scan the plane of points counter clock-wise to figure out where the points are (points on the same line are ignored), then, second, connect the points in a stepwise fashion from the bottom going counter-clock-wise again: if the angle is >180 (compare values of the coordinates), then discard the penultimate point and connect the 2nd last to the last point.

Only 4 teams solved this problem at the 2103 regionals (including the winning team ‘if cats programmed computers’).


Fitness training

John cycles A km, Mary runs B km, starting and finishing at the same place using one circular route of M km. This can be computed with a straight-forward modulo operation. All 53 teams solved this problem at the 2013 regionals.



Spellchecking in the online search engine; well, given two words, what is the minimum cost of the change operation to go from word_A to word_B, given certain costs of additions, deletions, and character swaps? Comparing strings of characters is around quite a while, from spellchecking, to plagiarism checkers, to DNA sequence alignments, so surely a fine algorithm should be around for that already. Indeed: the minimum edit distance (Levenshtein distance) (nice explanation), where, instead of computing all possible options (very costly!), you fill in the table accordingly. The ‘tricky’ part is that the basic algorithm for the minimum edit distance counts each change as a cost of 1, whereas in this problem, some changes cost 2; hence, you will have to change those values in the standard algorithm (demo that lets you play with different costs).

Only 2 teams solved this problem at the 2103 regionals (including the winning team ‘if cats programmed computers’).



Construct a railroad network between cities in the shape of a tree, but put in a bid for the second-most cheapest option. So, we have lines and points, or: some graph algorithm. Two main groups are shortest path (Dijkstra, Bellman-Ford) and spanning tree. We need a minimum spanning tree (MST) to begin with. This reduces the option for the most suitable algorithm to Prim’s or Kruskal’s. Prim requires a particular starting vertex, Kruskal doesn’t. The problem statement doesn’t require a starting vertex, hence Kruskal’s algorithm is the one of choice (example). But then how do we get the second-best spanning tree? Also in this case, many have asked before (thoertically and practically—search online for both): take an edge with weight w that’s not in the MST and results in a cycle when added to the MST, compare w with the weight of the heaviest (non-w) edge in the cycle (v), then of those comparisons among the cycles, take the one with the lowest difference, add the edge with weight w and remove the other edge v. There you have your second-best option.

Only 4 teams solved this problem at the 2103 regionals (including the winning team ‘if cats programmed computers’)


Student IDs

Generate student IDs from the students’ names, following a given pattern. Of itself, this is a somewhat laborious implementation. The only real issue is to keep track of what’s been processed of the string. Here, it is especially useful to first design the solution separately before delving into the murky code, as it otherwise will require a lot of test cases to check the corner cases (and remember you have only one machine). A nice way to design it is to use automata and only then to convert that into code.

39 teams out of the 53 solved this problem at the 2103 regionals.



Just in case you’re trying out the remaining problems, and are banging your head against the wall or pulling your hair out: no team solved the Street lights (Problem B; looks like a maths problem, with floating point complication) and the Necklace (Problem G), and only 3 teams solved Matchstick maths (Problem D; ask a team member of ‘if cats programmed computers’, who solved it).

Reblogging 2008: Failing to recognize your own incompetence

From the “10 years of keetblog – reblogging: 2008”: On those uncomfortable truths on the difference between knowing what you don’t know and not knowing what you don’t know… (and one of the earlier Ig Nobel prize winners 15 years ago)

Failing to recognize your own incompetence; Aug 25, 2008


Somehow, each time when I mention to people the intriguing 2000 Ig Nobel prize winning paper “Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments” [1], they tend to send (non)verbal signals demonstrating a certain discomfort. Then I tone it down a bit, saying that one could argue about the set up of the experiment that led Kruger & Dunning to their conclusion. Now—well, based on material from a few years ago but I found out recently—I cannot honestly say that anymore either. A paper from the same authors, “Why people fail to recognize their own incompetence” [2], reports not only more of their experiments in different settings, but also different experiments by other researchers validating the basic tenet that ignorant and incompetent people do not realize they are incompetent but rather think more favourably of themselves—“tend to hold overinflated views of their skills”—than can be justified based on their performance.

Yeah, the shoe might fit. Or not. In addition to the lower end of the scale overestimating their competencies by a large margin, the converse happens, though to a lesser extent, at the other end of the scale, where top-experts underestimate their actual capabilities. The latter brings it own set of problems and research directions, which I will set aside for the remainder of this blog post. Instead, I will dwell a bit on those people bragging to know this that and the other, but, alas, do not perform properly and, moreover, do not even realize they do not. Facing a person who knows s/he does not have the required skills is one thing and generally s/he’s willing to listen and learn or say to not care about it, but those people who do not realize the knowledge/skills gap they have are, well, a hopeless bunch futile to waste your time on (unless you teach them anyway).

 Let us have a look what those psychologists provided to come to this conclusion. Aside from the experiment about jokes in the ’99 paper, which are at least (sub)culture-dependent, the data about the introductory-level psychology class taken by 141 students is quite telling. Right after the psych exam, the students were asked about their own estimate of performance & mastery of the course material (relative to other students in their class) and to estimate their raw score of the exam. These were the results ([2] p84, Fig.1):

If you think such kind of data is only observed with undergraduates in psychology, well, then check [2]’s references: debate teams, hunters about their firearms, medical residents (over)estimating their patient-interviewing techniques, medical lab technicians overestimating their knowledge of medical terminology—you name it, the same pattern, even if the subjects were held a carrot of monetary incentive in an attempt to assess themselves honestly.

 Imagine you going to a GP or doctor of a regional hospital who has the arrogance to know it all and does not call in a specialist on time. One can debate about the harmfulness or harmlessness about such cases. A very recent incident I observed was where x1 and x2 demanded from y to do nonsensical task z. Task z—exemplifying ignorance and incompetence of x1 and x2—was not carried out by y for it could not be done, but it was nevertheless used by x1 and x2 to “demonstrate” “(inherent) incompetence” of y because y did not do task z, whereas, in fact, it the only thing it shows is that y, unlike x1 and x2, may actually have realized z could not be done, hence, understand z better than x1 and x2 do. One’s incompetence [in this case, of x1 and x2] can have far-reaching effects on others around oneself. Trying to get x1 and x2 to realize their shortcomings has not worked thus far. Dunning et al’s students, however, had exam results for unequivocal feedback and there was an additional test set up with a controlled setting where they had built-in a lecture to teach the incompetent so as to rate their competencies better (which worked to some extent), but in real life those options are not always available. What options are available, if any? A prevalent approach I observed here in Italy (well, in academia at least) is that Italians tend to ignore those xs so as to limit as much as possible the ‘air time’ and attention they have, i.e., an avoidance strategy to leave the incompetent be, whereas, e.g., in the Netherlands people will tend to keep on talking until they have blisters on their tongues (figuratively) to try to get some sense in the xs heads, and yet others attempt to sweep things under the carpet and pray there will not appear any wobbles one could fall over. Research directions, let alone some practical suggestions on “how to let people become aware of their intellectual and social deficiencies”—other than ‘teach them’—were not mentioned in the article, but made it to the list of future works.

 You might wonder: does this hold across cultures? The why of the ‘ignorant and unaware of it’ gives some clues that, in theory, culture may not have anything to do with it.

“In many intellectual and social domains, the skills needed to produce correct responses are virtually identical to those needed to evaluate the accuracy of one’s responses… Thus, if people lack the skills to produce correct answers, they are also cursed with an inability to know when their, or anyone else’s, are right or wrong. They cannot recognize their responses as mistaken, or other people’s responses as superior to their own.” ([2], p. 86—emphasis added)

The principal problem has to do with so-called meta-cognition, which “refers to the ability to evaluate responses as correct or incorrect”, and incompetence then entails that one cannot successfully complete such a task; this is a catch-22, but, as mentioned, ‘outside intervention’ through teaching appeared to work and other means are a topic of further investigation. Clearly, a culture of arrogance can make significant stats more significant, but it does not change the principle of the cause. In this respect, the start of the article aptly quotes Confucius: “Real knowledge is to know the extent of one’s ignorance”. Conversely, according to Whitehead (quoted on p. 86 of [2]), “it is not ignorance, but ignorance of ignorance, that is the death of knowledge”.


[1] Kruger, J., Dunning, D. Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of personality and Social Psychology, 1999, 77: 1121-1134.

[2] Dunning, D., Johnson, K., Ehrlinger, J., Kruger, J. Why people fail to recognize their own incompetence. Current Directions in Psychological Science, 2003, 12(3): 83-87.

 p.s.: I am aware of the fact that I do not know much about psychology, so my rendering, interpretation, and usage of the content of those papers may well be inaccurate, although I fancy the thought that I understood them.