On ‘open access’ CS conference proceedings

It perhaps sounds nice and doing-good-like, for the doe-eyed ones at least: publish computer science conference proceedings as open access so that anyone in the world can access the scientific advances for free. Yay. Free access to scientific materials is good for a multitude of reasons. There’s downside in the set-up in the way some try to push this now, though, which amounts to making people pay for what used to be, and still mostly is, for free already. I take issue with that. Instead of individualising a downside of open access by heaping more costs onto the individual researchers, the free flow of knowledge should be—and remain—a collectivised effort.

 

It is, and used to be, the case that most authors put the camera-ready-copy (CRC) on their respective homepages and/or institutional repositories, and it used to be typically even before the conference (e.g., mine are here). Putting the CRC on one’s website or in an openly accessible institutional repository seems to happen slightly less often now, even though it is legal to do so. I don’t know why. Even if it were not entirely legal, a collective disobedience is not something that the publishers easily can fight. It doesn’t help that Google indexes the publisher quicker than the academics’ webpages, so the CRCs on the authors’ pages don’t turn up immediately in the search results even whey the CRCs are online, but that would be a pathetic reason for not uploading the CRC. It’s a little extra effort to lookup an author’s website, but acceptable as long as the file is still online and freely available.

Besides the established hallelujah’s to principles of knowledge sharing, there’s since recently a drive at various computer science (CS) conferences to make sure the proceedings will be open access (OA). Like for OA journal papers in an OA or hybrid journal, someone’s going to have to pay for the ‘article processing charges’. The instances that I’ve seen close-up, put those costs for all papers of the proceedings in the conference budget and therewith increase the conference registration costs. Depending on 1) how good or bad the deal is that the organisers made, 2) how many people are expected to attend, and 3) how many papers will go in the volume, it hikes up the registration costs by some 50 euro. This is new money that the publishing house is making that they did not use to make before, and I’m pretty sure they wouldn’t offer an OA option if it were to result in them making less profit from the obscenely lucrative science publishing business.

So, who pays? Different universities have different funding schemes, as have different funders as to what they fund. For instance, there exist funds for contributing to OA journal article publishing (also at UCT, and Springer even has a list of OA funders in several countries), but that cannot be used in this case, for the OA costs are hidden in the conference registration fee. There are also conference travel funds, but they fund part of it or cap it to a maximum, and the more the whole thing costs, the greater the shortfall that one then will have to pay out of one’s own research fund or one’s own pocket.

A colleague (at another university) who’s pushing for the OA for CS conference proceedings said that his institution is paying for all the OA anyway, not him—he easily can have principles, as it doesn’t cost him anything anyway. Some academics have their universities pay for the conference proceedings access already anyway, as part of the subscription package; it’s typically the higher-ranking technical universities that have access. Those I spoke to, didn’t like the idea that now they’d have to pay for access in this way, for they already had ‘free’ (to them) access, as the registration fees come from their own research funds. For me, it is my own research funds as well, i.e., those funds that I have to scramble together through project proposal applications with their low acceptance rates. If I’d go to/have papers at, say, 5 such conferences per year (in the past several years, it was more like double that), that’s the same amount as paying a student/scientific programmer for almost a week and about a monthly salary for the lowest-paid in South Africa, or travel costs or accommodation for the national CS&IT conference (or both) or its registration fees. That is, with increased registration fees to cover the additional OA costs, at least one of my students or I would lose out on participating in even a local conference, or students would be less exposed to doing research and obtaining programming experience that helps them to get a better job or better chance at obtaining a scholarship for postgraduate studies. To name but a few trade-offs.

Effectively, the system has moved from “free access to the scientific literature anyway” (the online CRCs), to “free access plus losing money (i.e.: all that I could have done with it) in the process”. That’s not an improvement on the ground.

Further, my hard-earned research funds are mine, and I’d like to decide what to do with it, rather than having that decision been taken for me. Who do the rich boys up North think they are to say that I should spend it on OA when the papers were already free, rather than giving a student an opportunity to go to a national conference or devise and implement an algorithm, or participate in an experiment etc.! (Setting aside them trying to reprimand and ‘educate’ me on the goodness—tsk! as if I don’t know that the free flow of scientific information is a good thing.)

Tell me, why should the OA principles trump the capacity building when the papers are free access already anyway? I’ve not seen OA advocates actually weighing up any alternatives on what would be the better good to spend money on. As to possible answers, note that an “it ought to be the case that there would be enough money for both” is not a valid answer in discussing trade-offs, nor is a “we might add a bit of patching up as conference registration reduction for those needy that are not in the rich inner core” for it hardly ever happens, nor is a “it’s not much for each instance, you really should be able to cover it” because many instances do add up. We all know that funding for universities and for research in general is being squeezed left, right, and centre in most countries, especially over the past 8-10 years, and such choices will have to, and are being, made already. These are not just choices we face in Africa, but this holds also in richer countries, like in the EU (fewer resources in relative or absolute terms and greater divides), although a 250 euro (the 5 conferences scenario) won’t go as far there as in low-income countries.

Also, and regardless the funding squeeze: why should we start paying for free access that already was a de facto, and with most CS proceedings publishers, also a de jure, free access anyway? I’m seriously starting to wonder who’s getting kickbacks for promoting and pushing this sort of scheme. It’s certainly not me, and nor would I take it if some publisher would offer it to me, as it contributes to the flow of even more money from universities and research institutes to the profits of multinationals. If it’s not kickbacks, then to all those new ‘conference proceedings need to be OA’ advocates: why do you advocate paying for a right that we had for free? Why isn’t it enough for you to just pay for a principle yourself as you so desire, but instead insist to force others to do so too even when there is already a tacit and functioning agreement going on that realises that aim of free flow of knowledge?

Sure, the publisher has a responsibility to keep the papers available in perpetuity, which I don’t, and link rot does exist. One easily could write a script to search all academics’ websites and get the files, like citeseer used to do well. They get funding for such projects for long-term archiving, like arxiv.org does as well, and philpapers, and SSRN as popular ones (see also a comprehensive list of preprint servers), and most institution’s repositories, too (e.g., the CS@UCT pubs repository). So, the perpetuity argument can also be taken care of that way, without the researchers actually having to pay more.

Really, if you’re swimming in so much research money that you want to pay for a principle that was realised without costs to researchers, then perhaps instead do fund the event so that, say, some student grants can be given out, that it can contribute to some nice networking activity, or whatever part of the costs. The new “we should pay for OA, notwithstanding that no one was suffering when it was for free” attitude for CS conference proceedings is way too fishy to actually being honest; if you’re honest and not getting kickbacks, then it’s a very dumb thing to advocate for.

For the two events where this scheme is happening that I’m involved in, I admit I didn’t forcefully object at the time it was mentioned (nor had I really thought through the consequences). I should have, though. I will do so a next time.

An Ontology Engineering textbook

My first textbook “An Introduction to Ontology Engineering” (pdf) is just released as an open textbook. I have revised, updated, and extended my earlier lecture notes on ontology engineering, amounting to about 1/3 more new content cf. its predecessor. Its main aim is to provide an introductory overview of ontology engineering and its secondary aim is to provide hands-on experience in ontology development that illustrate the theory.

The contents and narrative is aimed at advanced undergraduate and postgraduate level in computing (e.g., as a semester-long course), and the book is structured accordingly. After an introductory chapter, there are three blocks:

  • Logic foundations for ontologies: languages (FOL, DLs, OWL species) and automated reasoning (principles and the basics of tableau);
  • Developing good ontologies with methods and methodologies, the top-down approach with foundational ontologies, and the bottom-up approach to extract as much useful content as possible from legacy material;
  • Advanced topics that has a selection of sub-topics: Ontology-Based Data Access, interactions between ontologies and natural languages, and advanced modelling with additional language features (fuzzy and temporal).

Each chapter has several review questions and exercises to explore one or more aspects of the theory, as well as descriptions of two assignments that require using several sub-topics at once. More information is available on the textbook’s page [also here] (including the links to the ontologies used in the exercises), or you can click here for the pdf (7MB).

Feedback is welcome, of course. Also, if you happen to use it in whole or in part for your course, I’d be grateful if you would let me know. Finally, if this textbook will be used half (or even a quarter) as much as the 2009/2010 blogposts have been visited (around 10K unique visitors since posting them), that would mean there are a lot of people learning about ontology engineering and then I’ll have achieved more than I hoped for.

UPDATE: meanwhile, it has been added to several open (text)book repositories, such as OpenUCT and the Open Textbook Archive, and it has been featured on unglue.it in the week of 13-8 (out of its 14K free ebooks).

Ontology, part-whole relations, isiZulu and culture

The title is a mouthful, but it can go together. What’s interesting, is that the ‘common’ list of part-whole relations are not exactly like that in isiZulu and Zulu culture.

Part-whole relations have been proposed over the past 30 years, such as to relate a human heart to the human it is part of, that Gauteng is located in South Africa (geographically a part of), and the slice of the cake is a portion of the cake, and they seemed well-established by now. The figure below provides an informal view of it.

Informal taxonomy of common part-whole relations (source: [2])

My co-author, Langa Khumalo, and I already had an inkling this hierarchy probably would not work for isiZulu, based, first, on a linguistic analysis to generate natural language [1], and, second, the Shuter & Shooter English-isiZulu dictionary already lists 18 translations for just ‘part’ alone. Yet, if those ‘common’ part-whole relations are universal, the differences observed ought to be just an artefact of language, not ontological differences. To clear up the matter, we guided ourselves with the following questions:

  1. Which part-whole relations have been named in isiZulu, and to what extent are they not only lexically but also semantically distinct?
  2. Can all those part-whole relations be mapped with equivalence relations to the common part-whole relations?
  3. For those that cannot be mapped with equivalence relations: is the difference in meaning ontologically possibly interesting for ontology engineering?
  4. Is there something different as gleaned from isiZulu part-whole relations that is useful in improving the theoretical appreciation of part-whole relations?

To figure this out, we first took a bottom-up approach with evidence gathering, and then augmented it with further ontological analysis. Plodding though the isiZulu-English dictionaries got us 81 terms that had something to do with parts. 41 were discarded because they were not applicable upon closer inspection (e.g., referring to creating parts cf. relating parts, idioms). Further annotations and examples were added, which reduced it to 28 (+ 3 we had missed and were added). Of those 28, we selected 13 for ontological analysis and formalisation. That selection was based on importance (like ingxenye) and some of them that seemed a bit overly specific, like iqatha for portions of meat, and meat only. The hierarchy of the final selection is shown in the figure below, with an informal indication of what the relation relates.

Selected isiZulu terms with informal descriptions. (Source: [2])

They held up ontologically, i.e., some are the same as the ‘common’ ones, yet some others are really different, like the hlanganyela for a collective (cf. individual object) being part of (participating in) an event. Admitted, some of the domains/ranges aren’t very clearly delineated. For instance, isiqephu relates solid and ‘solid-like’ portions, as in, e.g., Zonke izicezu zesinkwa ziyisiqephu sesinkwa esisodwa ‘all slices of bread are a portion of some loaf of bread’. Where exactly that border of ‘solid-like’ is and when it really counts as a liquid (and thus isiqephu applies no more), is not yet clear—that’s a separate question orthogonal to the relation. Nonetheless, the investigation did clear up several things, especially the more precise umunxa that took me a while to unravel, which turned out to be a chain of parthood relations; e.g., the area where the fireplace is in the hut is a portion of the hut (sample use with the verbaliser: Onke amaziko angumunxa wexhiba). We didn’t touch upon really thorny issues that probably will deserve a paper of their own. For instance, the temporalised parthood isihlephu is used to relate a meaningful scattered part with identity to the whole it was part of, such as the broken-off ear of a cup that was part of the cup (but it cannot be used for the chip of the cup, as a chip isn’t identifiable in the same way as the ear is).

We did try to test the terms against the isiZulu National Corpus to see how the terms are used, but with the limited functionalities and tooling, not as much came out of it as we had hoped for. In any case, the detailed assessment of a section of the corpus did show the relevant uses were not contradicting the formalisation.

Further details can be found in our paper “On the ontology of part-whole relations in Zulu language and culture” that will be presented at the 10th International Conference on Formal Ontology in Information Systems 2018 (FOIS’18) that will be held from 17 to 21 September in Cape Town, South Africa.

As far as I know, this is the first such investigation. Checking out other languages a bit (mainly Spanish and German), and some related works on Turkish and Chinese, it might be the case that also there the ‘common’ part-whole relations may not be exactly the same. We carried out whole process systematically, which is described as such in the paper, so that anyone who’d like to do something like this for another language region and culture, could follow the same procedure.

 

References

[1] Keet, C.M., Khumalo, L. On the verbalization patterns of part-whole relations in isiZulu. 9th International Natural Language Generation conference (INLG’16), September 5-8, 2016, Edinburgh, UK. ACL, 174-183.

[2] Keet, C.M., Khumalo, L. On the ontology of part-whole relations in Zulu language and culture. 10th International Conference on Formal Ontology in Information Systems 2018 (FOIS’18). IOS Press. 17-21 September, 2018, Cape Town, South Africa. (in print)