Facts and opinions about Theory of Computation in Computer Science Curricula

Computing curricula are regularly reassessed and updated to reflect changes in the discipline, and, at the time of writing, the ACM/IEEE curriculum ‘CS2013’—also called the Srawman Draft [1], about which I blogged earlier—is under review. South Africa does not have its own computing organisation that for official accreditation and does not provide official guidelines on quality and curriculum planning, and therefore such international curricula guidelines provide the main guiding principles for curriculum development here. As is known, there’s theory and there’s practice, so a fact-based assessment about what the guidelines say and what is happening in the trenches may be in order. Here, I will focus on assessment of the curriculum with respect to one course in particular: theory of computation (ToC)—consisting of, roughly, formal languages, automata, Turing machines, computability, and complexity—which introduces the mathematical and computational principles that are the foundations of CS, such as the foundations of programming languages and algorithms, and the limits of computation. One may wonder (well, at least I did): How are ToC topics implemented in CS curricula around the world at present? Are there any country or regional differences? What is the sentiment around teaching ToC in academia that may influence the former?

To answer these questions, I examined how it is implemented in computer science curricula around the world, and the sentiment around teaching it by means of two surveys. The first survey was an examination of curricula and syllabi of computer science degrees around the world on whether, and if so, how, they include ToC. The second one was an online opinion survey (about which I reported before). The paper describing the details [2] has been accepted at the 21st Annual Meeting of the Southern African Association for Research in Mathematics, Science, and Technology Education (SAARMSTE’13), and here I will cherry-pick a few salient outcomes. (or go straight to the answers of the three questions)

Syllabi survey

The 17 traditional and comprehensive universities from South Africa were consulted online on offerings of CS programmes and also 15 universities from continental Europe, 15 from the Anglo-Saxon countries, 6 from Asia, 7 from Africa other than South Africa, and 8 from Latin America. It appeared that 27% of the South African universities—University of KwaZulu-Natal, University of South Africa, University of Western Cape, and the University of Witwatersrand—has ToC in the CS curriculum, compared to 84% elsewhere in the world, which is depicted in Figure 1. The regional difference between Europe (93% inclusion of ToC) and the Anglo-Saxon countries (80%) may be an artefact of the sample size, but it deserves further analysis.

Figure 1. Comparison of ToC in the CS curriculum in South Africa and in other countries around the world (FLAT = formal languages and automata theory, i.e., not including topics such as Turing machines computability and complexity).

The level of detail of the ToC course contents as described in the course syllabi online varied (but see below for better data from the opinion survey), and only 43 universities (of the 59 with data) had sufficient information online regarding timing of ToC in their degree programme. Five universities had it explicitly in the MSc degree programme, with the rest mainly in year 2, 3, or 4 (but see below for better data from the opinion survey), and 5 universities have it spread over 2 or more courses, whereas the rest offers it in a single course (sometimes with further advanced courses covering topics such as probabilistic automata and other complexity classes).

Opinion survey

The opinion survey was quite successful, with a response of n=77. 58 respondents had filled in their affiliation, and they were employed mainly at universities and research institutes: 12 respondents gave an South African academic affiliation and thus the majority of respondents were from around the world, including, among others, the USA, UK, Canada, Germany, Italy, Switzerland, Indonesia, China, Cuba, Brazil, and Argentina. A more detailed characterisation of the respondents, as well as the complete (anonymised) raw question answers with percentages, sorted by question (exported from LimeSurvey) are online at http://www.meteck.org/files/tocsurvey/.

There was a near-unanimous agreement (76 of the 77) that ToC should be in programme, 74 (96% of the answers) have it currently in the programme, and 82% had it in their degree programme when they were a student. Overall, the timing when it is taught in the programme has varied little over the years (see Figure 2). Further, for 90% of the responses, ToC is core in the curriculum, and secure in the programme for 86% (only few reasons were provided for “under threat/other”: that it has been removed from some specialisations but not all (in part due to the computer science vs. information systems tensions), or threatened due to low enrolment numbers).

Figure 2. Comparison between when the survey respondents did ToC in their degree and in which year in the programme it is taught at their university (where applicable).

Regarding the topics that should be part of a ToC course, the following is observed. The survey listed 46 topics and for each one, a possible answer [essential/extended/peripheral/no answer] could be given. The complete list of ToC topics ordered on percent ‘essential’ is shown in Table 1. In short: it is perceived decidedly that formal languages, automata, Turing machines, complexity, computability and decidability themes form part of one coherent offering, although the detail of the sub-topics covered may vary.

Table 1. Ordering of the 46 ToC topics, by calculating the percentage of responses that marked it as ‘essential’ out of the given answers.

Given the plentiful anecdotes, hearsay, and assertions in other articles about teaching ToC concerning difficulties with ToC teaching and learning, the survey also included some questions about that. The data provided by the respondents do substantiate the existence of issues to some extent: 44% of the responses answered that there are no issues and everything runs smoothly, or: a slight majority does, which can be subdivided into 32% ‘causes problems in the academic system each year’ and 24% where ‘management/student affairs has gotten used to the fact there are problems’. Several respondents provided additional information regarding the issues, mentioning low pass rates (n=3), that students struggle because they do not see the usefulness of ToC for their career (n=4), that it also depends on the quality of the teacher (n=2), and low enrolment numbers (n=2). For 45%, the first-time pass rates remain below 60% and with 80% of the respondents, the pass rate remains below 80%. The correlation between pass rate and issues is 0.79 (n is to small to draw any conclusions for the other combinations of pass rates, class sizes, extrapolated course content, and having issues).

Discussion

There is much one can discuss about with respect to the data (and more is included in the paper than I cover here in this blog post). Considering the curriculum analysis first, it can be summarized that ToC in CS is solidly in the programme, is oftentimes taught in a single course, and mostly in 2nd and 3rd year of the undergrad CS programme. Interestingly, there is a discrepancy between the ‘essential’ content according to the survey and the newly proposed ACM curriculum guidelines; compare Table 1 with Figure 3.

Figure 3. Proposed CS2013’s ToC topics in the Strawman draft (layout edited). 100% of tier-1 and >80% of tier-2 is core and has to be covered, and an undefined amount of the elective topics to facilitate track-development (no tracks have been defined in the CS2013 draft yet).

Considering the Strawman’s “core” topics, one may question the feasibility of imparting a real understanding of complexity classes P and NP without also touching upon computability and Turing machines. Furthermore, the hours indicated in Figure 3 are meant as minimum hours of fact-to-face lectures (i.e., 8 lessons at a South African university, or at least almost 3 weeks of a standard 16 credit course), which, if this minimum is adhered to, amounts to a very superficial treatment of partial ToC topics. As an aside: my ToC students at UKZN now go through all of it (in accordance with the topics listed in the handbook). Comparing the ‘essentials’ list with the older curriculum guidelines [3, 4], however, one observes that they are much more in agreement.

Quite a bit has changed in the computing arena since the late 1980s, and most notably the specialization and diversification of the field. ToC matters more for CS than other recently recognised specialisations within computing—e.g., software engineering, net-centric computing, information systems, and computational biology—and this diversification is, or has to be, recognised by the curriculum developers [5, 6], which should result in putting more or less weight on the core topics (see [7] for a detailed analysis on sub-disciplines within computing and a proposed weighting of curriculum themes). But recall that the Strawman draft is about (different track within) CS only. The diversification and its effect on computing curricula is noticeable clearly only when one compares it with the Software Engineering curriculum guidelines [8]: these guidelines include only a little bit on finite state machines, grammars, and complexity and computability in the “Data structures and algorithms” and “Discrete structures II” themes. It may be the case that, in praxis, those degree programmes called “computer science” indeed do contain the more fundamental topics, such as ToC (and logic, formal methods etc.), and that other ‘tracks’ actually have been given different names already, hence, would have been filtered out unintentionally a priori in the data collection stage of the curriculum survey.

Concerning issues teaching ToC, on an absolute scale, that 56% faces issues with their ToC courses is substantial, and, conversely, it deserves a comparative analysis to uncover what it is that the other half does so as to not have such issues. Based on the comments in the survey and outside (follow-up emails with survey respondents), there are a few directions: it may help to demonstrate better the applicability of ToC topics in the students’ prospective career, have experienced good teachers, and appropriate preparation in prior courses to increase the pass rates. Further, having issues might be related to the quantity and depth of material covered in a ToC course with respect to nominal course load. The data hints also to another possible explanation: even with a 80-100% pass rate and no low enrolment the ‘gotten used to the issues’ was selected occasionally, and vv., with a 41-60% pass rate that everything runs smoothly, thereby indicating that having issues might also be relative to a particular university culture and expectations of students, academics, and management.

Answers to the questions

Looking again at the questions raised at the start, here are the (short) answers to them:

  1. How are ToC topics implemented in CS curricula around the world at present? ToC topics in the actual international curricula are more in line with the older curriculum guidelines of [3, 4] than the more recent versions that put less weight on ToC topics. The timing in the curriculum regarding when to teach ToC remains largely stable and for a majority is scheduled in the 2nd or 3rd year.
  2. Are there any country or regional differences? There are country/regional differences, the largest one being that ToC is taught at only 27% of the South African traditional and comprehensive universities versus at 84% of the consulted curricula elsewhere in the world. Even including those SA universities with partial ToC coverage does not make up for the differences with elsewhere in the world or any of the proposed CS curriculum guidelines. Other geographic or language-based differences are not deducible from the data, or: based on the data, region does not matter substantially regarding inclusion of ToC in the CS curriculum, except that the slight difference between Europe and the Anglo-Saxon countries deserves further attention.
  3. What is the sentiment around teaching ToC in academia that may influence the former? Opinion on ToC is overwhelmingly in favour of having it in the curriculum, and primarily in the 2nd or 3rd year. Also, a large list of topics is considered to be ‘essential’ to the course, and this list is substantially larger than the recent international curricula Strawman drafts’ core for ToC topics (and more like the Strawman drafts’ total ToC list). Despite noted issues with the course, the voices from the field clearly indicate that ToC is here to stay.

In closing (for now): ToC is solidly in the CS degree programme, and perhaps ought to be introduced more widely in South Africa. And just in case you think something along the line of “well, we have pressing issues to solve in South Africa and no time for follies like doodling DFAs and tinkering with Turing machines”: CS and development of novel and good quality software requires an understanding of ToC topics. For instance, to develop a correct isiZulu grammar checker for text processing software or a parser for natural language processing, scalable image pattern recognition algorithms to monitor wildlife tracks with pictures taken in situ in, say, the Kruger park, an ontology-driven user interface for the Department of Science & Technology’s National Recordal System for indigenous knowledge management, and proper data integration to harmonize and streamline service delivery management, to name but a few application scenarios. Foreigners will not do all this for you (and they have their own problems they want to solve), or only for large consulting fees that otherwise could have been used to, among others, install potable water for the 1.3 million South Africans that don’t have it now, provide them closed toilets, ARV etc.

References

[1] ACM/IEEE Joint Task Force on Computing Curricula. (2012). Computer Science Curricula 2013 Strawman Draft (Feb. 2012). ACM/IEEE.

[2] Keet, C.M. An Assessment of Theory of Computation in Computer Science Curricula. 21st Annual Meeting of the Southern African Association for Research in Mathematics, Science, and Technology Education (SAARMSTE’13). BellVille, South Africa, January 14-17, 2013.

[3] Denning, P.J., Comer, D.E., Gries, D., Mulder, M.C., Tucker, A., Turner, A.J. & Young, P.R. (1989) Computing as a discipline. Communications of the ACM, 32(1), 9-23.

[4] UNESCO-IFIP. (1994). A modular curriculum in computer science. UNESCO and IFIP report ED/94/WS/13. 112p.

[5] Sahimi, M., Roach, S., Cuadros-Vargas, E. & Reed, D. (2012). Computer Science curriculum 2013: reviewing the Strawman report from the ACM/IEEE Task Team. In: Proceedings of the 43rd ACM technical Symposium on Computer Science Education (pp. 3-4). Raleigh, North Carolina, USA, February 29 – March 3, 2012. New York: ACM Conference Proceedings.

[6] Rosenbloom, P.S. (2004). A new framework for computer science and engineering. IEEE Computer, 37(11), 23-28.

[7] ACM/IEEE Joint Task Force on Computing Curricula. (2005). The overview report. ACM, AIS, IEEE-CS, September 30, 2005.

[8] ACM/IEEE Joint Task Force on Computing Curricula. (2004). Software Engineering 2004. ACM, IEEE-CS, August 23, 2004.

A successful EKAW’12 conference

Having returned four days ago from the 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12)—held in a sunny (!) and beautiful Galway from 8-12 October—I have not yet managed to read all the papers I checked off to read, but I don’t want to postpone the usual conference blogpost too much. So here it goes.

The main reasons why ‘successful’ is in the title of this post is that there were several interesting papers, I was (co-)author of two full papers (acceptance rate 15%) of which one won the best paper award, useful feedback on the contents of the papers, it was productive regarding meeting up and conversing about our research and networking, and it was held in Galway. The remainder of this posts briefly outlines some of that; there are Springer LNAI conference proceedings and most presentations have been uploaded on YouTube now.

There were three keynotes. Martin Hepp talked about the difference between ontologies and (more lightweight) web ontologies. Michael Uschold reflected on building the Enterprise Ontology and the lessons learned. Lee Harland provided a lot of information about “practical semantics” for the pharmaceutical industry to improve on the drug discovery process with, a.o., flexible data integration, the new W3C draft of the provenance data model, and quantitative data ontology in the Open PHACTS project.

There were several sessions spread over three whole days, grouped by the following topics: knowledge extraction and enrichment, natural language processing, linked data, ontology engineering and evaluation, social and cognitive aspects of knowledge representation, applications of knowledge engineering, and in-use papers.

Unsurprisingly, I’ll zoom in a bit on the ontology engineering contributions. There were several papers on improving the quality of an ontology. María Poveda-Villalón presented the OntOlogy Pitfall Scanner OOPS! tool that implements the current catalogue of 29 pitfalls [1], where pitfalls may be logical consistency issues or due to modeling or due to human understanding. Given an ontology, OOPS! evaluates it on those pitfalls and reports possible instances, which then can be corrected; e.g., a user defined a property to be the inverse of itself or swapped intersection and union in an expression or missing disjointness axioms. Concerning the latter, Sebastien Ferré’s Advocatus Diaboli—or: “pew! pew!”—may come in helpful as well [2]: it lets one explore the ontology, find “absurd” conjuncts, and add an axiom to exclude that. Or: the aim of the Possible World Explorer is to reduce the amount of possible worlds admitted by the ontology and therewith approximate the intended models better. My own contribution on Detecting and Revising Flaws in OWL Object Property Expressions [3]—which won the best paper award—considers flaws in object property expressions, good and safe role boxes/object property expressions, defines two tests to check for that in an ontology, and provides proposals for how to correct the mistakes (there’s an informal introduction in a previous blog post). In addition to these research contributions on finding and fixing flaws, there was also an in-use paper about that, though then applied to SKOS vocabularies [4], which won the best in-use paper award. It combines guidelines and constraints for SKOS in a new tool Skosify and evaluated 14 SKOS vocabularies and thesauri in some detail, therewith improving those artifacts.

From a modelling/ontology viewpoint, the paper about derived roles [5] was really interesting: although I had thought about the basic temporal dimension of roles before, not in such detail as Mizoguchi and co-authors did. For instance, how should one represent ‘murderer’ or ‘examinee’? There is such thing as an “original role” as we commonly know it, but also a “derived role”, where the meaning of the original role is slightly altered, based on the context of that role; e.g., an examinee not only being an examinee whilst writing the exam, but also when she is studying before the exam, and once one is a murderer during the act of killing, one remains ‘a murderer’ for the remainder of one’s life (though, obviously, not permanently stuck in an act of killing). These derived roles have further, more detailed, specifications, which are summarized in the paper.

Another aspect of foundational ontologies is using them in domain ontology development, and the step prior to that: how to figure out what the ‘best’ foundational ontology is for your project. I co-authored a paper about that with my MSc student Zubeida Khan: ONSET: Automated Foundational Ontology Selection and Explanation [6], which was presented by her and also featured at the demo session where colleagues provided suggestions for more nice features. As mentioned in earlier blogposts (e.g., here), features of foundational ontologies were analysed, as well as criteria for selection of a foundational ontology and needs by existing ontology development projects, which were both used to design a tool, ONSET, that helps with automated selection of a foundational ontology and providing an explanation of the computed selection. Riichiro Mizoguchi—from the YAMATO foundational ontology and who was also attending the conference—has provided the values for the criteria of their foundational ontology in the meantime (thank you!), and you will see an updated ONSET very soon.

Some tools have been evaluated more rigorously than others, and there are a myriad of evaluation approaches. One that stands out by having used the Systems Usability Scale and a funny video during the presentation, is the evaluation of the Live OWL Documentation Environment LODE that automatically generates documentation of your ontology in one HTML page [7]. One that stands out for its interesting results, is the paper about the effect of software-supported collaboration features in the ontology development environment [8]. Marco Rospocher presented the user evaluation done with the MoKi modeling wiki with and without its collaboration features and evaluated their effect on ontology development. The collaborative ontology development went better with such features.

More papers deserve attention here (and I may add them later once I have read the papers), and likewise the mention of other people who attended and of which it was really pleasant to meet them again as well as some fist meeting-in-person after reading several of their papers over the years (among others, and in alphabetical order: Claudia d’Amato, Matthieu d’Aquin, Aldo Gangemi, Chiara Ghidini, Patrick Lambrix, Riichiro Mizoguchi, Marco Rospocher, Mari Carmen Suárez-Figueroa, and Michael Uschold), and to my pleasant surprise, there appear to be ontology enthusiasts in Senegal as well (Gaoussou Camara presented a poster about the use of the infectious diseases ontology).

The next EKAW conference in 2014 will be held in Sweden and I’m looking forward to participating again.

References

(note: I tried to find the freely available versions to link to, where I could not find them, the link points to the Springer page of the EKAW’12 proceedings)

[1] María Poveda-Villalón, Mari Carmen Suárez-Figueroa and Asunción Gómez-Pérez. Validating ontologies with OOPS!. EKAW’12. Springer LNAI vol 7603, pp 267-281.

[2] Sebastien Ferré and Sebastian Rudolph. Advocatus Diaboli – Exploratory enrichment of ontologies with negative constraints. EKAW’12. Springer LNAI vol 7603, pp 42-56.

[3] C. Maria Keet. Detecting and Revising Flaws in OWL Object Property Expressions. EKAW’12. Springer LNAI vol 7603, pp2 52-266.

[4] Osma Suominen and Eero Hyvönen. Improving the quality of SKOS vocabularies with Skosify. EKAW’12. Springer LNAI vol 7603, pp 383-397.

[5] Kouji Kozaki, Yoshinobu Kitamura and Riichiro Mizoguchi. A model of derived roles. EKAW’12. Springer LNAI vol 7603, pp 227-236.

[6] Zubeida Khan and C. Maria Keet. ONSET: Automated Foundational Ontology Selection and Explanation. EKAW’12. Springer LNAI vol 7603, pp 237-251.

[7] Silvio Peroni, David Shotton and Fabio Vitali. The Live OWL Documentation Environment: A tool for the automatic generation of ontology documentation. EKAW’12. Springer LNAI vol 7603, pp 398-412.

[8] Chiara Di Franscescomarino, Chiara Ghidini, and Marco Rospocher. Evaluating wiki-enhanced ontology authoring. EKAW’12. Springer LNAI vol 7603, pp 292-301.

Some ideas about what the Semantic Web will look like in 2022

Research into realizing a vision of the Semantic Web has been ongoing for little over 10 years, and a call has gone out to ponder, daydream, fantasize, think wishfully or with fear about “What will the Semantic Web look like 10 years from now?” (SW2022). A selection of the many ideas will be presented on November 11, 2012, at the SW2022 workshop, held in conjunction with the 11th International Semantic Web Conference (ISWC’12) in Boston, USA.

For the curious: all SW2022 papers that will be presented are online on the SW2022 page (scroll down to about half-way on the web page for the programme). I picked out a few that I will summarise and comment on below; my selection is based on topic and/or author(s) and/or curious title, and I am a co-author of one of the papers.

Abraham Bernstein will present the first main paper [1], on the “global brain Semantic Web”, where the Internet is going to serve as the analogue to a brain’s neurons. The ‘global brain’ is used as a metaphor (or revamped old-fashioned AI?) for “distributed interleaved human-machine computation”, or, in fancier, more marketable, terms, now also called “collective intelligence” and “social computing”. In short: put the human in the Semantic Web, both as part of the knowledge provider and as educated user. Bernstein zooms in on the need to be able to manage the “motivational diversity, cognitive diversity, and error diversity” with respect to the possibility of realizing this global brain Semantic Web. Alessandro Oltramari’s vision for a cognitive Semantic Web [2] is quite similar to Bernstein’s one, where the semantic web is tuned to the individual user and “it will be an emergent social network of human and artificial cognitive agents interacting in a hybrid environment, where the distinction between physical and virtual will be superseded by the very nature of the entities populating it, namely knowledge objects and knowledge agents” [2]. Compared to these, our vision of interoperability is somewhat more humble.

Oliver Kutz will present our paper [3] about interoperability among ontologies, to be realized with the Distributed Ontology Language (DOL) that is currently in the process of standardisation at ISO (scheduled to be finalized by 2015). DOL is a metalanguage for distributed ontologies that may be represented in different ontology languages (some of the technical details can be found in a recent paper that won the best paper award at FOIS’12 [4] and a few examples are described in [5]). Overall then, it would be nice if, by 2022, we have solved the interoperability issues not only among data, but also the ‘models’ (ontologies, services descriptions etc.) and, especially, their logic-based representation languages. For instance, being able to seamlessly link knowledge that is represented partially in OWL 2 DL and partially in an ontology represented in Common Logic or leaving an OBO ontology like that yet declare more semantics (e.g., cardinality constraints, property chains) ‘around’ it in a more expressive language for those who need it, and advanced features for modularization, which are all realistic usage scenarios with the DOL. Clearly, all this will need some tool support. Initial tools do exist—Hets for reasoning over heterogeneous ontologies and the Ontohub ontology repository—but more can and will have to be done to realize full interoperability.

The paper on the Semantic Web needs (vision?) for cultural heritage [6] offers nothing I did not already know. South Africa has its own programme in that area—albeit called “indigenous knowledge management”, not “cultural heritage”—and we did our own requirements analysis some time ago already [7, 8]. Our list of requirements lists matches the one by Vavliakis et al., and we have a technology maturity analysis, a set of OWL requirements, and actual use cases from the domain experts and users of the Department of Science & technology’s National Recordal System project for indigenous knowledge management (about which I blogged before). That the topics will receive attention also at SW2022 hopefully increases the chance that those requirements will be investigated further, solved, and realized, which, in turn, will improve the software developed here and, ultimately, the people will benefit from it all.

Mutharaju [9] emphasizes on the need for connectivity, personalization and abstraction. Regarding the latter, he notes that “There would be a need to provide multiple (and higher) levels of abstractions and facilitate drill-down mechanisms.” yey! maybe my work on granularity (among others, [10]) will find its way into implementations after all. Also, Mutharaju thinks that the Semantic Web may be of use for the benefit of the environment (e.g., calculating better traffic flow, using sensor data etc.).

A short paper scheduled for the panel session is entitled “The rise of the verb” [11], which I found a curious title: verbs are taken into account already, where a verb’s ontological foundation is, in the Semantic Web context, represented as an object property in OWL or reified under, say, DOLCE’s Perdurant. Considering the contents of the paper, a more suitable title with respect to the contents could have been “action in the Semantic Web”: the paper’s introduction suggests adding something executable to the semantic web by means of JavaScript but where the instruction is specified at the knowledge level. Heiko Paulheim and Jeff Pan also want some language extensions: they argue in favour of language extensions, so as to be able to handle imprecision/uncertainty in particular [12].

Vander Sande and co-authors present a rather bleak vision of the Semantic Web [13], in that it could endanger humanity. They spend the full 6 pages on highlighting the myriad of dangers and the possible misuses of Semantic Web technologies. Among others: ‘semantic spam’ instead of the dumb variety we have gotten used to, where spammers take advantage of the Linked Open Data cloud and otherwise linked social network data to make the spam look more believable; polluting the LOD cloud through link spoofing; identity theft and provenance manipulation; and the Web of Things for autonomous computerized weaponry. One also could have added a follow-through of the saying that ‘knowledge is power’, where better and scaled-up knowledge management facilitates obtaining more power (and power corrupts, and absolute power corrupts absolutely). All this, in turn, goes back to the philosophical issues regarding responsibility in research, engineering, and technology and whether some field is inherently bad, neutral, or good, or whether the bad pops up only with some application scenarios where the technologies could possibly be used. For the Semantic Web, I think it is only the latter, but you may try to convince me otherwise.

Although I won’t be attending, it’s appreciated that the papers are online already, and I can imagine there will be some lively discussions at the SW2022 workshop.

References

[1] Abraham Bernstein. The Global Brain Semantic Web – Interleaving Human-Machine Knowledge and Computation. SW2022, Boston, Nov 11, 2012.

[2] Alessandro Oltramari. Enabling the cognitive Semantic Web. SW2022, Boston, Nov 11, 2012.

[3] Oliver Kutz, Christoph Lange, Till Mossakowski, C. Maria Keet, Fabian Neuhaus, Michael Grüninger. The Babel of Semantic Web tongues – in search of the Rosetta Stone of interoperability. SW2022, Boston, Nov 11, 2012.

[4] Till Mossakowski, Christoph Lange, Oliver Kutz. Three Semantics for the Core of the Distributed Ontology Language. In Michael Gruninger (Ed.), FOIS 2012: 7th International Conference on Formal Ontology in Information Systems, Graz, Austria.

[5] Christoph Lange, Till Mossakowski, Oliver Kutz, Christian Galinski, Michael Grüninger, Daniel Couto Vale. The Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility, Terminology and Knowledge Engineering Conference (TKE’12). Madrid, Spain.

[6] Konstantinos N. Vavliakis, Georgios Th. Karagiannis and Pericles A. Mitkas. Semantic Web in Cultural heritage after 2020. SW2022, Boston, Nov 11, 2012.

[7] Thomas Fogwill, Ronell Alberts, C. Maria Keet. The potential for use of semantic web technologies in IK management systems. IST-Africa Conference 2012. May 9-11, Dar es Salaam, Tanzania.

[8] Ronell Alberts, Thomas Fogwill, C. Maria Keet. Several Required OWL Features for Indigenous Knowledge Management Systems. 7th Workshop on OWL: Experiences and Directions (OWLED 2012). 27-28 May, Heraklion, Crete, Greece. CEUR-WS Vol-849. 12p.

[9] Raghava Mutharaju. How I would like Semantic Web to be, for my children. SW2022, Boston, Nov 11, 2012.

[10] C. Maria Keet. A formal theory of granularity. PhD Thesis, KRDB Research Centre, Faculty of Computer Science, Free University of Bozen-Bolzano, Italy. 2008.

[11] Paul Groth. The rise of the verb. SW2022, Boston, Nov 11, 2012.

[12] Heiko Paulheim and Jeff Z. Pan. Why the Semantic Web should become more imprecise. SW2022, Boston, Nov 11, 2012.

[13] Miel Vander Sande, Sam Coppens, Davy Van Deursen, Erik Mannens and Rik Van De Walle. The terminator’s origins or how the Semantic Web could endanger humanity. SW2022, Boston, Nov 11, 2012.

MOOCs, computer-based teaching aids, and taking notes

This post initially started out to be directed toward the current COMP314 Theory of Computation students at UKZN, who, like last year, are coming to terms not only with the subject domain, but also with the fact that I write the lecture notes on the blackboard (which is green, btw). The post eneded up containing some general reflections on the use/non-use of computer-based teaching aids and, in extension, the Massive Open Online Courses (MOOCs), with illustrations taken mainly from the ToC course.

First a note to the students: I am aware most of you don’t really like taking notes during the lectures and quite a few still do not do so—despite that you know that you’re served a summary of the bulky textbook, saving you to summarise it. But those who do make notes, or at least rewrite the notes from someone else or fill up the gaps in their own notes, go much more quickly through the exercises than those who do not. For instance, I occasionally gave as an exercise an example that was done in class on the board already. The diligent note-takers’ response is along the line of “yeah, that was trivial, and, by the way, we did that already in class, here’s the solution. Give me a real challenge!” compared to starting the ‘exercise’ from scratch by those who did not take notes, and who do not even recollect we did it in class. In the end (and by observation, not scientific rigour of the double blind experiment), the former group completes more, and more advanced, exercises in the same or less amount of time. Maybe I should shout that from the rooftops.

Taking notes means you listen, read, and write, therewith processing the knowledge that I’m trying to get from my brain into yours. Doing something actively, versus passively listening (or, worse, letting it go into one ear and immediately release it through the other, or dozing off), makes you think about the material at least once, which then saves time later on because you’ll remember some, if not all, of it. In addition, imagine how fast I could go through the material if I would not have to take the time to write it on the board, but just click the down arrow, and therewith having the opportunity to cover even more material than I already do whereas you would hardly have time to think about the matter at all, let alone ask questions. Besides, note taking is a good exercise in general, because there often won’t be any handouts prepared for you once you’re out there in industry, so you might as well practice for those situations now. (See also Mole’s column on why to chalk it upon the blackboard.)

Nonetheless, unlike, say, 15-20 years ago when we did not know any better than reading the material beforehand and taking notes during the lectures, in the present-day slides-era, quite a few students are not happy with the note-taking effort and ‘quality of the teaching aids provided’, as they are used to the slides in other courses. I distribute the slides, too, for a course like ontologies and knowledge bases, because there is no textbook (I even wrote a 150-pages long set of lecture notes), but for Theory of computation, we use a textbook (Hopcoft, Mottwani, and Ullman) that has the details at undergraduate level. Nor were last year’s students happy with the note-taking. Some students searched online back then, and I vividly remember one student sharing his opinion about that with me, unsolicited: “ma’am, we searched online for better material, but it’s all just as bad as yours! So we won’t hold it against you”. Well, thanks.

Closely related to that ‘searching around online’, however, are possible less pleasant side effects. I will mention two. First, last year some students looked up alternative explanations for the diagonalization language and some things being incomputable, having come across a version of the barbershop paradox. Strictly speaking, it is not an alternative explanation of the same thing; or: if you are going to use that, it is a wrong analogy. I’ll explain why in class in a few weeks when we’ll cover chapters 8 and 9 of HMU.

Second, this year I have seen some non-notes students watching a YouTube video on the pumping lemma and on CFGs during the exercises in the lab; such repeats take up extra time. The note-takers, on the other hand, flicked through their pages in a fraction of the time, hence, were ahead in the exercises simply because they had used the lecture time more effectively. (And I generously assume that what was presented in the YouTube video was correct, which need not be the case; see previous point).

This does not mean there is no room for improvement regarding teaching aids for Theory of Computing, which I’ve written about last year. This year I tried the AutoSim automata simulator and the CFG Grammar Editor again, which still does not have much uptake by the students, afaik, and I’m trying out Gradiance. The Gradiance system is an online MCQs question bank with explanations of the answers and hints in case the answer given was incorrect—i.e., turning the gained experience and knowledge about common conceptual mistakes into an educational feature)—and one can assign homework and grade assignments. The dedication to do the (optional) homework is dwindling since the 5th week into the semester, but the automatic grading and providing the option for self-study for the interested and/or determined-to-pass students are can be great (the system is almost cheat-proof, but not entirely).

To get to the point after some meandering: some types of computer-based teaching aids can be useful, just not all of them all of the time. True, computing looks at what can be automated, and what can be automated efficiently, and so I could apply that notion to everything—up to totally automated course offerings, which is the direction that the MOOCs are going. However, computing also concerns solving problems, and being able to recognise when a problem is best solved by computation, and when other solutions may be more appropriate. For instance, low pass rates may be considered to be a problem, but this does not imply that e-learning is the solution to that; non-determinism and epsilon-transitions are concepts that are apparently not easy to grasp, and the simulator is more illustrative than my coloured chalk trying to simulate a run on the blackboard; in the pre-CMS era, course admin was a chore and there were, perhaps, instances of ‘lost’ assignment or project submissions (though that did not happen when I was a student in the ‘90s), which Moodle alleviates and prevents, respectively. So, software can indeed solve some problems.

This brings me to the other end of the spectrum: the Massive Open Online Courses, or MOOCs; yes, there is even a MOOC for Theory of Computation, by Ullman himself. Which problems do they purport to solve, and do they? Despite reading a lot of pop-articles about it over the past year (see, e.g., the feature series from the Chronicle of Higher Education on MOOCs), it still is not clear to me with respect to the solutions for lecturing issues. One recurring argument in the flurry of news articles is that it is great [to/because it] give[s] the poor sods around the world some crumbs from the elite universities; well, the ‘poor’ with a good Internet connection and the money to pay for the data transfer, that is, not Joe and Joanne Soap in Chesterville, Soweto, etc., and the interested potential student has to have a good command of the English language as well.

Another recurring argument with the MOOCs is that one can learn from the best, and, en passant, implying, or even explicitly stating, that the MOOC lecturers are assumed to be better teachers by definition than anyone else. Sure, there are lecturers who teach stuff that is wrong, but how widespread is that, and what is the cause of that? If anyone has hard data about such claims, please let me know. For the sake of argument, let’s assume mistakes are widespread, and that it is because we, as non-elite university lecturers, are undereducated and incompetent teachers. Is a MOOC the solution? Or maybe giving lecturers the time to learn their material and prepare the lectures better, i.e., local capacity building? Not all of us are undereducated. People who have taught a course many times, do research in that field, and possibly even have written textbooks on the topic, tend to be better lecturers, because they generally have reflected more on the teaching and already have come across all possible conceptual mistakes the students make, can anticipate it, and therewith even prevent that from happening at least to a larger extent than a novice lecturer; those people are not all and only at Stanford, Harvard, and MIT.

Second, the MOOCs—for the time being, at least—are based on a push-mechanism of knowledge transfer, yet lecturing consists of much more than talking at the front of the classroom. There is interaction with questions and answers, there is context, and so on. For instance, regarding motivational context, to introduce the data mining in a database course with a story about the nearby Pick ‘n Pay at Westwood mall that the on-campus students go to, and, once signed up for their customer loyalty card, how PnP will find out you’re a student even if you did not say so on the application form. Or the problems with Johannesburg’s integrated services delivery management system as a real life example of data integration issues and the urgent need to have educated South African computer scientists to solve them. Or, given that UKZN’s CS student have had a lot of java programming in the first and second year, to show them the java language as a CFG in the theory of computation course. In addition, sometimes to the pleasant and sometimes frightened, surprise of my students, I actually do know about half of the roughly 70 registered students by name, and know who can be prodded into action only when an exercise is marked with ‘super hard’, who does not want to know the answer straight away but just a little hint to move on him/herself to find the solution, the determined to solve it on their own, the insecure who needs a bit of encouragement, and so on.

To have a MOOC suit you, you would have to have done the same prior courses—unless the MOOC is only an introduction to topic x—, you would not care about the absence of a few context/motivational stories, your learning style has to match with that of the MOOC, and you have to be very disciplined on your own. To name but a few potential hurdles in the positive light.

Further, there are the exercises and tests. There are some new tools for the automatic grading of exercises and assignments (Gradiance would fit here, I suppose, but not the regular textbook exercises), or semi-automatic with software+TA, and virtual ‘MOOC study groups’ are popping up in social media. If you don’t know any better, it is probably great. Like thinking pizza is tasty all around the world—until you experience how it tastes in Naples. Such online groups are not bad—I participated in it myself when I was studying at the Open University UK, and it is better than no contact with other students at all—but it does not compare with the face-to-face meetings with fellow students, where the lectures are discussed, notes compared and brushed up, exercises discussed and solved, peer-explanation happens, students motivating one another, and so on.

Overall, then, if MOOCs are going to become the standard, the world will be poorer for it. For sidelining competent lecturers, de-incentivising weaker lecturers from acting on their responsibility to brush up their knowledge and skills, de-contextualising and hamburgerising courses, impoverishing the academic learning environment by narrowing down education to a mere push-mechanism of knowledge transfer, and dehumanizing students into boring conformity (eenheidsworst in Dutch). Add to that mix the cultural imperialism, and we are well on our way to a ‘brave new world’.

In the meantime, participate in the lectures and process the information, and take notes. I don’t think MOOCs will kill the regular universities, but imagine if you really were the last generation to go to (or work at) a real university… Exploit the advantages that a face-to-face university offers you, and cherish it while it lasts!

Fixing flaws in OWL object property expressions

OWL 2 DL is a very expressive language and, thanks to ontology developers’ persistent requests, has many features for declaring complex object property expressions: object sub-properties, (inverse) functional, disjointness, equivalence, cardinality, (ir)reflexivity, (a)symmetry, transitivity, and role chaining. A downside of this is that with the more one can do, the higher is the chance that flaws in the representation are introduced; hence, an unexpected or undesired classification or inconsistency may actually be due to a mistake in the object property box, not a class axiom. While there are nifty automated reasoners and explanation tools that help with the modeling exercise, the standard reasoning services for OWL ontologies assume that the axioms in the ‘object property box’ are correct and according to the ontologist’s intention. This may not be the case. Take, for instance, the following thee examples, where either the assertion is not according to the intention of the modeller, or the consequence may be undesirable.

  • Domain and range flaws; asserting hasParent \sqsubseteq hasMother instead of hasMother \sqsubseteq hasParent in accordance with their domain and range restrictions (i.e., a subsetting mistake—a more detailed example can be found in [1]), or declaring a domain or a range to be an intersection of disjoint classes;
  • Property characteristics flaws: e.g., the family-tree.owl (when accessed on 12-3-2012) has hasGrandFather \sqsubseteq  hasAncestor and Trans(hasAncestor) so that transitivity unintentionally is passed down the property hierarchy, yet hasGrandFather is really intransitive (but that cannot be asserted in OWL);
  • Property chain issues; for instance the chain hasPart \circ  hasParticipant \sqsubseteq  hasParticipant in the pharmacogenomics ontology [2] that forces the classes in class expressions using these properties—in casu, DrugTreatment and DrugGeneInteraction—to be either processes due to the domain of the hasParticipant object property, or they will be inconsistent.

Unfortunately, reasoner output and explanation features in ontology development environments do not point to the actual modelling flaw in the object property box. This is due to that implemented justification and explanation algorithms [3, 4, 5] consider logical deductions only and that class axioms and assertions about instances take precedence over what ‘ought to be’ concerning object property axioms, so that only instances and classes can move about in the taxonomy. This makes sense from a logic viewpoint, but it is not enough from an ontology quality viewpoint, as an object property inclusion axiom—being the property hierarchies, domain and range axioms to type the property, a property’s characteristics (reflexivity etc.), and property chains—may well be wrong, and this should be found as such, and corrections proposed.

So, we have to look at what type of mistakes can be made in object property expressions, how one can get the modeller to choose the ontologically correct options in the object property box so as to achieve a better quality ontology and, in case of flaws, how to guide the modeller to the root defect from the modeller’s viewpoint, and propose corrections. That is: the need to recognise the flaw, explain it, and to suggest revisions.

To this end, two non-standard reasoning services were defined [6], which has been accepted recently at the 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12): SubProS and ProChainS. The former is an extension to the RBox Compatibility Service for object subproperties by [1] so that it now also handles the object property characteristics in addition to the subsetting-way of asserting object sub-properties and covers the OWL 2 DL features as a minimum. For the latter, a new ontological reasoning service is defined, which checks whether the chain’s properties are compatible by assessing the domain and range axioms of the participating object properties. Both compatibility services exhaustively check all permutations and therewith pinpoint to the root cause of the problem (if any) in the object property box. In addition, if a test fails, one or more proposals are made how best to revise the identified flaw (depending on the flaw, it may include the option to ignore the warning and accept the deduction). Put differently: SubProS and ProChainS can be considered so-called ontological reasoning services, because the ontology does not necessarily contain logical errors in some of the flaws detected, and these two services thus fall in the category of tools that focus on both logic and additional ontology quality criteria, by aiming toward ontological correctness in addition to just a satisfiable logical theory. (on this topic, see also the works on anti-patterns [7] and OntoClean [8]). Hence, it is different from other works on explanation and pinpointing mistakes that concern logical consequences only [3,4,5], and SubProS and ProChainS also propose revisions for the flaws.

SubProS and ProChainS were evaluated (manually) with several ontologies, including BioTop and the DMOP, which demonstrate that the proposed ontological reasoning services indeed did isolate flaws and could propose useful corrections, which have been incorporated in the latest revisions of the ontologies.

Theoretical details, the definition of the two services, as well as detailed evaluation and explanation going through the steps can be found in the EKAW’12 paper [6], which I’ll present some time between 8 and 12 October in Galway, Ireland. The next phase is to implement an efficient algorithm and make a user-friendly GUI that assists with revising the flaws.

References

[1] Keet, C.M., Artale, A.: Representing and reasoning over a taxonomy of part-whole relations. Applied Ontology 3(1-2) (2008) 91–110

[2] Dumontier, M., Villanueva-Rosales, N.: Modeling life science knowledge with OWL 1.1. In: Fourth International Workshop OWL: Experiences and Directions 2008 (OWLED 2008 DC). (2008) Washington, DC (metro), 1-2 April 2008

[3] Horridge, M., Parsia, B., Sattler, U.: Laconic and precise justifications in OWL. In: Proceedings of the 7th International Semantic Web Conference (ISWC 2008). Volume 5318 of LNCS., Springer (2008)

[4] Parsia, B., Sirin, E., Kalyanpur, A.: Debugging OWL ontologies. In: Proceedings of the World Wide Web Conference (WWW 2005). (2005) May 10-14, 2005, Chiba, Japan.

[5] Kalyanpur, A., Parsia, B., Sirin, E., Grau, B.: Repairing unsatisfiable concepts in OWL ontologies. In: Proceedings of ESWC’06. Springer LNCS (2006)

[6] Keet, C.M. Detecting and Revising Flaws in OWL Object Property Expressions. 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12), Oct 8-12, Galway, Ireland. Springer, LNAI, 15p. (in press)

[7] Roussey, C., Corcho, O., Vilches-Blazquez, L.: A catalogue of OWL ontology antipatterns. In: Proceedings of K-CAP’09. (2009) 205–206

[8] Guarino, N., Welty, C.: An overview of OntoClean. In Staab, S., Studer, R., eds.: Handbook on ontologies. Springer Verlag (2004) 151–159

Moral responsibility in the Computing era (SEP entry)

The Stanford Encyclopedia of Philosophy intermittently has new entries that have to do with computing, like on the philosophy of computer science about which I blogged before, ethics of, among others, Internet research, and now Computing and Moral Responsibility by Merel Noorman [1]. The remainder of this post is about the latter entry that was added on July 18, 2012. Overall, the entry is fine, but I had expected more from it, which may well be due to that the ‘computing and moral responsibility’ topic needs some more work to mature and then maybe will give me the answers I was hoping to find already.

Computing—be this the hardware, firmware, software, or IT themes—interferes with the general notion of moral responsibility, hence, affects every ICT user at least to some extent, and the computer scientists, programmers etc who develop the artifacts may themselves be morally responsible, and perhaps even the produced artifacts, too. This area of philosophical inquiry deals with questions such as “Who is accountable when electronic records are lost or when they contain errors? To what extent and for what period of time are developers of computer technologies accountable for untoward consequences of their products? And as computer technologies become more complex and behave increasingly autonomous can or should humans still be held responsible for the behavior of these technologies?”. To this end, the entry has three main sections, covering moral responsibility, the question whether computers can be more agents, and the notion of (and the need for) rethinking the concept of moral responsibility.

First, it reiterates the general stuff about moral responsibility without the computing dimension, like that it has to do with the actions of humans and its consequences: “generally speaking, a person or group of people is morally responsible when their voluntary actions have morally significant outcomes that would make it appropriate to praise or blame them”, where the SEP entry dwells primarily on the blaming. Philosophers roughly agree that the following three conditions have to be met regarding being morally responsible (copied from the entry):

 1. There should be a causal connection between the person and the outcome of actions. A person is usually only held responsible if she had some control over the outcome of events.

2. The subject has to have knowledge of and be able to consider the possible consequences of her actions. We tend to excuse someone from blame if they could not have known that their actions would lead to a harmful event.

3. The subject has to be able to freely choose to act in certain way. That is, it does not make sense to hold someone responsible for a harmful event if her actions were completely determined by outside forces.

But how are these to be applied? Few case examples of the difficulty to apply it in praxis are given; e.g., the malfunctioning Therac-25 radiation machine (three people died caused by overdoses of radiation, primarily due to issues regarding the software), the Aegis software system that misidentified an Iranian civilian aircraft in 1988 as an attacking military aircraft and the US military decided to shoot it down (contrary to two other systems that had identified it correctly) and having killed all 209 passengers on board, the software to manage remote-controlled drones, and perhaps even the ‘filter bubble’. Who is to blame, if at all? These examples, and others I can easily think of, are vastly different scenarios, but they have not been identified, categorized, and treated as such. But if we do, then perhaps at least some general patters can emerge and even rules regarding moral responsibility in the context of computing. Here’s my initial list of different kinds of cases:

  1. The hardware/software was intended for purpose X but is used for purpose Y, with X not being inherently harmful, whereas Y is; e.g., the technology of an internet filter for preventing kids to access adult-material sites is used to make a blacklist of sites that do not support government policy and subsequently the users vote for harmful policies, or, as simpler one: using mobile phones to detonate bombs.
  2. The hardware/software is designed for malicious intents; ranging from so-called cyber warfare (e.g., certain computer viruses, denial-of-service attacks) to computing for physical war to developing and using shadow-accounting software for tax evasion.
  3. The hardware/software has errors (‘bugs’):
    1. The specification was wrong with respect to the intentionally understated or mis-formulated intentions, and the error is simply a knock-on effect;
    2. The specification was correct, but a part of the subject domain is intentionally wrongly represented (e.g., the decision tree may be correctly implemented given the wrong representation of the subject domain);
    3. The specification was correct, the subject domain represented correctly, but there’s a conceptual error in the algorithm (e.g., the decision tree was built wrongly);
    4. The program code is scruffy and doesn’t do what the algorithm says it is supposed to do;
  4. The software is correct, but has the rules implemented as alethic or hard constraints versus deontic or soft constraints (not being allowed to manually override a default rule), effectively replacing human-bureaucrats with software-bureaucrats;
  5. Bad interface design to make the software difficult to use, resulting in wrong use and/or overlooking essential features;
  6. No or insufficient training of the users how to use the hardware/software;
  7. Insufficient maintenance of the IT system that causes the system to malfunction;
  8. Overconfidence in the reliability of the hardware/software;
    1. The correctness of the software, pretending that it always gives the right answer when it may not; e.g., assuming that the pattern matching algorithm for fingerprint matching is 100% reliable when it is actually only, say, 85%;
    2. Assuming (extreme) high availability, when no extreme high availability system is in place; e.g., relying solely on electronic health records in a remote area whereas the system may be down right when it is crucial to access it in the hospital information system.
  9. Overconfidence in the information provided by or through the software; this is partially alike 8-i, or the first example in item 1, and, e.g., willfully believing that everything published on the Internet is true despite the so-called ‘information warfare’ regarding the spreading of disinformation.

Where the moral responsibility lies can be vastly different depending on the case, and even within the case, it may require further analysis. For instance (and my opinions follow, not what is written in the SEP entry), regarding maintenance: a database for the electronic health records outgrows it prospective size or the new version of the RDBMS actually requires more hardware resources than the server has, with as consequence that querying the database becomes too slow in a critical case (say, to check whether patient A is allergic to medicine B that needs to be administered immediately): perhaps the system designer should have foreseen this, or perhaps management didn’t sign off on a purchase for a new server, but I think that the answer to the question of where the moral responsibility lies can be found. For mission-critical software, formal methods can be used, and if, as engineer, you didn’t and something goes wrong, then you are to blame. One cannot be held responsible for a misunderstanding, but when the domain expert says X of the subject domain and you have some political conviction that you prefer Y and build that into the software and that, then, results in something harmful, then you can be held morally responsible (item 3-ii). On human vs. software bureaucrat (item 4), the blame can be narrowed down when things go wrong: was it the engineer who didn’t bother with the possibility of exceptions, was there a/no technological solution for it at the time of development (and knowingly ignore it), or was it the client who willfully was happy avoiding such pesky individual exceptions to the rule? Or, another example, as the SEP entry questions (an example of item 1): can one hold the mobile phone companies responsible for having designed cell phones that also can be used to detonate bombs? In my opinion: no. Just in case you want to look for guidance, or even answers, in the SEP entry regarding such kind of questions and/or cases: don’t bother, there are none.

More generally, the SEP entry mentions two problems for attributing blame and responsibility: the so-called problem of ‘many hands’ and the problem with physical and temporal distance. The former concerns the issue that there are many people developing the software, training the users, etc., and it is difficult to identify the individual, or even the group of individuals, who ultimately did the thing that caused the harmful effect. It is true that this is a problem, and especially when the computing hardware or software is complex and developed by hundreds or even thousands of people. The latter concerns the problem that the distance can blur the causal connection between action and event, which “can reduce the sense of responsibility”. But, in my opinion, just because someone doesn’t reflect much on her actions and may be willfully narrow-minded to (not) accept that, yes, indeed, those people celebrating a wedding in a tent in far-away Afghanistan are (well, were) humans, too, does not absolve one from the responsibility—neither the hardware developer, nor the software developer, nor the one who pushed the button—as distance does not reduce responsibility. One could argue it is only the one who pushed the button who made the judgment error, but the drone/fighter jet/etc. computer hardware and software are made for harmful purposes in the first place. Its purpose is to do harm to other entities—be this bombing humans or, say, a water purification plant such that the residents have no clean water—and all developers involved very well know this; hence, one is morally responsible from day one that one is involved in its development and/or use.

I’ll skip the entry’s section on computers as agents (AI software, robots), and whether they can be held morally responsible, just responsible, or merely accountable, or none of them, except for the final remark of that section, credited to Bruno Latour (emphasis mine):

[Latour] suggests that in all forms of human action there are three forms of agency at work: 1) the agency of the human performing the action; 2) the agency of the designer who helped shaped the mediating role of the artifacts and 3) the artifact mediating human action. The agency of artifacts is inextricably linked to the agency of its designers and users, but it cannot be reduced to either of them. For him, then, a subject that acts or makes moral decisions is a composite of human and technological components. Moral agency is not merely located in a human being, but in a complex blend of humans and technologies.

Given the issues with assigning moral responsibility with respect to computing, some philosophers ponder about doing away with it, and replace it with a better framework. This is the topic of the third section of the SEP entry, which relies substantially on Gotterbarn’s work on it. He notes that computing is ethically not a neutral practice, and that the “design and use of technological artifacts is a moral activity” (because the choice of one design and implementation over another does have consequences). Moreover, and more interesting, is that, according to the SEP entry, he introduces the notions of negative responsibility and positive responsibility. The former “places the focus on that which exempts one from blame and liability”, whereas the latter “focuses on what ought to be done”, and entails to “strive to minimize foreseeable undesirable events”. Computing professionals, according to Gotterbarn, should adopt the notion of positive responsibility. Later on in the section, there’s a clue that there’s some way to go before achieving that. Accepting accountability is more rudimentary than taking moral responsibility, or at least a first step toward moral responsibility. Nissenbaum (paraphrased in the SEP entry) has identified four barriers to accountability in society (at least back in 1997 when she wrote it): the above-mentioned problem of many hands, the acceptance of ‘bugs’ as an inherent element of large software applications, using the computer as scapegoat, and claiming ownership without accepting liability (read any software license if you doubt the latter). Perhaps that needs to be addressed before going on to the moral responsibility, or one reinforces the other? Dijkstra vents his irritation in one of his writings about software ‘bugs’—the cute euphemism dating back to the ‘50s—and instead proposes to use one of its correct terms: they are errors. Perhaps users should not be lenient with errors, which might compel developers to deliver a better/error-free product, and/or we have to instill in the students more about the positive responsibility and reduce their tolerance for errors. And/or what about re-writing the license agreements a bit, like accepting responsibility provided it is used in one of the prescribed and tested ways? We already had that when I was working for Eurologic more than 10 years ago: the storage enclosure was supposed to work in certain ways and was tested in a variety of configurations, and that we signed off on for our customers. If it was faulty in one of the tested system configurations after all, then that was our problem, and we’d incur the associated costs to fix it. To some extent, that was also with our suppliers. Indeed, for software, that is slightly harder, but one could include in the license something along the line of ‘X works on a clean machine and when common other packages w, y, and z are installed, but we can’t guarantee it when you’ve downloaded weird stuff from the Internet’; not perfect, but it is a step in the right direction. Anyone has better ideas?

Last, the closing sentence is a useful observation, effectively stretching the standard  notion of moral responsibility thanks to computing (emphasis added): “[it] is, thus, not only about how the actions of a person or a group of people affect others in a morally significant way; it is also about how their actions are shaped by technology.”. But, as said, the details are yet to be thought through and worked out in some detail and general guidelines that can be applied.

References

[1] Merel Noorman. (forthcoming in 2012). Computing and Moral Responsibility. Stanford Encyclopedia of Philosophy (Fall 2012 Edition), Zalta, E.N. (ed.).  Stable URL: http://plato.stanford.edu/archives/fall2012/entries/computing-responsibility/.

A new version of ONSET and more technical details are now available

After the first release of the foundational ONtology Selection and Explanation Tool ONSET half a year ago, we—Zubeida Khan and I—continued its development by adding SUMO, conducting a user evaluation, and we wrote a paper about it, which was recently accepted [1] at the 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12).

There are theoretical and practical reasons why using a foundational ontology improves the quality and interoperability of the domain ontology, be this by means of reusing DOLCE, BFO, GFO, SUMO, YAMATO, or another one, in part or in whole (see, e.g., [2,3] for some motivations). But as a domain ontology developer, and those who are potentially interested in using a foundational ontology in particular, do ask: which one of them would be best to use for the task at hand? That is not an easy question to answer, and hitherto required from a developer to pore over all the documentation, weighing the pros and cons for the scenario, make an informed decision, know exactly why, and be able to communicate that. This bottleneck has been solved with the ONSET tool. Or, at least: we claim it does, and the user evaluation supports this claim.

In short, ONSET, the foundational ONtology Selection and Explanation Tool helps the domain ontology developer in this task. Upon answering one or more questions and, optionally, adding any scaling to indicate some criteria are more important to you than others, it computes the most suitable foundational ontology for that scenario and explains why this is so, including reporting any conflicting answers (if applicable). The questions themselves are divided into five different categories—Ontology, representation language, software engineering properties, applications, and subject domain—and there are “explain” buttons to clarify terms that may not be immediately clear to the domain ontology developer. (There are a few screenshots at the end of this post.)

Behind the scenes is a detailed comparison of the features of DOLCE, BFO, GFO, and SUMO, and an efficient algorithm. The latter and the main interesting aspects of the former are included in the paper; the complete set of criteria is available in a file on the ONSET webpage. You can play with ONSET using your real or a fictitious ontology development scenario after downloading the jar file. If you don’t have a scenario and can’t come up with one: try one of the scenarios we used for the user evaluation (also online). The user evaluation consisted of 5 scenarios/problems that the 18 participants had to solve, half of them used ONSET and half of them did not. On average, the ‘accuracy’ (computed from selecting the appropriate foundatinal ontology and explaining why) was 3 times higher for those who used ONSET compared to those who did not. The ONSET users also did it slightly faster.

Thus, ONSET greatly facilitates in selecting a foundational ontology. However, I concede that from the Ontology (philosophy) viewpoint, the real research component is, perhaps, only beginning. Among others, what is the real effect of the differences between those foundational ontolgoies for ontology development, if any? Is one category of criteria, or individual criterion, always deemed more important than others? Is there one or more ‘typical’ combination of criteria, and if so, is there a single particular foundational ontology suitable, and if not, where/why are the current ones insufficient? In the case of conflicts, which criteria do they typically involve? ONSET clearly can be a useful aid investigating these questions, but answering them is left to future works. Either way, ONSET contributes to taking a scientific approach to comparing and using a foundational ontology in ontology development, and provides the hard arguments why.

We’d be happy to hear your feedback on ONSET, be this on the tool itself or when you have used it for a domain ontology development project. Also, the tool is very easy to extend thanks to the way it is programmed, so if you have your own pet foundational ontology that is not yet included in the tool, you may like to provide us with the values for the criteria so that we can include it.

Here are a few screenshots: of the start page, questions and an explanation, other questions, and the result (of a fictitious example):

Startpage of ONSET, where you select inclusion of additional questions that don’t make any difference right now, and where you can apply scaling to the five categories.

Section of the questions about ontological commitments and a pop-up screen once the related “Explain” button is clicked.

Another tab with questions. In this case, the user selected “yes” to modularity, upon which the tool expanded the question so that a way of modularisation can be selected.

Section of the results tab, after having clicked “calculate results” (in this case, of a fictitious scenario). Conflicting results, if any, will be shown here as well, and upon scrolling down, relevant literature is shown.

References

[1] Khan, Z., Keet, C.M. ONSET: Automated Foundational Ontology Selection and Explanation. 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12). Oct 8-12, Galway, Ireland. Springer, LNAI, 15p. (accepted)

[2] Keet, C.M. The use of foundational ontologies in ontology development: an empirical assessment. 8th Extended Semantic Web Conference (ESWC’11), G. Antoniou et al (Eds.), Heraklion, Crete, Greece, 29 May-2 June, 2011. Springer, Lecture Notes in Computer Science LNCS 6643, 321-335.

[3] Borgo, S., Lesmo, L. The attractiveness of foundational ontologies in industry. In: Proc. of FOMI’08, Amsterdam, The Netherlands, IOS Press (2008), 1-9.

Follow

Get every new post delivered to your Inbox.

Join 25 other followers

%d bloggers like this: