Some ideas about what the Semantic Web will look like in 2022

Research into realizing a vision of the Semantic Web has been ongoing for little over 10 years, and a call has gone out to ponder, daydream, fantasize, think wishfully or with fear about “What will the Semantic Web look like 10 years from now?” (SW2022). A selection of the many ideas will be presented on November 11, 2012, at the SW2022 workshop, held in conjunction with the 11th International Semantic Web Conference (ISWC’12) in Boston, USA.

For the curious: all SW2022 papers that will be presented are online on the SW2022 page (scroll down to about half-way on the web page for the programme). I picked out a few that I will summarise and comment on below; my selection is based on topic and/or author(s) and/or curious title, and I am a co-author of one of the papers.

Abraham Bernstein will present the first main paper [1], on the “global brain Semantic Web”, where the Internet is going to serve as the analogue to a brain’s neurons. The ‘global brain’ is used as a metaphor (or revamped old-fashioned AI?) for “distributed interleaved human-machine computation”, or, in fancier, more marketable, terms, now also called “collective intelligence” and “social computing”. In short: put the human in the Semantic Web, both as part of the knowledge provider and as educated user. Bernstein zooms in on the need to be able to manage the “motivational diversity, cognitive diversity, and error diversity” with respect to the possibility of realizing this global brain Semantic Web. Alessandro Oltramari’s vision for a cognitive Semantic Web [2] is quite similar to Bernstein’s one, where the semantic web is tuned to the individual user and “it will be an emergent social network of human and artificial cognitive agents interacting in a hybrid environment, where the distinction between physical and virtual will be superseded by the very nature of the entities populating it, namely knowledge objects and knowledge agents” [2]. Compared to these, our vision of interoperability is somewhat more humble.

Oliver Kutz will present our paper [3] about interoperability among ontologies, to be realized with the Distributed Ontology Language (DOL) that is currently in the process of standardisation at ISO (scheduled to be finalized by 2015). DOL is a metalanguage for distributed ontologies that may be represented in different ontology languages (some of the technical details can be found in a recent paper that won the best paper award at FOIS’12 [4] and a few examples are described in [5]). Overall then, it would be nice if, by 2022, we have solved the interoperability issues not only among data, but also the ‘models’ (ontologies, services descriptions etc.) and, especially, their logic-based representation languages. For instance, being able to seamlessly link knowledge that is represented partially in OWL 2 DL and partially in an ontology represented in Common Logic or leaving an OBO ontology like that yet declare more semantics (e.g., cardinality constraints, property chains) ‘around’ it in a more expressive language for those who need it, and advanced features for modularization, which are all realistic usage scenarios with the DOL. Clearly, all this will need some tool support. Initial tools do exist—Hets for reasoning over heterogeneous ontologies and the Ontohub ontology repository—but more can and will have to be done to realize full interoperability.

The paper on the Semantic Web needs (vision?) for cultural heritage [6] offers nothing I did not already know. South Africa has its own programme in that area—albeit called “indigenous knowledge management”, not “cultural heritage”—and we did our own requirements analysis some time ago already [7, 8]. Our list of requirements lists matches the one by Vavliakis et al., and we have a technology maturity analysis, a set of OWL requirements, and actual use cases from the domain experts and users of the Department of Science & technology’s National Recordal System project for indigenous knowledge management (about which I blogged before). That the topics will receive attention also at SW2022 hopefully increases the chance that those requirements will be investigated further, solved, and realized, which, in turn, will improve the software developed here and, ultimately, the people will benefit from it all.

Mutharaju [9] emphasizes on the need for connectivity, personalization and abstraction. Regarding the latter, he notes that “There would be a need to provide multiple (and higher) levels of abstractions and facilitate drill-down mechanisms.” yey! maybe my work on granularity (among others, [10]) will find its way into implementations after all. Also, Mutharaju thinks that the Semantic Web may be of use for the benefit of the environment (e.g., calculating better traffic flow, using sensor data etc.).

A short paper scheduled for the panel session is entitled “The rise of the verb” [11], which I found a curious title: verbs are taken into account already, where a verb’s ontological foundation is, in the Semantic Web context, represented as an object property in OWL or reified under, say, DOLCE’s Perdurant. Considering the contents of the paper, a more suitable title with respect to the contents could have been “action in the Semantic Web”: the paper’s introduction suggests adding something executable to the semantic web by means of JavaScript but where the instruction is specified at the knowledge level. Heiko Paulheim and Jeff Pan also want some language extensions: they argue in favour of language extensions, so as to be able to handle imprecision/uncertainty in particular [12].

Vander Sande and co-authors present a rather bleak vision of the Semantic Web [13], in that it could endanger humanity. They spend the full 6 pages on highlighting the myriad of dangers and the possible misuses of Semantic Web technologies. Among others: ‘semantic spam’ instead of the dumb variety we have gotten used to, where spammers take advantage of the Linked Open Data cloud and otherwise linked social network data to make the spam look more believable; polluting the LOD cloud through link spoofing; identity theft and provenance manipulation; and the Web of Things for autonomous computerized weaponry. One also could have added a follow-through of the saying that ‘knowledge is power’, where better and scaled-up knowledge management facilitates obtaining more power (and power corrupts, and absolute power corrupts absolutely). All this, in turn, goes back to the philosophical issues regarding responsibility in research, engineering, and technology and whether some field is inherently bad, neutral, or good, or whether the bad pops up only with some application scenarios where the technologies could possibly be used. For the Semantic Web, I think it is only the latter, but you may try to convince me otherwise.

Although I won’t be attending, it’s appreciated that the papers are online already, and I can imagine there will be some lively discussions at the SW2022 workshop.

References

[1] Abraham Bernstein. The Global Brain Semantic Web – Interleaving Human-Machine Knowledge and Computation. SW2022, Boston, Nov 11, 2012.

[2] Alessandro Oltramari. Enabling the cognitive Semantic Web. SW2022, Boston, Nov 11, 2012.

[3] Oliver Kutz, Christoph Lange, Till Mossakowski, C. Maria Keet, Fabian Neuhaus, Michael Grüninger. The Babel of Semantic Web tongues – in search of the Rosetta Stone of interoperability. SW2022, Boston, Nov 11, 2012.

[4] Till Mossakowski, Christoph Lange, Oliver Kutz. Three Semantics for the Core of the Distributed Ontology Language. In Michael Gruninger (Ed.), FOIS 2012: 7th International Conference on Formal Ontology in Information Systems, Graz, Austria.

[5] Christoph Lange, Till Mossakowski, Oliver Kutz, Christian Galinski, Michael Grüninger, Daniel Couto Vale. The Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility, Terminology and Knowledge Engineering Conference (TKE’12). Madrid, Spain.

[6] Konstantinos N. Vavliakis, Georgios Th. Karagiannis and Pericles A. Mitkas. Semantic Web in Cultural heritage after 2020. SW2022, Boston, Nov 11, 2012.

[7] Thomas Fogwill, Ronell Alberts, C. Maria Keet. The potential for use of semantic web technologies in IK management systems. IST-Africa Conference 2012. May 9-11, Dar es Salaam, Tanzania.

[8] Ronell Alberts, Thomas Fogwill, C. Maria Keet. Several Required OWL Features for Indigenous Knowledge Management Systems. 7th Workshop on OWL: Experiences and Directions (OWLED 2012). 27-28 May, Heraklion, Crete, Greece. CEUR-WS Vol-849. 12p.

[9] Raghava Mutharaju. How I would like Semantic Web to be, for my children. SW2022, Boston, Nov 11, 2012.

[10] C. Maria Keet. A formal theory of granularity. PhD Thesis, KRDB Research Centre, Faculty of Computer Science, Free University of Bozen-Bolzano, Italy. 2008.

[11] Paul Groth. The rise of the verb. SW2022, Boston, Nov 11, 2012.

[12] Heiko Paulheim and Jeff Z. Pan. Why the Semantic Web should become more imprecise. SW2022, Boston, Nov 11, 2012.

[13] Miel Vander Sande, Sam Coppens, Davy Van Deursen, Erik Mannens and Rik Van De Walle. The terminator’s origins or how the Semantic Web could endanger humanity. SW2022, Boston, Nov 11, 2012.

MOOCs, computer-based teaching aids, and taking notes

This post initially started out to be directed toward the current COMP314 Theory of Computation students at UKZN, who, like last year, are coming to terms not only with the subject domain, but also with the fact that I write the lecture notes on the blackboard (which is green, btw). The post eneded up containing some general reflections on the use/non-use of computer-based teaching aids and, in extension, the Massive Open Online Courses (MOOCs), with illustrations taken mainly from the ToC course.

First a note to the students: I am aware most of you don’t really like taking notes during the lectures and quite a few still do not do so—despite that you know that you’re served a summary of the bulky textbook, saving you to summarise it. But those who do make notes, or at least rewrite the notes from someone else or fill up the gaps in their own notes, go much more quickly through the exercises than those who do not. For instance, I occasionally gave as an exercise an example that was done in class on the board already. The diligent note-takers’ response is along the line of “yeah, that was trivial, and, by the way, we did that already in class, here’s the solution. Give me a real challenge!” compared to starting the ‘exercise’ from scratch by those who did not take notes, and who do not even recollect we did it in class. In the end (and by observation, not scientific rigour of the double blind experiment), the former group completes more, and more advanced, exercises in the same or less amount of time. Maybe I should shout that from the rooftops.

Taking notes means you listen, read, and write, therewith processing the knowledge that I’m trying to get from my brain into yours. Doing something actively, versus passively listening (or, worse, letting it go into one ear and immediately release it through the other, or dozing off), makes you think about the material at least once, which then saves time later on because you’ll remember some, if not all, of it. In addition, imagine how fast I could go through the material if I would not have to take the time to write it on the board, but just click the down arrow, and therewith having the opportunity to cover even more material than I already do whereas you would hardly have time to think about the matter at all, let alone ask questions. Besides, note taking is a good exercise in general, because there often won’t be any handouts prepared for you once you’re out there in industry, so you might as well practice for those situations now. (See also Mole’s column on why to chalk it upon the blackboard.)

Nonetheless, unlike, say, 15-20 years ago when we did not know any better than reading the material beforehand and taking notes during the lectures, in the present-day slides-era, quite a few students are not happy with the note-taking effort and ‘quality of the teaching aids provided’, as they are used to the slides in other courses. I distribute the slides, too, for a course like ontologies and knowledge bases, because there is no textbook (I even wrote a 150-pages long set of lecture notes), but for Theory of computation, we use a textbook (Hopcoft, Mottwani, and Ullman) that has the details at undergraduate level. Nor were last year’s students happy with the note-taking. Some students searched online back then, and I vividly remember one student sharing his opinion about that with me, unsolicited: “ma’am, we searched online for better material, but it’s all just as bad as yours! So we won’t hold it against you”. Well, thanks.

Closely related to that ‘searching around online’, however, are possible less pleasant side effects. I will mention two. First, last year some students looked up alternative explanations for the diagonalization language and some things being incomputable, having come across a version of the barbershop paradox. Strictly speaking, it is not an alternative explanation of the same thing; or: if you are going to use that, it is a wrong analogy. I’ll explain why in class in a few weeks when we’ll cover chapters 8 and 9 of HMU.

Second, this year I have seen some non-notes students watching a YouTube video on the pumping lemma and on CFGs during the exercises in the lab; such repeats take up extra time. The note-takers, on the other hand, flicked through their pages in a fraction of the time, hence, were ahead in the exercises simply because they had used the lecture time more effectively. (And I generously assume that what was presented in the YouTube video was correct, which need not be the case; see previous point).

This does not mean there is no room for improvement regarding teaching aids for Theory of Computing, which I’ve written about last year. This year I tried the AutoSim automata simulator and the CFG Grammar Editor again, which still does not have much uptake by the students, afaik, and I’m trying out Gradiance. The Gradiance system is an online MCQs question bank with explanations of the answers and hints in case the answer given was incorrect—i.e., turning the gained experience and knowledge about common conceptual mistakes into an educational feature)—and one can assign homework and grade assignments. The dedication to do the (optional) homework is dwindling since the 5th week into the semester, but the automatic grading and providing the option for self-study for the interested and/or determined-to-pass students are can be great (the system is almost cheat-proof, but not entirely).

To get to the point after some meandering: some types of computer-based teaching aids can be useful, just not all of them all of the time. True, computing looks at what can be automated, and what can be automated efficiently, and so I could apply that notion to everything—up to totally automated course offerings, which is the direction that the MOOCs are going. However, computing also concerns solving problems, and being able to recognise when a problem is best solved by computation, and when other solutions may be more appropriate. For instance, low pass rates may be considered to be a problem, but this does not imply that e-learning is the solution to that; non-determinism and epsilon-transitions are concepts that are apparently not easy to grasp, and the simulator is more illustrative than my coloured chalk trying to simulate a run on the blackboard; in the pre-CMS era, course admin was a chore and there were, perhaps, instances of ‘lost’ assignment or project submissions (though that did not happen when I was a student in the ‘90s), which Moodle alleviates and prevents, respectively. So, software can indeed solve some problems.

This brings me to the other end of the spectrum: the Massive Open Online Courses, or MOOCs; yes, there is even a MOOC for Theory of Computation, by Ullman himself. Which problems do they purport to solve, and do they? Despite reading a lot of pop-articles about it over the past year (see, e.g., the feature series from the Chronicle of Higher Education on MOOCs), it still is not clear to me with respect to the solutions for lecturing issues. One recurring argument in the flurry of news articles is that it is great [to/because it] give[s] the poor sods around the world some crumbs from the elite universities; well, the ‘poor’ with a good Internet connection and the money to pay for the data transfer, that is, not Joe and Joanne Soap in Chesterville, Soweto, etc., and the interested potential student has to have a good command of the English language as well.

Another recurring argument with the MOOCs is that one can learn from the best, and, en passant, implying, or even explicitly stating, that the MOOC lecturers are assumed to be better teachers by definition than anyone else. Sure, there are lecturers who teach stuff that is wrong, but how widespread is that, and what is the cause of that? If anyone has hard data about such claims, please let me know. For the sake of argument, let’s assume mistakes are widespread, and that it is because we, as non-elite university lecturers, are undereducated and incompetent teachers. Is a MOOC the solution? Or maybe giving lecturers the time to learn their material and prepare the lectures better, i.e., local capacity building? Not all of us are undereducated. People who have taught a course many times, do research in that field, and possibly even have written textbooks on the topic, tend to be better lecturers, because they generally have reflected more on the teaching and already have come across all possible conceptual mistakes the students make, can anticipate it, and therewith even prevent that from happening at least to a larger extent than a novice lecturer; those people are not all and only at Stanford, Harvard, and MIT.

Second, the MOOCs—for the time being, at least—are based on a push-mechanism of knowledge transfer, yet lecturing consists of much more than talking at the front of the classroom. There is interaction with questions and answers, there is context, and so on. For instance, regarding motivational context, to introduce the data mining in a database course with a story about the nearby Pick ‘n Pay at Westwood mall that the on-campus students go to, and, once signed up for their customer loyalty card, how PnP will find out you’re a student even if you did not say so on the application form. Or the problems with Johannesburg’s integrated services delivery management system as a real life example of data integration issues and the urgent need to have educated South African computer scientists to solve them. Or, given that UKZN’s CS student have had a lot of java programming in the first and second year, to show them the java language as a CFG in the theory of computation course. In addition, sometimes to the pleasant and sometimes frightened, surprise of my students, I actually do know about half of the roughly 70 registered students by name, and know who can be prodded into action only when an exercise is marked with ‘super hard’, who does not want to know the answer straight away but just a little hint to move on him/herself to find the solution, the determined to solve it on their own, the insecure who needs a bit of encouragement, and so on.

To have a MOOC suit you, you would have to have done the same prior courses—unless the MOOC is only an introduction to topic x—, you would not care about the absence of a few context/motivational stories, your learning style has to match with that of the MOOC, and you have to be very disciplined on your own. To name but a few potential hurdles in the positive light.

Further, there are the exercises and tests. There are some new tools for the automatic grading of exercises and assignments (Gradiance would fit here, I suppose, but not the regular textbook exercises), or semi-automatic with software+TA, and virtual ‘MOOC study groups’ are popping up in social media. If you don’t know any better, it is probably great. Like thinking pizza is tasty all around the world—until you experience how it tastes in Naples. Such online groups are not bad—I participated in it myself when I was studying at the Open University UK, and it is better than no contact with other students at all—but it does not compare with the face-to-face meetings with fellow students, where the lectures are discussed, notes compared and brushed up, exercises discussed and solved, peer-explanation happens, students motivating one another, and so on.

Overall, then, if MOOCs are going to become the standard, the world will be poorer for it. For sidelining competent lecturers, de-incentivising weaker lecturers from acting on their responsibility to brush up their knowledge and skills, de-contextualising and hamburgerising courses, impoverishing the academic learning environment by narrowing down education to a mere push-mechanism of knowledge transfer, and dehumanizing students into boring conformity (eenheidsworst in Dutch). Add to that mix the cultural imperialism, and we are well on our way to a ‘brave new world’.

In the meantime, participate in the lectures and process the information, and take notes. I don’t think MOOCs will kill the regular universities, but imagine if you really were the last generation to go to (or work at) a real university… Exploit the advantages that a face-to-face university offers you, and cherish it while it lasts!