Design rationale and overview of the African Wildlife tutorial ontologies

There are several tutorial ontologies, which typically focus on illustrating one or two aspects of ontology development, notably language features and automated reasoning. This may suffice for one’s aims, but for an ontology engineering course, one would need to be able to illustrate a myriad of development factors and devise exercises for a wider range of tasks of ontology development. For instance, to illustrate the use of ontology design patterns, competency questions, foundational ontologies, and science-based modelling practices, neither of which is addressed easily by the popular tutorial ontologies (notably: wine and pizza), perhaps because they predate most of the advances made in ontology engineering research. Also, I have noticed that my students replicate examples from the exercises they carry out and from inspecting popular and easy-to-find ontologies. Marking the practical assignments, I got to see sandwich and ice cream and burger ontologies with toppings and value partitions, and software and mobile phone ontologies where laptop models are instances rather than classes. Not providing good and versatile examples holistically, causes the propagation of sub-optimal ontology development at least in the exercises, which then also may affect negatively the development of an operational domain ontology that the graduates may have to develop later on.

I’ve been exploring alternatives and variants over the past 11 years in the ontology engineering courses that I have taught yearly to about 8-40 students/year. In an attempt to systematise and possibly generalise from that, I’ve identified 22 requirements that contribute to a good tutorial ontology, which concern the suitability of the subject domain (7 factors), the ease of demonstrating logics and reasoning tasks (7), and assistance with demonstrating engineering aspects (8). Its details are described in a technical report [1]. I don’t claim that it’s an exhaustive list, but that it is one that may help someone to develop their own tutorial ontology in a fun or interesting topic if they so wish—after all, not everyone is interested in pizzas, wines, African wildlife, pets, shirts, a small university, or Robert Stevens’ family.

I’ve tried out a variety of extant tutorial ontologies as well as a range of versions of the African Wildlife Ontology (AWO) over the years (early experiences), eventually settling for a set of 14 versions, all the way from the example from the Primer [2] to DOLCE- and BFO-aligned to translated in several languages, and some with possible answers to some of the exercises. A graphical rendering of the main classes and relations is shown in the following figure:

The versions of the AWO are summarised in the following table, which is also mentioned as annotation in the OWL files.

 

The AWO meets a majority of the 22 requirements, is mature by now, and it has been used yearly in an ontology engineering course or tutorial since 2010. Also, it is links up with my ontology engineering textbook with relevant examples and exercises. The AWO provides a wide range of options concerning examples and exercises for ontology engineering well beyond illustrating only logic features and automated reasoning. For instance, it assists in demonstrating tasks about ontology quality, such as alignment to a foundational ontology and satisfying competency questions, versioning, and multilingual ontologies. For instance, it is easier to demonstrate alignment of a class Animal to DOLCE’s (Non-Agentive) Physical Object than, say, debating what Algorithm aligns with or descend into political debates on the gender binary or what constitutes a family. One can use the height or the colours of the plants and animals to discuss how to model attributes as qualities or dependent entities cf. OWL’s data properties or an artificial ValuePartition. Declare, say, de individual lion simba as an instance of Lion, rather than the confusion regarding grape varieties. Use intuitively obvious disjointness between animals and plants, and subsequently easy catches on sensitising modellers to the far-reaching effects of declaring domain and range axioms by first asserting that animals eat animals, and then adding that carnivorous plants eat insects. In addition, it links up easily to topics for ontology integration activities, such as with biodiversity data, wildlife trade, and tourism to create, e.g., an OBDA system with freely available data (e.g., taken from here) or an ontology-enhanced website for an organisation that offers environmentally sustainable safaris. More examples of broad usage options are described in section 2.3 in the tech report.

The AWO is freely available under a CC-BY licence through the textbook’s webpage at https://people.cs.uct.ac.za/~mkeet/OEbook/ in this folder. A more comprehensive description of the requirements, design, and content is described in a technical report [1] for the time being.

 

References

[1] Keet, CM. The African Wildlife Ontology tutorial ontologies: requirements, design, and content. Technical Report 1905.09519. 23 May 2019. https://arxiv.org/abs/1905.09519.

[2] Antoniou, G., van Harmelen, F. A Semantic Web Primer. MIT Press, USA. 2003.

From ontology verbalisation to language learning exercises

I’m aware that to most people ‘playing with’ (investigating) ontologies and isiZulu does not sound particularly useful on the face of it. Yet, there’s the some long-term future music, like eventually being able to generate patient discharge notes in one’s own language, which will do its bit to ameliorate the language barrier in healthcare in South Africa so that patients at least will adhere to the treatment instructions a little better, and therewith receive better quality healthcare. But benefits in the short-term might serve something as well. To that end, I proposed an honours project last year, which has been completed in the meantime, and one of the two interesting outcomes has made it into a publication already [1]. As you may have guessed from the title, it’s about automation for language learning exercises. The results will be presented at the 6th Workshop on Controlled Natural Language, in Maynooth, Ireland in about 2 weeks time (27-28 August). In the remainder of this post, I highlight the main contributions described in the paper.

First, regarding the post’s title, one might wonder what ontology verbalisation has to do with language learning. Nothing, really, except that we could reuse the algorithms from the controlled natural language (CNL) for ontology verbalisation to generate (computer-assisted) language learning exercises whose answers can be computed and marked automatically. That is, the original design of the CNL for things like pluralising nouns, verb conjugation, and negation that is used for verbalising ontologies in isiZulu in theory [2] and in practice [3], was such that the sentence generator is a detachable module that could be plugged in elsewhere for another task that needs such operations.

Practically, the student who designed and developed the back-end, Nikhil Gilbert, preferred Java over Python, so he converted most parts into Java, and added a bit more, notably the ‘singulariser’, a sentence scrabble, and a sentence generator. Regarding the sentence generator, this is used as part of the exercises & answers generator. For instance, we know that humans and the roles they play (father, aunt, doctor, etc.) are mostly in isiZulu’s noun classes 1, 2, 1a, 2a, or 3a, that those classes do not (or rarely?) have non-human nouns and generally it holds for all humans and their roles that they can ‘eat’, ‘talk’ etc. This makes it relatively easy create a noun chain and a verb chain list to mix and match nouns with verbs accordingly (hurrah! for the semantics-based noun class system). Then, with the 231 nouns and 59 verbs in the newly constructed mini-corpus, the noun chain and the verb chain, 39501 unique question sentences could be generated, using the following overall architecture of the system:

Architecture of the CNL-driven CALL system. The arrows indicate which upper layer components make use of the lower layer components. (Source: [1])

From a CNL perspective as well as the language learning perspective, the actual templates for the exercises may be of interest. For instance, when a learner is learning about pluralising nouns and their associated verb, the system uses the following two templates for the questions and answers:

Q: <prefixSG+stem> <SGSC+VerbRoot+FV>
A: <prefixPL+stem> <PLSC+VerbRoot+FV>
Q: <prefixSG+stem> <SGSC+VerbRoot+FV> <prefixSG+stem>
A: <prefixPL+stem> <PLSC+VerbRoot+FV> <prefixPL+stem>

The answers can be generated automatically with the algorithms that generate the plural noun (from ‘prefixSG’ to ‘prefixPL’) and add the plural subject concord (from ‘SGSC’ to ‘PLSC’, in agreement with ‘prefixPL’), which were developed as part of the GeNI project on ontology verbalization. This can then be checked against what the learner has typed. For instance, a generated question could be umfowethu usula inkomishi and the correct answer generated (to check the learner’s response against) is abafowethu basula izinkomishi. Another example is generation of the negation from the positive, or, vv.; e.g.:

Q: <PLSC+VerbRoot+FV>
A: <PLNEGSC+VerbRoot+NEGFV>

For instance, the question may present batotoba and the correct answer is then abatotobi. In total, there are six different types of sentences, with two double, like the plural above, hence a total of 16 templates. It is not a lot, but it turned out it is one of the very few attempts to use a CNL in such way: there is one paper that also will be presented at CNL’18 in the same session [4], and an earlier one [5] uses a fancy grammar system (that we don’t have yet computationally for isiZulu). This is not to be misunderstood as that this is one of the first CNL/NLG-based system for computer-assisted language learning—e.g., there’s assistance in essay writing, grammar concept question generation, reading understanding question generation—but curiously very little on CNLs or NLG for the standard entry-level type of questions to learn the grammar. Perhaps the latter is considered ‘boring’ for English by now, given all the resources. However, thousands of students take introduction courses in isiZulu each year, and some automation can alleviate the pressure of routine activities from the lecturers. We have done some evaluations with learners—with encouraging results—and plan to do some more, so that it may eventually transition to actual use in the courses; that is: TBC…

 

References

[1] Gilbert, N., Keet, C.M. Automating question generation and marking of language learning exercises for isiZulu. 6th International Workshop on Controlled Natural language (CNL’18). IOS Press. Co. Kildare, Ireland, 27-28 August 2018. (in print)

[2] Keet, C.M., Khumalo, L. Toward a knowledge-to-text controlled natural language of isiZulu. Language Resources and Evaluation, 2017, 51(1): 131-157.

[3] Keet, C.M. Xakaza, M., Khumalo, L. Verbalising OWL ontologies in isiZulu with Python. The Semantic Web: ESWC 2017 Satellite Events, Blomqvist, E. et al. (eds.). Springer LNCS vol. 10577, 59-64.

[4] Lange, H., Ljunglof, P. Putting control into language learning. 6th International Workshop on Controlled Natural language (CNL’18). IOS Press. Co. Kildare, Ireland, 27-28 August 2018. (in print)

[5] Gardent, C., Perez-Beltrachini, L. Using FB-LTAG Derivation Trees to Generate Transformation-Based Grammar Exercises. Proc. of TAG+11, Sep 2012, Paris, France. pp117-125, 2012.

An Ontology Engineering textbook

My first textbook “An Introduction to Ontology Engineering” (pdf) is just released as an open textbook. I have revised, updated, and extended my earlier lecture notes on ontology engineering, amounting to about 1/3 more new content cf. its predecessor. Its main aim is to provide an introductory overview of ontology engineering and its secondary aim is to provide hands-on experience in ontology development that illustrate the theory.

The contents and narrative is aimed at advanced undergraduate and postgraduate level in computing (e.g., as a semester-long course), and the book is structured accordingly. After an introductory chapter, there are three blocks:

  • Logic foundations for ontologies: languages (FOL, DLs, OWL species) and automated reasoning (principles and the basics of tableau);
  • Developing good ontologies with methods and methodologies, the top-down approach with foundational ontologies, and the bottom-up approach to extract as much useful content as possible from legacy material;
  • Advanced topics that has a selection of sub-topics: Ontology-Based Data Access, interactions between ontologies and natural languages, and advanced modelling with additional language features (fuzzy and temporal).

Each chapter has several review questions and exercises to explore one or more aspects of the theory, as well as descriptions of two assignments that require using several sub-topics at once. More information is available on the textbook’s page [also here] (including the links to the ontologies used in the exercises), or you can click here for the pdf (7MB).

Feedback is welcome, of course. Also, if you happen to use it in whole or in part for your course, I’d be grateful if you would let me know. Finally, if this textbook will be used half (or even a quarter) as much as the 2009/2010 blogposts have been visited (around 10K unique visitors since posting them), that would mean there are a lot of people learning about ontology engineering and then I’ll have achieved more than I hoped for.

UPDATE: meanwhile, it has been added to several open (text)book repositories, such as OpenUCT and the Open Textbook Archive, and it has been featured on unglue.it in the week of 13-8 (out of its 14K free ebooks).

Yet another software-based clicker system: LetsThink – GoAnswer

Every now and then, I dabble into trying to improve on teaching, which in this case of the post’s topic, is in the form of an implementation project. Yes, one of those with working software as result, thanks to the programmer and CS honours student Yaseen Hamdulay who implemented it. In short: we now have a software-based audience response system that satisfies some pertinent requirements for its use in lectures: math support, figures, and diacritics, and one-click question release and showing the results so as not to hold up the lecture. It will be presented at UCT’s upcoming Teaching & Learning Conference on Oct 22-23.

The context

Peer instruction (PI) enables students to learn from each other. Unlike some other educational interventions, PI has been shown to improve the final grade by up to 34-45%, regardless the discipline it is used (computer science, genetics, zoology, psychology—you name it). On top of that, it generally receives positive feedback from students, because it makes the lectures more engaging, or in any case at least less dull.

From the practical side on how to go about using it in class, there are four principal options: expensive hardware-based ‘clickers’, software-based audience response systems (ARSs) with students’ smartphones or laptops, card and picture-taking (image processing), and none of that through indicating the concept test’s answer with one’s fingers on one’s chest. Regarding resources, the software-based ARS sits somewhere in the middle, which suits our setting well. There are quite a few software-based ARSs (see, e.g., this review). However, upon analysis, they all have issues hampering effective classroom use. For instance, they may not have a single-question release or are for-payment only (e.g., Socrative, Google Forms, Pinnion, and eClicker), or have an impractical results display—they make money from corporate accounts, so the tools are geared for that. The major drawbacks of all these systems are that at least the free versions, if not all, have very limited question setting and answer options, mostly having just plaintext. So, no pictures in the question, answers can be only 100 or 140 characters short, let alone displaying mathematical symbols. Perhaps surprisingly in the international arena, they don’t do diacritics either, even though multiple natural languages have them. For the lectures, this ends up really clumsily and cumbersome, where such questions also must have projected on the screen an ‘offline’ version and then toggling with the screen of the voting progress and results.

The problems

Thus, the main problem is that there is no software-based ARS that contains basic required features for smooth use of PI in the classroom irrespective of one’s discipline, hampering the uptake of this active learning intervention. The current software-based ARSs are prohibitive in uptake of PI in the classroom, yet in our—and many others—medium-level resource setting, extensive use of hardware clickers is not a sustainable option, nor are structural expenses of the user-based payment schemes an option, for they effectively discourage broad participation due to the by-participant payment scheme.

The solution

To address these issues, we developed an ARS that has the following features: support for figures, mathematical formulae, proper HTML text for display of diacritics, one-click question release, showing voting progress, results, question reset, saving of the voting results of a session (among others). This is a web-accessible ARS that can cater for various disciplines within one software system and that is simple and fast to work with.

The project page has several screenshots and the link to the tool. It requires a simple registration (at this stage, we don’t bother with email verification—so don’t lose your password), but is then also immediately usable.

To brighten up this post, here are some screenshots; others are on the project page and some have explanations in a previous post on using PI in a networks course, like the one that has maths and a picture in one.

Screenshot of a question with a picture:

Question with a figure

Question with a figure

Note that the lecturer controls (bottom left, and top-left to go back) are intentionally kept in small font. It’s easily readable from the lecturer’s machine, but it’s irrelevant to read for the audience.

Screenshot of a question with some maths, with first the interface for creating the question, then the one projected, and finally the student’s interface when voting. Note that this is just to give an impression that you can do it, not that it is in any way a sensible concept test. Anything that can be done with the latex plugin for html (MathJax) can be used in the question and in the answer box, but thus not your own macros. (For the curious: the answer to the question is A (read the problem).)

Interface for creating a question

Interface for creating a question

The question rendered in the presentation interface

The question rendered in the presentation interface

Small screenshot of the voting interface

Small screenshot of the voting interface

Screenshot of a question with diacritics:

Some question with diacritics.

Desktop/laptop interface of the question to answer, after having entered the question ID:

Section of the lecturer/admin interface, where you can choose to group questions

Exported results from the voting imported into a spreadsheet:

LetsThink-GoAnswer has been beta-tested with my networks students last semester, has been demo-ed in a CILT seminar on ARSs, and the occasional lecturer already is/will be experimenting with it. The code will be made available on GitHub soon. In the meantime, please don’t hesitate to contact me if you have any questions, and we’d love to hear your feedback!

Peer instruction with a computer networks course

Last year, I attempted to use Peer Instruction (PI) in the networks course, hoping that it would correlate with class attendance (it didn’t; see [1] and related post), but it was useful for various reasons nevertheless. So, I tried again this year and, imho, with better questions (the results are ambivalent regarding class attendance—TBC). But what are “better questions”? There’s no such thing as a ‘concept inventory’ for networks, like there is for physics where PI was applied to first [2]. The PI 4 CS website has several sets of PI questions and the ones I checked for theory of computation are really good (discussed here), but there are none for networks.

So I played around from scratch. One issue with last year’s installment was that software-based PI didn’t do the job, because none of the freely available tools have the feature to show figures, or have too many other shortcomings to use effectively in the classroom [1], therewith limiting my options. One of our students developed a funky PI tool this year (that I will write about later); suffice to say, I can do questions with math display and figures now, so that that issue was solved. I’d searched a bit on guidance on designing PI questions, to no avail, until after the course’s installment when preparing for this post, one useful result turned up, being Julie Schell’s blog post over at teach.com (a search engine indexing lag issue, it appears). Mine seem to be mostly fine with respect to those suggestions, so I will give some examples below.

Let me start with a PI question that essentially also returned in the exam, though it was reworded as a non-MCQ question (yes, just to rub it in to those students who did not attend class). A screenshot of the PI tool is shown below, where the graph comes straight from the course slides. The setting is that it is posed right after students try to come to grips with flow control (TCP buffer at the receiver) versus congestion (in the network), and within congestion, the effects of lost packets (that have to be resent, but only 1 arrives) versus delayed packets (packet resent, and both arrive at the receiver). What is the answer?

Screenshot of a PI question in our own software-based PI tool

Screenshot of a PI question in our own software-based PI tool

Clearly, the first distinction is to eliminate TCP buffer at the end versus network congestion issues. The buffer size is handled by TCP such that it will never overflow to lose any packets there (part of the protocol), so it must be congestion that causes the red line not to reach the dotted one, ever. With (know to be) lost packets, only one arrives; with delays, duplicates arrive. Due to having to handle duplicate packets, the receiver receives more packets than strictly needed, so the red line can never reach the maximum. Hence, (d) is the correct answer.

Another one that I like is the ARP one, which basically tests understanding of network layer services versus link layer services as a first pass, and then what ARP is actually doing. The question is as follows.

Consider the following network configuration and addresses. Host A wants to communicate with B. It did so yesterday. Which of the following will NOT happen/is NOT true?

networkARPa) Host 111.111.111.112 also receives a frame from A.

b) Router R will put <111.111.111.111, 74-29-9C-E8-FF-55, 20> in its ARP table, if not already there

c) A’s frame has destination addresses 222.222.222.222 and 49-BD-D2-C7-56-2A

d) Router R sends a frame containing addresses 1A-23-F9-CD-06-9B and 49-BD-D2-C7-56-2A, and 111.111.111.111 and 222.222.222.222

Read the explanation in [1] if you want to know why the answer is option (c). Or, take something what seems to be contradictory, but isn’t: statelessness and persistence (on HTTP). One of those think-and-analyse questions is the one on protocol design, notably regarding reliable transfer, and a hint of a gradation in goodness of answer, and therewith inherently some argumentation is needed regardless of what option one chooses.

Consider the following protocol that tries to solve reliable transfer over a channel with bit errors. It has several shortcomings, some worse than others. Which of the following is the worst?

rdt20PictureYou can choose between:

a) it can’t handle corrupt ACK/NAK packets

b) it gets stuck in the “wait” state when the ACK/NAK is lost

c) it uses superfluous NAKs

d) it is an inefficient stop-and-wait protocol

It requires one to ponder about what it means to have a really a bad protocol design (what is ‘worst’?). Getting stuck in a state forever—‘hanging’/’freezing’, if you will—is really, really bad. But, we’re dealing with a protocol “that tries to solve a channel with bit errors”, only. The scope is not lost packets, but corrupt packets, so it won’t get stuck in the “wait” state because a msg always arrives; hence, it cannot be (b). Then, what is worse of the other three? Superfluous NAKs, just because you can do it with ACKs + sequence numbers? Waiting for a response on your individual message, knowing that there are more efficient protocols that can pipeline packets with cumulative AKCs to reduce waiting and limit the cable being idle (option (b))? That surely is suboptimal, but what does that have to do with ‘worse’ from a viewpoint of reliable transfer as the only requirement? Or, still from a reliable transfer viewpoint, the receiver can’t figure out if the msg is an ACK or a NAK (option (a))? After all, with bit errors, it may well be the case that not only the msg arrives corrupted, but also the ACK/NAK. If the sender doesn’t know which one it is, and doesn’t know what to do with such a jumbled-up ACK-or-NAK packet, it cannot guarantee it has reliably transferred the packet. That’s certainly worse than having a cable idle (which is also a problem, one of performance, but not as bad). This reduces the option to either (a) or (d). A superfluous NAKs is, well, superfluous in the design, because one would have the handle those if they were there, and one could be more elegant in the design if they weren’t there, but it’s not as bad as not dealing with them at all. So, the answer is option (a).

These are just some of the questions that probe a bit deeper than the more commonly known rote learning for networks. It’s not that networks in CS is not devoid of rote learning—there’s quite a bit of it, in fact—but there surely are some things one would do well to try to understand.

If you want to have the full set of questions, don’t hesitate contact me; by then I hope to have them formatted in a more presentable way, with explanations for each question.

 

P.S.: Note to my dear students: yes, the latter question appeared in a slightly different format in the class test, as did the one on stateless and persistence, as did the congestion control one in the exam (in non-MCQ format). Awww… if only you would have attended lectures and had taken notes… It’s not that I want you to fail, just to show unequivocally some of the myriad of benefits of attending lectures.

References

[1] Keet, C.M. An Experiment with Peer Instruction in Computer Science to Enhance Class Attendance. 23rd Annual Meeting of the Southern African Association for Research in Mathematics, Science, and Technology Education (SAARMSTE’15), Huillet, E. (Ed.), pp319-331. 13-16 January 2015, Maputo, Mozambique.

[2] Crouch, C.H. & Mazur, E. (2001). Peer Instruction: Ten years of experience and results. American Journal of Physics, 69, 970-977.

Dancing Algorithms

Yes, it appears that the two can go together. Not in that the algorithms are dancing, but one can do a dance with a choreography such that it demonstrates an algorithm. Zoltan Katai and Laszlo Toth from Romania came up with the idea of this intercultural computer science education [1], with a theoretical motivation traced all the way back to Montessori. It has nothing to do with the scope of my earlier post on folk dancing and cultural heritage preservation, yet at the same time, it contributes to it: watching the videos of the dances immerses you in the folk music, rhythm, the traditional clothes of the region, and some typical steps and and movements used in their dances.

The context, in short: learning to program is not easy for most students—as our almost 900 first-year students are starting to experience from next week onwards—and especially understanding the workings of algorithms. Katai and Toth’s approach is to involve ‘playing out’ the algorithm with people, not by clumsily walking around, but using folk dance and music to make students understand and remember it more easily. They took several sorting algorithms to demonstrate the idea, and tested it on their students, demonstrating that it improved understanding significantly [1].

Perhaps because of my bias toward the dancing, I didn’t take note of the algorithm being danced-out when I watched it the first time, or perhaps it is useful to have read the core steps of the algorithm before watching anyway. You choose: watch the video of the selection sort algorithm—given a list, repeatedly select the smallest remaining element and move that to the ‘sorted’ section of the list—with a Gypsy (Roma) folk dance, or read further below what selection sort is. (note: the video goes on double speed in the middle, for it gets a bit repetitive.)

So, what was happening in the dance? We have one ‘comparer’ (x, for short) and one ‘compared with’ (y). The left-most dancer, with number 3 (our first value of x), starts to dance to the front, calls on the second one in line, being the guy with 0 (our first y), he swirls her back in the line in the spot he came from and stays at the front (0 being the new value of x), and calls on the next, the lady with the 1 (the new value of y), who gets back in the line; and so on till the last one (with number 6). Dancer 0 does a solo act and goes to the first spot: he’s now the first one in the ‘sorted’ part, and we have completed one iteration. Starting with the second main iteration: now number 3 is again at the front of the unsorted, and she dances again to the front (so the value of our x is 3 at this point), calling the second one in the unsorted list, who has number 1, so the lady with number 3 goes back in the unsorted again, and the dancer with 1 continues through the remainder of the list, has her solo, and joins the guy with the 0 in the sorted part, having completed the second main iteration. And so on until about 6:20 minutes into the video clip when the list is sorted and the dancers do a little closing act.

A bit more structured, the following is happening in the choreography of the dance in the video:

  1. Divide the list into a ‘sorted’ part (initially empty) and an ‘unsorted’ part (initially the list you want to sort)

  2. Do for as long as there’s more than one item in the ‘unsorted’ (find the smallest item in that list):

    1. select first element of the unsorted list (with some value, that we refer to with x)

    2. if we’re not at the end of the ‘unsorted’ list, then get the next element of the ‘unsorted’ list (with some value, let’s call that one y)

      1. if x < y, then y is put back in the same spot in the unsorted list, and we return to the start of step (b) to get the next item to test x against (being the one after the one we just tested)

      2. else (i.e., x > y), then the value of x takes y‘s spot in the unsorted list, we assign the value of y to x, and we return to the start of step (b) to get the next element from the unsorted list

    3. else (i.e., we’re at the end of the list and thus x is the lowest value), place (the value of) x in the next available spot in the ‘sorted’ part. Then go back to the start of step 2.

  3. Place the last item from ‘unsorted’ at the end of the ‘sorted’.

  4. Done (i.e., there’s nothing more in ‘unsorted’ to sort)

The algorithm itself is less cumbersome by not having those “let’s pick out one and come to the front” steps, but direct comparisons. I did not plan to include here, but do after all for it makes a nice sequence of artsy → informal analysis → semi-precise structure → specification the computer can work with. (Never mind that that was not the order things came about). Using our CSC1015F course material (2012 samples) that teaches Python, one of the possible sample code snippets is as follows:

def selection_sort ( values ):
   """Sort values using selection sort algorithm."""
   # iterate over outer positions in list
   for outer in range (len (values)):
      # assume first value is minimum
      minimum = outer
      # compare minimum to rest of list and update
      for inner in range (outer+1, len (values)):
         if values[inner] < values[minimum]:
            minimum = inner
      # swap minimum with outer position
      temp = values[minimum]
      values[minimum] = values[outer]
      values[outer] = temp
   return values

This is not the only way of achieving it, btw, and the 1017F lecture and lab lecture files have another version of achieving the same (I just thought that this one may be more readable by non-CS readers of the post).

Other danced sorting algorithms are insertion sort (video) with Romanian dance, and shell-sort (video) and bubble sort (video) on Hungarian dances, and more information is also available from the Algo-rythmics website. This isn’t new, and dances on many other tunes can be viewed on youtube, e.g., various bubble-sort dances. Reconstructing the bubble-sort algorithm from the dance below (5min, no fastforward) is an exercise left to the reader… likewise making dances on other algorithms (the ICPC’14 solution to the baggage problem seems like a fun candidate line dance-like), and the same dances but with other folk dances and music.

References

[1] Zoltan Katai and Laszlo Toth. Technologically and artistically enhanced multi-sensory computer programming education. Teaching and Teacher Education, 2010, 26(2): 244-251.

Updated ontology engineering lecture notes (2015)

It’s that time of the year again, in the southern hemisphere that is, where course preparations for the academic year are going on full steam ahead. Also this year, I’ll be teaching a CS honours course on ontology engineering. To that end, the lecture notes have been updated, though not in a major way like last year. Some sections have been shuffled around, there are a few new exercises, Chris’s update suggestion from last year on the OBO-OWL mapping has been included, and a couple of typos and odd sentences have been fixed.

Practically, this installment will be a bit different from previous years, as it has integrated a small project on Semantic Wikis, funded by CILT and OpenUCT. Set up, maintenance, and filling it with contents on ontology engineering topics will initially be done ‘in house’ by students enrolled in the course and not be generally available on the Web, but if all goes well, it’ll be accessible to everyone some time in April this year, and possibly included in the OER Commons.

Semantic MediaWiki’s features are fairly basic and there are a bunch of plugins and extensions I’ve seen listed, but I didn’t check whether they all worked with the latest SMW. If you have a particular suggestion, please leave a comment or send me an email. One thing I’m still wondering about particularly, but haven’t found a solution to, is whether there’s a plugin that lets you see the (lightweight) ontology when adding contents, so that it makes it easier to use terms in the text from the ontology’s vocabulary rather than find an having to process manually whatever (near)synonyms have been used throughout the pages (like, one contributor using ‘upper ontology’, another ‘foundational ontology’ and the third ‘top-level ontology’), and allow on-the-fly extensions of that ontology.