More lively Theory of Computation lectures with peer instruction

Although I don’t teach theory of computation these days, I did encounter something that I’m pretty sure of that would have made the lectures more lively compared to my attempts (blogged about earlier here), and deserves some ‘air time’: peer instruction [1].

My database systems, computer literacy, and networks students already encountered it during some of my lectures, but to briefly recap the general idea (pioneered by Eric Mazur): you insert ‘quiz questions’ (i.e., concept tests) into the lectures, students vote on the multiple choice question, then there is a discussion among the students about the answers, and a re-vote. This sounds perhaps too playful for higher education, but it has been shown to improve course grades by 34-45% [1, 2], thanks to, among others, active engagement, focusing on understanding principles, and student discussion involving ‘teaching each other’.

Even if one thinks it can be suitable for some courses but not others (like tough computer science courses), there’s even peer instruction course material for something so abstract as theory of computation, pioneered by Cynthia Lee and available through peerinstruction4cs. And it works [2].

The peer instruction slides for theory of computation and several other courses can be downloaded after registration. This I did, both out of curiosity and for getting ideas how I might improve the quizzes for the networks course I’m teaching, for which no materials are available yet. Here, I’ll give two examples of such questions for theory of computation to give an indication that it is feasible.

The first one is about tracing through an NFA, and grasping the notion that if any trace through the NFA computation ends in a final state with all input read, it accepts the string. The question is shown in the figure below, which I rendered with the AutoSim automata simulator.

Peer instruction quiz question about NFAs.

Peer instruction quiz question about NFAs.

A and D can be excluded ‘easily’: A has the second trace as [accept] but it rejects, and D has the first trace [reject] but it accepts (which can be checked nicely with the AutoSim file loaded into the tool). The real meat is with options B and C: if one trace accepts and the other rejects, then does the NFA as a whole accept the string or not? Yes, it does; thus, the answer is B.

The other question is further into the course material, on understanding reductions. Let us have ATM and it reduces to MYSTERY_LANGUAGE. Which of the following is true (given the above statement is true):

  1. a) You can reduce to MYSTERY_LANG from ATM.
  2. b) MYSTERY_LANG is undecidable.
  3. c) A decider for MYSTERY_LANG (if it exists) could be used to decide ATM.
  4. d) All of the above

Answering this during the lecture requires more than consulting a ‘cheat sheet’ on reductions. A is a rephrasing of the statement, so that holds (though this option may not be very fair with students whose first language is not English). You will have to remember that ATM is undecidable, and, if visually oriented, the following figure:

Graphical depiction of reduction

Graphical depiction of reduction

 

So, if ATM is undecidable (the “P1” on the left), then so is MYSTERY_LANGUAGE (the “P2” on the right); for the aficionados: there’s a theorem for that in Chapter 9 of [3], being ‘if there is a reduction from P1 to P2, then if P1 is undecidable, then so is P2’. So B holds. Thus, the answer is probably D, but let’s look at C just to be sure of D. Because ATM is ‘in’ MYSTERY_LANGUAGE, then if we can decide MYSTERY_LANGUAGE, then so can we decide ATM—it just happens to be the case we can’t (but if we could, it would have been possible). Thus, indeed, D is the answer.

Sure, there is more to theory of computation than quizzes alone, but feasible it is, and on top of that, student perception of peer instruction is apparently positive [4]. But before making the post too long on seemingly playful matters in education, I’m going to have a look at how that other game—the second semi-final—will unfold to see how interesting the world cup final is going be.

References

[1] Crouch, C.H., Mazur, E. (2001). Peer Instruction: Ten years of experience and results. American Journal of Physics, 69, 970-977.

[2] Bailey Lee, C., Garcia, S., Porter, L. (2013). Can Peer Instruction Be Effective in Upper-Division Computer Science Courses?. ACM Transactions on Computing Education, 13(3): 12-22.

[3] Hopcroft, J.E., Motwani, R. & Ullman, J.D. (2007). Introduction to Automata Theory, Languages, and Computation, 3rd ed., Pearson Education.

[4] Simon, B., Esper, S., Porter, L., Cutts, Q. (2013). Student experience in a student-centered peer instruction classroom. Proceedings of the International Computing Education Research Conference (ICER’13), ACM Conference Proceedings. pp129-136.

ACM ICPC 2014 solution to problem A – baggage

Some of you already know I was on-site coach for the UCT team at the ACM Inter Collegiate Programming Contest World Finals in Ekaterinburg, held from 22 to 26 June, 2014. The problems were unusually hard this year, and solving 4 out of the 10 problems already got teams into the bronze medal range [results]. The technical coach of the UCT team, Bruce Merry, has written an analysis of 6 problems on his blog (D, K, C, E, B, and F–update: now also I and G, J, L, and H), and over at TopCoder, SnapDragon discusses 4 of the 10 problems (A, E, F, G), except that SnapDragon does not provide a solution to A (and I don’t like his hint of brute-force code-and-try) [update 3-7: there's description of his solution here now]. Googling, I couldn’t find someone else discussing the solution to problem A, so here’s mine, which I solved on the plane from Moscow to Frankfurt (among other activities, and one among the four flights we had to take to arrive back in Cape Town).

The problem

Stripping the “baggage problem” to its essentials: you have a sequence of alternating Bs and As, which has to be reordered to first all the As then all the Bs, and the reordering has to occur moving two letters at the same time, and remain adjacent. For instance, with an n = 4, we have 2n characters, BABABABA, and in the end after sorting, it then will be AAAABBBB, and this has to occur in the minimum amount of moves. N is randomly chosen, and is an integer between 3 and 100. The cells on your tape are numbered from 0 to -2n+1, so with our n = 4, from -7 to 8, and the first B is on position 1.

Update (3-7): This problem appears to be “Tait’s counter puzzle”, which Peter Guthrie Tait, a Scottish mathematical physicist, described in 1884 [see description] (thanks to Davi Duarte who mentioned it in the topcoder thread).

Solution

Informally, there are two parts to the algorithm: move around the As and Bs to pair them as AA and BB, which cost you n/2 moves if n is even and n/2-rounded up moves if n is odd, and then sort those pairs in the remainder to a total of n moves. The first part occurs alternating moving AB from the right to the left, starting at the last AB (position 2n-2) and then every other 4 to its left, and BA from left to right, starting from position 3 and every other 4 positions to the right (i.e., 7, 11, etc.), and then the BBs are ferried to the right from left to right, and the AAs from the right to the left, also alternating. N=3 is an exception.

One can do this with a neat mathematical proof, but I found out with pattern creation and recognition, visualizing the provided sample input and moves for n = 5 and n = 8, which can be done in 5 resp. 8 steps (given in the problem statement as sample output). Here’s a description of that approach.

First, devising one for 3 is trivial, using the following moves (the only option), using the position indicated with the position of the first letter, and highlighting the ones that will be moved next:

Start: . . . . . . BABABA
Move 2 to -1, which gives . . . . ABB . . ABA
Move 5 to 2, which gives . . . . ABBBAA . .
Move 3 to -3, which gives . . AAABBB . . . .

This already suggests that the minimum amount of moves will always be n, because it is n moves also for n = 5 and n = 8. Next, devising one for n = 4, and knowing the moves for n = 3 and n = 5, I tried to work it out for BABABABA, i.e., the first AB, like with n = 3, BABABABA, and BABABABA like with the n = 5 case. The last one is the only one that worked in 4 moves, starting with moving 6 to -1, again. Again, because for all ns so far, the first move is to -1.

Let’s put this to the test with n = 7, with the ones to be moved in bold and the empty cells indicated with dots:

..BABABABABABABA

i.e., move the last AB (thus, position 12) as this worked for n=4 and n=5, then

ABBABABABABAB..A

i.e., move the first BA after position 1 (thus, position 3)

ABBA..BABABABBAA

i.e., move the last AB before a pair (thus, position 8)

ABBAABBAB..ABBAA

i.e., move the first BA after position 1 (thus, position 5). Then sorting the BBs and AAs:

ABBAAB..BBAABBAA
A..AABBBBBAABBAA
AAAAABBBBB..BBAA
AAAAABBBBBBB..AA
AAAAAAABBBBBBB..

This was enough for me to have discovered the pattern of alternating for the matching and alternating for the sorting.

What is ‘nasty’ in the problem description, retrospectively and for the pattern-based approach vs. a neat maths-y proof at least, is that the pattern deviates for n = 3, and the provided sample output for n = 8, although in 8 steps, is an alternative solution, not one that one obtains with the algorithm. With the approach as mentioned above, we have for n = 8 the following (I did that one to double-check):

..BABABABABABABABA
ABBABABABABABAB..A
ABBA..BABABABABBAA
ABBAABBABAB..ABBAA
ABBAABBA..BBAABBAA
A..AABBABBBBAABBAA
AAAAABBABBBB..BBAA
AAAAA..ABBBBBBBBAA
AAAAAAAABBBBBBBB..

This confirmed it works. It may look a bit craft-y, but the patterns show beautifully with larger n, which are shown below for the even and odd case (made just for the blog post, not in solving it); squint your eyes if it’s not immediately clear.

N = 16

..BABABABABABABABABABABABABABABABA
ABBABABABABABABABABABABABABABAB..A
ABBA..BABABABABABABABABABABABABBAA
ABBAABBABABABABABABABABABAB..ABBAA
ABBAABBA..BABABABABABABABABBAABBAA
ABBAABBAABBABABABABABAB..ABBAABBAA
ABBAABBAABBA..BABABABABBAABBAABBAA
ABBAABBAABBAABBABAB..ABBAABBAABBAA
ABBAABBAABBAABBA..BBAABBAABBAABBAA
A..AABBAABBAABBABBBBAABBAABBAABBAA
AAAAABBAABBAABBABBBB..BBAABBAABBAA
AAAAA..AABBAABBABBBBBBBBAABBAABBAA
AAAAAAAAABBAABBABBBBBBBB..BBAABBAA
AAAAAAAAA..AABBABBBBBBBBBBBBAABBAA
AAAAAAAAAAAAABBABBBBBBBBBBBB..BBAA
AAAAAAAAAAAAA..ABBBBBBBBBBBBBBBBAA
AAAAAAAAAAAAAAAABBBBBBBBBBBBBBBB..

N = 17

..BABABABABABABABABABABABABABABABABA
ABBABABABABABABABABABABABABABABAB..A
ABBA..BABABABABABABABABABABABABABBAA
ABBAABBABABABABABABABABABABAB..ABBAA
ABBAABBA..BABABABABABABABABABBAABBAA
ABBAABBAABBABABABABABABAB..ABBAABBAA
ABBAABBAABBA..BABABABABABBAABBAABBAA
ABBAABBAABBAABBABABAB..ABBAABBAABBAA
ABBAABBAABBAABBA..BABBAABBAABBAABBAA
ABBAABBAABBAABBAABB..BAABBAABBAABBAA
A..AABBAABBAABBAABBBBBAABBAABBAABBAA
AAAAABBAABBAABBAABBBBB..BBAABBAABBAA
AAAAA..AABBAABBAABBBBBBBBBAABBAABBAA
AAAAAAAAABBAABBAABBBBBBBBB..BBAABBAA
AAAAAAAAA..AABBAABBBBBBBBBBBBBAABBAA
AAAAAAAAAAAAABBAABBBBBBBBBBBBB..BBAA
AAAAAAAAAAAAA..AABBBBBBBBBBBBBBBBBAA
AAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBB..

Neat. Writing the code is left as an exercise to the reader.

I’m looking forward to the other problems (except the crane balancing problem C, whose description I did not like, at all), the upcoming regionals and next year’s ICPC World Finals in Morocco (if not winning, then to have at least a great time like we had this time)!

First steps for isiZulu natural language generation

Yes, Google Translate English-isiZulu does exist, but it has many errors (some very funny) and there’s a lot more to Natural Language Generation (NLG) than machine translation, such as natural language-based query interfaces that has some AI behind it, and they are needed, too [1]. Why should one bother with isiZulu? Muendane has his lucid opinions about that [2], and in addition to that, it is the first language of about 23% of the population of South Africa (amounting to some 10 million people), about half can speak it, and it is a Bantu language, which is spoken by nearly 300 million people—what works for isiZulu grammar may well be transferrable to its related languages. Moreover, it being in a different language family than the more well-resourced languages, it can uncover some new problems to solve for NLG, and facilitate access to online information without the hurdle of having to learn English or French first, as is the case now in Sub-Saharan Africa.

The three principal approaches for NLG are canned text, templates, and grammars. While I knew from previous efforts [3] that the template-based approach is very well doable but has its limitations, and knowing some basic isiZulu, I guessed it might not work with the template-based approach but appealing if it would (for a range of reasons), that no single template could be identified so far was the other end of the spectrum. Put differently: we had to make a start with something resembling the foundations of a grammar engine.

Langa Khumalo, with the Linguistics program and director of the University Language Planning and Development Office at the University of KwaZulu-Natal, and I have been trying to come up with isiZulu NLG. We have patterns and algorithms for (‘simple’) universal and existential quantification, subsumption, negation (class disjointness), and conjunction; or: roughly OWL 2 EL and a restricted version of ALC. OWL 2 EL fist neatly with SNOMED CT, and therewith has the potential for interactive healthcare applications with the isiZulu healthcare terminologies that are being developed at UKZN.

The first results on isiZulu NLG are described in [4,5], which was not an act of salami-slicing, but we had more results than that fitted in a single paper. The first paper [4] will appear in the proceedings ofthe 4th workshop on Controlled Natural language (CNL’14), and is about finding those patterns and, for the options available, an attempt at figuring out which one would be best. The second paper [5], which will appear in the 8th International Web Rule Symposium (RuleML’14) conference proceedings, is more about devising the algorithms to make it work and how to actually generate those sentences. Langa and I plan to attend both events, so you can ask us about the details either in Prague (18-20 Aug) or Galway (20-22 Aug) in person. In the meantime, the CRCs of the papers are online (here and here).

Regarding the technical aspects, the main reasons why we cannot get away with devising templates to generate isiZulu controlled natural language is that isiZulu is non-trivial:

  • There is a whole system of noun classes: nouns are grouped in one of the 17 noun classes, each with their own peculiarities, which is illustrated in Figure 1, below;
  • Agglutination, informally: putting lots of bits and pieces together to make a word. A selection of those so-called ‘concords’ is included in Figure 2, below;
  • Phonological conditioned copulatives, meaning that the ‘is a’ depends on the term that comes after it (ng or y); and
  • Complex verb conjugation.
isiZulu noun classes with an example (source: [5]).

isiZulu noun classes with an example (source: [5]).

A selection of isiZulu concords (source: [5])

A selection of isiZulu concords (source: [5])

What does this mean for the verbalization? In English, we use ‘Each…’ or ‘For all…’ for the universal quantifier and it doesn’t matter over which noun it is quantified. In isiZulu, it does. Each noun class has its own ‘each’ and ‘for all’, and it is not acceptable (understandable) to use one for the wrong noun class. For disjointness, like “Cup is not a Glass” ({\sf Cup \sqsubseteq \neg Glass} in DL), in English we have the ‘is not a’ regardless what comes before or after the subsumption+negation, but in isiZulu, the copulative is omitted, the first noun (OWL class, if you will) brings in a so-called negative subject concord, the second noun brings in a pronominal, and they are glued together (e.g., Indebe akuyona Ingilazi, where the second word is composed of aku + yona), and to top it off, each noun class has its own concord and pronomial. A seemingly simple conjunction—just an ‘and’ in English—has to be divided into an and-when-it-is-used-in-an-enumeration and an and-when-it-is-a-connective, and when it is used in an enumeration, it depends on the first letter of the noun that comes after the ‘and’. Existential quantification is even more of a hassle. The table below shows a very brief summary comparing typical patterns in English with those for isiZulu.

A few DL symbols, their typical verbalization options in English, and an indication of possible patterns (source: [4])

A few DL symbols, their typical verbalization options in English, and an indication of possible patterns (source: [4])

We did ask isiZulu speakers which of the possible options they preferred (in a survey, with Limesurvey localized to isiZulu), but there wasn’t an overwhelming consistent agreement among them except for one of the options for existential quantification (the –dwa option), although there was more agreement among the linguists than among the non-linguists, possibly due to dialect influences (results can be found in [4]).

If you don’t feel like reading the two papers, but still would like to have some general overview and examples, you also can check out the slides of the CS colloquium I gave last week. I managed to ‘lure in’ also ICT4D people—and then smack them with a bit of logic and algorithms—but the other option, being talking about the other paper accepted at RuleML, probably would have had to be a ‘cookie colloquium’ to get anyone to attend (more about that paper in another post—it is fascinating, but possibly of less interest to a broader audience). If you want to skip the tedious bits and just get a feel of how one of the algorithms works out: check out the example starting on slide 63, which shows the steps to go from {\sf \forall x (uSolwazi(x) \rightarrow \exists y (ufundisa(x, y) \land Isifundo(y)))} in FOL, or {\sf uSolwazi \sqsubseteq \exists ufundisa.Isifundo} in DL (“Each professor teaches at least one course”, if the vocabulary were in English), to “Bonke oSolwazi bafundisa isifundo esisodwa”.

Clearly, a lot remains to be done.

References

[1] Alberts, R., Fogwill, T., Keet, C.M. Several Required OWL Features for Indigenous Knowledge Management Systems. 7th Workshop on OWL: Experiences and Directions (OWLED’12). 27-28 May, Heraklion, Crete, Greece. CEUR-WS Vol-849. 12p

[2] Muendane, N.M. I am an African. 2006, Soultalk CC.

[3] Jarrar, M., Keet, C.M., Dongilli, P. Multilingual verbalization of ORM conceptual models and axiomatized ontologies. STARLab Technical Report, Vrije Universiteit Brussels, Belgium. February 2006.

[4] Keet, C.M., Khumalo, L. Toward verbalizing logical theories in isiZulu. 4th Workshop on Controlled Natural Language (CNL’14), 20-22 August 2014, Galway, Ireland. Springer LNAI. (in press)

[5] Keet, C.M., Khumalo, L. Basics for a grammar engine to verbalize logical theories in isiZulu. 8th International Web Rule Symposium (RuleML’14), August 18-20, 2014, Prague, Czech Republic. Springer LNCS (in press).

CFP OWLED 2014

Here’s a bit of unvarnished promotion for the 1tth OWL: Experience and Directions Workshop (I’m the co-program chair, with Valentina Tamma, and Bijan Parsia is general chair)

—-

CALL FOR PAPERS: 11th OWL: Experiences and Directions Workshop (OWLED)
Riva del Garda, October 17th – 18th, 2014 co-located with ISWC 2014

http://www.w3.org/community/owled/workshop-2014/

Important Dates (All deadlines are Hawaii time)
• Paper submission due: July 30, 2014 EXTENDED TO AUGUST 4, 2014!!!
• Acceptance notifications: September 5, 2014
• Final papers due: September 18, 2014
• OWLED workshop: 17-18 October, 2014

OWLED is now also a Community Group at the W3C. Everyone is invited to participate:

http://www.w3.org/community/owled/

 

The aim of the OWL: Experiences and Directions Workshop (OWLED) is to establish an
international forum for the OWL community, where practitioners in industry and
academia, tool developers and others interested in OWL can describe real and
potential applications, share experience and discuss requirements for language
extensions/modifications. OWL has become the representational model of choice for
supporting interoperability in many industries. This has been made possible thanks
also to the development of numerous OWL reasoning systems that efficiently deal with
both intensional (ontologies) and extensional (data) query answering. In this
edition we aim to bridge the gap with the reasoner evaluation community and welcome
the submission of papers describing challenging ontologies and/or tasks to be
represented in OWL and processed by OWL reasoners. It also welcomes proposals for
improving the OWL 2 standard.

This year, we would like to invite submissions of the following types of papers:

Technical papers:  All submissions must be in English and be no longer than 12 pages
(including references). Papers that exceed this limit will be rejected without
review.  These papers should present research, implementation experience, and
reports on the above and related topics. Space will be reserved for authors to
present their work at the workshop.

Short papers (4-6 pages, including references): These papers should present work
that is in an early stage and/or include publishable (novel) implemented systems
that are of interest to the OWLED community; and (in case of an implemented system),
can be demonstrated at the workshop.

All submissions must be must be in PDF, and must adhere to the Springer LNCS style.
For more details, see Springer’s Author Instructions:

http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0.

Papers can be submitted online using the Easychair Conference system:

https://www.easychair.org/conferences/?conf=owled2014

Papers related to any aspects of OWL and extensions, applications, theory, methods
and tools, are welcome.

Topics of interest include, but are not limited to:

  • Application driven requirements for OWL
  • Applications of OWL, particularly: from industry, or for data integration, for service interoperability, for sophisticated/non-obvious inference, for knowledge discovery
  • and within specific domains such as: law, bio and biomed, eLearning
  • Experience of using OWL: notably, highly expressive ontologies or the OWL 2 Profiles
  • Evaluation of OWL tools e.g. reasoners
  • Benchmarks for OWL tools
  • Performance and scalability issues and improvements
  • Extensions to OWL
  • OWL and Rules
  • Implementation techniques and experience reports
  • Non-standard reasoning service (implementation and requirements for)
  • Explanation
  • Ontology comprehension and verbalisation
  • Multilingual OWL
  • Modelling issues
  • Tools, including editors, visualisation, parsers and syntax checkers
  • Collaborative editing of ontologies
  • Versioning of OWL ontologies
  • Alignment of OWL ontologies
  • Modularity
  • Query answering with OWL
  • SPARQL and OWL
  • Linked Data and OWL

Dabbling into evaluating reasoners with the DMOP ontology

The Data Mining OPtimization ontology (DMOP) is a highly axiomatised ontology that uses almost all features of OWL 2 DL and the domain entities are linked to DOLCE, using all four main ‘branches’ of DOLCE. Some details are described in last year’s OWLED’13 paper [1] and a blog post. We did observe ‘slow’ reasoner performance to classify the ontology, however, like, between 10 and 20 minutes, varying across versions and machines. The Ontology Reasoner Evaluation (ORE’14) workshop (part of the Vienna Summer of Logic) was a nice motivation to have a look at trying to figure out what’s going on, and some initial results are described briefly in the 6 pages-short paper [2], which is co-authored with Claudia d’Amato, Agnieszka Lawrynowicz, and Zubeida Khan.

Those results are definitely what can be called interesting, even though we’re still at the level of dabbling into it from a reasoner user-centric viewpoint, and notably, from a modeller-centric viewpoint. The latter is what made us pose questions like “what effect does using feature x have on performance of the reasoner?”. No one knew, except for the informal feedback back I received at DL 2010 on [3] that reasoning with data types slows down things, and likewise when the cardinalities are high. That’s not an issue with DMOP, though.

So, the first thing we did was determining a baseline on a good laptop—your average modeller doesn’t have a HPC cluster readily at hand—and in an Ontology Development Environment, where the reasoner is typically accessed from. Some 9 minutes to classify the ontology (machine specs and further details in the paper).

The second steps were the analysis of one specific modeling construct (inverses), and what effect DOLCE has on the overall performance.

The reason why we chose representation of inverses is because in OWL 2 DL (cf. OLW DL), one can use the objectInverseOf(OP) to use the inverse of an object property instead of extending the ontology’s vocabulary and using InverseObjectProperties(OPE1 OPE2) to relate the property and its inverse. For instance, to use the inverse the property addresses in an axiom, one used to have to introduce a new property, addressed by, declare it inverse to addresses, and then use that in the axiom, whereas in OWL 2 DL, one can use ObjectInverseOf(addresses) in the axiom (in Protégé, the syntax is inverse(addresses)). That slashed computing the class hierarchy by at least over a third (and about half for the baseline). Why? We don’t know. Other features used in DMOP, such as punning and property chains, were harder to remove and are heavily used, so we didn’t test those.

The other one, removing DOLCE, is a bit tricky. But to give away the end results upfront: that made it 10 times faster! The ‘tricky’ part has to do with the notion of ‘linking to a foundational ontology’ (deserving of its own blog post). For DMOP, we had not imported but merged, and we did not merge everything from DOLCE and its ExtendedDnS, but only what was deemed relevant, being, in numbers, 43 classes, 78 object properties and 593 axioms. To make matters worse—from an evaluation viewpoint, that is—is that we reused heavily three DOLCE object properties, so we kept those three DOLCE properties in the evaluation file, as we suspected it otherwise would have affected the deductions too much and interfere with the DOLCE-or-not question (one also could argue that those three properties can be considered an integral part of DMOP). So, it was not a simple case of ‘remove the import statement and run the reasoner again’, but a ‘remove almost everything with a DOLCE URI manually and then run the reasoner again’.

Because computation was so ‘slow’, we wondered whether maybe cleverly modularizing DMOP could be the way to go, in case someone wants to use only a part of DMOP. We got as far as trying to modularize the ontology, which already was not trivial because DMOP and DOCLE are both highly axiomatised and with few, if any, relatively isolated sections amenable to modularization. Moreover, what it did show, is that such automated modularization (when it was possible) only affects the number of class and number of axioms, not the properties and individuals. So, the generated modules are stuck with properties and individuals that are not used in, or not relevant for, that module. We did not fix that manually. Also, putting back together the modules did not return it to the original version we started out with, missing 225 axioms out of the 4584.

If this wasn’t enough already, the DMOP with/without DOLCE test was performed with several reasoners, out of curiosity, and they gave different output. FaCT++ and MORe had a “Reasoner Died” message. My ontology engineering students know that, according to DOLCE, death is an achievement, but I guess that its reasoners’ developers would deem otherwise. Pellet and TrOWL inferred inconsistent classes; HermiT did not. Pellet’s hiccup had to do with datatypes and should not have occurred (see paper for details). TrOWL fished out a modeling issue from all of those 4584 axioms (see p5 of the paper), of the flavour as described in [4] (thank you), but with the standard semantics of OWL—i.e., not caring at all about the real semantics of object property hierarchies—it should not have derived an inconsistent class.

Overall, it feels like having opened up a can of worms, which is exciting.

References

[1] Keet, C.M., Lawrynowicz, A., d’Amato, C., Hilario, M. Modeling issues and choices in the Data Mining OPtimisation Ontology. 8th Workshop on OWL: Experiences and Directions (OWLED’13), 26-27 May 2013, Montpellier, France. CEUR-WS vol 1080.

[2] Keet, C.M., d’Amato, C., Khan, Z.C., Lawrynowicz, A. Exploring Reasoning with the DMOP Ontology. 3rd Workshop on Ontology Reasoner Evaluation (ORE’14). July 13, 2014, Vienna, Austria. CEUR-WS vol (accepted).

[3] Keet, C.M. On the feasibility of Description Logic knowledge bases with rough concepts and vague instances. 23rd International Workshop on Description Logics (DL’10), 4-7 May 2010, Waterloo, Canada.

[4] Keet, C. M. (2012). Detecting and revising flaws in OWL object property expressions. In Proc. of EKAW’12, volume 7603 of LNAI, pages 252–266. Springer.

8 years of keetblog

The 8-year anniversary swooshed by a few days ago, but, actually it’s really only completing today, as the first blog post with real content was published on April 18, 2006, about solving sudokus with constraint programming.

The top-post among the 186 posts (>9000 visits to that page alone) is still the introduction for two lectures on top-down and bottom-up ontology development that I wrote in November 2009 as part of the Semantic Web technologies MSc course at the Free University of Bolzano; anyone wishing to read an updated version: have a look at the 2014 lecture notes (its ‘Block II’). The post most commented on is about academia.edu, and then on my wish for a semantic search of insects.

The more ‘trivia’/fun ones—still having to do with science—are, I think, about the complexity of coffee and culinary evolution, but I may be biased (my first degree up to MSc was in food science). For some reason, there were more visitors reading about failing to recognize your own incompetence and some sneakiness of academia.edu than about food (and many other topics). Ah, well. A full list sorted by year is available on the list of blog posts page.

The frequency of posting is somewhat less than a few years ago and, consequently, the visits went down from about 1500/month during its heydays [well, years] to about 1000/month now, but that’s still not bad at all for a ‘dull’ blog, and I would like to thank you again, and even more so the fans (subscribers) and those of you who have taken the effort to like a post or to leave comments both online and offline! I hope it’s been an interesting read, or else enjoyable procrastination.

Ontology Engineering lecture notes for 2014 online

The lecture notes for the Ontology Engineering BSc honours in CS course are available online now. The file is updated compared to the COMP720 module (and those notes have been removed). The main changes consist of reordering the chapters in Block II and Block III, adding better or more explanations and examples in several sections, fixing typos, and updates to reflect advances made in the field. It again includes the DL primer written by Markus Kroetzsch, Ian Horrocks and Frantisek Simancik (saving me the time writing about that; thanks!).

As with the last three installments, the target audience is computer science students in their 4th year (honours), so the notes are of an introductory nature. It has three blocks after the introduction: logic foundations, ontology engineering, and advanced topics (the latter we will skip, as this is a shorter course). The logic foundations contain a recap of FOL and the notion of reasoning, the DL primer and the basics of automated reasoning with the Description Logics with ALC, the DL-based OWL species, and some practical automated reasoning. The ontology engineering block starts with methods and methodologies that give guidance how to commence actually developing an ontology, and how to avoid and fix issues. Subsequently, there are two chapters going into some detail of two ‘paths’ in the methodology, being top-down ontology development using foundational ontologies, and bottom-up ontology development to extract knowledge from other material, such as relational databases, thesauri, and natural language documents.

The advanced topics are optional this year, but I left them in the lecture notes, as they may pique your interest. Chapter 8 on Ontology-Based Data Access is a particular application scenario of ontologies that ‘spice up’ database applications. Chapter 9 touches upon a few sub-areas within ontologies: representing and reasoning with vagueness and uncertainty, extending the language to include also temporal knowledge, the use of ontologies to enhance conceptual data models, and a note on social aspects.

It is still an evolving document, and relative completeness of sections varies slightly, so it has to be seen in conjunction with the slides, lectures, and some additional documentation that will be made available on the course’s Vula site.

Suggestions and corrections are welcome! If you want to use a part of it in your own lectures and/or use the accompanying slides with it, please contact me.

Follow

Get every new post delivered to your Inbox.

Join 40 other followers

%d bloggers like this: