The TDDonto tool to try out TDD for ontology authoring

Last month I wrote about Test-Driven Development for ontologies, which is described in more detail in the ESWC’16 paper I co-authored with Agnieszka Lawrynowicz [1]. That paper does not describe much about the actual tool implementing the tests, TDDonto, although we have it and used it for the performance evaluation. Some more detail on its design and more experimental results are described in the paper “The TDDonto Tool for Test-Driven Development of DL Knowledge Bases” [2] that has just been published in the proceedings of the 29th International Workshop on Description Logics, which will take place next weekend in Cape Town (22-25 April 2016).

What we couldn’t include there in [2] is multiple screenshots to show how it works, but a blog is a fine medium for that, so I’ll illustrate the tool with some examples in the remainder of the post. It’s an alpha version that works. No usability and HCI evaluations have been done, but at least it’s a Protégé plugin rather than command line :).

First, you need to download the plugin from Agnieszka’s ARISTOTELES project page and place the jar file in the plugins folder of Protégé 5.0. You can then go to the Protégé menu bar, select Windows – Views – Evaluation views – TDDOnto, and place it somewhere on the screen and start using it. For the examples here, I used the African Wildlife Ontology tutorial ontology (AWO v1) from my ontology engineering course.

Make sure to have selected an automated reasoner, and classify your ontology. Now, type a new test in the “New test” field at the top, e.g. carnivore DisjointWith: herbivore, click “Add test”, select the checkbox of the test to execute, and click the “Execute test”: the status will be returned, as shown in the screenshot below. In this case, the “OK” says that the disjointness is already asserted or entailed in the ontology.

cdisjh

Now let’s do a TDD test that is going to fail (you won’t know upfront, of course); e.g., testing whether impalas are herbivores:

impalaFail

The TDD test failed because the subsumption is neither asserted nor entailed in the ontology. One can then click “add to ontology”, which updates the ontology:

impalaAdd

Note that the reasoner has to be run again after a change in the ontology.

Lets do two more: testing whether lion is a carnivore and that flower is a plan part. The output of the tests is as follows:

lionflower

It returns “OK” for the lion, because it is entailed in the ontology: a carnivore is an entity that eats only animals or parts thereof, and lions eat only herbivore and eats some impala (which are animals). The other one, Flower SubClassOf: PlantParts fails as “undefined”, because Flower is not in the ontology.

Ontologies do not have only subsumption and disjointness axioms, so let’s assume that impalas eat leaves and we want check whether that is in the ontology, as well as whether lions eat animals:

lionImpalaEats

The former failed because there are no properties for the impala in the AWO v1, the latter passed, because a lion eats impala, and impala is an animal. Or: the TDDOnto tool indeed behaves as expected.

Currently, only a subset of all the specified tests have been implemented, due to some limitations of existing tools, but we’re working on implementing those as well.

If you have any feedback on TDDOnto, please don’t hesitate to tell us. I hope to be seeing you later in the week at DL’16, where I’ll be presenting the paper on Sunday afternoon (24th) and I also can give a live demo any time during the workshop (or afterwards, if you stay for KR’16).

 

References

[1] Keet, C.M., Lawrynowicz, A. Test-Driven Development of Ontologies. 13th Extended Semantic Web Conference (ESWC’16). Springer LNCS. 29 May – 2 June, 2016, Crete, Greece. (in print)

[2] Lawrynowicz, A., Keet, C.M. The TDDonto Tool for Test-Driven Development of DL Knowledge bases. 29th International Workshop on Description Logics (DL’16). April 22-25, Cape Town, South Africa. CEUR WS vol. 1577.

Advertisements

Ontology authoring with a Test-Driven Development approach

Ontology development has its processes and procedures—conducting a domain analysis, the implementation, maintenance, and so on—which have been developed since the mid 1990s. These high-level information systems-like methodologies don’t tell you what and how to add the knowledge to the ontology, however, i.e., the ontology authoring stage is a ‘just do it’, but not how. There are, perhaps surprisingly, few methods for how to do that; notably, FORZA uses domain and range constraints and the reasoner to propose a suitable part-whole relation [1] and advocatus diaboli zooms in on disjointness constraints among classes [2]. In a way, they all use what can be considered as tests on the ontology before adding an axiom. This smells of notions that are well-known in software engineering: unit tests, test-driven development (TDD), and Agile, with the latter two relying on different methodologies cf. the earlier ones (waterfall, iterative, and similar).

Some of those software engineering approaches have been adjusted and adopted for ontology engineering; e.g., the Agile-inspired OntoMaven that uses the standard reasoning services as tests [3], eXtreme Design with ODPs [4] that have been prepared previously, and Vrandecic and Gangemi’s early exploration of possibilities for unit tests [5]. Except for Warrender & Lord’s TDD tests for subsumption [6], they are all test-last approaches (design, author, test), rather than a test-first approach (test, author, test). The test-first approach is called test-driven development in software engineering [7], which has been ported to conceptual modelling recently as well [8]. TDD is a step up from a “add something and lets see what the reasoner says” stance, because one has to think and check upfront first before doing. (Competency questions can help with that, but they don’t say how to add the knowledge.) The question that arises, then, is how such a TDD approach would look like for ontology development. Some of the more detailed questions to be answered are (from [9]):

  • Given the TDD procedure for programming in software engineering, then what does that mean for ontologies during ontology authoring?
  • TDD uses mock objects for incomplete parts of the code, and mainly for methods; is there a parallel to it in ontology development, or can that aspect of TDD be ignored?
  • In what way, and where, (if at all) can this be integrated as a methodological step in existing ontology engineering methodologies?

We—Agnieszka Lawrynowicz from Poznan University of Technology in Poland and I—answer these questions in our paper that was recently accepted at the 13th Extended Semantic Web Conference (ESWC’16), to be held in May 2016 in Crete, Greece: Test-Driven Development of Ontologies [9]. In short: we specified tests for the OWL 2 DL language features and basic types of axioms one can add, implemented it as a Protégé plug-in, and tested it on performance with 67 ontologies (result: great). The tool and test data can be downloaded from Agnieszka’s ARISTOTELES project page.

Now slightly less brief than that one-liner. The tests—like for class subsumption, domain and range, a property chain—can be specified in two principal ways, called TBox tests and ABox tests. The TBox tests rely solely on knowledge represented in the TBox, whereas for the ABox tests, mock individuals are explicitly added for a particular TBox axiom. For instance, the simple existential quantification, as shown below, where the TBox query is denoted in SPARQL-OWL notation.

Teq

Teqprime

 

 

 

 

 

For the implementation, there is (1) a ‘wrapping’ component that includes creating the mock entities, checking the condition (line 2 in the TBox test example in the figure above, and line 4 in ABox TDD test), returning the TDD test result, and cleaning up afterward; and (2) the interaction with the ontology doing the actual testing, such as querying the ontology and class and instance classification. There are several options to realise component (2) of the TBox TDD tests. The query can be sent to, e.g., OWL-BGP [10] that uses SPARQL-OWL and Hermit to return the answer (line 1), but one also could use just the OWL API and send it to the automated reasoner, among other options.

Because OWL-BGP and the other options didn’t cater for the tests with OWL’s object properties, such as a property chain, so a full implementation would require extending current tools, we decided to first examine performance of the different options for (2) for those tests that could be implemented with current tools so as to get an idea of which approach would be the best to extend, rather than gambling on one, implement all, and go on with user testing. This TDD tool got the unimaginative name TDDonto and can be installed as a Protégé plugin. We tested the performance with 67 TONES ontology repository ontologies. The outcome of that is that the TBox-based SPARQL-OWL approach is faster than the ABox TDD tests (except for class disjointness; see figure below), and the OWL API + reasoner for the TBox TDD tests is again faster in general. These differences are bigger with larger ontologies (see paper for details).

Test computation times per test type (ABox versus TBox-based SPARQL-OWL) and per the kind of the tested axiom (source: [9]).

Test computation times per test type (ABox versus TBox-based SPARQL-OWL) and per the kind of the tested axiom (source: [9]).

Finally, can this TDD be simply ‘plugged in’ into one of the existing methodologies? No. As with TDD for software engineering, it has its own sequence of steps. An initial sketch is shown in the figure below. It outlines only the default scenario, where the knowledge to be added wasn’t there already and adding it doesn’t result in conflicts.

Sketch of a possible ontology lifecycle that focuses on TDD, and the steps of the TDD procedure summarised in key terms (source: [9]).

Sketch of a possible ontology lifecycle that focuses on TDD, and the steps of the TDD procedure summarised in key terms (source: [9]).

The “select scenario” has to do with what gets fed into the TDD tests, and therewith also where and how TDD could be used. We specified three of them: either (a) the knowledge engineer provides an axiom, (b) a domain expert fills in some template (e.g., the ‘all-some’ one) and that software generates the axiom for the domain ontology (e.g., Professor \sqsubseteq \exists teaches.Course ), or (c) someone wrote a competency question that is either manually or automatically converted into an axiom. The “refactoring” could include a step for removing a property from a subclass when it is added to its superclass. The “regression testing” considers previous tests and what to do when any conflicts may have arisen (which may need an interaction with step 5).

There is quite a bit of work yet to be done on TDD for ontologies, but at least there is now a first comprehensive basis to work from. Both Agnieszka and I plan to go to ESWC’16, so I hope to see you there. If you want more details or read the tests with a nicer layout than how it is presented in the ESWC16 paper, then have a look at the extended version [11] or contact us, or leave a comment below.

 

References

[1] Keet, C.M., Khan, M.T., Ghidini, C. Ontology Authoring with FORZA. 22nd ACM International Conference on Information and Knowledge Management (CIKM’13). ACM proceedings. Oct. 27 – Nov. 1, 2013, San Francisco, USA. pp569-578.

[2] Ferre, S. and Rudolph, S. (2012). Advocatus diaboli exploratory enrichment of ontologies with negative constraints. In ten Teije, A. et al., editors, 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12), volume 7603 of LNAI, pages 42-56. Springer. Oct 8-12, Galway, Ireland

[3] Paschke, A., Schaefermeier, R. Aspect OntoMaven – aspect-oriented ontology development and configuration with OntoMaven. Tech. Rep. 1507.00212v1, Free University of Berlin (July 2015)

[4] Blomqvist, E., Sepour, A.S., Presutti, V. Ontology testing — methodology and tool. In: Proc. of EKAW’12. LNAI, vol. 7603, pp. 216-226. Springer (2012)

[5] Vrandecic, D., Gangemi, A. Unit tests for ontologies. In: OTM workshops 2006. LNCS, vol. 4278, pp. 1012-1020. Springer (2006)

[6] Warrender, J.D., Lord, P. How, What and Why to test an ontology. Technical Report 1505.04112, Newcastle University (2015), http://arxiv.org/abs/1505.04112

[7] Beck, K.: Test-Driven Development: by example. Addison-Wesley, Boston, MA (2004)

[8] Tort, A., Olive, A., Sancho, M.R. An approach to test-driven development of conceptual schemas. Data & Knowledge Engineering 70, 1088-1111 (2011)

[9] Keet, C.M., Lawrynowicz, A. Test-Driven Development of Ontologies. 13th Extended Semantic Web Conference (ESWC’16). Springer LNCS. 29 May – 2 June, 2016, Crete, Greece. (in print)

[10] Kollia, I., Glimm, B., Horrocks, I. SPARQL Query Answering over OWL Ontologies. In: Proc, of ESWC’11. LNCS, vol. 6643, pp. 382-396. Springer (2011)

[11] Keet, C.M., Lawrynowicz, A. Test-Driven Development of Ontologies (extended version). Technical Report, Arxiv.org http://arxiv.org/abs/1512.06211. Dec 19, 2015.

 

 

 

 

From data on conceptual models to optimal (logic) language profiles

There are manifold logic-based reconstructions of the main conceptual data modelling languages in a ‘gazillion’ of logics. The reasons for pursuing this line of work are good. In case you wonder, consider:

  • Automated reasoning over a conceptual data model to improve their quality and avoid bugs; e.g., an empty database table due to an inconsistency in the model (unsatisfiable class). Instead of costly debugging, one can catch it upfront.
  • Designing and executing queries with the model’s vocabulary cf. putting up with how the data is stored with its typically cryptic table and column names.
  • Test data generation in automation of software engineering.
  • Use it as ‘background knowledge’ during the query compilation stage (which helps optimizing it, so better performance querying a database).

Most of the research efforts on formalizing the conceptual data modelling languages have gone to capturing as much as possible of the modelling language, and therewith aiming to solve the first use case scenario. Runtime usage of conceptual models, i.e., use case scenarios 2-4 above, is receiving some attention, but it brings with it its own set of problems: which trade-offs are the best? That is, we know we can’t have both the modelling languages in their full glory formalised on some arbitrary (EXPTIME or undecidable) logic and have scalable runtime performance. But which subset to choose? There are papers where (logician) authors state something like ‘you don’t need keys in ER, so we ignore those’ or ‘let’s skip ternaries, as most relationships are binary anyway’ or ‘we sweep those pesky aggregation associations under the carpet’ or ‘hierarchies, disjointness and completeness are certainly important’. Who’s right? Or is neither one of them right?

So, we had all that data of the 101 UML, ER, EER, ORM, and ORM2 models analysed (see previous post and [1]). With that, we could construct evidence-based profiles based on the features that are actually used by modellers, rather than constructing profiles based on gut feeling or on one’s pet logic. We specified a core profile and one for each family of the conceptual data modelling languages under consideration (UML Class Diagrams, ER/EER, and ORM/ORM2). The details of the outcome can be found in our recently accepted paper “Evidence-based Languages for Conceptual Data Modelling Profiles” [1] that has been accepted at the 19th Conference on Advances in Databases and Information Systems (ADBIS’15), that will take place from September 8-12 in Poitiers, France. As with the other recent posts on conceptual data models, also this paper was co-authored with Pablo Fillottrani and is an output of our DST/MINCyT-funded bi-lateral project on the unification of conceptual data modelling languages (project overview).

To jump to the short answer: the core profile can be represented in \mathcal{ALNI} (called \mathcal{PL}_1 in [3], with PTIME subsumption), whereas the modelling language-specific profiles do not match any of the very many currently existing Description Logic languages with known computational complexity.

Now how we got into that situation. There are some formalization options to consider first, which can affect the complexity of the logic. Notably, 1) whether to use inverses or qualified number restrictions, and 2) whether to go for DL role components for UML’s association ends/ORM’s roles/ER’s relationship components with a 1:1 mapping, or to ignore that and formalise the associations/fact types/relationships only (and how to handle that choice then). Extending a logic language with inverses tends to be less costly computationally cf. qualified number restrictions, so we chose the former. The latter is more complicated to handle regardless the choice, which is partially due to the fact that they are surface aspects of an underlying difference in ontological commitment as to what relations are—so-called standard view versus positionalist—and how it is represented in the models (see discussion in the paper). For the core profile, the dataset of conceptual models justified binaries + standard view representation. In addition to that, the core profile has classes, attributes, mandatory and arbitrary (unqualified) cardinality, class subsumption, and single identification. That set covers 87.57% of all the entities in the models in the dataset (91.88% of the UML models, 73.29% of the ORM models, and 94.64% of the ER/EER models). Note there’s no disjointness or completeness (there were too few of them to merit inclusion) and no role and relationship subsumption, so there isn’t much one can deduce automatically, which is a bit of a bummer.

The UML profile extends the core only slightly, yet it covers 99.44% of the elements in the UML diagrams of the dataset: add cardinality on attributes, attribute value constraints, subsumption for DL roles (UML associations), and aggregation (they are plain associations since UML v2.4.1). This makes a “\mathcal{ALNHI}(D) ” DL that, as far as we know, hasn’t been investigated yet. That said, fiddling a bit by opting for unique name assumption and some constraints on cardinalities and role inclusion, it looks like DL\mbox{-}Lite^{\mathcal{HN}}_{core} [4] may suffice, which is NLOGSPACE in subsumption and AC^0 in data complexity.

For ER/EER, we need to add to the core the following to make it to 99.06% coverage: composite and multivalued attribute (remodelled), weak entity type with its identification constraint, ternaries, associative entity types, and multi-attribute identification. With some squeezing and remodelling things a bit (see paper), DL\mbox{-}Lite^{\mathcal{N}}_{core} [4] should do (also NLOGSPACE), though \mathcal{DLR}_{ifd} [5] will make the formalisation better to follow (though that DL has too many features and is EXPTIME-complete).

Last, the ORM/ORM2 profile, which is the largest to achieve a high coverage (98.69% of the elements in the models in the data set): the core profile + subsumption on roles (DL role components) and fact types (DL roles), n-aries, disjointness on roles, nested object types, value constraints, disjunctive mandatory, internal and external uniqueness, external identifier (compound reference scheme). There’s really no way to avoid the roles, n-aries, and disjointness. There’s no exactly fitting DL for this cocktail of features, though \mathcal{DLR}_{ifd} and $latex $\mathcal{CFDI}_{nc}^{\forall -} &s=-2$ [6] approximate it; however, the former has too much constructs and the latter too few. That said, \mathcal{DLR}_{ifd} is computationally not ‘well-behaved’, but with \mathcal{CFDI}_{nc}^{\forall -} we still can capture over 96% of the elements in the ORM models of the dataset and it’s PTIME (yup, tractable) [7].

The discussion section of the paper answers the research questions we posed at the beginning of the investigation and reflects on not only missing features, but also ‘useless’ ones. Perhaps we won’t make a lot of friends discussing ‘useless’ features, especially when some authors investigated exactly those features. Anyway, here it goes. Really, nominal are certainly not needed (and computationally costly to boot). We can only guess why there were so few disjointness and completeness constraints in the data set, and even when they were present, they were in the few models we got from textbooks (see data set for sources of the models); true, there weren’t a lot of class hierarchies, but still. The other thing that was a bit of a disappointment was that the relational properties weren’t used a lot. Looking at the relationships in the models, there were certainly opportunities for transitivity and more irreflexivity declarations. One of our current conjectures is that they have limited implementation support, so maybe modellers don’t see the point of adding such constraints; another could be that an ‘average modeller’ (whatever that means) doesn’t quite understand all the 11 that are available in ORM2.

Overall, while a bit disappointing for the use case scenario of reasoning over conceptual data models for inconsistency management, the results are actually very promising for runtime usage of conceptual data models. Maybe that of itself will generate more interest from industry in doing that analysis step before implementing a database or software application: instead of developing a conceptual data model “just for documentation and dust-gathering”, you’ll have one that also will add more, new, better advanced features to your application.

References

[1] Keet, C.M., Fillottrani, P.R. An analysis and characterisation of publicly available conceptual models. 34th International Conference on Conceptual Modeling (ER’15). Springer LNCS. 19-22 Oct, Stockholm, Sweden. (in print)

[2] Fillottrani, P.R., Keet, C.M. Evidence-based Languages for Conceptual Data Modelling Profiles. 19th Conference on Advances in Databases and Information Systems (ADBIS’15). Springer LNCS. Poitiers, France, Sept 8-11, 2015. (in print)

[3] Donini, F., Lenzerini, M., Nardi, D., Nutt, W. Tractable concept languages. In: Proc. of IJCAI’91. vol. 91, pp. 458-463. 1991.

[4] Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M. The DL-Lite family and relations. Journal of Artificial Intelligence Research, 2009, 36:1-69.

[5] Calvanese, D., De Giacomo, G., Lenzerini, M. Identification constraints and functional dependencies in Description Logics. In: Proc. of IJCAI’01, pp155-160, Morgan Kaufmann.

[6] Toman, D., Weddell, G. On adding inverse features to the Description Logic \mathcal{CFDI}_{nc}^{\forall} . In: Proc. of PRICAI 2014, pp587-599.

[7] Fillottrani, P.R., Keet, C.M., Toman, D. Polynomial encoding of ORM conceptual models in \mathcal{CFDI}_{nc}^{\forall -} . 28th International Workshop on Description Logics (DL’15). Calvanese, D., Konev, B. (Eds.), CEUR-WS vol. 1350, pp401-414. 7-10 June 2015, Athens, Greece.

First tractable encoding of ORM conceptual data models

For (relatively) many years I’ve been focusing on as-expressive-as-possible languages to represent information and knowledge, including the computationally impractical full first order logic, because one would/should want to be as precise as possible and required to represent the subject domain in an ontology and universe of discourse for the application in a conceptual data model. After all, one can always throw out the computationally unpleasant constructs later during the implementation stage, if the ontology or conceptual data model is intended for use at runtime, such as OBDA [1], test data generate for verification [2], and in the query compilation stage in RDBMSs [3]. The resulting slimmed theories/models may be different for different applications, but then at least the set of slimmed theories/models share their common understanding.

So, now I ventured in that area, not because there’s some logic x and conceptual modeling language y has to be forced into it, but it actually appears that many fancy construct/features are not used in publicly available conceptual data models anyway (see data set and xls with some analysis). The timing of the outcome of the analysis of the data set coincided with David Toman’s visit to UCT as part of his sabbatical and Pablo Fillottrani’s visit, who enjoyed the last exchange of our bi-lateral project on the unification of conceptual data modelling languages (project page). To sum up the issue we were looking at: the need for run-time usage of conceptual data models requires a tractable logic-based reconstruction of the conceptual models (i.e., in at most PTIME), which appeared to hardly exist or miss constructs important for conceptual models (regardless whether that was ORM, EER or UML Class Diagrams), or both.

The solution ended up to be a logic-based reconstruction for most of ORM2 using the \mathcal{CFDI}_{nc}^{\forall -} Description Logic, which also happens to be the first tractable encoding of (most of) ORM/ORM2. With this logic, several features important for conceptual models (i.e., occur relatively often) do have their proper encoding in the logic, notably n-aries, complex identification constraints, and n-ary role subsumption. The, admittedly quite tedious, mapping

Low resolution and small version of our DL15 poster summarising the contributions.

Low resolution and small version of our DL15 poster summarising the contributions.

captures over 96% of the constructs used in practice in the set of 33 ORM diagrams we analysed (see data set). Further, the results are easily transferable to EER and UML Class diagrams, with an even greater coverage. The results (and comparison with related works) are presented in our recently accepted paper at the 28th International Workshop on Description Logics (DL’15) that will take place form 7 to 11 June in Athens, Greece.

The list of accepted papers of DL’15 is available, listing 21 papers with long presentations, 16 papers with short presentation, and 26 papers with poster presentations. David will present our results in the poster session, as it’s probably of more relevance in the conceptual modelling community (and I’ll be marking exams then), and some other accepted papers cover more new ground, such as casting schema.org as a description logic, temporal query answering in EL, exact learning of ontologies, and more. The proceedings is will be online on CEUR-WS in the upcoming days as volume 1350. I’ve added a mini version of our poster on the right. I tried tikzposter, as they look really cool, but it doesn’t support figures (other than those made in latex), so I resorted to ppt (that doesn’t support math), wondering why these issues haven’t been solved by now.

Anyway, more about this topic is in the pipeline that I soon hope to be able to give updates on.

 

References

[1] Calvanese, D., Keet, C.M., Nutt, W., Rodriguez-Muro, M., Stefanoni, G. Web-based Graphical Querying of Databases through an Ontology: the WONDER System. ACM Symposium on Applied Computing (ACM SAC’10), March 22-26 2010, Sierre, Switzerland. pp 1389-1396.

[2] Toman, D., Weddell, G.E.: Fundamentals of Physical Design and Query Compilation. Synthesis Lectures on Data Management, Morgan & Claypool  Publishers (2011)

[3] Smaragdakis, Y., Csallner, C., Subramanian, R.: Scalable satisfiability checking and test data generation from modeling diagrams. Automation in Software Engineering 16, 73–99 (2009)

[4] Fillottrani, P.R., Keet, C.M., Toman, D. Polynomial encoding of ORM conceptual models in \mathcal{CFDI}_{nc}^{\forall -} . 28th International Workshop on Description Logics (DL’15). CEUR-WS vol xx., 7-10 June 2015, Athens, Greece.

Forum for AI Research 2015, Cape Town

In 10 day’s time, the (CAIR-driven) Forum for Artificial Intelligence Research 2015 (FAIR’15) Workshop will be held at UCT in Cape Town, South Africa, from March 30 to April 2. There are still some spaces available; registration is free, but please register (for catering purposes). What will you get for this ‘bargain price’? A lot of food for the mind!

FAIR’15 follows the same format as the previous 7 editions that went under various acronyms since 2008 (among others, MOWS, MOSS, MAIS, FAIR), with a mini-course, a tutorial, and postgraduate student presentations. This edition has the following on offer.

Ulrike Sattler (University of Manchester, UK) will present a mini-course on automated reasoners in the mornings. She will go into the details of what really happens when you click that menu option “start reasoner” and Protégé’s “?” that explains the deductions, and what are the factors that influence the reasoner’s performance.

David Toman (University of Waterloo, Canada) will present a 2-hour tutorial on using knowledge representation and reasoning (logic) for query optimization in relational databases and ontology-based data access (i.e., advanced aspects of database systems implementation).

Further, there are several sessions with postgraduate student presentations. Among others, Catherine Chavula will talk about new results (cf. [1]) in multilingual ontologies, Zubeida Khan will talk about foundational ontology interchangeability (details in [2]), and (very recently MSc cum laude graduated!) Nasubo Ongoma will present her thesis on logic-based temporal conceptual data modeling (including material from [3]). Gavin Rens will talk about probabilistic belief change, Kody Moodley on defeasible reasoning for description logics, Henriette Harmse about scenario testing with OWL, and Nishal Morar on taxonomic classification.

Aurona Gerber will give an overview of Data Science at CSIR, and for some more variety in the programme, I’ll talk about the stuff ontology [4]. Check the programme for all titles of the presentations and the abstracts of the mini-course and tutorial.

An important aim of FAIR is the networking among people in Southern Africa, and share and discuss informally our research in (predominantly) KR&R and related areas—so if the above topics sound interesting, or made you curious, or you would like to meet a potential MSc/PhD supervisor, you’re welcome to join (note: some basic knowledge of logics will be needed to understand the talks, though). If you have any questions, please don’t hesitate to contact one of the organisers, Arina Britz and me.

References

[1] Chavula, C., Keet, C.M. Is Lemon Sufficient for Building Multilingual Ontologies for Bantu Languages? 11th OWL: Experiences and Directions Workshop (OWLED’14). Keet, C.M., Tamma, V. (Eds.). Riva del Garda, Italy, Oct 17-18, 2014. CEUR-WS vol. 1265, 61-72.

[2] Khan, Z.C., Keet, C.M. Feasibility of automated foundational ontology interchangeability. 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW’14). K. Janowicz et al. (Eds.). 24-28 Nov, 2014, Linkoping, Sweden. Springer LNAI 8876, 225-237.

[3] Keet, C.M., Ongoma, E.A.N. Temporal Attributes: their Status and Subsumption. Asia-Pacific Conference on Conceptual Modelling (APCCM’15). Koehler, H., Saeki, M. (Eds.), Conferences in Research and Practice in Information Technology (CRPIT), Vol. 165. 27-30 January, 2015, Sydney, Australia.

[4] Keet, C.M. A core ontology of macroscopic stuff. 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW’14). K. Janowicz et al. (Eds.). 24-28 Nov, 2014, Linkoping, Sweden. Springer LNAI vol. 8876, 209-224.

Updated ontology engineering lecture notes (2015)

It’s that time of the year again, in the southern hemisphere that is, where course preparations for the academic year are going on full steam ahead. Also this year, I’ll be teaching a CS honours course on ontology engineering. To that end, the lecture notes have been updated, though not in a major way like last year. Some sections have been shuffled around, there are a few new exercises, Chris’s update suggestion from last year on the OBO-OWL mapping has been included, and a couple of typos and odd sentences have been fixed.

Practically, this installment will be a bit different from previous years, as it has integrated a small project on Semantic Wikis, funded by CILT and OpenUCT. Set up, maintenance, and filling it with contents on ontology engineering topics will initially be done ‘in house’ by students enrolled in the course and not be generally available on the Web, but if all goes well, it’ll be accessible to everyone some time in April this year, and possibly included in the OER Commons.

Semantic MediaWiki’s features are fairly basic and there are a bunch of plugins and extensions I’ve seen listed, but I didn’t check whether they all worked with the latest SMW. If you have a particular suggestion, please leave a comment or send me an email. One thing I’m still wondering about particularly, but haven’t found a solution to, is whether there’s a plugin that lets you see the (lightweight) ontology when adding contents, so that it makes it easier to use terms in the text from the ontology’s vocabulary rather than find an having to process manually whatever (near)synonyms have been used throughout the pages (like, one contributor using ‘upper ontology’, another ‘foundational ontology’ and the third ‘top-level ontology’), and allow on-the-fly extensions of that ontology.

Ontologies and Knowledge bases lecture notes for 2013

The lecture notes for the ontologies and knowledge bases module (COMP720) for semester 2 in 2013 are online available now. I’ve updated them compared to last year’s installment (mentioned here): in addition to the regular changes, like updates to reflect the advances made in the past year in ontology engineering, better explanations in several sections, and more examples, it includes the DL primer by Markus Kroetzsch, Ian Horrocks and Frantisek Simancik (saving me the time writing about that; thanks!), more exercises, and answers to selected exercises.

As last year, the target audience is computer science students in their 4th year (honours), so the notes are of an introductory nature. It has three blocks: logic foundations, ontology engineering, and advanced topics. The logic foundations contain a recap of FOL, the DL primer and the basics of automated reasoning with the Description Logics with ALC, the DL-based OWL species, and some practical automated reasoning. The ontology engineering block starts with top-down ontology development using foundational ontologies, then bottom-up ontology development to extract knowledge from ‘legacy’ representations, and finally (perhaps too briefly), methods and methodologies. The advanced topics are balanced in two directions, where the first one certainly will be covered and the second one if time permits: ontology-based data access applications (i.e., an ontology-drive information system) and temporal ontologies.

It is essentially still an evolving document, and relative completeness of sections varies slightly. Suggestions and corrections are welcome! If you want to use a part of it in your own lectures and/or use the accompanying slides with it, please contact me.