# OBDA/I Example in the Digital Humanities: food in the Roman Empire

A new installment of the Ontology Engineering module is about to start for the computer science honours students who selected it, so, in preparation, I was looking around for new examples of what ontologies and Semantic Web technologies can do for you, and that are at least somewhat concrete. One of those examples has an accompanying paper that is about to be published (can it be more recent than that?), which is on the production and distribution of food in the Roman Empire [1]. Although perhaps not many people here in South Africa might care about what happened in the Mediterranean basin some 2000 years ago, it is a good showcase of what one perhaps also could do here with the historical and archeological information (e.g., an inter-university SA project on digital humanities started off a few months ago, and several academics and students at UCT contribute to the Bleek and Lloyd Archive of |xam (San) cultural heritage, among others). And the paper is (relatively) very readable also to the non-expert.

So, what is it about? Food was stored in pots (more precisely: an amphora) that had engravings on it with text about who, what, where etc. and a lot of that has been investigated, documented, and stored in multiple resources, such as in databases. None of the resources cover all data points, but to advance research and understanding about it and food trading systems in general, it has to be combined somehow and made easily accessible to the domain experts. That is, essentially it is an instance of a data access and integration problem.

There are a couple of principal approaches to address that, usually done by an Extract-Transform-Load of each separate resource into one database or digital library, and then putting a web-based front-end on top of it. There are many shortcomings to that solution, such as having to repeat the ETL procedure upon updates in the source database, a single control point, and the, typically only, canned (i.e., fixed) queries of the interface. A more recent approach, of which the technologies finally are maturing, is Ontology-Based Data Access (OBDA) and Ontology-Based Data Integration (OBDI). I say “finally” here, as I still very well can remember the predecessors we struggled with some 7-8 years ago [2,3] (informally here, here, and here), and “maturing”, as the software has become more stable, has more features, and some of the things we had to do manually back then have been automated now. The general idea of OBDA/I applied to the Roman Empire Food system is shown in the figure below.

OBDA in the EPnet system (Source: [1])

There are the data sources, which are federated (one ‘middle layer’, though still at the implementation level). The federated interface has mapping assertions to elements in the ontology. The user then can use the terms of the ontology (classes and their relations and attributes) to query the data, without having to know about how the data is stored and without having to write page-long SQL queries. For instance, a query “retrieve inscriptions on amphorae found in the city of ‘Mainz” containing the text ‘PNN’” would use just the terms in the ontology, say, Inscription, Amphora, City, found in, and inscribed on, and any value constraint added (like the PNN), and the OBDA/I system takes care of the rest.

Interestingly, the authors of [1]—admitted, three of them are former colleagues from Bolzano—used the same approach to setting up the ontology component as we did for [3]. While we will use the Protégé Ontology Development Environment in the OE module, it is not the best modelling tool to overcome the knowledge acquisition bottleneck. The authors modelled together with the domain experts in the much more intuitive ORM language and tool NORMA, and first represented whatever needed to be represented. This included also reuse of relevant related ontologies and non-ontology material, and modularizing it for better knowledge management and thereby ameliorating cognitive overload. A subset of the resultant ontology was then translated into the Web Ontology Language OWL (more precisely: OWL 2 QL, a tractable profile of OWL 2 DL), which is actually used in the OBDA system. We did that manually back then; now this can be done automatically (yay!).

Skipping here over the OBDI part and considering it done, the main third step in setting up an OBDA system is to link the data to the elements in the ontology. This is done in the mapping layer. This is essentially of the form “TermInTheOntology <- SQLqueryOverTheSource”. Abstracting from the current syntax of the OBDA system and simplifying the query for readability (see the real one in the paper), an example would thus have the following make up to retrieve all Dressel 1 type of amphorae, named Dressel1Amphora in the ontology, in all the data sources of the system:

```Dressel1Amphora <-
SELECT ic.id
FROM ic JOIN at ON at.carrier=ic.id
WHERE at.type=’DR1’```

Or some such SQL query (typically larger than this one). This takes up a bit of time to do, but has to be done only once, for these mappings are stored in a separate mapping file.

The domain expert, then, when wanting to know about the Dressel1 amphorae in the system, would have to ask only ‘retrieve all Dressel1 amphorae’, rather than creating the SQL query, and thus being oblivious about which tables and columns are involved in obtaining the answer and being oblivious about that some data entry person at some point had mysteriously decided not to use ‘Dressel1’ but his own abbreviation ‘DR1’.

The actual ‘retrieve all Dressel1 amphorae’ is then a SPARQL query over the ontology, e.g.,

`SELECT ?x WHERE {?x rdf:Type :Dressel1Amphora.}`

which is surely shorter and therefore easier to handle for the domain expert than the SQL one. The OBDA system (-ontop-) takes this query and reasons over the ontology to see if the query can be answered directly by it without consulting the data, or else can be rewritten given the other knowledge in the ontology (it can, see example 5 in the paper). The outcome of that process then consults the relevant mappings. From that, the whole SQL query is constructed, which is sent to the (federated) data source(s), which processes the query as any relational database management system does, and returns the data to the user interface.

It is, perhaps, still unpleasant that domain experts have to put up with another query language, SPARQL, as the paper notes as well. Some efforts have gone into sorting out that ‘last mile’, such as using a (controlled) natural language to pose the query or to reuse that original ORM diagram in some way, but more needs to be done. (We tried the latter in [3]; that proof-of-concept worked with a neutered version of ORM and we have screenshots and videos to prove it, but in working on extensions and improvements, a new student uploaded buggy code onto the production server, so that online source doesn’t work anymore (and we didn’t roll back and reinstalled an older version, with me having moved to South Africa and the original student-developer, Giorgio Stefanoni, away studying for his MSc).

Note to OE students: This is by no means all there is to OBDA/I, but hopefully it has given you a bit of an idea. Read at least sections 1-3 of paper [1], and if you want to do an OBDA mini-project, then read also the rest of the paper and then Chapter 8 of the OE lecture notes, which discusses in a bit more detail the motivations for OBDA and the theory behind it.

References

[1] Calvanese, D., Liuzzo, P., Mosca, A., Remesal, J, Rezk, M., Rull, G. Ontology-Based Data Integration in EPNet: Production and Distribution of Food During the Roman Empire. Engineering Applications of Artificial Intelligence, 2016. To appear.

[2] Keet, C.M., Alberts, R., Gerber, A., Chimamiwa, G. Enhancing web portals with Ontology-Based Data Access: the case study of South Africa’s Accessibility Portal for people with disabilities. Fifth International Workshop OWL: Experiences and Directions (OWLED 2008), 26-27 Oct. 2008, Karlsruhe, Germany.

[3] Calvanese, D., Keet, C.M., Nutt, W., Rodriguez-Muro, M., Stefanoni, G. Web-based Graphical Querying of Databases through an Ontology: the WONDER System. ACM Symposium on Applied Computing (ACM SAC 2010), March 22-26 2010, Sierre, Switzerland. ACM Proceedings, pp1389-1396.

# Reblogging 2012: Fixing flaws in OWL object property expressions

From the “10 years of keetblog – reblogging: 2012”: There are several 2012 papers I (co-)authored that I like and would have liked to reblog—whatever their citation counts may be. Two are on theoretical, methodological, and tooling advances in ontology engineering using foundational ontologies in various ways, in collaboration with Francis Fernandez and Annette Morales following a teaching and research visit to Cuba (ESWC’12 paper on part-whole relations), and a dedicated Honours student who graduated cum laude, Zubeida Khan (EKAW’12 paper on foundational ontology selection). The other one, reblogged here, is of a more fundamental nature—principles of role [object property] hierarchies in ontologies—and ended up winning best paper award at EKAW’12; an extended version has been published in JoDS in 2014. I’m still looking for a student to make a proof-of-concept implementation (in short, thus far: when some are interested, there’s no money, and when there’s money, there’s no interest).

———–

OWL 2 DL is a very expressive language and, thanks to ontology developers’ persistent requests, has many features for declaring complex object property expressions: object sub-properties, (inverse) functional, disjointness, equivalence, cardinality, (ir)reflexivity, (a)symmetry, transitivity, and role chaining. A downside of this is that with the more one can do, the higher is the chance that flaws in the representation are introduced; hence, an unexpected or undesired classification or inconsistency may actually be due to a mistake in the object property box, not a class axiom. While there are nifty automated reasoners and explanation tools that help with the modeling exercise, the standard reasoning services for OWL ontologies assume that the axioms in the ‘object property box’ are correct and according to the ontologist’s intention. This may not be the case. Take, for instance, the following thee examples, where either the assertion is not according to the intention of the modeller, or the consequence may be undesirable.

• Domain and range flaws; asserting hasParent $\sqsubseteq$ hasMother instead of hasMother $\sqsubseteq$ hasParent in accordance with their domain and range restrictions (i.e., a subsetting mistake—a more detailed example can be found in [1]), or declaring a domain or a range to be an intersection of disjoint classes;
• Property characteristics flaws: e.g., the family-tree.owl (when accessed on 12-3-2012) has hasGrandFather $\sqsubseteq$ hasAncestor and Trans(hasAncestor) so that transitivity unintentionally is passed down the property hierarchy, yet hasGrandFather is really intransitive (but that cannot be asserted in OWL);
• Property chain issues; for instance the chain hasPart $\circ$ hasParticipant $\sqsubseteq$ hasParticipant in the pharmacogenomics ontology [2] that forces the classes in class expressions using these properties—in casu, DrugTreatment and DrugGeneInteraction—to be either processes due to the domain of the hasParticipant object property, or they will be inconsistent.

Unfortunately, reasoner output and explanation features in ontology development environments do not point to the actual modelling flaw in the object property box. This is due to that implemented justification and explanation algorithms [3, 4, 5] consider logical deductions only and that class axioms and assertions about instances take precedence over what ‘ought to be’ concerning object property axioms, so that only instances and classes can move about in the taxonomy. This makes sense from a logic viewpoint, but it is not enough from an ontology quality viewpoint, as an object property inclusion axiom—being the property hierarchies, domain and range axioms to type the property, a property’s characteristics (reflexivity etc.), and property chains—may well be wrong, and this should be found as such, and corrections proposed.

So, we have to look at what type of mistakes can be made in object property expressions, how one can get the modeller to choose the ontologically correct options in the object property box so as to achieve a better quality ontology and, in case of flaws, how to guide the modeller to the root defect from the modeller’s viewpoint, and propose corrections. That is: the need to recognise the flaw, explain it, and to suggest revisions.

To this end, two non-standard reasoning services were defined [6], which has been accepted recently at the 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12): SubProS and ProChainS. The former is an extension to the RBox Compatibility Service for object subproperties by [1] so that it now also handles the object property characteristics in addition to the subsetting-way of asserting object sub-properties and covers the OWL 2 DL features as a minimum. For the latter, a new ontological reasoning service is defined, which checks whether the chain’s properties are compatible by assessing the domain and range axioms of the participating object properties. Both compatibility services exhaustively check all permutations and therewith pinpoint to the root cause of the problem (if any) in the object property box. In addition, if a test fails, one or more proposals are made how best to revise the identified flaw (depending on the flaw, it may include the option to ignore the warning and accept the deduction). Put differently: SubProS and ProChainS can be considered so-called ontological reasoning services, because the ontology does not necessarily contain logical errors in some of the flaws detected, and these two services thus fall in the category of tools that focus on both logic and additional ontology quality criteria, by aiming toward ontological correctness in addition to just a satisfiable logical theory. (on this topic, see also the works on anti-patterns [7] and OntoClean [8]). Hence, it is different from other works on explanation and pinpointing mistakes that concern logical consequences only [3,4,5], and SubProS and ProChainS also propose revisions for the flaws.

SubProS and ProChainS were evaluated (manually) with several ontologies, including BioTop and the DMOP, which demonstrate that the proposed ontological reasoning services indeed did isolate flaws and could propose useful corrections, which have been incorporated in the latest revisions of the ontologies.

Theoretical details, the definition of the two services, as well as detailed evaluation and explanation going through the steps can be found in the EKAW’12 paper [6], which I’ll present some time between 8 and 12 October in Galway, Ireland. The next phase is to implement an efficient algorithm and make a user-friendly GUI that assists with revising the flaws.

References

[1] Keet, C.M., Artale, A.: Representing and reasoning over a taxonomy of part-whole relations. Applied Ontology 3(1-2) (2008) 91–110

[2] Dumontier, M., Villanueva-Rosales, N.: Modeling life science knowledge with OWL 1.1. In: Fourth International Workshop OWL: Experiences and Directions 2008 (OWLED 2008 DC). (2008) Washington, DC (metro), 1-2 April 2008

[3] Horridge, M., Parsia, B., Sattler, U.: Laconic and precise justifications in OWL. In: Proceedings of the 7th International Semantic Web Conference (ISWC 2008). Volume 5318 of LNCS., Springer (2008)

[4] Parsia, B., Sirin, E., Kalyanpur, A.: Debugging OWL ontologies. In: Proceedings of the World Wide Web Conference (WWW 2005). (2005) May 10-14, 2005, Chiba, Japan.

[5] Kalyanpur, A., Parsia, B., Sirin, E., Grau, B.: Repairing unsatisfiable concepts in OWL ontologies. In: Proceedings of ESWC’06. Springer LNCS (2006)

[6] Keet, C.M. Detecting and Revising Flaws in OWL Object Property Expressions. 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW’12), Oct 8-12, Galway, Ireland. Springer, LNAI, 15p. (in press)

[7] Roussey, C., Corcho, O., Vilches-Blazquez, L.: A catalogue of OWL ontology antipatterns. In: Proceedings of K-CAP’09. (2009) 205–206

[8] Guarino, N., Welty, C.: An overview of OntoClean. In Staab, S., Studer, R., eds.: Handbook on ontologies. Springer Verlag (2004) 151–159

# CFP Logics and Reasoning for Conceptual Models (LRCM 2016)

Just in case you don’t have enough to do these days, or want to ‘increase exposure’ when attending KR2016/DL2016/NMR2016 in Cape Town in April, or try to use it as another way in to attend KR2016/DL2016/NMR2016, or [fill in another reason]: please consider submitting a paper or an abstract to the Second Workshop on Logics and Reasoning for Conceptual Models (LRCM 2016):

```================================================================
Second Workshop on Logics and Reasoning for Conceptual Models (LRCM 2016)
April 21, 2016, Cape Town, South Africa
http://lrcm2016.cs.uct.ac.za/
==
Co-located with:
15th Int. Conference on Knowledge Representation and Reasoning (KR 2016)
http://kr2016.cs.uct.ac.za/
29th Int. Workshop on Description Logics (DL 2016)
http://dl2016.cs.uct.ac.za/
==============================================================

There is an increase in complexity of information systems due to,
among others, company mergers with information system integration,
upscaling of scientific collaborations, e-government etc., which push
the necessity for good quality information systems. An information
system’s quality is largely determined in the conceptual modelling
stage, and avoiding or fixing errors of the conceptual model saves
resources during design, implementation, and maintenance. The size and
high expressivity of conceptual models represented in languages such
as EER, UML, and ORM require a logic-based approach in the
representation of information and adoption of automated reasoning
techniques to assist in the development of good quality conceptual
models. The theory to achieve this is still in its infancy, however,
with only a limited set of theories and tools that address subtopics
in this area. This workshop aims at bringing together researchers
working on the logic foundations of conceptual data modelling
languages and the reasoning techniques that are being developed so as
to discuss the latest results in the area.

**** Topics ****
Topics of interest include, but are not limited to:
- Logics for temporal and spatial conceptual models and BPM
- Deontic logics for SBVR
- Other logic-based extensions to standard conceptual modelling languages
- Unifying formalisms for conceptual schemas
- Decidable reasoning over conceptual models
- Dealing with finite and infinite satisfiability of a conceptual model
- Reasoning over UML state and behaviour diagrams
- Reasoning techniques for EER/UML/ORM
- Interaction between ontology languages and conceptual data modelling languages
- Tools for logic-based modelling and reasoning over conceptual models
- Experience reports on logic-based modelling and reasoning over conceptual models
- Logics and reasoning over models for Big Data

To this end, we solicit mainly theoretical contributions with regular
talks and implementation/system demonstrations and some modelling
experience reports to facilitate cross-fertilisation between theory
and praxis.  Selection of presentations is based on peer-review of
submitted papers by at least 2 reviewers, with a separation between
theory and implementation & experience-type of papers.

**** Submissions ****
We welcome submissions in LNCS style in the following two formats for
oral presentation:
- Extended abstracts of maximum 2 pages;
- Research papers of maximum 10 pages.
Both can be submitted in pdf format via the EasyChair website at
https://easychair.org/conferences/?conf=lrcm2016.

**** Important dates ****
Submission of papers/abstracts:  February 7, 2016
Notification of acceptance:      March 15, 2016
Workshop:                        April 21, 2016

**** Organisers ****
Diego Calvanese (Free University of Bozen-Bolzano, Italy)
Alfredo Cuzzocrea (University of Trieste and ICAR-CNR, Italy)
Maria Keet (University of Cape Town, South Africa)

**** PC Members ****
Alessandro Artale (Free University of Bozen-Bolzano, Italy)
Arina Britz (Stellenbosch University, South Africa)
Thomas Meyer (University of Cape Town, South Africa)
Marco Montali (Free University of Bozen-Bolzano, Italy)
Till Mossakowski (University of Magdeburg)
Anna Queralt (Barcelona Supercomputing Center, Spain)
Vladislav Ryzhikov (Free University of Bozen-Bolzano, Italy)
Pablo Fillottrani (Universidad Nacional del Sur, Argentina)
Szymon Klarman (Brunel University London, UK)
Roman Kontchakov (Birkbeck, University of London, UK)
Oliver Kutz (Free University of Bozen-Bolzano, Italy)
Ernest Teniente (Universitat Politecnica de Catalunya, Spain)
David Toman (University of Waterloo, Canada)
(Further invitations pending)

Depending on the number of submissions, the duration of the workshop
will be either half a day or a full day.```

# A new selection of book reviews (from 2015)

By now a regular fixture for the new year (5th time in the 10th year of this blog), I’ll briefly comment on some of the fiction novels I have read the past year, then two non-fiction ones. They are in the picture on the right (minus The accidental apprentice). Unlike last year’s list, they’re all worthy of a read.

Fiction

The devil to pay by Hugh FitzGerald Ryan (2011). Although I’m not much of a history novel fan, the book is a fascinating read. It is a romanticised story based on the many historical accounts of Alice the Kyteler and her maidservant Petronilla de Midia, the latter who was the first person to be tortured and burned at the stake for heresy in Ireland (on 3 Nov 1324, in Kilkenny, to be precise). Unlike the usual histories where men play the centre stage, the protagonist, Alice the Kyteler, is a successful and rich businesswomen who had had four husbands (serially), and one thread through the story is a description of daily life in those middle ages for all people involved—rich, poor, merchant, craftsmen, monk, the English vs. Irish, and so on. It’s written in a way of a snapshot of life of the ordinary people that come and go, insignificant in the grander scheme of things. At some point, however, Alice and Petronilla are accused of sorcery by some made-up charges from people who want a bigger slice of the pie and are also motivated by envy, which brings to the foreground the second thread in the story: the power play between the Church that actively tried to increase its influence in those days, the secular politics with non-church and/or atheist people in power, and the laws and functioning legal system at the time. This clash is what turned the every-day-life setting into one that ended up having been recorded in writing and remembered and analysed by historians. All did not end well for the main people involved, but there’s a small sweet revenge twist at the end.

Black widow society by Angela Makholwa (2013). Fast-paced, with lots of twists and turns, this highly recommendable South African crime fiction describes the gradual falling apart of a secret society of women who had their abusive husbands murdered. The adjective ‘exciting’ is probably not appropriate for such a morbid topic, but it’s written in a way that easily sucks you into the schemes and quagmires of the four main characters (The Triumvirate and their hired assassin), and wanting to know how they get out of the dicey situations. Spoiler alert: some do, some don’t. See also the short extract, and there’s an ebook version for those who’d prefer that over buying a hardcopy in South Africa (if you’re nearby, you can borrow my hardcopy, of course).

The accidental apprentice by Vikas Swarup (2012). It’s a nice read, but my memory of the details is a bit sketchy by now and I lent out the book; I recall liking it more for reading a novel about India by an Indian author rather than the actual storyline, even though I had bought it for the latter reason only. The story is about a young female sales clerk in India who has to pass several ‘life tests’ somehow orchestrated by a very rich businessman; if she passes, she can become CEO of his company. The life tests are about one’s character in challenging situations and inventiveness to resolve it. Without revealing too much of how it ends, I think it would make a pleasant Bollywood or Hollywood movie.

Moxyland by Lauren Beukes (2008). Science fiction set in Cape Town. It has a familiar SF setting of a dystopian future of more technology and somehow ruled/enslaved by it, haves and have-nots divide, and a sinister authoritarian regime to suppress the masses. A few individuals try to act against it but get sucked into the system even more. It is not that great as a story, yet it is nice to read a SF novel that’s situated in the city I live in.

Muh by David Safir (2012). One of the cows in a herd on a farm in Germany finds out they’re all destined for the slaughterhouse, and the cow escapes with a few other cows and a bull to travel to the cows’ paradise on earth: India. The main part of the book is about that journey, interspersed with very obvious referrals to various religious ideas and prejudices. I bought it because I very much enjoyed the author’s other book, Mieses karma (reviewed here). Muh was readable enough—which is more than the few half-read books lying around in a state of abandon—but not nearly as good and fun as Mieses karma. On a different note, this book is probably only available in German.

Non-fiction

Big Short by Michael Lewis (2010). The book chronicles the crazy things that happened in the financial sector that led to the inevitable crash in 2008. It reads like a suspense thriller, but it is apparently a true account of what happened inside the system, making it jaw-dropping. There are irresponsible people in the system, and there are other irresponsible people in the system. Some of them—the “misfits, renegades and visionaries”—saw it coming, and betted that it would crash, making more money the bigger the misfortunes of others. Others didn’t see it coming, due to their feckless behaviour, laziness, greed, short-sightedness, ignorance and all that so that they bought into bond/shares/mortgage packages that could only go downhill and thus lost a lot of money. For those who are not economists and conversant in financial jargon, it is not always an easy read the more complex the crazy schemes get—that was also a problem for some of the people in the system, btw—but even if you read over some of the explanations of part of a scheme, the message will be clear: it’s so rotten. A movie based on the book just came out.

17 Contradictions and the end of capitalism by David Harvey (2014). There are good book reviews of this book online (e.g., here and here), which see it as a good schematic introduction to Marxist political economy. I have little to add to that. In Harvey’s on words, his two aims of the two books were “to define what anti-capitalism might entail… [and] to give rational reasons for becoming anti-capitalist in the light of the contemporary state of things.”. Overall, the dissecting and clearly describing the contradictions can be fertile ground indeed for helping to end capitalism, as contradictions are the weak spots of a system and cannot remain indefinitely. Its chapter 8 ‘Technology, work, and human disposability’ could be interesting reading material for a social issues and profession practice course on technology and society, to subsequent have some discussion session or essay writing on it. Locally, in the light of the student protests we recently had (discussed earlier): if you don’t have enough time to read the whole book, then check out at least chapters 13 ‘Social reproduction’ and 14 ‘Freedom and domination’, and, more generally with respect to society, chapter 17 ‘The revolt of human nature: universal alienation’, the conclusions & epilogue, and a few of the foundational contradictions, notably the one on private property & common wealth and capital & labour.

Previous editions: books on (South) Africa from 2011, some more and also general books in 2012, book suggestions from 2013, and the mixed bag from 2014.