The Data Mining OPtimization ontology (DMOP) is a highly axiomatised ontology that uses almost all features of OWL 2 DL and the domain entities are linked to DOLCE, using all four main ‘branches’ of DOLCE. Some details are described in last year’s OWLED’13 paper [1] and a blog post. We did observe ‘slow’ reasoner performance to classify the ontology, however, like, between 10 and 20 minutes, varying across versions and machines. The Ontology Reasoner Evaluation (ORE’14) workshop (part of the Vienna Summer of Logic) was a nice motivation to have a look at trying to figure out what’s going on, and some initial results are described briefly in the 6 pages-short paper [2], which is co-authored with Claudia d’Amato, Agnieszka Lawrynowicz, and Zubeida Khan.
Those results are definitely what can be called interesting, even though we’re still at the level of dabbling into it from a reasoner user-centric viewpoint, and notably, from a modeller-centric viewpoint. The latter is what made us pose questions like “what effect does using feature x have on performance of the reasoner?”. No one knew, except for the informal feedback back I received at DL 2010 on [3] that reasoning with data types slows down things, and likewise when the cardinalities are high. That’s not an issue with DMOP, though.
So, the first thing we did was determining a baseline on a good laptop—your average modeller doesn’t have a HPC cluster readily at hand—and in an Ontology Development Environment, where the reasoner is typically accessed from. Some 9 minutes to classify the ontology (machine specs and further details in the paper).
The second steps were the analysis of one specific modeling construct (inverses), and what effect DOLCE has on the overall performance.
The reason why we chose representation of inverses is because in OWL 2 DL (cf. OLW DL), one can use the objectInverseOf(OP) to use the inverse of an object property instead of extending the ontology’s vocabulary and using InverseObjectProperties(OPE1 OPE2) to relate the property and its inverse. For instance, to use the inverse the property addresses in an axiom, one used to have to introduce a new property, addressed by, declare it inverse to addresses, and then use that in the axiom, whereas in OWL 2 DL, one can use ObjectInverseOf(addresses) in the axiom (in Protégé, the syntax is inverse(addresses)). That slashed computing the class hierarchy by at least over a third (and about half for the baseline). Why? We don’t know. Other features used in DMOP, such as punning and property chains, were harder to remove and are heavily used, so we didn’t test those.
The other one, removing DOLCE, is a bit tricky. But to give away the end results upfront: that made it 10 times faster! The ‘tricky’ part has to do with the notion of ‘linking to a foundational ontology’ (deserving of its own blog post). For DMOP, we had not imported but merged, and we did not merge everything from DOLCE and its ExtendedDnS, but only what was deemed relevant, being, in numbers, 43 classes, 78 object properties and 593 axioms. To make matters worse—from an evaluation viewpoint, that is—is that we reused heavily three DOLCE object properties, so we kept those three DOLCE properties in the evaluation file, as we suspected it otherwise would have affected the deductions too much and interfere with the DOLCE-or-not question (one also could argue that those three properties can be considered an integral part of DMOP). So, it was not a simple case of ‘remove the import statement and run the reasoner again’, but a ‘remove almost everything with a DOLCE URI manually and then run the reasoner again’.
Because computation was so ‘slow’, we wondered whether maybe cleverly modularizing DMOP could be the way to go, in case someone wants to use only a part of DMOP. We got as far as trying to modularize the ontology, which already was not trivial because DMOP and DOCLE are both highly axiomatised and with few, if any, relatively isolated sections amenable to modularization. Moreover, what it did show, is that such automated modularization (when it was possible) only affects the number of class and number of axioms, not the properties and individuals. So, the generated modules are stuck with properties and individuals that are not used in, or not relevant for, that module. We did not fix that manually. Also, putting back together the modules did not return it to the original version we started out with, missing 225 axioms out of the 4584.
If this wasn’t enough already, the DMOP with/without DOLCE test was performed with several reasoners, out of curiosity, and they gave different output. FaCT++ and MORe had a “Reasoner Died” message. My ontology engineering students know that, according to DOLCE, death is an achievement, but I guess that its reasoners’ developers would deem otherwise. Pellet and TrOWL inferred inconsistent classes; HermiT did not. Pellet’s hiccup had to do with datatypes and should not have occurred (see paper for details). TrOWL fished out a modeling issue from all of those 4584 axioms (see p5 of the paper), of the flavour as described in [4] (thank you), but with the standard semantics of OWL—i.e., not caring at all about the real semantics of object property hierarchies—it should not have derived an inconsistent class.
Overall, it feels like having opened up a can of worms, which is exciting.
References
[1] Keet, C.M., Lawrynowicz, A., d’Amato, C., Hilario, M. Modeling issues and choices in the Data Mining OPtimisation Ontology. 8th Workshop on OWL: Experiences and Directions (OWLED’13), 26-27 May 2013, Montpellier, France. CEUR-WS vol 1080.
[2] Keet, C.M., d’Amato, C., Khan, Z.C., Lawrynowicz, A. Exploring Reasoning with the DMOP Ontology. 3rd Workshop on Ontology Reasoner Evaluation (ORE’14). July 13, 2014, Vienna, Austria. CEUR-WS vol (accepted).
[3] Keet, C.M. On the feasibility of Description Logic knowledge bases with rough concepts and vague instances. 23rd International Workshop on Description Logics (DL’10), 4-7 May 2010, Waterloo, Canada.
[4] Keet, C. M. (2012). Detecting and revising flaws in OWL object property expressions. In Proc. of EKAW’12, volume 7603 of LNAI, pages 252–266. Springer.
Pingback: Journal paper on Data Mining OPtimization Ontology | Keet blog