Tools to access data through an ontology

Linking data to ontologies and, as natural next step, Ontology-Based Data Access (OBDA)—i.e., using an ontology to mediate access to data, such as querying the data through an ontology—is one of those requirements from the field that took some time to address theoretically, but now the first working prototypes are available. In particular, there is now a combination of an ODBA-enabled reasoner (DIG-Mastro) at the back-end and an ODBA-plugin for Protégé to also have an editor front-end together in one coherent solution. The tools and latest accompanying papers [1,2,3] are in print and online accessible. For those of you who will go to OWLED 2008 or IIMAS’08 this week, you can see it working during the demo sessions, where they take the LUBM benchmark ontology together with the La Sapienza university database (27 tables and 250000 tuples) and two scenarios are worked out with the MiFID ontology and customer business processes.

The remainder of this post is a bit of a marketing exercise about it: the DIG-Mastro and OBDA-plugin for Protégé were developed in a collaboration between members of the KRDB group here at UniBz and the Romans from the DIS at “La Sapienza” university.

On the motivation side, the advantages of OBDA are that the ontology provides a semantic view of the application domain (as opposed to the gory details of the data), constraints expressed in the ontology can fix some incompleteness that tends to be present in especially legacy databases, and, in principle, it can provide the single-view for multiple databases underneath.

The engineers among you are probably well aware that an OWL-DL/OWL 1.1 type-level ontology in Protégé does not scale well if one wants to reason over it, let alone link it to data too to do, e.g., automated instance classification. In order to allow for a scalable system, the DL-lite family of Description Logic languages [4] was developed. Of this family, DL-liteA is used for the implementation (DL-LiteA is LogSpace in data complexity, just as efficient relational databases). The language’s features, i.e. what kind of things you can model, are described in [3,4] and summarized and compared with other ontology languages in [5] in table 1, which is almost the same as can be done with standard UML class diagrams and ER. In contrast to the more expressive DL-based ontology languages and accompanying reasoners, the DIG-Masto actually can deal with unions of conjunctive queries (UCQ) over large data sources and still have efficient reasoning.

A far from trivial issue is the question of how to link the data to the ontologies; the theoretical details can be found in [3]. The mappings do not look very nice for complex mappings (see fig.7 in [1] compared to the readable mappings in fig.1 in [2]), but the OBDA-plugin makes it a lot easier to make them—automation of this procedure is in the pipeline [6,7]—and once the GLAV mappings are defined, you can simply reuse them as often as you want. In short, the plugin allows you to describe the data sources, the mappings, send these descriptions to an OBDA-enabled reasoner, issue OBDA-specific queries, and view the results in the GUI. And yes, I’ve seen it working.

Here are two screenshots of part of the GUI in Protégé (copied from [2]), where the first shows RDBMS-to-Ontology mappings, and the second one a UCQ issued to the DIG-Mastro with query and results manageable through the OBDA-plugin (click to enlarge).

RDBMS-to-Ontology mapping

SPARQL UCQ

What else do you want? 🙂

[1] Mariano Rodriguez-Muro, Lina Lubyte, and Diego Calvanese. Realizing Ontology Based Data Access: A plug-in for Protégé. In Proc. of the Workshop on Information Integration Methods, Architectures, and Systems (IIMAS 2008 ), 2008. Cancun, Mexico.

[2] Antonella Poggi, Mariano Rodriguez-Muro, and Marco Ruzzi. Ontology-based database access with DIG-Mastro and the OBDA Plugin for Protégé (Demo). In Proceeding of the Workshop OWLED 2008. Washington DC, USA, 1-2 April 2008.

[3] Antonella Poggi, Domenico Lembo, Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Riccardo Rosati. Linking Ontologies to Data. Journal on Data Semantics. X: 133-173, 2008.

[4] Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and Riccardo Rosati. Tractable reasoning and efficient query answering in description logics: The DL-Lite family. Journal of Automated Reasoning, 2007, 39(3):385-429.

[5] C. Maria Keet and Mariano Rodriguez. Toward using biomedical ontologies: trade-offs between ontology languages. AAAI 2007 Workshop Semantic eScience (SeS 2007). 23 July 2007, Vancouver, Canada.

[6] Lina Lubyte and Sergio Tessaris. Extracting ontologies from relational databases. Proceedings of the 20th International workshop on Description Logics (DL’07). Bressanone, Italy. CEUR-WS Vol-250, 387-394.

[7] L. Lubyte, S. Tessaris. Supporting the Design of Ontologies for Data Access. In Proc. of the 21st International Workshop on Description Logics (DL 2008 ). To appear.

Advertisement

10 responses to “Tools to access data through an ontology

  1. Thanks for this, I’ve passed it along to a bunch of people that I know are making attempts to link big data to expressive ontologies. Hopefully it will help!

    I’d like to see more posts like this from you :). (and I like the new look and feel)

  2. 大変ですよね、就職活動

    就職氷河期、それも超就職氷河期{と言われる時代に突入しようとしています。

    この、就職氷河期は並大抵の努力では乗り切ることが困難と言われています。

    {米国、アメリカのサブプライムを皮切りに引き起こされた|世界同時不況とも言われている}今回の不況、それこそ数年間にわたって、世界経済に{影響を与えます。|インパクトを与え続けるでしょう。}

    この時期の、今のあなたが直面している就職活動というイベントは、あなたの人生、就職人生の大きなウエイトを占めることとなります。

    {昨年までの就職活動状況とは180度転換した|就職活動は、技術と知識で乗り切ることが出来ます。}就職氷河期、絶対に後悔しないように全力で戦いましょう。

  3. Pingback: Working towards WONDER Data « Keet blog

  4. Pingback: OBDA/I Example in the Digital Humanities: food in the Roman Empire | Keet blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.