Review of ‘The web was done by amateurs’ by Marco Aiello

Via one of those friend-of-a-friend likes on social media that popped up in my stream, I stumbled upon the recently published book “The web was done by amateurs” (there’s also a related talk) by Marco Aiello, which piqued my interest both concerning the title and the author. I’ve met Aiello once in Trento, when a colleague and he had a departing party, with Aiello leaving for Groningen. He probably doesn’t remember me, nor do I remember much of him—other than his lamentations about Italian academia and going for greener pastures. Turns out he’s done very well for himself academically, and the foray into writing for the general public has been, in my opinion, a fairly successful attempt with this book.

The short book—it easily can be read in a weekend—starts in the first part with historical notes on who did what for the Internet (the infrastructure) and the multiple predecessor proposals and applications of hyperlinking across documents that Tim Berners-Lee (TBL) apparently was blissfully unaware of. It’s surely a more interesting and useful read than the first Google hit, the few factoids from W3C, or Wikipedia one can find online with a simple search—or: it pays off to read books still in this day and age :). The second part is for most readers, perhaps, also still history: the ‘birth’ of the Web and the browser wars in the mid 1990s.

Part III is, in my opinion, the most fun to read: it discusses various extensions to the original design of TBL’s Web that fixes, or at least aims to fix, a shortcoming of the Web’s basics, i.e., they’re presented as “patches” to patch up a too basic—or: rank-amateur—design of the original Web. They are, among others, persistence with cookies to mimic statefulness for Web-based transactions (for, e.g., buying things on the web), trying to get some executable instructions with Java (ActiveX, Flash), and web services (from CORBA, service-oriented computing, to REST and the cloud and such). Interestingly, they all originate in the 1990s in the time of the browser wars.

There are more names in the distant and recent history of the Web that I knew of, so even I picked up a few things here or there. IIRC, they’re all men, though. Surely there would be at least one woman worthy of mention? I probably ought to know, but didn’t, so I searched the Web and easily stumbled upon the Internet Hall of Fame. That list includes Susan Estrada among the pioneers, who founded CERFnet that “grew the network from 25 sites to hundreds of sites.”, and, after that, Anriette Esterhuysen and Nancy Hafkin for the network in Africa, Qiheng Hu for doing this for China, and Ida Holz for the same in Latin America (in ‘global connections’). Web innovators specifically include Anne-Marie Eklund Löwinder for DNS security extensions (DNSSEC, noted on p143 but not by its inventor’s name) and Elizabeth Feinler for the “first query-based network host name and address (WHOIS) server” and “she and her group developed the top-level domain-naming scheme of .com, .edu, .gov, .mil, .org, and .net, which are still in use today”.

One patch to the Web that I really missed in the overview of the early patches, is the “Web 2.0”. I know that, technologically, it is a trivial extension to TBL’s original proposal: the move from static web pages in 1:n communication from content provider to many passive readers, to m:n communication with comment sections (fancy forms), or: instead of the surfer being just a recipient of information by reading one webpage after another and thinking her own thing of it, to be able to respond and interact, i.e., the chatrooms, the article and blog comment features, and, in the 2000s, the likes of MySpace and Facebook. It got so many more people involved in it all.

Continuing with the book’s content, cloud computing and the fog (section 7.9) are from this millennium, as is, what Aiello dubbed, the “Mother of All Patches.”: the Semantic Web. Regarding the latter, early on in the book (pp. vii-viii) there is already an off-hand comment that does not bode well: “Chap. 8 on the Semantic Web is slightly more technical than the rest and can be safely skipped.” (emphasis added). The way Chapter 8 is written, perhaps. Before discussing his main claim there, a few minor quibbles: it’s the Web Ontology Language OWL, not “Ontology Web Language” (p105), and there’s OWL 2 as successor of the OWL of 2004. “RDF is a nifty combination of being a simple modeling language while also functioning as an expressive ontological language” (p104), no: RDF is for representing data, not really for modeling, and most certainly would not be considered an ontology language (one can serialize an ontology in RDF/XML, but that’s different). Class satisfiability example: no, that’s not what it does, or: the simplification does not faithfully capture it; an example with a MammalFish that cannot have any instances (as subclass of both Mammal and Fish that are disjoint), would have been (regardless the real world).

The main claim of Aiello regarding the Semantic Web, however, is that it’s been that time to throw in the towel, because there hasn’t been widespread uptake of Semantic Web technologies on the Web even though it was proposed already around the turn of the millenium. I lean towards that as well and have reduced the time spent on it from my ontology engineering course over the years, but don’t want to throw out the baby with the bathwater just yet, for two reasons. First, scientific results tend to take a long time to trickle down. Second, I am not convinced that the ‘semantic’ part of the Web is the same level of end-user stuff as playing with HTML is. I still have an HTML book from 1997. It has instructions to “design your first page in 10 minutes!”. I cannot recall if it was indeed <10 minutes, but it sure was fast back in 1998-1999 when I made my first pages, as a non-IT interested layperson. I’m not sure if the whole semantics thing can be done even on the proverbial rainy Sunday afternoon, but the dumbed down version with schema.org sort of works. This schema.org brings me to p110 of Aiello’s book, which states that Google can make do with just statistics for optimal search results because of its sheer volume (so bye-bye Semantic Web). But it is not just stats-based: even Google is trying with schema.org and its “knowledge graph”; admitted, it’s extremely lightweight, but it’s more than stats-only. Perhaps the schema.org and knowledge graph sort of thing are to the Semantic Web what TBL’s proposal for the Web was to, say, the fancier HyperCard.

I don’t know if people within the Semantic Web research community would think of its tooling as technologies for the general public. I suspect not. I consider the development and use of ontologies in ontology-driven information systems as part of the ‘back office’ technologies, notwithstanding my occasional attempts to explain to friends and family what sort of things I’m working on.

What I did find curious, is that one of Aiello’s arguments for the Semantic Web’s failure was that “Using ontologies and defining what the meaning of a page is can be much more easily exploited by malicious users” (p110). It can be exploited, for sure, but statistics can go bad, very bad, too, especially on associations of search terms, the creepy amount of data collection on the Web, and bias built into the Machine Learning algorithms. Search engine optimization is just the polite terms for messing with ‘honest’ stats and algorithms. With the Semantic Web, it would a conscious decision to mess around and that’s easily traceable, but with all the stats-based approaches, it sneakishly can creep in whilst trying to keep up the veneer of impartiality, which is harder to detect. If it were a choice between two technology evils, I prefer the honest bastard cf. being stabbed in the back. (That the users of the current Web are opting for the latter does not make it the lesser of two evils.)

As to two possible new patches (not in the book and one can debate whether they are), time will tell whether a few recent calls for “decentralizing” the Web will take hold, or more fine-grained privacy that also entails more fine-grained recording of events (e.g., TBL’s solid project). The app-fication discussion (Section 10.1) was an interesting one—I hardly use mobile apps and so am not really into it—and the lock-in it entails is indeed a cause for concern for the Web and all it offers. Another section in Chapter 10 is IoT, which sounds promising and potentially scary (what would the data-hungry ML algorithms of the Web infer from my fridge contents, and from that, about me??)—for the past 10 years or so. Lastly, the final chapter has the tempting-to-read title “Should a new Web be designed?”, but the answer is not a clear yes or no. Evolve, it will.

Would I have read the book if I weren’t on sabbatical now? Probably still, on an otherwise ‘lost time’ intercontinental trip to a conference. So, overall, besides the occasional gap and one could quibble a bit here and there, the book is a nice read on the whole for any lay-person interested in learning something about the ubiquitous Web, any expert who’s using only a little corner of it, and certainly for the younger generation to get a feel for how the current Web came about and how technologies get shaped in praxis.

Advertisements

An Ontology Engineering textbook

My first textbook “An Introduction to Ontology Engineering” (pdf) is just released as an open textbook. I have revised, updated, and extended my earlier lecture notes on ontology engineering, amounting to about 1/3 more new content cf. its predecessor. Its main aim is to provide an introductory overview of ontology engineering and its secondary aim is to provide hands-on experience in ontology development that illustrate the theory.

The contents and narrative is aimed at advanced undergraduate and postgraduate level in computing (e.g., as a semester-long course), and the book is structured accordingly. After an introductory chapter, there are three blocks:

  • Logic foundations for ontologies: languages (FOL, DLs, OWL species) and automated reasoning (principles and the basics of tableau);
  • Developing good ontologies with methods and methodologies, the top-down approach with foundational ontologies, and the bottom-up approach to extract as much useful content as possible from legacy material;
  • Advanced topics that has a selection of sub-topics: Ontology-Based Data Access, interactions between ontologies and natural languages, and advanced modelling with additional language features (fuzzy and temporal).

Each chapter has several review questions and exercises to explore one or more aspects of the theory, as well as descriptions of two assignments that require using several sub-topics at once. More information is available on the textbook’s page [also here] (including the links to the ontologies used in the exercises), or you can click here for the pdf (7MB).

Feedback is welcome, of course. Also, if you happen to use it in whole or in part for your course, I’d be grateful if you would let me know. Finally, if this textbook will be used half (or even a quarter) as much as the 2009/2010 blogposts have been visited (around 10K unique visitors since posting them), that would mean there are a lot of people learning about ontology engineering and then I’ll have achieved more than I hoped for.

UPDATE: meanwhile, it has been added to several open (text)book repositories, such as OpenUCT and the Open Textbook Archive, and it has been featured on unglue.it in the week of 13-8 (out of its 14K free ebooks).

Reblogging 2013: Release of the (beta version of the) foundational ontology library ROMULUS

From the “10 years of keetblog – reblogging: 2013”: also not easy to choose regarding research. Here, then, the first results of Zubeida Khan’s MSc thesis on the foundational ontology library ROMULUS, being the first post of several papers on the theoretical and methodological aspects of it (KCAP’13 poster, KEOD’13 paper, MEDI’13 paper, book chapter, EKAW’14 paper) and her winning the award for best Masters from the CSIR. The journal paper on ROMULUS has just been published last week in the Journal on Data Semantics, in a special issue edited by Alfredo Cuzzocrea.

Release of the (beta version of the) foundational ontology library ROMULUS; April 4

—————-

With the increase on ontology development and networked ontologies, both good ontology development and ontology matching for ontology linking and integration are becoming a more pressing issue. Many contributions have been proposed in these areas. One of the ideas to tackle both—supposedly in one fell swoop—is the use of a foundational ontology. A foundational ontology aims to (i) serve as a building block in ontology development by providing the developer with guidance how to model the entities in a domain, and  (ii) serve as a common top-level when integrating different domain ontologies, so that one can identify which entities are equivalent according to their classification in the foundational ontology. Over the years, several foundational ontologies have been developed, such as DOLCE, BFO, GFO, SUMO, and YAMATO, which have been used in domain ontology development. The problem that has arisen now, is how to link domain ontologies that are mapped to different foundational ontologies?

To be able to do this in a structured fashion, the foundational ontologies have to be matched somehow, and ideally have to have some software support for this. As early as 2003, this issue as foreseen already and the idea of a “WonderWeb Foundational Ontologies Library” (WFOL) proposed, so that—in the ideal case—different domain ontologies can to commit to different but systematically related (modules of) foundational ontologies [1]. However, the WFOL remained just an idea because it was not clear how to align those foundational ontologies and, at the time of writing, most foundational ontologies were still under active development, OWL was yet to be standardised, and there was scant stable software infrastructure. Within the Semantic Web setting, the solvability of the implementation issues is within reach yet not realised, but their alignment is still to be carried out systematically (beyond the few partial comparisons in the literature).

We’re trying to solve these theoretical and practical shortcomings through the creation of the first such online library of machine-processable, aligned and merged, foundational ontologies: the Repository of Ontologies for MULtiple USes ROMULUS. This version contains alignments, mappings, and merged ontologies for DOLCE, BFO, and GFO and some modularized versions thereof, as a start. It also has a section on logical inconsistencies; i.e., entities that were aligned manually and/or automatically and seemed to refer to the same thing—e.g., a mathematical set, a temporal region—actually turned out not to be (at least from a logical viewpoint) due to other ‘interfering’ axioms in the ontologies. What one should be doing with those, is a separate issue, but at least it is now clear where the matching problems really are down to the nitty-gritty entity-level.

We performed a small experiment on the evaluation of the mappings (thanks to participants from DERI, Net2 funds, and Aidan Hogan), and we would like to have more feedback on the alignments and mappings. It is one thing that we, or some alignment tool, aligned two entities, another that asserting an equivalence ends up logically consistent (hence mapped) or inconsistent, and yet another what you think of the alignments, especially the ontology engineers. You can participate in the evaluation: you will get a small set of a few alignments at a time, and then you decide whether you agree, partially agree, or disagree with it, are unsure about it, or skip it if you have no clue.

Finally, ROMULUS also has a range of other features, such as ontology selection, a high-level comparison, browsing the ontology through WebProtégé, a verbalization of the axioms, and metadata. It is the first online library of machine-processable, modularised, aligned, and merged foundational ontologies around. A poster/demo paper [2] was accepted at the Seventh International Conference on Knowledge Capture (K-CAP’13), and papers describing details are submitted and in the pipeline. In the meantime, if you have comments and/or suggestions, feel free to contact Zubeida or me.

References

[1] Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A. Ontology library. WonderWeb Deliverable D18 (ver. 1.0, 31-12-2003). (2003) http://wonderweb.semanticweb.org.

[2] Khan, Z., Keet, C.M. Toward semantic interoperability with aligned foundational ontologies in ROMULUS. Seventh International Conference on Knowledge Capture (K-CAP’13), ACM proceedings. 23-26 June 2013, Banff, Canada. (accepted as poster &demo with short paper)

OBDA/I Example in the Digital Humanities: food in the Roman Empire

A new installment of the Ontology Engineering module is about to start for the computer science honours students who selected it, so, in preparation, I was looking around for new examples of what ontologies and Semantic Web technologies can do for you, and that are at least somewhat concrete. One of those examples has an accompanying paper that is about to be published (can it be more recent than that?), which is on the production and distribution of food in the Roman Empire [1]. Although perhaps not many people here in South Africa might care about what happened in the Mediterranean basin some 2000 years ago, it is a good showcase of what one perhaps also could do here with the historical and archeological information (e.g., an inter-university SA project on digital humanities started off a few months ago, and several academics and students at UCT contribute to the Bleek and Lloyd Archive of |xam (San) cultural heritage, among others). And the paper is (relatively) very readable also to the non-expert.

 

So, what is it about? Food was stored in pots (more precisely: an amphora) that had engravings on it with text about who, what, where etc. and a lot of that has been investigated, documented, and stored in multiple resources, such as in databases. None of the resources cover all data points, but to advance research and understanding about it and food trading systems in general, it has to be combined somehow and made easily accessible to the domain experts. That is, essentially it is an instance of a data access and integration problem.

There are a couple of principal approaches to address that, usually done by an Extract-Transform-Load of each separate resource into one database or digital library, and then putting a web-based front-end on top of it. There are many shortcomings to that solution, such as having to repeat the ETL procedure upon updates in the source database, a single control point, and the, typically only, canned (i.e., fixed) queries of the interface. A more recent approach, of which the technologies finally are maturing, is Ontology-Based Data Access (OBDA) and Ontology-Based Data Integration (OBDI). I say “finally” here, as I still very well can remember the predecessors we struggled with some 7-8 years ago [2,3] (informally here, here, and here), and “maturing”, as the software has become more stable, has more features, and some of the things we had to do manually back then have been automated now. The general idea of OBDA/I applied to the Roman Empire Food system is shown in the figure below.

OBDA in the EPnet system (Source: [1])

OBDA in the EPnet system (Source: [1])

There are the data sources, which are federated (one ‘middle layer’, though still at the implementation level). The federated interface has mapping assertions to elements in the ontology. The user then can use the terms of the ontology (classes and their relations and attributes) to query the data, without having to know about how the data is stored and without having to write page-long SQL queries. For instance, a query “retrieve inscriptions on amphorae found in the city of ‘Mainz” containing the text ‘PNN’” would use just the terms in the ontology, say, Inscription, Amphora, City, found in, and inscribed on, and any value constraint added (like the PNN), and the OBDA/I system takes care of the rest.

Interestingly, the authors of [1]—admitted, three of them are former colleagues from Bolzano—used the same approach to setting up the ontology component as we did for [3]. While we will use the Protégé Ontology Development Environment in the OE module, it is not the best modelling tool to overcome the knowledge acquisition bottleneck. The authors modelled together with the domain experts in the much more intuitive ORM language and tool NORMA, and first represented whatever needed to be represented. This included also reuse of relevant related ontologies and non-ontology material, and modularizing it for better knowledge management and thereby ameliorating cognitive overload. A subset of the resultant ontology was then translated into the Web Ontology Language OWL (more precisely: OWL 2 QL, a tractable profile of OWL 2 DL), which is actually used in the OBDA system. We did that manually back then; now this can be done automatically (yay!).

Skipping here over the OBDI part and considering it done, the main third step in setting up an OBDA system is to link the data to the elements in the ontology. This is done in the mapping layer. This is essentially of the form “TermInTheOntology <- SQLqueryOverTheSource”. Abstracting from the current syntax of the OBDA system and simplifying the query for readability (see the real one in the paper), an example would thus have the following make up to retrieve all Dressel 1 type of amphorae, named Dressel1Amphora in the ontology, in all the data sources of the system:

Dressel1Amphora <-
    SELECT ic.id
       FROM ic JOIN at ON at.carrier=ic.id
          WHERE at.type=’DR1’

Or some such SQL query (typically larger than this one). This takes up a bit of time to do, but has to be done only once, for these mappings are stored in a separate mapping file.

The domain expert, then, when wanting to know about the Dressel1 amphorae in the system, would have to ask only ‘retrieve all Dressel1 amphorae’, rather than creating the SQL query, and thus being oblivious about which tables and columns are involved in obtaining the answer and being oblivious about that some data entry person at some point had mysteriously decided not to use ‘Dressel1’ but his own abbreviation ‘DR1’.

The actual ‘retrieve all Dressel1 amphorae’ is then a SPARQL query over the ontology, e.g.,

SELECT ?x WHERE {?x rdf:Type :Dressel1Amphora.}

which is surely shorter and therefore easier to handle for the domain expert than the SQL one. The OBDA system (-ontop-) takes this query and reasons over the ontology to see if the query can be answered directly by it without consulting the data, or else can be rewritten given the other knowledge in the ontology (it can, see example 5 in the paper). The outcome of that process then consults the relevant mappings. From that, the whole SQL query is constructed, which is sent to the (federated) data source(s), which processes the query as any relational database management system does, and returns the data to the user interface.

 

It is, perhaps, still unpleasant that domain experts have to put up with another query language, SPARQL, as the paper notes as well. Some efforts have gone into sorting out that ‘last mile’, such as using a (controlled) natural language to pose the query or to reuse that original ORM diagram in some way, but more needs to be done. (We tried the latter in [3]; that proof-of-concept worked with a neutered version of ORM and we have screenshots and videos to prove it, but in working on extensions and improvements, a new student uploaded buggy code onto the production server, so that online source doesn’t work anymore (and we didn’t roll back and reinstalled an older version, with me having moved to South Africa and the original student-developer, Giorgio Stefanoni, away studying for his MSc).

 

Note to OE students: This is by no means all there is to OBDA/I, but hopefully it has given you a bit of an idea. Read at least sections 1-3 of paper [1], and if you want to do an OBDA mini-project, then read also the rest of the paper and then Chapter 8 of the OE lecture notes, which discusses in a bit more detail the motivations for OBDA and the theory behind it.

 

References

[1] Calvanese, D., Liuzzo, P., Mosca, A., Remesal, J, Rezk, M., Rull, G. Ontology-Based Data Integration in EPNet: Production and Distribution of Food During the Roman Empire. Engineering Applications of Artificial Intelligence, 2016. To appear.

[2] Keet, C.M., Alberts, R., Gerber, A., Chimamiwa, G. Enhancing web portals with Ontology-Based Data Access: the case study of South Africa’s Accessibility Portal for people with disabilities. Fifth International Workshop OWL: Experiences and Directions (OWLED 2008), 26-27 Oct. 2008, Karlsruhe, Germany.

[3] Calvanese, D., Keet, C.M., Nutt, W., Rodriguez-Muro, M., Stefanoni, G. Web-based Graphical Querying of Databases through an Ontology: the WONDER System. ACM Symposium on Applied Computing (ACM SAC 2010), March 22-26 2010, Sierre, Switzerland. ACM Proceedings, pp1389-1396.

Updated ontology engineering lecture notes (2015)

It’s that time of the year again, in the southern hemisphere that is, where course preparations for the academic year are going on full steam ahead. Also this year, I’ll be teaching a CS honours course on ontology engineering. To that end, the lecture notes have been updated, though not in a major way like last year. Some sections have been shuffled around, there are a few new exercises, Chris’s update suggestion from last year on the OBO-OWL mapping has been included, and a couple of typos and odd sentences have been fixed.

Practically, this installment will be a bit different from previous years, as it has integrated a small project on Semantic Wikis, funded by CILT and OpenUCT. Set up, maintenance, and filling it with contents on ontology engineering topics will initially be done ‘in house’ by students enrolled in the course and not be generally available on the Web, but if all goes well, it’ll be accessible to everyone some time in April this year, and possibly included in the OER Commons.

Semantic MediaWiki’s features are fairly basic and there are a bunch of plugins and extensions I’ve seen listed, but I didn’t check whether they all worked with the latest SMW. If you have a particular suggestion, please leave a comment or send me an email. One thing I’m still wondering about particularly, but haven’t found a solution to, is whether there’s a plugin that lets you see the (lightweight) ontology when adding contents, so that it makes it easier to use terms in the text from the ontology’s vocabulary rather than find an having to process manually whatever (near)synonyms have been used throughout the pages (like, one contributor using ‘upper ontology’, another ‘foundational ontology’ and the third ‘top-level ontology’), and allow on-the-fly extensions of that ontology.

Results of the OWL feature popularity contest at OWLED 2014

One of the events on the OWLED 2014 programme–co-located with ISWC2014–was the OWL feature popularity contest, with as dual purpose to get a feel of possible improvements to the OWL 2 standard and to generate lively discussions (though the latter happened throughout the workshop already anyway). The PC co-chair, Valentina Tamma, and I had collected some questions ourselves and we had solicited suggestions for question from the participants beforehand, and we used a ‘software-based clicker’ (audience response system) during the session so that participants could vote and see results instantly. The remainder of this posts contains the questions and the results. We left the questions open, so you still can vote by going to govote.at and fill in the number shown in the left-hand bottom in the screenshots, and try to skew the outcome your way (voting is anonymous). I’ll check the results again in two weeks…

1.The first question referred back to discussions from around 2007 during the standardization process of OWL 2: Several rather distinct features were discussed for OWL 2 that didn’t make it into the standard; do you (still) want any or all of them, if you ever did?

  • n-ary object properties, with n>2
  • constraints among different data properties, be this of the same object or different objects
  • unique name assumption
  • all of them!
  • I don’t really miss any of them

The results, below, show some preference for constraints among data properties, and overall a mild preference to at least have some of them, rather than none.

Voting results of question 1

Voting results of question 1

2. Is there any common pattern for which you would propose syntactic sugar?

  • Strict partial ordering
  • Disjoint transitive roles
  • Witnessed universal/closure: adding existentially quantified to a universal (Carral et al., OWLED14)
  • Witnessed universal/closure: adding universally quantified to an existential (raised in bio-ontologies literature)
  • Specific patterns; e.g., episodes
  • Nothing really

The results, below, are a bit divided. Carral et al.’s paper presented the day before seems to have done some good convincing, given the three votes, and the strict partial ordering, i.e., a pattern for parthood also received some votes, but about half of the respondents weren’t particularly interested in such things.

Voting results of question 2

Voting results of question 2

3. Ignoring practicalities on (in)feasibility, which of the following set of features would you like to see OWL to be extended with most?

  • Temporal
  • Fuzzy and Probabilistic
  • Rough sets
  • I’m not interested in any of these extensions

The results show that some temporal extension is the clear winner, which practically isn’t going to be easy to do, unfortunately, because even minor temporal extensions cause dramatic jumps in complexity. Other suggestions for extensions made during the discussion were more on data properties (again) and a way to deal with measurement units.

Voting results of question 3

Voting results of question 3

4. Which paradigm do you prefer in order to model / modify your ontologies in an ODE?

  • Controlled natural language
  • Diagram-based tool
  • Online collaborative tool
  • Dedicated ontology editor
  • Text editor
  • No preference
  • It depends on the task

Results again in the figure below. The interesting aspect is, perhaps, that there was no one who had no preference, and no one preferred a diagram-based tool. Mostly, it depends on the task, then some tool that caters for collaborative ontology development.

Voting results of question 4

Voting results of question 4

5. There are four standardised optional syntaxes in OWL 2. If due to time/resource constraints, tool compatibilities, etc., not all optional syntaxes could be accommodated for in an “OWL 3.0”, which could be discontinued, according to you, if any?

  • OWL/XML
  • Functional style
  • Turtle
  • Manchester
  • They all should stay

The latter option, that they all should stay, was selected most among the participants, though not by a majority of voters, and I’m sure it would have ended up differently with more participants (based on discussions afterward). Note: by now, the voting was shown ‘live’ as the responses came in cf. the earlier hide-and-show.

Voting results of question 5

Voting results of question 5

6. Turning around the question phrasing: Which feature do you like less?

  • Property chains
  • Key
  • Transitivity
  • The restrictions limiting the interactions between the different property characteristics (thus preventing certain patterns)
  • They are all useful to a greater or lesser extent

Options B and D generated a lively debate, but the results show clearly that the participants who voted wanted to keep them all.

Voting results of question 6

Voting results of question 6

7. Which of the following OP characteristics features do you consider most important when developing an ontology?

  • reflexivity
  • irreflexivity
  • symmetry
  • asymmetry
  • antisymmetry
  • transitivity
  • acyclicity

This last question appeared a no-brainer among the choices, with a unanimous transitivity above all. It was raised whether functional ought to have been included, which we intentionally had not done, for it’s a different kind of constraint (cardinality/multiplicity) than the properties of properties. The results most likely would have looked quite different if we did.

Voting results of question 7

Voting results of question 7

The results were supposed to be on the OWLED community page, but I have from reliable source (the general chair of OWLED14, Bijan Parsia) that the software doesn’t seem to be very friendly and feature rich, hence a quick post here. You can read Bijan’s live blogging of the presentations at OWLED there as well. The proceedings of the workshop are online as CEUR-WS vol. 1265.

Considering some stuff—scientifically

Yay, now I can say “I look into stuff” and actually be precise about what I have been working on (and get it published, too!), rather than just oversimplifying into vagaries about some of my research topics. The final title of the paper I settled on is not as funny as proposing a ‘pointless theory’ [1], though: it’s a Core Ontology of Macroscopic Stuff [2], which has been accepted at the 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW’14).

The ‘stuff’, in philosophical terms, are those things that are in natural language indicated typically with mass nouns, being those things you can’t count other than in quantities, like gold, water, whipping cream, agar, milk, and so on. The motivation to look into that was both for practical and theoretical reasons. For instance, you are working in the food industry and thus have to be concerned with traceability of ingredients, so you will have to know which (bulk) ingredients originate from where. Then, if something goes wrong—say, an E. coli infection in a product for consumption—then it would be doable to find the source of the microbial contamination. Most people might not realize what happens in the production process; e.g., some quantity of milk comes from a dairy farm, and in the food processing plant, some components of a portion of the milk is separated into parts (whey separated from the cheese-in-the-making, fat for butter and the remainder buttermilk). To talk about parts and portions of such stuffs requires one to know about those stuffs, and how to model it, so there can be some computerized tracking system for swift responses.

On the theoretical side, philosophers were talking about hypothetical cases of sending molecules of mixtures to Venus and the Moon, which isn’t practically usable, in particular because it was glossing over some important details, like that milk is an emulsion and thus has a ‘minimum portion’ for it to remain an emulsion involving many molecules. Foundational ontologies, which I like for their modeling guidance, didn’t come to the rescue either; e.g., DOLCE has Amount of Matter for stuffs but stops there, BFO has none of it. Domain ontologies for food, but also in other areas, such as ecology and biomedicine, each have their own way of modelling stuff, be this by source, usage, or whatever, making things incompatible because several criteria are used. So, there was quite a gap. The core ontology of macroscopic stuff aims to bridge this gap.

This stuff ontology contains categories of stuff and is formalised in OWL. There are distinctions between pure stuff and mixtures, and differences among the mixtures, e.g., true solutions vs colloids among homogeneous mixtures, and solid heterogeneous mixtures vs. suspension among heterogeneous mixtures, and each one with a set of defining criteria. So, Milk is an Emulsion by its very essence, regardless if you want to assign it a role that it is a beverage (Envo ontology) or an animal-associated habitat (MEO ontology), Blood is a Sol (type of colloid), and (table) Sugar a StructuredPureStuff. A basic alignment of the relations involved is possible with the stuff ontology as well regarding granules, grains, and sub-stuffs (used in cyc and biotop, among others).

The ontology both refines the DOLCE and BFO foundational ontologies and it resolves the main type of interoperability issues with stuffs in domain ontologies, thereby also contributing to better ontology quality. To make the ontology usable, modelling guidelines are provided, with examples of inferences, a decision diagram, outline of a template, and illustrations solving the principal interoperability issues among domain ontologies (scroll down to the last part of the paper). The decision diagram, which also gives an informal idea of what’s in the stuff ontology, is depicted below.

Decision diagram to select the principal kind of stuff (Source: [2])

Decision diagram to select the principal kind of stuff (Source: [2])

You can access the stuff ontology on its own, as well as versions linked to DOLCE and BFO. I’ll be presenting it in Sweden at EKAW late November.

p.s.: come to think of it, maybe I should have called it smugly “a real ontology of substance”… (substance being another term used for stuff/matter)

References

[1] Borgo S., Guarino N., and Masolo C.. A Pointless Theory of Space Based On Strong Connection and Congruence, in L. Carlucci Aiello, J. Doyle (eds.), in Proceedings of the Fifth International Conference on Principles of Knowledge Representation and Reasoning (KR’96), Morgan Kaufmann, Cambridge Massachusetts (USA), 5-8 November 1996, pp. 220-229.

[2] Keet, C.M. A Core Ontology of Macroscopic Stuff. 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW’14). 24-28 Nov, 2014, Linkoping, Sweden. Springer LNAI. (accepted)