The third international conference on Metadata and Semantics Research (MTSR’09) was held 1-2 October in Milan at the DiSCO faculty at the University of Milano-Bicocca. The conference focuses on the engineering aspects of metadata and semantics in applications: usage of ontologies and ontology-like artifacts, applications using Semantic Web and other technologies, metrics, surveys, and two recurring application areas, being eLearning and a whole day (1/3 of the conference) on semantics in agriculture, food & the environment.
The general impression I had was that while everyone sees the advantages of the semantic layer with ontologies (and thesauri, structured controlled vocabularies etc.) and Semantic Web technologies, it is still more difficult to implement it than initially perceived regarding (i) the time and effort it takes to get a system going and (ii) the learning curve of the people involved. In fact, quite a few people did not hide the hurdles they stumbled upon, which either were solved with additional human resources for the manual interventions, or fixed with a new engineering solution or a workaround, or left for future works by themselves or as thinly veiled hint to others. The hurdles include, but are not limited to: compatibility issues with the format of the original data and the desired or required format, dealing with maintaining backward compatibility, use of multiple ontologies in one system in some way, usage of only very basic querying (where they immediately seemed to settle for the limitations), and generally that Semantic Web tools—being predominantly academic prototypes—are too immature and unstable for deployment in operational information systems. On the positive side, there are also success stories where one can see the usefulness of ontologies and Semantic Web technologies.
Agriculture & Semantics Track
Within the Semantics & Agriculture, Food & Environment track, another large software application was presented, which was developed within the Seamless IP project. Ioannis Athanasiadis et al’s paper [1] focuses on the 11 small ontologies—preferred over one large ontology—they had developed, which are used in a scientific workflow that combines agricultural data and knowledge from different applications with the EU common agricultural Policy (CAP) to compute consequences of possible policies and to devise suggestions for farmers. Noteworthy is that they reused the RDFS label for language (lang) not for natural language translations of the concepts in the ontologies, but for programming languages. For instance, the concept Crop is represented in the OWL ontology and it is used across applications in the workflow: the owl:Class then has nested an <rdfs:label xml:lang=”gms”>C</rdfs:label> so that Crop in the ontology is linked to C that represents Crop in the application that was programmed in GAMS, and with <rdfs:label xml:lang=”aps”>Crop</rdfs:label> it is linked to the Agricultural Production and Externalities Simulator (APES) model written in C#. This solution also would address Daniel Martini’s [2] complaint that there are too few tools for OWL for other programming languages than Java, which are used more often in agriculture applications (Martini preferred C). The paper about networked ontologies by the Food and Agriculture Organisation of the UN discusses ongoing work and challenges for their networked ontologies case study about the fisheries domain within the NeOn project [3]; unfortunately, the FAO employees were shining in their absence to discuss the latest results of their experiments.
On the bright side, clear and unambiguous feel-good results thanks to the use of ontologies were presented as well. Ziemba, Cornejo and Beck developed and used ontologies for urban water conservation in Florida by linking the ontologies to digital libraries as well as using the ontologies to generate dynamically website content about events and news [4]. Ferrández of the University of Alicante presented the experiment done with Katia Vila of the University of Matanzas to enhance a question answering (QA) system with an ontology. They had a so-called Open Domain QA system for Spanish, AliQAn, which did not work for retrieving the correct answer when it was applied to scientific articles in the agricultural domain (about 2000 articles of the RCCA journal). So, Vila and Ferrández developed an ontology and dressed it up with WordNet and AGROVOC so as to have a Restricted Domain QA, and, lo and behold, it performed well; i.e., the correct answers were retrieved in 6 out of 8 samples (cf. 0 out of 8 without the ontology). When Vila and Ferrández looked into why this was the case, it appeared that the benefit of the ontology was in correct identification of the question, not specifically the information retrieval part. For instance, the question “¿Cuáles son los metabolitos principales que vienen del tracto digestivo?” [“which are the main metabolites that come from the digestive tract?”] was incorrectly classified as an entity-profession question with the open domain QA, whereas with the ontology-supported one, it was correctly classified as an entity-substance question. Put differently, the Open Domain QA system does not understand the complexities of questions in the agricultural domain.
Zschocke of the United Nations University talked about quality education and centralized learning repositories for an integrated eLearning system for the Consultative Group on International Agricultural Research (CGIAR), and the associated metadata creation for dissemination activities in particular [6]. Last, some explorative survey results and general scope of the EU 5-year Trace project were presented by Kathryn Donnelly of Nofima [7]. Products and ingredients have to be traceable to the source, so there will have to be some passing on of data among the companies, even though there is little standardization and idea how to do it other than the TraceCore XML. They are still pondering where and how to use ontologies, if at all (now they used a simple data list). The presentation of my own paper [8] was well-received, although perceived to be a bit overloaded with information; here’s the outline and summary posted earlier.
Main track and keynote
Luciano Serafini of the FBK was invited to give the keynote speech, which was about the APOSDLE project for e-learning and work-integrated learning in particular. Among the activities within the project was the development of “an integrated modelling methodology”, which resulted in a wiki, called MOKI. The approach consists of, primarily, splitting up the ontology development into informal modelling for the domain experts, a light formalization, and full axiomatization for the knowledge engineers so that one ends up with three different views for each element in the ontology. In addition, they found it useful to identify three roles in the ontology development process: the domain expert role, the knowledge engineer role, and the coach role to mediate between the former two.
Papers in the main track included, among others, the one of Federica Viti and colleagues of ItalBioNet about the system they developed to have a systems biology approach to breast cancer (G2SBC), focusing on the enrichment of the system with 9 ontologies and ontology-like artifacts [9]. Matteo Palmonari of the University Milano-Bicocca presented database schema integration into a “Semantic Peer Data Ontology”, which was augmented with a “Global Light Services Ontology” to deal with the services so that the whole system can be queried to retrieve the data from the original sources (the queries are limited to attributes of a concept, i.e., projections on a table) [10].
Overall, it was useful to attend and listen to what people are doing with the idea of, and technologies for, adding more semantics to their applications. As for the agricultural domain, I have the impression that it has, like ecology, lots of ‘legacy systems’ (i.e., that are currently operational but which may do better with more explicit semantics) in more diverse information systems than Semantic Web theory and tools researchers assume and that it also runs into more issue to make those systems “semantic web enabled” than medical and genomics information systems.
References
Note: the references that do not have a URL are only online on the Springer website at the time of writing this post.
[1] I.N. Athanasiadis, A. E. Rizzoli, S. Janssen, E. Andersen, F. Villa. Ontology for seamless integration of agricultural data and models. In: Proceedings of MTSR’09. Springer CCIS 46, 282-293.
[2] D. Martini, M. Schmitz, J. Frisch, M. Kunisch. A Service Architecture for Facilitated Metadata Annotation and Resource Linkage Using agroXML and ReSTful Web Services. In: Proceedings of MTSR’09. Springer CCIS 46, 239-244.
[3] C. Caracciolo, J. Heguiabehere, M. Sini, J. Keizer. Networked Ontologies from the Fisheries Domain. In: Proceedings of MTSR’09. Springer CCIS 46, 257-262.
[4] L. Ziemba, C. Cornejo, H. Beck. A Water Conservation Digital Library Using Ontologies. In: Proceedings of MTSR’09. Springer CCIS 46, 263-269.
[5] K. Vila, A. Ferrández. Developing an Ontology for Improving Question Answering in the Agricultural Domain. In: Proceedings of MTSR’09. Springer CCIS 46, 245-256.
[6] T. Zschocke, J. Beniest. Assuring the Quality of Agricultural Learning Repositories: Issues for the Learning Object Metadata Creation Process of the CGIAR. In: Proceedings of MTSR’09. Springer CCIS 46, 226-238.
[7] K. A.-M. Donnelly, J. van der Roest, S. T. Hoskuldsson, P. Olsen, K. M. Karlsen. Improving Information Exchange in the Chicken Processing Sector Using Standardised Data Lists. In: Proceedings of MTSR’09. Springer CCIS 46, 312-321.
[8] Keet, C.M. Ontology design parameters for aligning agri-informatics with the Semantic Web. In: Proceedings of MTSR’09. Springer CCIS 46, 239-244.
[9] F. Viti, E. Mosca, I. Merelli, A. Calabria, R. Alfieri,L. Milanesi. Ontological enrichment of the Genes-to-Systems Breast Cancer database. In: Proceedings of MTSR’09. Springer CCIS 46, 171-182.
[10] D. Beneventano, F. Guerra, A. Maurino, M. Palmonari, G. Pasi, A. Sala. Unified Semantic Search of Data and Services. In: Proceedings of MTSR’09. Springer CCIS 46, 95-107.