African Wildlife Ontology tutorial ontologies

Following the invitation by Deshen Moodley from the University of KwaZulu-Natal to give a guest lecture for his Ontology and Knowledge Based Systems fourth year (honours) course, I looked up the African Wildlife Ontology he intended to use and that was introduced in the A Semantic Web Primer book by Grigoris Antoniou and Frank van Harmelen [1]. Given the state of that tutorial ontology, I could not resist fiddling with it to make the tutorial ontology a little more comprehensive.

Googling for an existing version in OWL on the Web, I came across Guy Lapalme’s version that, however, gave me an error loading it in Protégé 4.1-beta due to the use of collection in the definition of Herbivore. Having removed that and renamed the .xml extension into .owl, this version is renamed AfricanWildlifeOntology0.owl. The ontology has 10 classes and 3 object properties concerning animals such as Lion, Giraffe, Plant, eats, and is-part-of. Note the annotations in the ontology that give an idea of what should be modelled (else: see 4.3.1 pages 119-133 in [1]). Upon running the reasoner, it will classify, among others, that Carnivore is a subclass of Animal.

All this is not really exciting though, and the tutorial ontology is not of a particularly good quality. First, I added knowledge: I played with proper parthood and added a few more plant parts and animals, such as Impala, Warthog, and RockDassie, and also refined knowledge such that giraffes eat not only leaves but also twigs and there are omnivores, too. This version of the African Wildlife Ontology is named AfricanWildlifeOntology1.owl. With this additional knowledge, warthogs are classified as omnivores, lions as carnivores, giraffes as herbivores, and so on. We still miss out on having impalas classified as herbivores; what can—or should—you add to the ontology to achieve that?

However, adding classes and object properties to an ontology does not necessarily make a better quality ontology. One aspect that does with respect to the subject domain, is to refine the represented knowledge so as to limit the possible models, such as giraffes eating both leaves and twigs and adding more characteristics to the object properties, like that the is-part-of is not only transitive, but also reflexive, and is-proper-part-of is transitive and irreflexive or asymmetric (the latter we can add thanks to the increased expressiveness of OWL 2 DL compared to OWL-DL). Another aspect is purely engineering: if you intend to put your ontology online, you should name the ontology (in Protégé select “refactor” and “change ontology URI”) so that its contents can be identified appropriately on the Semantic Web. Third, we can improve the ontology’s quality by using a foundational ontology.

Foundational ontologies provide principal categories of kinds of entities and relations to give a basic structure to a reference or domain ontology. With it, you can avoid reinventing the wheel during ontology development by availing of outcomes from research into the foundations of ontologies and it can guide you to make the modelling process easier to carry out successfully. In addition, it facilitates linking your ontology with other ontologies that also adhere to a foundational ontology. Foundational ontologies contain basic categories such as IndependentContinuant/Endurant (roughly: to represent objects) and Occurent/Perdurant (informally: processes), and Quality for representing attributes, and then their respective sub-categories, such as AmountOfMatter, Feature, PhysicalObject, Achievement, Function, and SpatialRegion; see, e.g., DOLCE, BFO, GFO, GUM, and SUMO.

For the sake of example, let us take DOLCE [2] to enrich the African Wildlife Ontology. To do this, we need to import into our wildlife ontology an OWLized version of DOLCE; in this case, we import DOLCE-lite.owl. Then, consider first the taxonomic component of DOLCE (see Wonderweb deliverable D18 Fig 2 p14 and Table 1 p15 or explore the imported ontology with its annotations). Where does Plant fit in in the DOLCE categorisation? Giraffes drink water: where should we put Water? Impalas run (fast); where should we put Running? Lions eat impalas, and in the process, the impalas die; where should we put Death? The answers can be found in AfricanWildlifeOntology2.owl. DOLCE is more than a taxonomy, and we can also inspect in more detail its object properties and reuse the already defined properties instead of re-inventing them. First, the African Wildlife Ontology’s is-part-of is the same as DOLCE’s part, and likewise for their respective inverses. Concerning the subject domain, here are a few modelling questions. The elephant’s Tusks (ivory) are made of Apatite (calcium phosphate, an amount of matter); which DOLCE relation can be reused? Giraffes eat leaves and twigs; how do Plant and Twig relate? How would you represent the Size (Height, Weight, etc.) of an average adult elephant; with DOLCE’s Quality or an OWL data property? Answers to the former two questions are included in AfricanWildlifeOntology2.owl.

How does it work out when we import BFO into AfricanWildlifeOntology1.owl? Aside from minor differences (e.g., Death is not a type of Achievement as in DOLCE, but a ProcessBoundary instead, and animals and plants are subtypes of Object), there is a major difference with respect to the object properties (BFO has none). A possible outcome of linking the wildlife ontology to BFO is included in AfricanWildlifeOntology3.owl. To do these last two exercises with DOLCE and BFO in a transparent and reusable way, however, we need a mapping between the two foundational ontologies. Even more so: if there was a proper mapping, only one of the two exercises would have sufficed and the software would have taken care of the mappings between the two. But, alas, such a mapping and implementation is yet to be done.

One could take the development a step further by adding types of part-whole relations [3] so as to be more precise than only a generic part-of relation (e.g., Root is a structural part of some Plant and NatureReserve is located-in some Country) and/or consider a Content Ontology Design Pattern [4], such as being more finicky about names for plants and animals with, perhaps, the Linnaean Taxonomy content pattern or adding some information on the Climatic Zone where the plants and animals live, and so on. (But note that regarding content, one also can take a bottom-up approach to ontology development with resources such as the Environment Ontology or pick and choose from ‘semantified’ Biodiversity Information Standards etc.)

References

[1] Antoniou, G, van Harmelen, F. A Semantic Web Primer. MIT Press, 2003.

[2] Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A. WonderWeb Deliverable D18–Ontology library. WonderWeb. 2003.

[3] Keet, C.M. and Artale, A. Representing and Reasoning over a Taxonomy of Part-Whole Relations. Applied Ontology, IOS Press, 2008, 3(1-2): 91-110.

[4] Presutti, V., Gangemi, A., David, S., de Cea, G. A., Surez-Figueroa, M. C., Montiel-Ponsoda, E., Poveda, M. A library of ontology design patterns: reusable solutions for collaborative design of networked ontologies. NeOn deliverable D2.5.1, Institute of Cognitive Sciences and Technologies (CNR). 2008.

South African women on leadership in science, technology and innovation

Today I participated in the Annual NACI symposium on the leadership roles of women in science, technology and innovation in Pretoria, which was organized by the National Advisory Council on Innovation, which I will report on further below. As preparation for the symposium, I searched a bit to consult the latest statistics and see if there are any ‘hot topics’ or ‘new approaches’ to improve the situation.

General statistics and their (limited) analyses

The Netherlands used to be at the bottom end of the country league tables on women professors (from my time as elected representative in the university council at Wageningen University, I remember a UN table from ’94 or ‘95 where the Netherlands was third last from all countries). It has not improved much over the years. From Myklebust’s news item [1], I sourced the statistics to Monitor Women Professors 2009 [2] (carried out by SoFoKleS, the Dutch social fund for the knowledge sector): less than 12% of the full professors in the Netherlands are women, with the Universities of Leiden, Amsterdam, and Nijmegen leading the national league table and the testosterone bastion Eindhoven University of Technology closing the ranks with a mere 1.6% (2 out of 127 professors are women). With the baby boom generation lingering on clogging the pipeline since a while, the average percentage increase has been about 0.5% a year—way too low to come even near the EU Lisbon Agreement Recommendation’s target of 25% by 2010, or even the Dutch target of 15%, but this large cohort will retire soon, and, in terms of the report authors, makes for a golden opportunity to move toward gender equality more quickly. The report also has come up with a “Glass Ceiling Index” (GCI, the percentage of women in job category X-1 divided by the percentage of women in job category X) and, implicitly, an “elevator” index for men in academia. In addition to the hard data to back up the claim that the pipeline is leaking at all stages, they note it varies greatly across disciplines (see Table 6.3 of the report): in science, the most severe blockage is from PhD to assistant professor, in Agriculture, Technology, Economics, and Social Sciences it is the step from assistant to associate professor, and for Law, Language & Culture, and ‘miscellaneous’, the biggest hurdle is from associate to full professor. From all GCIs, the highest GCI (2.7) is in Technology in the promotion from assistant to associate professor, whereas there is almost parity at that stage in Language & Culture (GCI of 1.1, the lowest value anywhere in Table 6.3).

“When you’re left out of the club, you know it. When you’re in the club, you don’t see what the problem is.” Prof. Jacqui True, University of Auckland [4]

Elsewhere in ‘the West’, statistics can look better (see, e.g., The American Association of University Professors (AAUP) survey on women 2004-05), or are not great either (UK, see [3], but the numbers are a bit outdated). However, one can wonder about the meaning of such statistics. Take, for instance, the NYT article on a poll about paper rights vs. realities carried out by The Pew Research in 22 countries [4]: in France, some 100% paid their lip service to being in favour of equal rights, yet 75% also said that men had a better life. It is only in Mexico (56%), Indonesia (55%) and Russia (52%) that the people who were surveyed said that women and men have achieved a comparable quality of life. But note that the latter statement is not the same as gender equality. And equal rights and opportunities by law does not magically automatically imply the operational structures are non-discriminatory and an adequate reflection of the composition of society.

A table that has generated much attention and questions over the years—but, as far as I know, no conclusive answers—is the one published in Science Magazine [5] (see figure below). Why is it the case that there are relatively much more women physics professors in countries like Hungary, Portugal, the Philippines and Italy than in, say, Japan, USA, UK, and Germany? Recent guessing for the answer (see blog comments) are as varied as the anecdotes mentioned in the paper.

Physics professors in several countries (Source: 5).

Barinaga’s [5] collection of anecdotes of several influential factors across cultures include: a country’s level of economic development (longer established science propagates the highly patriarchal society of previous centuries), the status of science there (e.g., low and ‘therefore’ open to women), class structure (pecking order: rich men, rich women, poor men, poor women vs. gender structure rich men, poor men, rich women, poor women), educational system (science and mathematics compulsory subjects at school, all-girls schools), and the presence or absence of support systems for combining work and family life (integrated society and/or socialist vs. ‘Protestant work ethic’), but the anecdotes “cannot purport to support any particular conclusion or viewpoint”. It also notes that “Social attitudes and policies toward child care, flexible work schedules, and the role of men in families dramatically color women’s experiences in science”. More details on statistics of women in science in Latin America can be found in [6] and [7], which look a lot better than those of Europe.

Barbie the computer engineer

Bonder, in her analysis for Latin America [7], has an interesting table (cuadro 4) on the changing landscape for trying to improve the situation: data is one thing, but how to struggle, which approaches, advertisements, and policies have been, can, or should be used to increase women participation in science and technology? Her list is certainly more enlightening than the lame “We need more TV shows with women forensic and other scientists. We need female doctor and scientist dolls.” (says Lotte Bailyn, a professor at MIT) or “Across the developed world, academia and industry are trying, together or individually, to lure women into technical professions with mentoring programs, science camps and child care.” [8] that only very partially addresses the issues described in [5]. Bonder notes shifts in approaches from focusing only on women/girls to both sexes, from change in attitude to change in structure, from change of women (taking men as the norm) to change in power structures, from focusing on formal opportunities to targeting to change the real opportunities in discriminatory structures, from making visible non-traditional role models to making visible the values, interests, and perspectives of women, and from the simplistic gender dimension to the broader articulation of gender with race, class, and ethnicity.

The NACI symposium

The organizers of the Annual NACI symposium on the leadership roles of women in science, technology and innovation provided several flyers and booklets with data about women and men in academia and industry, so let us start with those. Page 24 of Facing the facts: Women’s participation in Science, Engineering and Technology [9] shows the figures for women by occupation: 19% full professor, 30% associate professor, 40% senior lecturer, 51% lecturer, and 56% junior lecturer, which are in a race distribution of 19% African, 7% Coloured, 4% Indian, and 70% White. The high percentage of women participation (compared to, say, the Netherlands, as mentioned above) is somewhat overshadowed by the statistics on research output among South African women (p29, p31): female publishing scientists are just over 30% and women contributed only 25% of all article outputs. That low percentage clearly has to do with the lopsided distribution of women on the lower end of the scale, with many junior lecturers who conduct much less research because they have a disproportionate heavy teaching load (a recurring topic during the breakout session). Concerning distribution of grant holders in 2005, in the Natural & agricultural sciences, about 24% of the total grants (211 out of 872) have been awarded to women and in engineering & technology it is 11% (24 out of 209 grants) (p38). However, in Natural & agricultural sciences, women make up 19% and in engineering and technology 3%, which, taken together with the grant percentages, show there is a disproportionate amount of women obtaining grants in recent years. This leads one to suggest that the ones that actually do make it to the advanced research stage are at least equally as good, if not better, than their male counterparts. Last year, women researchers (PIs) received more than half of the grants and more than half of the available funds (table in the ppt presentation of Maharaj, which will be made available online soon).

Mrs Naledi Pandor, the Minister for Science and Technology, held the opening speech of the event, which was a good and entertaining presentation. She talked about the lack of qualified PhD supervisors to open more PhD positions, where the latter is desired so as to move to the post-industrial, knowledge-based economy, which, in theory at least, should make it easier for women to participate than in an industrial economy. She also mentioned that one should not look at just the numbers, but instead at the institutional landscape so as to increase opportunities for women. Last, she summarized the “principles and good practice guidelines for enhancing the participation of women in the SET sector”, which are threefold: (1) sectoral policy guidelines, such as gender mainstreaming, transparent recruiting policies, and health and safety at the workplace, (2) workplace guidelines, such as flexible working arrangements, remuneration equality, mentoring, and improving communication lines, and (3) re-entry into the Science, Engineering and Technology (SET) environment, such as catch-up courses, financing fellowships, and remaining in contact during a career break.

Dr. Thema, former director of international cooperation at the Department of Science and Technology added the issues of the excessive focus on administrative practicalities, the apartheid legacy and frozen demographics, and noted that where there is no women’s empowerment, this is in violation of the constitution. My apologies if I have written her name and details wrongly: she was a last-minute replacement for Prof. Immaculada Garcia Fernández, department of computer science at the University of Malaga, Spain. Garcia Fernández did make available her slides, which focused on international perspectives on women leadership in STI. Among many points, she notes that the working conditions for researchers “should aim to provide… both women and men researchers to combine work and family, children and career” and “Particular attention should be paid, to flexible working hours, part-time working, tele-working and sabbatical leave, as well as to the necessary financial and administrative provisions governing such arrangements”. She poses the question “The choice between family and profession, is that a gender issue?”

Dr. Romilla Maharaj, executive director for human and institutional capacity development at the National Research Foundation came with much data from the same booklet I mentioned in the first paragraph, but little qualitative analysis of this data (there is some qualitative information). She wants to move from the notion of “incentives” for women to “compensation”. The aim is to increase the number of PhDs five-fold by 2018 (currently the rate is about 1200 each year), which is not going to be easy (recollect the comment by the Minister, above). Concerning policies targeted at women participation, they appear to be successful for white women only (in postdoc bursaries, white women even outnumber white men). In my opinion, this smells more of a class/race structure issue than a gender issue, as mentioned above and in [5]. Last, the focus of improvements, according to Maharaj, should be on institutional improvements. However, during the break-out session in the afternoon, which she chaired, she seemed to be selectively deaf on this issue. The problem statement for the discussion was the low research output by women scientists compared to men, and how to resolve that. Many participants reiterated the lack of research time due to the disproportionate heavy teaching load (compared to men) and what is known as ‘death by committee’, and the disproportionate amount of (junior) lecturers who are counted in the statistics as scientists but, in praxis, do not do (or very little) research, thereby pulling down the overall statistics for women’s research output. Another participant wanted to se a further breakdown of the numbers by age group, as the suspicion was that it is old white men who produce most papers (who teach less, have more funds, supervise more PhD students etc.) (UPDATE 13-10-’10: I found some data that seems to support this). In addition, someone pointed out that counting publications is one thing, but considering their impact (by citations) is another one and for which no data was available, so that a recommendation was made to investigate this further as well (and to set up a gender research institute, which apparently does not yet exist in South Africa). The pay-per-publication scheme implemented at some universities could thus backfire for women (who require the time and funds to do research in the first place so as to get at least a chance to publish good papers). Maharaj’s own summary of the break-out session was an “I see, you want more funds”, but that does not rhyme fully with he institutional change she mentioned earlier nor with the multi-faceted problems raised during the break-out session that did reveal institutional hurdles.

Prof. Catherine Odora Hoppers, DST/NRF South African Research Chair in Development Education (among many things), gave an excellent speech with provoking statements (or: calling a spade a spade). She noted that going into SET means entering an arena of bad practice and intolerance; to fix that, one first has to understand how bad culture reproduces itself. The problem is not the access, she said, but the terms and conditions. In addition, and as several other speakers already had alluded to as well, she noted that one has to deal with the ghosts of the past. She put this in a wider context of the history of science with the value system it propagates (Francis Bacon, my one-line summary of the lengthy quote: science as a means to conquer nature so that man can master and control it), and the ethics of SET: SET outcomes have, and have had, some dark results, where she used the examples of the atom bomb, gas chambers, how SET was abused by the white male belittling the native and that it has been used against the majority of people in South Africa, and climate change. She sees the need for a “broader SET”, meaning ethical, and, (in my shorthand notation) with social responsibility and sustainability as essential components. She is putting this into practice by stimulating transdisciplinary research at her research group, and, at least and as a first step: people from different disciplines must to be able to talk to each other and understand each other.

To me, as an outsider, it was very interesting to hear what the current state of affairs is regarding women in SET in South Africa. While there were complaints, there we also suggestions for solutions, and it was clear from the data available that some improvements have been made over the years, albeit only in certain pockets. More people registered for the symposium than places available, and with some 120 attendees from academia and industry at all stages of the respective career paths, it was a stimulating mix of input that I hope will further improve the situation on the ground.

References

[1] Jan Petter Myklebust. THE NETHERLANDS: Too few women are professors. University World News, 17 January 2010, Issue: 107.

[2] Marinel Gerritsen, Thea Verdonk, and Akke Visser. Monitor Women Professors 2009. SoFoKleS, September 2009.

[3] Helen Hague. 9.2% of professors are women. Times Higher Education, May 28, 1999.

[4] Victoria Shannon. Equal rights for women? Surveys says: yes, but…. New York Times/International Herald Tribune—The female factor, June 30, 2010.

[5] Marcia Barinaga. Overview: Surprises Across the Cultural Divide. Compiled in: Comparisons across cultures. Women in science 1994. Science, 11 March 1994 263: 1467-1496 [DOI: 10.1126/science.8128232]

[6] Beverley A. Carlson. Mujeres en la estadística: la profesión habla. Red de Reestructuración y Competitividad, CEPAL – SERIE Desarrollo productivo, nr 89. Santiago de Chile, Noviembre 2000.

[7] Gloria Bonder. Mujer y Educación en América Latina: hacia la igualdad de oportunidades. Revista Iberoamericana de Educación, Número 6: Género y Educación, Septiembre – Diciembre 1994.

[8] Katrin Benhold. Risk and Opportunity for Women in 21st Century. New York Times International Herald Tribune—The female factor, March 5, 2010.

[9] Anon. Facing the facts: Women’s participation in Science, Engineering and Technology. National Advisory Council on Innovation, August 2009.

The complexity of… coffee

The KRR group I am visiting has created a “coffee reserve bank” for managing the use of the Illy coffee pods and the non-Illy espresso machine. Despite efforts, they did not manage to source a true Illy espresso machine in South Africa. Does it make a difference? You might think it does not—compared to the alternative of soluble coffee granules, the difference is indeed negligible—but one should not underestimate the complexities of the whole endeavour of making a good espresso. For instance, as the late Mr. Ernesto Illy has shown, the perfect timing for an espresso is 30 seconds. Not longer, not shorter.

To see why it is 30 seconds, take a look at Figure 1 (from the entertaining article in Scientific American [1] and freely available here). Less than 30 seconds results in a coffee with too few aroma compounds, and longer than 30 seconds causes the water to extract too much undesirable aroma compounds. Yes, there is a science behind the perfect coffee; there are 23,814 articles in Pubmed on caffeine, 7740 hits for coffee (d.d. 3-8-10; searching the Food Science and Technology Abstracts cost money). Coffee is “a polyphasic colloidal system, in which water molecules are bound to the dispersed gas bubbles, oil droplets and solid fragments, all of which are less than five microns in size.”…

The cumulative chemical composition of espresso with increasing extraction time (Source: 1).

What else is essential for the perfect espresso?

First, there are the plants with their genetic composition (Coffea Arabica is the best) and the growth conditions. The latter includes the soil composition, altitude, amount of rainfall, sunlight, and temperature fluctuations.

Second, we have to take a look at the beans produced by the coffee plants and what is done with them. Deciding when to pick them and either to sun-dry or to wash them affects their chemical composition and, hence, the final taste. Bean batches are scanned for moldy beans using ultraviolet fluorescence, trichromatic mapping is used to generate a colour footprint, and photo-electric cells detect duds, so that each individual bean is analysed—at a rate of 400 beans per second (note: 50 are needed for 1 espresso.)

Third, there is the roasting process. The green bean that enters the roasting process has about 250 volatile molecule types, which is bumped up to over 800 during the roasting process. At a temperature of 185-240 degrees Celcius, the well-known Maillard reaction takes place between amino acids and sugars (the same reaction is responsible for the brown bread crust). Depending on the temperature of the roasting, it should be done for a longer (up to 40 minutes) or shorter (90 seconds) period of time, thereby affecting the final result  (different times and temperatures cause different chemical reactions to take place). The aromas that are generated during this process are analysed by gas chromatography and with olfactometry, and subsequently mass spectrometry is used to identify the chemical.

According to the New York Times obituary on Ernesto Illy, there are 114 quality-control checks between the arrival of the raw beans at the start of the food processing and shipping the roasted beans. That much for the coffee bean itself, and onto brewing the coffee.

The extraction of the components from the ground coffee is done by heated water. Not discussed explicitly in the SciAm paper, but using plain physics and chemistry: what is extracted depends on the size of the granules (so that there is more or less surface to come into contact with the water), the temperature of the water, the pressure applied when the water percolates through the granules, and how long the water is in contact with the granules. The longer the two are in contact, the more gets extracted. In casu (continuing with information from [1]), with filter coffee, the contact is 4-6 minutes, which causes relatively large quantities of acids and caffeine to dissolve; go figure the extraction with the French press/cafetiere method, which is also associated with elevated levels of cholesterol (sci and popsci article). During the 30 seconds percolation for espresso, less acid and only about 60-70% of the caffeine is dissolved into the liquid. (So, a “long espresso” gives you more caffeine.). As a side note, the water temperature for espresso is 92-94 degrees, the pressure goes up to 9 atmospheres, and when the hydraulic resistance of the coffee ground bed (the pod or loose coffee) is slightly less than the pressure of the water, the water flows through at a rate of 1 mm per second, producing about 30 ml espresso coffee in 30 seconds.

For the inexperienced, here are some suggestions to check if the previous step is done right, using the crema, i.e., foam, on top of the coffee as a measure. This part is not easy to summarise, so here it is in the words of Ernesto Illy [1]:

If the color of the foam topping is light, it means that the espresso has been underextracted, probably because the grind was too coarse, the water temperature too low or the time too short. If the crema is very dark in hue and has a “hole” in the middle, it is likely that the consistency of the coffee grounds was too fine or the quantity of grounds was too large. An overextracted espresso exhibits either a white froth with large bubbles if the water was too hot or just a white spot in the center of the cup if the brewing time was too long.

If you want to know more about it: read the paper, check out the Association for the Science and Information on Coffee that organises scientific conferences on coffee (chemistry, genetics, processing etc etc) and gathers other scientific articles published elsewhere, and experiment yourself :).

To close with a quote from an interview with Illy in 2005 put online at the The Written Word:

it is easy to try for oneself and realise why coffee is a cup that ticks. Place mugs of espresso, regular coffee, tea and cola. “The cola is sweet but you drink it and forget it, it has no after-taste; tea is rich in aroma but has little body and is low in taste; regular coffee does not have too much aroma — milk has already reduced its bitterness; and finally, there is the espresso which is rich in aroma and has a lasting impression of taste. It is a clear winner. With espresso, you cannot cheat on its flavour or taste.”

References

[1] Ernesto Illy. The complexity of coffee. Scientific American, June 2002, 86-92.