Journal paper on granular perspectives

My paper “The granular perspective as semantically enriched granulation hierarchy” [1] has been accepted in the International Journal Granular Computing, Rough Sets and Intelligent Systems, which is an invited extended version of the GrC’09 paper.

Although the paper is rather logic-based and full of definitions, propositions, and proofs, the underlying idea is quite straightforward. One granulates data and information in multiple ways to generate (granulation) hierarchies that have levels containing more or less detail about one’s domain of interest. Such hierarchies can be as simple as a partonomy of, say, all parts of the human structural anatomy in medicine (cell, organ, body, etc,) or administrative boundaries in GIS (city, province, etc.).  However, what the characteristics of such hierarchies are and what consequences they have on levels of granularity is left implicit throughout literature on granularity. Being imprecise about them easily can yield nonsensical hierarchies using different criteria to granulate information at different levels of detail, which is undesirable in general and in particular for software implementations, whereas having a way to declare that knowledge explicitly gives new opportunities for computation. For instance, to query at ones desired level of detail instead of perpetual time-consuming browsing or simplifying generating partial views of an ontology suited to the context of the domain expert.

The paper describes a way how to represent that hitherto implicit information and to do that in a structured and consistent manner throughout, and justifies why it makes sense to include exactly those properties of the hierarchies. Such a ‘dressed up’ hierarchy I call a granular perspective. Granular perspectives can be uniquely identified, hence, distinguished, by means of formally representing their semantics using a granulation criterion—by what attributes or properties do you divide up things—and type of granularity—how do you divide it—used for granulation. For instance, a criterion could be human structural anatomy (cf., say, functional, or by processes in the human body) and as mechanism using one specific type of relation between the entities in the different levels, e.g., by parthood (cf., say, by subsumption).

Those perspectives can be connected to each other consistently, which can be done by a simple relation or using mereological relations, thereby facilitating cross-granular querying and other reasoning scenarios. (Nothing of that is implemented though—it’s good to [try to] have the theory sorted out first.)

There are some copyright restrictions on the paper that is in print at the moment, so if you want to have a copy, feel free to contact me. An earlier, non-self-standing, version with more ontological analysis can be found in two sections of Chapter 3 of my PhD thesis [2]. To my pleasant surprise, Vogt has, independently, applied my theory of granularity—including the perspectives and linking them—in an informal way with biological material entities [3], which some readers of this blog might find more motivational to start reading than the technical details and, admittedly, fairly simple examples I have used to illustrate it in the thesis. Several illustrations from the eco/GIS domain are described elsewhere [4].

References

[1] Keet, C.M. (2011). The granular perspective as semantically enriched granulation hierarchy. Int. J. Granular Computing, Rough Sets and Intelligent Systems, (in print).

[2] Keet, C.M. A formal theory of granularity. PhD Thesis, KRDB Research Centre, Faculty of Computer Science, Free University of Bozen-Bolzano, Italy. 2008.

[3] Vogt, L. Spatio-structural granularity of biological material entities. BMC Bioinformatics, 2010, 11:289.

[4] Keet, C.M. Structuring GIS information with types of granularity: a case study. VI International Conference on Geomatics (Geomatics’09), 10-12 February 2009, Havana, Cuba.

New book on Novel Developments in Granular Computing

Late 2008 I mentioned the forthcoming invited book chapter [1] I wrote for “Novel Developments in Granular Computing: Applications for Advanced Human Reasoning and Soft Computation”, edited by JingTao Yao. Finally, it has been published.

The topics of the book focus on modelling with/representation of granularity, rough sets and logic, data mining, classification, and fuzzy aspects; see the preface and abstracts of the 19 chapters. The free sample chapter is an interesting analysis by Yiyu Yao on Human-Inspired Granular Computing (see menu bar on the left of the page). My contribution is in the modelling section: basically, the book chapter is a self-contained version of chapter 2 of my PhD thesis, with some minor additions from chapters 4 and 5; in short:

Multiple different understandings and uses exist of what granularity is and how to implement it, where the former influences success of the latter with regards to storing granular data and using granularity for automated reasoning over the data or information, such as granular querying for information retrieval. We propose a taxonomy of types of granularity and discuss for each leaf type how the entities or instances relate within its granular level and between levels. Such distinctions give guidelines to a modeler to better distinguish between the types of granularity in the design phase and the software developer to improve on implementations of granularity. Moreover, these foundational semantics of granularity provide a basis from which to develop a comprehensive theory of granularity.

Anyone who has published with IGI before knows about the unusual editing policies and their preferred layout; hence, I will upload the latex-ed preprint soon… here is the preprint.

References

[1] Keet, C.M. A top-level categorization of types of granularity. In: Novel Developments in Granular Computing: Applications for Advanced Human Reasoning and Soft Computation. JingTao Yao (Ed.). IGI Global. 2010. pp81-117.

The, surprisingly somewhat belated, report on the Granular Computing Conference GrC’09

Last week I attended the IEEE International Conference on Granular Computing 2009 in Nanchang, China. I had done my preparations to report about it also during the conference, but, alas, I was not allowed to even read my own blog, let alone posting to it; see [footnote 1] below for some observations on non-accessible blogs. This being the situation, hereby then a slightly belated report.

Tengwan Ge in Nanchang, which was also depicted on the beautiful GrC'09 remembrance plate from Jiangxi University

Tengwang Ge in Nanchang, which was also depicted on the beautiful GrC'09 remembrance plate from Jiangxi University

Keynotes

T.Y. Lin of San José State University, USA—one of the initiators of “Granular Computing” as a, in his view, specialisation area of applied mathematics—spoke about the difference between keyword terms in Web searches and the concepts behind it and he proposed to solve the issues with category theory. On my question why specifically category theory and not Semantic Web languages and technologies, I unfortunately did not get a clear answer. Xingdong Wu of the University of Vermont talked about mining user patterns, aggregations, and user interest modelling with wildcards, where the use of wildcards provide much flexibility (and improvements) in specifying and mining the user patterns. Last, Xue-wen Chen of the University of Kansas first went through the usual introductory aspects of system biology, to proceed to the actual topics of gene and protein networks. The technologies used are hidden Markov models, Bayesian networks, and a K-GIDDI divide-and-conquer biclustering algorithm that touched up with gene ontology terms to, respectively: find 4 new genes (their functions) in the D. melanogaster (fruitfly), figure out a gene network (80% of what was known already in the literature, but now done by automation), and 95.42% correct function assignment of H. sapiens genes (verified against the 437 genes already annotated) of an overall amount of 618 function assignments. Not bad, not bad at all; his 2008/9 papers have the details.

Sessions

My paper on granular perspectives ([1] and summary here) was scheduled right at the start in the first of the parallel sessions, in the “Foundations of granular computing”. It was listened to with interest, and I received positive feedback–if people will use it is, time will tell. Hong Hu brought up the topics of dynamic similarity (e.g. the gradual changes from tadpole to frog) and how to deal with it, for which he proposed to use neural networks [2]. However, dealing with the standard ‘static’ similarity, i.e. comparing two objects at the same time, is already a widely researched area, and unsurprisingly, the last word has not been said yet about dynamic similarity either; in fact, perhaps they were the first ones. The same session had scheduled a paper with a preliminary, set-based, notion for a theory of granularity [3], which already looks ahead to using attributes of the objects (but that was not fully integrated in the theory yet), and gives “granule” as a lump of objects in a granular level an explicit place in the theory. Yinliang Zhao’s paper was about a way to granulate program code, and in particular of object-oriented programming languages with a restriction, thus far, to single-inheritance class hierarchies (the programming version of taxonomies) [4].

In the afternoon, I went to the ‘Japanese session’, where, among others, Toyota presented an application paper for visualising the topics of and navigation to the 7000 or so Japanese laws to acquaint Japanese citizens with it (this year Japan changed its judicial system that now has a citizen-judge, or ‘saiban-in’ system) [5]. If this is practically scalable to the Italian system? It should, in theory at least, but with its more than 70000 laws, it will require more levels of granularity than the four they have so as to obtain appropriate overviews. In addition, the relations of the (hierarchically ordered) key terms in the texts of the full collection of Japanese laws have the characteristic of a so-called “small world network”, which makes it very suitable for visualisation. My experience with the Italian bureaucracy and its rules for the strangest things gives me the impression that that may not be the case with the Italian laws (and Italian laws can benefit from an automated consistency check, but that is a separate topic), but it is not a trivial exercise to actually verify or refute this hunch. As a tidbit of fun information about the relatedness of the keywords extracted from the Japanese laws: “Nation” scored highest with 1020 links, which was followed by “Money” with 981 links, which the presenter found curious enough to emphasise.

There were three sessions on rough sets: applications, theory, and computing. The applications session had a paper on using a “knowledge quantity” for relative importance of attributes used to compute the rough sets and to apply that to Chinese text categorisation using the “document frequency thresholding” characteristic [6]: while common terms appear to be important for global performance, rare terms “are the most informative” to be able to discern (make distinguishable) those documents from others and are, from a rough set perspective, therefore influential because they have most effect on the equivalence structure. Li [7], on the other hand, improved on the “extenics” company evaluation method by using rough sets so that the amount of company indicators, such as “human capital” and “technological innovation ability”, could be reduced, hence a company’s evaluation method simplified. On the theory side, there was, among others, a paper on neighborhood systems with respect to rough sets where a new “and” operator is introduced and, as the authors claim, is “different from traditional rough set approximations” [8]. The remainder of the paper to back up this claim is rather dense, but T.Y. Lin summarised his students’ work as that the lower and upper approximations in VPRS are special cases of the interior and closure in topological space. Last, Chen, Li and coauthors sought to dynamically update the upper and lower approximations of a rough set to reflect the changes in the underlying information system over time, and they presented the theory, algorithm, and experimental validation in [9,10].

Start of the walk at Lushan mountain, after the flower garden

Start of the walk at Lushan mountain, after the flower garden

The House where Nobel Priye winner Pearl Buckly lived on the Lushan Mountain

The House where Nobel Prize winner (literature, 1938) Pearl Buck lived on the Lushan Mountain

Other

As social event, besides the conference dinner, we had a trip to the Lushan mountain, which is a UNESCO world heritage site. Although I had to skip the walking sessions, the scenery is really beautiful and temperature comfortable. The visit to its “many old buildings” appeared to be the missionaries outpost of about 100-150 years ago, including a not very protestant church that is (still/again?) used for weddings, and the home of Pearl Sydenstricker Buck, who had won the Nobel prize for Literature for her writings about life in China.

Each participant also received a beautiful present from the local organisation, Jiangxi University of Finance and Economics: a black ceramic plate with in gold-coloured imprints the name of the university, IEEE GrC 2009, and in the centre the famous building Tengwang Ge.

Travelling to China is a bit of a hassle with the visa, and knowing some Chinese (which I do not) will be useful for getting around and things done, but nevertheless I highly recommend people to visit the country, be it a conference or holiday: the people are friendly and very helpful, the food is delicious, and there are lots of things to see and do.


References

  1. C. Maria Keet. From granulation hierarchy to granular perspective. In: Proceedings of the 5th IEEE international conference on Granular Computing 2009 (GrC’09). 17-19 August, Nanchang, China. IEEE Computer Society, 306-311.
  2. Hong Hu and Zhongzhi Shi. Machine learning as granular computing. In: Proc. of GrC’09. IEEE Computer Society, 229-234.
  3. Hong Li. Granule, Granular Set and Granular System. In: Proc. of GrC’09. IEEE Computer Society, 340-345.
  4. Yinliang Zhao. A step toward code granulation space. In: Proc. of GrC’09. IEEE Computer Society, 799-804.
  5. Tetsuya Toyota and Hajime Nobuhara. Hierarchical structure analysis and visualisation of Japanese law networks based on morphological analysis and granular computing. In: Proc. of GrC’09. IEEE Computer Society, 539-543.
  6. Yan Xu and Wang Bin. Knowledge management based on rough set. In: Proc. of GrC’09. IEEE Computer Society, 654-657.
  7. Yuan-yuan Li and Jun Yun. A comprehensive evaluation method based on extenics and rough set. In: Proc. of GrC’09. IEEE Computer Society, 381-383.
  8. Xibei Yang, Xinzhe Li and Tsau Young Lin. First GrC model — Neighborhood Systems: the most general rough set models. In: Proc. of GrC’09. IEEE Computer Society, 691-695.
  9. Weili Zou, Tianrui Li, Hongmei Chen, Xiaolan Ji. Approaches for incrementally updating approximations based on set-valued information systems while attribute values’ coarsening and refining. In: Proc. of GrC’09. IEEE Computer Society, 824-829.
  10. Hongmei Chen, Tianrui Li, Weibin Liu. Research on the approach of dynamically maintenance of approximations in rough set theory while attribute values coarsening and refining. In: Proc. of GrC’09. IEEE Computer Society, 45-48.

Notes

[footnote 1] Regular readers may recollect that Cuba did not block my blog, and that I have written a post there during the Informatica 2009 conference about the VIP session. This made me curious as to what type of blogs are (not) accessible here in China. Some observations (pages checked on 16 and 17 Aug 2009):

  1. WordPress: I did a random check of a few other wordpress blogs with full names as well as xxx.wordpress.com and .org types, such as Duncan‘s and WP’s own blog with tips ‘n tricks, to ascertain if it was just my blog being “timed out”, but all those blogs were “timed out”, too. I could access WP’s startpage.
  2. Blogspot: I tried Ben‘s and FSP’s blogs, which had a “connection interrupted” message. The http://www.blogger.com had a quick “connection interrupted” message, idem http://www.blogspot.com.
  3. Typepad: the frontpage already “timed out”, idem specific typepad blogs.
  4. Other blogs that do not run through one of those blogging sites but have their own software running, such as those of Michael Nielsen, LogBlog, and Microbeworld are accessible, but not the asmblog of the American Society for Microbiology, such as Small things considered (“timed out”, although the ASM was accessible).
  5. Curiously, when I did a Google search on “blog filters china”, one of the top hits returned was the accessible Harvard blog called “internet and democracy blog“, but the first hit returned by the Google search was a news item at National Public Radio that Microsoft implements blog filters for China, which closes with the line “Microsoft’s blogging filter could be seen as taking American companies’ cooperation with censorship to a new level. Instead of merely blocking what Internet users can read, she says, Microsoft is now limiting what they can write.”.

So, to whoever developed the filtering algorithms: there is room for improvement of your work; unless the owners of the three above-mentioned blogging softwares do this blocking themselves preemptively already, which I hope is not the case. To whomever who wants to have their blog also reach the Chinese in China: for the time being, install your own blogging software.

Enhancing granulation hierarchies

While the paper entitled From granulation hierarchy to granular perspective for this year’s IEEE International Conference on Granular Computing (GrC’09) has been accepted for a while [1], it took some effort to get the colourful sticker (visa) glued into my passport that allows me entry into China, where the conference will be held.

The paper is a shorter, and perhaps also better readable, version of Section 3.3 of my thesis, where the considerations and argumentation of the ontological aspects are mostly left out, so that some explanatory text and the definitions, lemmas, and theorems remain. The aim is to augment so-called granulation hierarchies–those things you get when linking up different levels of granularity (or their data at different levels of detail)–with several attributes and a way to unambiguously identify such hierarchies, what I then call granular perspectives.

Here’s the abstract:

It is well-known that one can granulate data and information in multiple ways to generate a plethora of granulation hierarchies each with their levels of granularity. It is left implicit what the characteristics of such hierarchies are, and what consequences they have on levels of granularity. We propose a way to represent such additional information of granulation hierarchies by upgrading them to full granular perspectives and to provide a consistent way to uniquely identify, hence, distinguish, such perspectives based on their semantics by using a criterion for granulation and type of granularity used for granulation. In addition, with the chosen premises, definitions, and proven properties, we demonstrate some consequences for characterising levels of granularity within such granular perspectives.

If the 6 pages do not satisfy your appetite for the topic and you want to read more about properties and the criterion for granulation and see more examples, then Section 3.3 of the thesis will be useful. More consequences of granular perspectives on granular levels can be found in Section 3.4 of the thesis.

References

[1] Keet, C.M. From granulation hierarchy to granular perspective. IEEE International Conference on Granular Computing (GrC’09), Nanchang, China, August 17-19, 2009. IEEE Computer Society, pp .

Rough Sets Theory workshop in Milan

While it was exceptionally warm weather outside, we stayed inside in a comfortable atmosphere in one of the aulas at the University of Milano-Bicocca, who had organised the first Rough Sets Theory workshop, 25-27 May 2009. With an emphasis on theory: there are many applications of rough sets, but “Even though this attention to application is of great importance, it is not excluded that theoretical aspects concerning with foundations of rough sets, both logical and mathematical, must be taken into account.”

As I’m no expert on rough sets (but there is an interesting relationship between rough sets and granularity, which was the topic of my presentation), the different topics covered by the programme were very interesting to me and gave a useful overview of the range of research topics. As it appears, there there’s plenty of work still to be done on rough sets theory—even though the basic description of rough sets is elegant and simple—and the ambience provided ample opportunity for exchange of ideas and lively discussions.

Topics ranged from using roughs sets with ordinal data and substituting the indistinguishability relation with a dominance relation presented by Salvatore Greco to discussions what are the essential ‘ingredients’ of rough sets to presentations on definitions of rough sets by Mihir Chakraborty and on the differences between Pawlak rough sets versus probabilistic rough sets by Yiyu Yao. For instance, on the latter, Pawlak rough sets consider qualitative aspects and has zero tolerance for errors whereas probabilistic rough sets are about quantitative aspects and acceptance of error; Yao proposed a solution to deal with both, called decision-theoretic rough sets. Also organizer Gianpiero Cattaneo talked about foundational and mathematical aspects of rough sets, but then using a binary relations approach (more detailed information can be found here).

Fertile ground for discussion and misunderstandings, due to the different backgrounds and assumptions of the attendees, was the notion of incompleteness. Simply put, given some ‘data table’ (which is not necessarily a database table), there may be null values, but what does that represent? Incomplete information? To make a long story short: it depends on the context (the semantics of the structure you use, language). Didier Dubois approached it from, among others, a setting of incomplete information in database integration and considered “ill-known attributes” and “ill-known rough sets” as cases of incomplete information about the data. Ill-known attributes are another rendering of the usage at the intensional level of a value range for an attribute of a class so that each object in the class’s extension does have only one value that falls within the defined range of allowed values. Ill-known rough sets are about the ill-observation of attribute values and the lack of discrimination of the set of attributes, and then there is the issue of “potential similarity”. His proposal is about covering-based generalisation of rough sets.

I have more notes of the presentations and the panel session, but I’ll leave it at that (for now at least). If you want to know more about these and the other programme topics, I’d suggest you attend next year’s workshop, but also related conferences may be of interest (e.g., RSKT, GrC, RSFDGrC) or, if you would like to see a closer link with fuzzy and with ontologies, then you may be interested in attending the WI-IAT’09 workshop on managing vagueness and uncertainty in the Semantic Web (VUSW’09) on 15-9-’09 in Milan.

Types of granularity and the TOG to facilitate modelling for the GIS domain

During and in between all the research traveling over the past half a year I also managed to write some papers. One of them is an invited book chapter [1] based on chapter 2 of my PhD thesis, i.e. the taxonomy of types of granularity with some additional material to make it self-standing to read. This book chapter that will appear in Novel Developments in Granular Computing (edited by JingTao Yao) early next year is to some readers, however, still rather abstract. To try meet feedback on how to apply these types of granularity and the TOG, I applied it and wrote a paper about using them to improve representation of granulation hierarchies in the subject domain of geography and ecology. This case study for representing semantics of granularity in Geographic Information Systems [2] will be presented early next year at the Geomatics’09 conference. The abstract of what I like to think to be a, from a potential user perspective, very readable paper (pdf) is as follows:

Dealing with granularity in the GIS domain is a well-known issue, and multiple data-centric engineering solutions have been developed to deal with finer- and coarser-grained data and information within one information system. These are, however, difficult to maintain and cumbersome for interoperability. To address these issues, we propose eight types of granularity and a facilitating basic theory of granularity to structure granulation hierarchies in the GIS domain. Several common hierarchies will be re-assessed and refined. It illustrates a methodology of first representing what one desires to consider for a GIS application, i.e., at the semantic layer, so as to enable reaping benefits of flexibility, reusability, transparency, and interoperability at the implementation layer.

A nice extra, for me at least, is that the Geomatics conference will be held in Havana, Cuba, as part of Informatica’09. Though I can understand Spanish and speak it a little (well, by mixing it with Italian), I do appreciate they are making it into a bi-lingual event with simultaneous translation Spanish/English. Looking at the preliminary programme (details online here soon), the following topics and people are already booked in: Oscar Corcho on semantics and the grid, Joep Crompvoets on spatial data, Michael Gould on data infrastructure, Robert Ward of the International Hydrographic bureau about marine data and information, several ISO representatives on various topics, and more researchers on open source geoinformatics, precision agriculture, managing remote sensing data and other topics, which will be presented by people from, mainly, the Americas and Europe (Cuba, Brazil, Chile, UK, Belgium, Spain, Switzerland, Italy, and Germany, among others).

For those of you who cannot physically attend—be it for financial reasons or due to the blockade—but would have liked to be there: you also can register as a virtual participant.

[1] Keet, C.M. A top-level categorization of types of granularity. In: Novel Developments in Granular Computing: Applications for Advanced Human Reasoning and Soft Computation. JingTao Yao (Ed.). IGI Global. (in print, not yet online—contact me if you want to have a preprint).

[2] Keet, C.M. Structuring GIS information with types of granularity: a case study. VI International Conference on Geomatics, 10-12 February 2009, Havana, Cuba.

Granularity and no emergence in biology

This time a post that bears some distant relation to my thesis topic: granularity. About 1.5 years ago I got concerned that emergence, emergent properties, and emergent behaviour would complicate developing a formal theory of granularity, so I read up on the topic. While writing along the overview and analyzing both the philosophical aspects and proposed examples of emergence in biology, I came to the realization that it doesn’t complicate granularity, but on the contrary: that granularity actually serves as a useful methodology to investigate (hypothesized) emergence, in particular because of the modeling advantages and prospects for structured in silico simulations.

This is very nice for my granularity, but 20 odd pages to support a useful application area of granularity even though it is not the focus-area of applications (wandering off too far from the narrative), and thus taking up too much space in the thesis. So, I’m phasing it out. Problem is, that I don’t know of any outlet where a cocktail of bio, IT, and philosophy would be publishable, because specialists of each discipline wouldn’t be too happy reading too much about the other two fields and can smack it because it is not necessarily detailed enough for their own field, despite that the idea of combining granularity & (hypothesized) emergence may have some novelty to it. Interdisciplinarity has its drawbacks.

Things being as they are, I’m putting the pdf online after the printed paragraph was getting dust for some 1.5 years – for there might just be an interested reader out there. Comments are welcome of course!

Topics that pass the revue in the manuscript are:
1 Introduction
2 Renewed claims of emergence in biology
3 Emergence from a philosophical perspective
3.1 Epistemological emergence
3.2 Ontological emergence
3.3 Strong emergence
3.4 Weak emergence
3.4.1 Simulations
3.5 Examples
3.5.1 Example 1: pseudoplasmodium formation by cellular slime moulds
3.5.2 Example 2: horizontal gene transfer with metagenomics
4 Emergence and levels of granularity
4.1 Preliminaries of granularity
4.2 The irreducibility argument
4.3 Non-predictability and non-derivability
4.4 Characterisation of granular level from the viewpoint of emergence
5 Concluding remarks

The abstract of “Granularity as a modelling approach to investigate hypothesized emergence in biology” is as follows.

Abstract. Informal usage of emergence in biological discourse tends towards being of the epistemic type, but not ontological emergence, primarily due to our lack of knowledge about nature and limitations to how to model it. Philosophy adds clarification to better characterise the fuzzy notion of emergence in biology, but paradoxically it is the methodology of conducting scientific experiments that can give decisive answers. A renewed interest in whole-ism in (molecular) biology and simulations of complex systems does not imply emergent properties exist, but illustrates the realisation that things a more difficult and complex than initially anticipated. Usage of (weak- and epistemological) emergence in bioscience is a shorthand for `we have a gap in our knowledge about the precise relation(s) between the whole and its parts and possibly missing something about the parts themselves as well’, which amounts to absence of emergence in the philosophical sense. Given that the existence of emergent properties is not undisputed, we need better methodologies to investigate such claims. Granularity serves as one of these approaches to investigate postulated emergent properties. Specification of levels of granularity and their contents can provide a methodological modelling framework to enable structured examination of emergence from both a formal ontological modelling approach and the computational angle, and helps elucidating the required level of granularity to explain away emergence. I discuss some modelling considerations for a granularity framework and its relevance for the testability of emergence in computational implementations such as simulations.

Granulate and Conquer

Or so goes the fancy tagline for a particular problem-solving methodology, which predominantly comprises applied mathematics and IT (soft computing) [1], and addresses to a lesser extent the philosophical and ontological aspects [2][3]. More comprehensively, the field of Granular Computing combines efforts from philosophy, Artificial Intelligence, machine learning, database theory and data mining, (applied) mathematics with fuzzy logic and rough sets, among others. Themes addressed for computational problem solving tend to emphasise quantitative aspects of granularity, whereas the others put a higher emphasis on the qualitative component of granularity.

The “granulate and conquer” then amounts to a nice methodology to manage your data, information and knowledge. Applications can be as diverse as:

  • Using clustering techniques to make sense of mRNA expression patterns in microarray data [4], in this case applied to gene expression data of the malarial parasite Plasmodium falciparum;
  • Access control models in computer security, where, as Lin summarizes in [1], for each object p Î V there is a granule B(p) Í U of objects that are in conflict; put differently: eventually, after taking into account the various access rights for the resources (like documents, folders etc.), the resulting granule contains the list of enemies. For the interested reader: details about all this goes under the term Chinese Wall Security Policy Model;
  • Individual student-tailored study feedback. A slightly outdated description of such a systems is given by McCalla and Greer [5], but anyone familiar with Computer-based Training and its ‘test exam’ facility makes use of this approach: after doing the test, for the wrong answers given, it tells you in which paragraph(s) of the study material you can find the explanation so that you don’t have to go through all the material again and re-do only those few sections that you didn’t understand sufficiently.

Although the applications are very diverse, there are some commonalities in the approaches and (oftentimes one-off) models created for a particular purpose. More specifically, the underlying semantics of how the granulation is done and the relation between the entities within a granular level (/granule/grain) is consistent – but there is not one single type of granule. A first attempt to categorise those types of granularity is made by [3], where a taxonomy is presented with seven leaf categories. The main distinctions made are between scale-dependent granularity and, for the lack of a better term, non-scale-dependent granularity. Further divisions include, among others, granulating according to some mathematical formula (e.g. seconds, minutes, hours, etc.), sorting by means of one type of (primitive) relation (e.g. [structural-]partOf, [spatially-]containedIn), and aggregation of the same collection of instances of one type that subsequently can be partitioned in various ways at lower levels of detail using semantic criteria where the entity at a lower level is a subtype of the type at the coarser-grained level (e.g. a collection of phone points and finer-grained land-line and mobile phone points). The distinctions described in the article can guide a conceptual modeller to better distinguish between the types of granularity when representing domain knowledge and can be of use to the software developer to improve applications that use granularity in one way or another.

Coincidentally, a conference on Granular Computing will take place within a few weeks, which will be held in Atlanta, USA, from 10 to 12 May 2006. There is little time left to register here.

References

[1] Lin, T.Y. Toward a Theory of Granular Computing. IEEE International Conference on Granular Computing (GrC06). 10-12 May 2006, Atlanta, USA. Draft online available

[2] Yao, Y.Y. Perspectives of Granular Computing. IEEE Conference on Granular Computing (GrC05), 1:85-90.

[3] Keet, C.M. A taxonomy of types of granularity. IEEE Conference in Granular Computing (GrC06). 10-12 May 2006, Atlanta, USA.

[4] Zhou, Y., Young, J.A., Santrosyan, A., Chen, K., Yan, S.F., Winzeler, E.A. In silico gene function prediction using ontology-based pattern identification. Bioinformatics, 2005 21(7):1237-1245.
Online information available at: http://carrier.gnf.org/publications/OPI

[5] McCalla, G.I., Greer, J.E. Granularity Hierarchies. Computers and Mathematics with Applications: Special Issue on Semantic Networks, 1992, 23:363-376.