Identification and keys in UML Class Diagrams

ER, its extended version EER, ORM, and it extended version ORM2 have several options for identification with keys/reference schemes. UML Class Diagrams, on the other hand, have internal, system-generated, identifiers, with a little-known and underspecified option for user-defined identifiers inspired by ER’s keys, whose description is buried in the standard on pp290-293 [1]. Although UML’s ease of system-generated identifiers relieves the burden of detailed conceptual analysis by the modeller, it is exactly making implicit subject domain semantics explicit that is crucial in the analysis stage; or: less analysis during the modelling stage stores up more problems down the road in terms of software bugs and interoperability. A uniform, or at least a structured and unambiguous approach, can reduce or even avoid inconsistencies in a conceptual data model, especially concerning taxonomies, achieve interoperability through less resource-consuming information integration, and if the identification mechanisms are the same across conceptual data modeling language, then unification among them comes a step closer. The present lack of harmonisation in handling identification hampers all this.

But which identification mechanism(s) could, or should, be specified more precisely for UML, and how does it rhyme with progress about identity in Ontology? How to incorporate it in UML to foster consistent usage? For instance, should the procedure to find and represent identity, or at least good keys, be a step in the modelling methodology, also be enforced in the CASE tool, or be part of the metamodel and/or in the conceptual modelling language itself?

The distinct implicit assumptions and explicit formalisations of extant identification schemes are elucidated in [2] for ER, EER, ORM, and ORM2. As it appears, not even ER/EER and ORM/ORM2 agree fully on keys and reference schemes, neither from a formal perspective (though the difference is minimal), nor from a methodological or CASE tool perspective. Both, however, do aim at strong or weak identification of entities with so-called natural keys (/semantic identifiers) in particular.

At the other end of the spectrum, Ontology does have something on offer, but the philosophers are only interested in identity of entities. Moreover, it is not just that they don’t agree on the details, but there is a tendency to admit it is nigh on inapplicable (see [3] for a good introduction). Watered-down versions of the notion of identity in philosophy that have been proposed in AI, are to used either only the necessary or only the sufficient conditions, which are at least somewhat applicable, be it as a basis for OntoClean [4], or in Description Logic languages (including OWL 2) with their primitive and defined classes. The details of the definitions, explanations, and pros and cons are presented in a ‘digested’ format in the first part of [2].

This analysis was subsequently used to increase the ontological foundations of UML in the second part of [2] to introduce two language enhancements for UML, being formally defined simple and compound identifiers and the notion of defined class, which also have a corresponding extension of UML’s metamodel. The proposed extensions focus on practical usability in conceptual data modelling, informed by ontology, and are approximations to the qualitative, relative, and synchronic identity, and to the notion of equivalent class (defined concept) in OWL (DL).

For those of you who do not care about the ‘unnecessary philosophizing’ (as one reviewer had put it) and justifications, there is a short (4-page) version [5] with the formal definitions and the UML extensions, which has been accepted at the SAICSIT’11 conference in Cape Town, South Africa. The longer version that provides explanation as to why the proposed extensions are the way they are is available as a SoCS technical report [2].

References

[1] Object Management Group. Ontology definition metamodel v1.0. Technical Report formal/2009-05-01, Object Management Group, 2009.

[2] Keet, C.M. Enhancing identification mechanisms in UML class diagrams with meaningful keys (extended version). Technical Report SoCS11-1, School of Computer Science, University of KwaZulu-Natal, South Africa. August 8, 2011.

[3] H. Noonan. Identity. In E. N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Fall 2008 edition, 2008.

[4] N. Guarino and C. Welty. Identity, unity, and individuality: towards a formal toolkit for ontological analysis. In W. Horn, editor, Proc. of ECAI’00. IOS Press, Amsterdam, 2000.

[5] Keet, C.M. Enhancing Identification Mechanisms in UML Class Diagrams with Meaningful Keys. SAICSIT Annual Research Conference 2011 (SAICSIT’11), Cape Town, October 3-5, 2011. ACM Conference Proceedings (in print).

Advertisement