An illustration of an “ERDP” to create an EER diagram: the dance school database

How to develop a conceptual data model, such as an EER diagram, UML Class Diagram, or ORM model? Besides dropping icons here and there on an empty canvas, a few strategies exist for approaching it systematically or at least in an assisted way, be it for ‘small data’ or for ‘big data’. One of them that I found useful to experiment with when I started out many years ago with the ‘small data’ cases, was the Conceptual Schema Design Procedure (CSDP) for ORM, as summarised in Table 1 below. It is summarised in that whitepaper and its details span a few hundred pages in Terry Halpin’s books [Halpin01], which was further extended in his later works. Extended Entity-Relationship modelling is more popular than Object-Role Modeling, however, and yet there’s no such CSDP for it. The elements don’t have the same name and the list of possible constraints to take into account are not the same in both families of languages either [KeetFillottrani15]. So, I amended it to make it work for EER.

Table 1. CSDP as summarised by Halpin in the white paper about Object-Role Modeling.

StepDescription
1Transform familiar information examples into elementary facts, and apply quality checks
2Draw the fact types, and apply a population check
3Check for entity types that should be combined, and note any arithmetic derivations
4Add uniqueness constraints, and check arity of fact types
5Add mandatory role constraints, and check for logical derivations
6Add value, set comparison and subtyping constraints
7Add other constraints and perform final checks

Unsurprisingly, yes, it is feasible to rework the CSDP for ORM to also be of use for designing EER diagrams, in an “ERDP”, ER Design Procedure, if you will. A basic first version is described in Chapter 4 of my new book that is currently in print with Springer [Keet23] (and available for pre-order from multiple online retailers already). I padded the CSDP-like procedure of the example a bit on both ends. There’s an optional preceding ‘step 0’ to explore the domain to prepare for a client meeting. Steps 1-7 are summarised in Table 2: listing the sample facts, drawing the core elements, and then adding constraints: cardinality, mandatory/optional participation, value, disjointness and completeness. Step 7 mostly amounts to adding nothing more, since EER has fewer constraints than ORM. Later steps may include quality improvements and various additions that some, but not all, EER variants have.

Table 2. Revised basic CSDP for EER diagrams.

StepDescription
0Universe of discourse (subject domain) exploration
1Transform familiar or provided sample examples into elementary facts, and apply quality checks
2Draw the entity types, relationships, and attributes
3Check for entity types that should be combined or generalised
4Add cardinality constraints, and check arity of fact types
5Add mandatory/optional constraints
6Add value constraints and subtyping constraints
7Add any other constraints of the EER variant used and perform final checks

The book’s chapter on conceptual data models also includes an example of the size that fits neatly when taking into account the page numbers and the rest of the content. As bonus material, I made a longer example now available on this page, which is about developing an EER diagram for a database to manage data for a dance school.

Picture of dancing the Ball de pastors del pirineo
Picture of our group dancing the “Ball de pastors del pirineo”.

I did go through a ‘step 0’ to explore the subject domain to explore my knowledge of dance schools, which was facilitated by having been member of several dance schools over the years. The example then goes through the 7-step procedure. All this gets us from devising facts such as

in a step-wise fashion with intermediate partial models to the final one, in Information Engineering notation, as shown in the following image:

Figure 1. The final EER diagram at the end of “step 6” of the procedure.

The dance school model description also hints at what lies beyond step 7, such as automated reasoning and ontology-driven aspects (not included in this basic version), and the page has a few notes on notations. I used IE notation because I really like the visuals of the crow’s feet for cardinality, but there’s a snag and some textbooks use Chen’s or a ‘Chen-like’ notation. Therefore, I added those variants on the page near the end.

Are the resulting models any better with such a basic procedure than without? I don’t know; it has never been tested. We have around 450 students who will have to learn EER in the first semester of their second year in computer science, so there may be plenty of participants for an experiment to make the conclusions more convincing. If you’re interested in teaming up for the research to find out, feel free to email me. 

References

[Halpin01] Halpin, T. Information Modeling and Relational Databases. San Francisco: Morgan Kaufmann Publishers. 2001.

[KeetFillottrani15] Keet, C.M., Fillottrani, P.R. An ontology-driven unifying metamodel of UML Class Diagrams, EER, and ORM2. Data & Knowledge Engineering, 2015, 98:30-53.

[Keet23] Keet, C.M. The What and How of Modelling Information and Knowledge: From Mind Maps to Ontologies. Springer, in press. ISBN-10: 3031396944; ISBN-13: 978-3031396946.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.