4.5 Entity Clustering for ER Models 77 • Dominance grouping • Abstraction grouping • Constraint grouping • Relationship grouping These grouping operations can be applied recursively or used in a variety of combinations to produce higher-level entity clusters, that is, clusters at any level of abstraction. An entity or entity cluster may be an Figure 4.9 Grouping operations (a) Dominance grouping (b) Abstraction grouping (c) Constraint grouping (d) Relationship grouping Teorey.book Page 77 Saturday, July 16, 2005 12:57 PM 78 CHAPTER 4 Requirements Analysis and Conceptual Data Modeling object that is subject to combinations with other objects to form the next higher level. That is, entity clusters have the properties of entities and can have relationships with any other objects at any equal or lower level. The original relationships among entities are preserved after all grouping operations, as illustrated in Figure 4.8. Dominant objects or entities normally become obvious from the ER diagram or the relationship definitions. Each dominant object is grouped with all its related nondominant objects to form a cluster. Weak entities can be attached to an entity to make a cluster. Multilevel data objects using abstractions such as generalization and aggregation can be grouped into an entity cluster. The supertype or aggregate entity name is used as the entity cluster name. Constraint-related objects that extend the ER model to incorporate integrity constraints, such as the exclusive- OR can be grouped into an entity cluster. Additionally, ternary or higher degree relationships potentially can be grouped into an entity cluster. The cluster represents the relationship as a whole. 4.5.3 Clustering Technique The grouping operations and their order of precedence determine the individual activities needed for clustering. We can now learn how to build a root entity cluster from the elementary entities and relationships defined in the ER modeling process. This technique assumes that a top- down analysis has been performed as part of the database requirement analysis and that the analysis has been documented so that the major functional areas and subareas are identified. Functional areas are often defined by an enterprise’s important organizational units, business activ- ities, or, possibly, by dominant applications for processing information. As an example, recall Figure 4.3 (reconstructed in Figure 4.10), which can be thought of as having three major functional areas: company organization (division, department), project management (project, skill, location, employee), and employee data (employee, manager, secretary, engineer, technician, prof-assoc, and desktop). Note that the functional areas are allowed to overlap. Figure 4.10 uses an ER diagram resulting from the database requirement analysis to show how clustering involves a series of bottom-up steps using the basic grouping operations. The fol- lowing list explains these steps. 1. Define points of grouping within functional areas. Locate the domi- nant entities in a functional area through natural relationships, local n-ary relationships, integrity constraints, abstractions, or Teorey.book Page 78 Saturday, July 16, 2005 12:57 PM 4.5 Entity Clustering for ER Models 79 Figure 4.10 ER diagram: clustering technique 1 1 1 N belongs-to N N is-allocated 1 1 has- allocated 1 1 is- managed-by contains is- headed-byhas d 1 1 N 1 1 11N Employee N 1 Project TechnicianEngineerSecretaryManager Prof-assocWorkstationDesktop skill-used assigned-to Department Division N N Location Skill is-allocated 1 1 is- married-to manages + N N Employee data functional area Project management functional area Company organization functional area Teorey.book Page 79 Saturday, July 16, 2005 12:57 PM 80 CHAPTER 4 Requirements Analysis and Conceptual Data Modeling just the central focus of many simple relationships. If such points of grouping do not exist within an area, consider a functional grouping of a whole area. 2. Form entity clusters. Use the basic grouping operations on elemen- tary entities and their relationships to form higher-level objects, or entity clusters. Because entities may belong to several potential clusters, we need to have a set of priorities for forming entity clus- ters. The following set of rules, listed in priority order, defines the set that is most likely to preserve the clarity of the conceptual model: a. Entities to be grouped into an entity cluster should exist within the same functional area; that is, the entire entity clus- ter should occur within the boundary of a functional area. For example, in Figure 4.10, the relationship between Department and Employee should not be clustered unless Employee is included in the company organization functional area with Department and Division. In another example, the relation- ship between the supertype Employee and its subtypes could be clustered within the employee data functional area. b. If a conflict in choice between two or more potential entity clusters cannot be resolved (e.g., between two constraint groupings at the same level of precedence), leave these entity clusters ungrouped within their functional area. If that func- tional area remains cluttered with unresolved choices, define functional subareas in which to group unresolved entities, entity clusters, and their relationships. 3. Form higher-level entity clusters. Apply the grouping operations recursively to any combination of elementary entities and entity clusters to form new levels of entity clusters (higher-level objects). Resolve conflicts using the same set of priority rules given in step 2. Continue the grouping operations until all the entity represen- tations fit on a single page without undue complexity. The root entity cluster is then defined. 4. Validate the cluster diagram. Check for consistency of the interfaces (relationships) between objects at each level of the diagram. Ver- ify the meaning of each level with the end users. The result of one round of clustering is shown in Figure 4.11, where each of the clusters is shown at level 2. Teorey.book Page 80 Saturday, July 16, 2005 12:57 PM 4.6 Summary 81 4.6 Summary Conceptual data modeling, using either the ER or UML approach, is par- ticularly useful in the early steps of the database life cycle, which involve requirements analysis and logical design. These two steps are often done simultaneously, particularly when requirements are determined from interviews with end users and modeled in terms of data-to-data relation- Figure 4.11 Clustering results Division/ Department Cluster is- managed-by is- headed-by has d N N 111 11N Employee Department Division 2.1 Manager cluster 2.3 Secretary cluster 2.4 Engineer cluster 2.5 Technician skill-used assigned-to Project management cluster 2.2 N Project Project P 1 N Skill Location is- married-to manages 1 1 1 N + Teorey.book Page 81 Saturday, July 16, 2005 12:57 PM . Conceptual data modeling, using either the ER or UML approach, is par- ticularly useful in the early steps of the database life cycle, which involve requirements analysis and logical design. These. entities and relationships defined in the ER modeling process. This technique assumes that a top- down analysis has been performed as part of the database requirement analysis and that the analysis. the functional areas are allowed to overlap. Figure 4.10 uses an ER diagram resulting from the database requirement analysis to show how clustering involves a series of bottom-up steps using