Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1482 Part X Business Intelligence Change the KeyColumns property for the currently selected attribute by clicking on the current value and then clicking the ellipses to launch the Key Columns dialog. The left pane of the Key Columns dia- log shows each of the current key members. Use the left and right arrows to build a key in the right pane as shown in Figure 71-4. FIGURE 71-4 The Key Columns dialog Likewise, add or change an attribute’s NameColumn binding by clicking the ellipses to invoke the Name Column dialog. Highlight the column that contains the desired value. Hierarchies and attribute relationships Once deployed, each attribute not specifically disabled becomes an attribute hierarchy for browsing and querying. The attribute hierarchy generally consists of two levels: the All level, which represents all pos- sible values of the attribute, and a level named after the attribute itself that lists each value individually. The Hierarchies and Levels pane of the Dimension Designer enables the creation of user hierar- chies, which define drill-down paths by organizing attributes into multiple levels. For example, 1482 www.getcoolebook.com Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1483 Building Multidimensional Cubes with Analysis Services 71 Figure 71-3 shows a user hierarchy that first presents the browser with a list of countries, which can be expanded into a list of states, then cities, and so on. Ultimately, the user will experience the dimension as some combination of attribute and user hierarchies. One of the most important practices to optimize cube performance is the careful construction of user hierarchies in conjunction with attribute relationships. This follows from how Analysis Services pre-calculates data summaries, called aggregations, to speed query performance. For example, totals by year or month might be pre-calculated along the time dimension. To understand attribute relationships, consider a simple time dimension with attributes for year, quarter, month, and day, with day relating to the fact table (key attribute). By default, every attribute in a dimension is related directly to the key attribute, resulting in the default relationships shown in Figure 71-5(a) and (b). The value for each non-key level summarizes all the day values. Contrast this to the properly assigned relationships in Figure 71-5(c), in which values for the month level must reference all the day values, but the quarters level need only reference the months, and the years need only reference the quarters. FIGURE 71-5 Attribute relationships Day Month Month (a) Default Relationships, no Hierarchy (b) Default Relationships with Hierarchy (c) Correct Relationships with Hierarchy Quarter Quarter Year Year Month Quarter Year Day Day When relationships are established in a dimension and a user hierarchy is created that mirrors those relationships, it is considered a natural hierarchy. The combination of creating the natural hierarchy and associated relationships is what enables effective aggregations to be created and used to speed query processing. Relationships describe how aggregations can be built, and only members of a natural hierarchy are considered in the aggregation creation process. 1483 www.getcoolebook.com Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1484 Part X Business Intelligence Best Practice W hile natural hierarchies are important for cube performance, user hierarchies provide drill-down paths for users who will interactively browse the contents of a cube, so it is important to define paths that make sense to the user of the cube as well. Spend time exploring how various users think about the data being presented and adapt the design to their perspective. Creating user hierarchies Drag an attribute to an empty spot of the Hierarchies and Levels pane to start a new hierarchy. Likewise, new attributes can be added by dragging attributes onto an existing hierarchy. Remember to rename each hierarchy with a user-friendly title, as the default names are ‘‘Hierarchy,’’ ‘‘Hierarchy 1,’’ and so on. The browser view is a good place to get a feel for how the user will experience the hierarchies as created. Right-click on the dimension name in the Solution Explorer and choose Process to update the server with the latest dimension definition. Then switch to the Browser tab of the Dimension Designer, choose the hierarchy to view in the toolbar, and explore the hierarchy in the pane. If the latest changes do not appear in the Browser tab, press the refresh button in the toolbar. Notice names, values, and ordering associated with each hierarchy, and adjust as needed. Note the differing icons to distinguish between user and attribute hierarchies while browsing. Establishing attribute relationships Switch to the Attribute Relationships tab of the Dimension Designer, which contains three panes. The diagram pane provides a graphical representation of the relationships like those shown in Figure 71-5. This mirrors the information presented in the Attribute Relationships pane, which shows a pair-wise list of relationships. Finally, the Attributes pane is a simple list of attributes — identical to the list shown on the Dimension Structure tab. The relationships diagram will look like that in Figure 71-5(a) after the attributes have been defined but no user hierarchy has been defined. This diagram shows how month, quarter, and year are all directly related to the day. Once the user hierarchy has been defined as described earlier, the relationships look like Figure 71-5(b), breaking each level of the hierarchy out into a separate box. One trick for interpret- ing the diagram is to read each arrow as ‘‘determines.’’ For example, knowing the day determines the corresponding month, knowing the month determines the quarter, and so on. Right-click on the design surface to add a new relationship, or right-click on an existing relationship to edit it; both operations launch the Edit Attribute Relationship dialog shown in Figure 71-6. Set the source/related attributes for each relationship until you have all the attributes correctly related, such as shown in Figure 71-5(c). Two important properties are associated with each relationship: Type and Car- dinality. The Relationship Type appears in the dialog as well as in the Properties pane, and can take on two values: ■ Rigid: Denotes that the values of these two attributes will have static relationships over time. If those relationships change, then a processing error will result. This is more efficient than the flexible alternative, however, because Analysis Services can retain aggregations when 1484 www.getcoolebook.com Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1485 Building Multidimensional Cubes with Analysis Services 71 the dimension is processed. Examples include quarter’s relationship to month, and state’s relationship to city. ■ Flexible: Used for attribute values and member property values that change in relationship over time. Aggregations are updated when the dimension is processed to allow for changes. For example, the relationship between employee and department would be flexible to reflect the movement of employees between departments. FIGURE 71-6 Relationship Editor Note that flexible relationships appear as hollow arrows, and rigid relationships as solid arrows, in the designer. The other important relationship property is Cardinality, which appears only in the Properties pane. Choose ‘‘Many’’ when there is a one-to-many relationship, such as the day to month rela- tionship. Choose ‘‘One’’ when there is a one-to-one relationship, such as that between a customer ID and social security number. When all user hierarchies have been defined, one-to-many relation- ships will tend to be represented as separate boxes, whereas one-to-one relationships will tend to appear in the same box. Boxes with more than one attribute represented can be toggled to show or hide attribute details — choose Expand All from the toolbar to see all attributes across the diagram. You can set the Attribute Relationships diagram to auto-arrange boxes or you can manually position boxes. To manually position boxes, turn off auto-arrange using the toolbar, then select the box and move it using the top border (dragging other portions of the shape will not move it). 1485 www.getcoolebook.com Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1486 Part X Business Intelligence Visibility and organization Most cubes will have a large number of dimension attributes, which can overwhelm the user. Using familiar names will help, but the simplest way to combat attribute overload is to not expose attributes that won’t be useful to the user. Specific strategies include the following: ■ Delete attributes that are not useful to users. This includes items not well understood and any alternative language information that can be specified in the translation view. ■ Some attributes can be presented to users only within a user hierarchy. For example, when interpreting a list of cities without knowing their corresponding country and state information, it may be challenging to tell the difference between Paris, Texas, and Paris, France. For these cases, build the appropriate user hierarchy and set the AttributeHierarchyVisible property to False for the corresponding attributes. For example, setting the City attribute’s AttributeHierarchyVisible to False will hide the city hierarchy itself while allowing the city to appear in any user hierarchies. ■ Attributes that will not be queried but are still needed for member properties, such as columns used only for sorting or calculations, can be fully disabled. Set AttributeHierarchyEnabled to False and note how the attribute icon is now grayed out. Also set AttributeHierarchyOptimizedState to NotOptimized,and AttributeHierarchyOrdered to False so that Analysis Services doesn’t spend unnec- essary time processing. Most client tools now support displaying properties in query results, although filtering on properties can be slow. ■ For attributes that need to be modeled but are very infrequently used, consider setting their AttributeHierarchyVisible property to False. These attributes will not be avail- able to users browsing cube data, but can still be referenced via MDX queries for custom applications. Once the list of visible attribute and user hierarchies has been determined, consider organizing dimensions with more than a handful of visible hierarchies into folders. Attributes will organize under the folder name entered into the AttributeHierarchyDisplayFolder property, whereas user hierarchies have an equivalent property named DisplayFolder. In general, these properties should be left blank for the most frequently used hierarchies in a dimension so that those items will display at the root level. Best Practice W ell-organized dimensions using well-understood names are essential to gaining acceptance for interactive applications — most users will be overwhelmed by the amount of available attributes. Excluding unused attributes not only helps simplify the user’s view of the data, it can greatly speed performance — especially for cubes with substantial calculations because the more attributes, the larger the number of cells each calculation must consider. 1486 www.getcoolebook.com Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1487 Building Multidimensional Cubes with Analysis Services 71 Basic setup checklist After creating a basic dimension via either the Dimension or the Cube wizards, review the following checklist, which outlines a first-order refinement. This level of attention is adequate for the majority of circumstances: ■ Ensure that attribute names are clear and unambiguous in the context of all dimensions in the model. If changes are required, consider modifying names in the data source view and regenerating the dimension to keep all names consistent within the model. ■ Review each attribute’s source ( KeyColumns and NameColumn properties) and ordering. Make frequent use of the browser view to check the results. ■ Create natural hierarchies and attribute relationships for every dimension to optimize aggrega- tions and query speed. ■ Review stakeholder needs for any additional user hierarchies and add them. ■ Remove unneeded attributes and adjust visibility as outlined above. ■ Organize dimensions with many hierarchies into folders. Best Practice Warnings S QL Server 2008 implements best practice warnings throughout the Analysis Services design environment, but dimension design is the first place they are normally encountered. The warnings appear as blue underlines on the object in question, such as the dimension name as viewed in the Dimension Designer. Don’t confuse these advisories with actual errors, which appear as red underlines and which will prevent a design from operating. Best practice warnings flag designs that are valid but may not be optimal, depending on the application being built. For example, when a new dimension is created, a warning is generated that relationships have not yet been defined. A full list of best practice warnings for the project is also generated in the Error List window whenever the cube is deployed. Read the advice associated with each of these, and if it does not apply to a given item, dismiss that particular warning by right-clicking in the Error List window and choosing Dismiss. Comments can be entered to document why a warning was dismissed. Warnings that don’t apply to a given project can be disabled globally by right-clicking on the project within the Solution Explorer and choosing Edit Database. Select the Warnings tab to see a list of all warning rules. Disabling a rule prevents a warning from being checked anywhere in the project. This same page provides both a list of individual warnings that have been dismissed and any comments provided. Beyond regular dimensions Dimension concepts described so far in this chapter have focused on the basic functionality common to most types of dimensions. It is somewhat challenging, however, to understand what exactly is meant by 1487 www.getcoolebook.com Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1488 Part X Business Intelligence the ‘‘type’’ of a dimension. Some sources refer to dimensions as being of only two types: data mining and standard, which encompasses everything else. Each dimension has a type property that assigns values such as Time, Geography, Customer, Accounts, and Regular, which corresponds to everything else not on the list. Furthermore, other characteristics of a dimension, such as parent-child organization, write-enabled dimensions, or linking a dimension from another database, can be thought of as different dimension types. For clarity, this chapter limits the discussion to standard dimensions and uses ‘‘type’’ only in the con- text of the dimension property, but it is important to understand how ‘‘type’’ is overloaded when reading other documents. Time dimension Nearly every cube needs a time dimension, and a great many production cubes exist with poorly imple- mented time dimensions. Fortunately, the Dimension Wizard will automatically create a time dimension and a corresponding dimension table, and populate the table with data. Right-click on the dimension folder in the Solution Explorer pane and choose New Dimension to start the wizard. ■ Select Creation Method: Select ‘‘Generate a time table’’ in the data source. ■ DefineTimePeriods:Choose the date range and periods that should appear in the dimen- sion. ■ Select Calendars: In addition to the standard calendar, choose and configure any other calendars that should appear in the dimension. ■ Completing the Wizard: Modify the name if desired; leave the ‘‘Generate schema now’’ check box unchecked. Review the structure of the dimension created by the wizard. Note that the dimension’s type property is set to Time, and that each attribute has an appropriate type set as well: days, months, quarters, and so on. Perform the basic checklist on the dimension design and adjust as necessary. KeyColumns and NameColumn properties do not require attention, but names assigned to attributes and hierarchies can be adjusted to work for the target audience. Attribute relationships will require refinements. Once the dimension has been adjusted, click the link in the Data Source View pane to create the time dimension table using appropriate naming and location choices. Assigning an attribute’s proper type property provides documentation, and may enable features in applications that use a cube. Attribute types are also used for some features within Analysis Services, including Business Intelligence calculations. Time dimensions can be developed from existing dimension tables as well, using the Dimension Wiz- ard. The challenge with this approach is specifying the attribute type for each of the columns in the time dimension. Using the wizard to generate a similar dimension table can also act as a guide when integrat- ing a custom time table. A server time dimension is an alternative to a traditional time dimension that relies on an underlying rela- tional table. The server time dimension is created internally to Analysis Services, and while not as flexi- ble as the traditional approach, it can be a great shortcut for building a simple cube or quick prototype. Create a server time dimension by starting the Dimension Wizard as described earlier, but choose ‘‘Gen- erate a time table on the server’’ as the creation method. 1488 www.getcoolebook.com Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1489 Building Multidimensional Cubes with Analysis Services 71 Because server time dimensions do not have an underlying dimension table, they will not appear in the data source view, so the relationship to the fact table(s) cannot be described there. Instead, use the Cube Designer’s dimension usage view to establish relationships to selected fact tables (also known as measure groups). Other dimension types In addition to the time dimension, Analysis Services recognizes more than a dozen other dimension types, including Customers, Accounts, and Products. Included templates can define a table similar to the process described for generating time dimensions. Start the Dimension Wizard and choose ‘‘Generate a non-time table in the data source’’ and then select a template. Existing tables can be cast as a special type as well by assigning the Type property for the dimension (such as Account)andtheType property for the dimension’s attributes (such as AccountNumber). Parent-child dimensions Most dimensions are organized into hierarchies that have a fixed number of levels, but certain business problems do not lend themselves to a fixed number of levels. For example, a minor organizational change may add a new level to the organizational chart. Relational databases solve this problem with self-referential tables. Analysis Services solves this problem using parent-child dimensions. A self-referential table involves two key columns — for example, an employee ID and a manager ID. To build the organizational chart, start with the president and look for employees that she manages; then look for the employees they manage, and so on. Often this relationship is expressed as a foreign key between the employee ID (the primary key) and the manager ID. When such a relationship exists on the source table, the Dimension Wizard will suggest the appropriate parent-child relationship. In the employee table example, the employee ID attribute will be configured with the Usage property set to Key, while the manager ID attribute will be configured with a Usage of Parent. Other important properties for configuring a parent-child dimension include the following: ■ RootMemberIf: As set on the parent attribute, this property tells Analysis Services how to identify the top level of the hierarchy. Values include ParentIsBlank (null or zero), ParentIsSelf (parent and key values are the same), ParentIsMissing (parent row not found). The default value is all three, ParentIsBlankSelfOrMissing. ■ OrderBy: The OrderBy of the Parent attribute will organize the hierarchy’s display. ■ NamingTemplate: By default, each level in the hierarchy is named simply Level 01, Level 02, etc. Change this naming by clicking the ellipses on the parent attribute’s NamingTemplate property and specifying a naming pattern in the Level Naming Template dialog. Levels can be given specific names, or a numbered scheme can be specified using an asterisk to denote the level number’s location. ■ MembersWithData: As set on the parent attribute, this property controls how non-leaf members with data are displayed. Under the default setting, NonLeafDataVisible,Analysis Services will repeat parent members at the leaf level to display their corresponding data. For example, if you browse a cube using a parent-child employee dimension to display sales volume by salesperson, then the sales manager’s name will show first at the manager level and then again at the employee level so that it can be associated with the sales the manager made. The alternative setting, NonLeafDataHidden, will not repeat the parent name or show data associated with it. This can be disconcerting in some displays because, as the totals do not 1489 www.getcoolebook.com Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1490 Part X Business Intelligence change, the sum of the detail rows will not match the total: In the sales manager example, the totals will differ by the sales manager’s contribution. ■ MembersWithDataCaption: When MembersWithData is set to NonLeafDataVisible, this parent attribute property instructs Analysis Services how to name the generated leaf members. Left at the default, blank, generated leaf members will have the same names as their corresponding parents. Enter any string using an asterisk to represent the parent name to change the default name generation. For example, ‘‘* (mgr)’’ will cause the string ‘‘(mgr)’’ to be suffixed to each sales manager’s name. ■ UnaryOperatorColumn: This is a custom rollup function often used with account dimen- sions, enabling the values associated with different types of accounts to be added or subtracted from the parent totals as needed. Set on the parent attribute, this property identifies a col- umn in the source data table that contains operators to direct how totals are constructed. The column is expected to contain ‘‘+’’ for items that should be added to the total, ‘‘−’’ for subtracted, and ‘‘∼’’ for ignore. The column can also contain ‘‘*’’ to multiply a value and the current partial total, or ‘‘/’’ to divide a value by the partial total, but these operators produce different results depending on which values are accumulated first. To control the order of operation, a second column can be added as an attribute in the parent-child dimension, given the type of sequence. For example, ‘‘+’’ and ‘‘−’’ operators could be used to calculate a net from a series of debit and credit accounts. Blank operators are treated as ‘‘+’’. Once the parent-child relationship is configured, the parent attribute presents a multi-level view of the dimension’s data. In addition, all the other attributes of the dimension are available and behave normally. The basic setup checklist applies to a parent-child dimension, although the name of the parent attribute will likely need to be adjusted within the dimension instead of in the data source view, given the unique usage. Dimension refinements Once a dimension has been built, a large number of properties are available to refine its behavior and that of its attributes. This section details some of the more common and less obvious refinements possible. Hierarchy (All) level and default member The (All) level is added to the top of each hierarchy by default, and represents every member in that hierarchy. At query time, the (All) level allows everything in a hierarchy to be included, without listing each member out separately. In fact, any hierarchy not explicitly included in a query is implicitly included using its (All) level. For example, a query that returns products sold by state explicitly is implicitly products sold by state for allyears,allmonths,allcustomers,etc. By default, the name of the (All) level will be All, which is quite practical and sufficient for most applications, but it is possible to give the (All) level a different name by setting the dimension property AttributeAllMemberName or the user hierarchy property AllMemberName. For example, the top level of the employee dimension could be changed to ‘‘Everyone.’’ Regardless of name, the (All) member is also the default member, implicitly included in any query for which that dimension is not explicitly specified. The default member can be changed by setting the dimension’s DefaultMember property. This property should be set with care. For example, setting the DefaultMember for the year attribute to 2009 will cause every query that does not explicitly 1490 www.getcoolebook.com Nielsen c71.tex V4 - 07/21/2009 3:53pm Page 1491 Building Multidimensional Cubes with Analysis Services 71 specify the year to return data for only 2009. Default members can also be set to conflict: Setting the DefaultMember for the year to 2009 and the month to August 2008 will cause any query that does not explicitly specify year and month to return no data. Default members are often set when data included in a cube is not commonly queried. Consider a cube populated with sales transactions that are mostly successful but sometimes fail due to customer credit or other problems. Nearly everyone that queries the cube will be interested in the volume and amount of successful transactions. Only someone doing failure analysis will want to view other than successful transactions. Thus, setting the status dimension’s default member to success would simplify queries for the majority of users. Another option is to eliminate the (All) level entirely by setting an attribute’s IsAggregatable property to false. When the (All) level is eliminated, either a DefaultMember must be specified or one will be chosen at random at query time. In addition, the attribute can participate in user hierarchies only at the top level, because appearing in a lower level would require the attribute to be aggregated. Grouping dimension members The creation of member groups, or discretization, is the process of grouping the values of a many-valued attribute into discrete ‘‘buckets’’ of data. This is a very useful approach for representing a large number of continuous values, such as annual income or commute distance. Enable the feature on an attribute by setting the DiscretizationBucketCount property to the number of groups to be created and by choosing a DiscretizationMethod from the list. A DiscretizationMethod setting of Automatic will result in reasonable groupings for most appli- cations. Automatic allows Analysis Services to choose an algorithm to match the data being grouped. Should the Automatic setting not yield acceptable groupings, try other methods. Once the groupings have been created they are not necessarily static — changes to the underlying data may cause new groupings to be calculated during cube processing. An attribute that is being grouped must not have any member properties — that is, other attributes can- not rely on a discretized attribute as the source of their aggregations. If the attribute to be discretized must participate in the natural hierarchy (for example, if it is the key or greatly affects performance), consider adding a second dimension attribute based on the same column to provide the grouped view. Take care to configure the attribute’s source columns and ordering because the OrderBy property will determine both how the data is examined in creating member groups and the order in which those groups are displayed. Cubes A cube brings the elements of the design process together and exposes them to the user, combining data sources, data source views, dimensions, measures, and calculations in a single container. A cube can contain data (measures) from many fact tables organized into measure groups. The data to be presented in Analysis Services is generally modeled with as few cubes and databases as is reasonable, with advantages to both the designer and the end user. Users that need only a narrow slice of what is presented in the resulting cube can be accommodated by defining a perspective, rather like an Analysis 1491 www.getcoolebook.com . changes. For example, the relationship between employee and department would be flexible to reflect the movement of employees between departments. FIGURE 71-6 Relationship Editor Note that flexible. above. ■ Organize dimensions with many hierarchies into folders. Best Practice Warnings S QL Server 2008 implements best practice warnings throughout the Analysis Services design environment, but. integrat- ing a custom time table. A server time dimension is an alternative to a traditional time dimension that relies on an underlying rela- tional table. The server time dimension is created