Nielsen c72.tex V4 - 07/21/2009 3:55pm Page 1522 Part X Business Intelligence ■ NotTop20ProductList: Builds the list of everything not in the top 20. The ‘‘except’’ set operator ( -) is used to remove the top 2 0 products from the list of all products. That list is filtered to exclude empty members and is in turn ordered by order count descending. ■ Average Top20ProductList Order Count: Calculates the average order count across the set of top 20 products. Similar aggregate functions, including SUM, MIN, MAX,andMEDIAN, share this syntax: AVG(set [, numeric_expression]). In practice, this calculation would likely be implemented as part of the next calculation, but it’s included here to show one calculation depending on another. ■ Difference from Top20 Products: Difference b etween a given product’s order count and the top 20 average. ■ Top 20 Products: Created as part of the product hierarchy to get a row to display showing the top 20 average. Because this row will display for a couple of measures, the numeric_expression is omitted so that it is calculated in the context of the cell being displayed. Dimension considerations There are several things to understand about dimensions and the properties of the cube being queried, as they affect query execution, including the following: ■ MdxMissingMemberMode: This dimension property, when set to true, causes invalid mem- bers specified as part of a query to be ignored without generating an error. For example, if an axis is defined as {[Product].[Product].[Mountain-100 Silver, 38], [Product] .[Product].[Banana]}and Banana is not a valid product name, then no error will be generated. Instead, the result will list the mountain bike and not the fruit. When MdxMissingMemberMode is set to false, an error is generated for invalid member names. MDX scripts (calculations described within the cube definition) always throw an error for missing members, regardless of this property setting. ■ IgnoreUnrelatedDimensions:Whentrue, this measure group property tells MDX to ignore dimensions unrelated to a measure being queried. For e xample, the employee dimen- sion of Adventure Works is not related to the Internet measures because no salesperson is involved with an Internet sale. Thus, the query SELECT {[Measures].[Internet Sales Amount]} ON COLUMNS, {[Employee].[Employee].[Employee].Members} ON ROWS FROM [Adventure Works] will list every employee with the total Internet sales amount, satisfying both the requirement to list all employees and the requirement to ignore the unrelated employee dimension when evaluating Internet sales. The alternative setting would result in null values being returned for every employee. An IgnoreUnrelatedDimensions setting of true is both the default and the more flexible option, but it requires some care by MDX query writers. ■ If a default member has been specified for a hierarchy, then results will be limited to that default unless a nother value for that hierarchy is explicitly listed in the query. For example, if [Date].[Calendar].[Calendar Year].&[2003] is the default member for year, then referencing [Date].[Calendar].[Month].&[2004]&[6]in a query without referencing the calendar year hierarchy will result in no data being returned. To retrieve the June 2004 data, 1522 www.getcoolebook.com Nielsen c72.tex V4 - 07/21/2009 3:55pm Page 1523 Programming MDX Queries 72 either reference the [All] level of calendar year, or, if the cube developer has suppressed the [All] level, reference the 2004 member of the year hierarchy. Default members are generally avoided, but they can be useful in some situations, and a query writer should be aware of any hierarchy that has a default member. ■ Autoexists vs. Non Empty: Placing the set {[Date].[Calendar Year].[Calendar Year] .Members * [Date].[Calendar].[Month].Members}on a query axis will result in listing the year 2001 with the months from 2001, the year 2002 with the months from 2002, etc. Why doesn’t the cross-join result in a true cross-product (e.g., 2001 appears with months from 2002) between the years and months? Analysis Services automatically detects which members of the hierarchies exist with each other, and returns only the valid combinations. This behavior is referred to as autoexists, and it only functions for hierarchies within a single dimension. Non Empty is used to further restrict sets to only those that have corresponding measure values. MDX Scripting Sets and calculations like those described in this chapter can be created directly within the cube. Using the Business Intelligence Development Studio, open the Cube Designer for the cube of interest and switch to the C alculations tab. For more information on designing cubes specifically, see Chapter 71, ‘‘Building Multidi- mensional Cubes with Analysis Services.’’ The cube contains a single declarative script that describes all the calculations and sets, although, by default, the designer presents this script as a series of forms (see Figure 72-3). Even if no calculated members or sets exist in the cube, a single CALCULATE statement should exist, instructing the cube to populate non-leaf cells within the cube. New in 2008 S QL Server 2008 exposes the ‘‘Associated Measure Group’’ and ‘‘Display Folder’’ properties of calculated members directly in the calculation definition (refer to Figure 72-3). These items can still be set by choosing Calculation Properties from the Cube menu, but their new location simplifies design. Likewise, Display folders can also be specified for named sets in this manner. Specifying these groupings organizes the client’s browsing experience. For example, placing calculated measures into measure groups displays calculated and physical measures in the same groupings. Calculated members and named sets Click on an existing calculated member (refer to Figure 72-3) or select New Calculated Member from the toolbar. A form appears with several options: ■ Name: Name of the calculated member, without its parent hierarchy. 1523 www.getcoolebook.com Nielsen c72.tex V4 - 07/21/2009 3:55pm Page 1524 Part X Business Intelligence FIGURE 72-3 Calculations tab of the Business Intelligence Development Studio Cube Designer ■ Parent Hierarchy: Hierarchy to which this member should be added. For measures, this will be simply Measures; for other hierarchies, use the built-in navigation to locate the appropriate dimension.hierarchy combination. ■ Parent Member: Only applies to multi-level hierarchies; specify the parent member of the calculated member. Essentially, this specifies the drill-down path in the multi-level hierarchy to reach the calculated member. ■ Expression: The formula that calculates the appropriate member value; equivalent to the expression used for a WITH MEMBER definition. ■ Format String: Optional format string; generally specified for measures. ■ Visible: Calculations are sometimes made not visible when they form the basis for other calculations but are not themselves valuable to the end-user. ■ Non-Empty Behavior: Select one or more measures to specify how to determine whether a specific cell will be empty. If not specified, the calculated member must be evaluated at every possible cell to determine whether it is empty. If specified, the listed physical measure(s) are used to determine whether a cell will be empty, dramatically speeding the calculation. 1524 www.getcoolebook.com Nielsen c72.tex V4 - 07/21/2009 3:55pm Page 1525 Programming MDX Queries 72 ■ Color and Font Expressions: Display attributes can be changed, as long as the client is using software that supports the appropriate display, based on any MDX expression. For example, values within budget could appear in green, and those outside of budget could be displayedinred. Sets are defined in a similar fashion to calculated members, but only the set name and defining expres- sion need to be specified. New in 2008 S ets created in SQL Server 2008 can be specified as either static or dynamic . Static sets are evaluated only once when the CREATE SET statement is evaluated. Dynamic sets are evaluated each time they are used in a query. Therefore, in addition to the low-overhead static sets available in SQL Server 2005, dynamic sets handle definitions that change depending on context, such as the 10 top-selling products, which may change depending upon the month being queried. Adding Business Intelligence The Cube Designer’s Business Intelligence Wizard can add calculations to a cube from standard tem- plates. Templates include currency conversion, combining values based on a chart of accounts, and time- based calculations such as moving averages and period to date. Each template has individual require- ments and purposes documented in Books Online, but time calculations are described here because they are most widely applicable. Using time intelligence requires a properly configured time dimension, with attribute types assigned, based on a dimension table (not a server dimension). Time intelligence is generally added late in the cube development cycle so that all the manually built calculated members are available for creating time-based calculations. Start the wizard by opening the cube of interest in the Cube Designer and selecting Add Business Intelligence from the toolbar. The wizard presents a series of pages: ■ Choose Enhancement: Select Define time intelligence. ■ Choose Target Hierarchy and Calculations: Calculations defined by the wizard will apply to a single time hierarchy. If the cube has multiple roles (e.g., Order vs. Ship date) or calendar types (e.g., Calendar vs. Fiscal), multiple runs of the wizard will be required to create calcula- tions for different target hierarchies. Generally, a multi-level hierarchy is chosen as the target hierarchy. Choose the target hierarchy at the top of the page, and then choose the calculations to be created for this hierarchy (e.g., Twelve Month Moving Average). ■ Define Scope of Calculations: Choose the measures that will be averaged, summarized, and so on, by the time calculations. ■ Completing the Wizard: Review the changes the wizard will make to the cube. The wizard adds the following: a named calculation to the data source view in the time table, a new hierarchy in the time dimension to c ontain the calculated members, and the MDX script that defines the 1525 www.getcoolebook.com Nielsen c72.tex V4 - 07/21/2009 3:55pm Page 1526 Part X Business Intelligence calculated members. Calculation results are accessed by queries that combine the target hierarchy and the hierarchy containing the calculated members. Results will be one of the following: a value when one can be calculated, null when not enough data is available, or ‘‘NA’’ if the calculation does not apply to the cell (e.g., a 12-month average calculation in a cell corresponding to one year). Summary MDX provides a way to define and query constructs within Analysis Services databases much the way SQL provides those capabilities for relational databases. Unlike SQL, MDX accommodates the multi-dimensional data b y specifying sets along multiple axes that identify the geometry of the resulting cell set. Functions are available to generate, order, and filter sets. The WITH keyword can be used to create building blocks for larger and more complex queries. MDX can also be used as the basis for calculations and sets defined within a cube definition. These MDX scripts are an excellent place to include commonly used calculations and groupings that are then available to any query. These features combine to create an extraordinarily rich and efficient data query and analysis environment. 1526 www.getcoolebook.com Nielsen c73.tex V4 - 07/21/2009 3:59pm Page 1527 Authoring Reports with Reporting Services IN THIS CHAPTER Report Definition Language Creating a report Data sources that connect a report to data Datasets that query data for a report Using XML and Web Services as a data source Tablix: The new workhorse of Reporting Services Simplified group definition in Reporting Services 2008 R eporting Services delivers a powerful toolset for report authoring. The Report Designer in Visual Studio provides robust capabilities for developers, while nontechnical users can build and update reports using the Report Builder. This chapter demonstrates how to build reports using Visual Studio. Building good reports requires an odd and often conflicting set of skills. Reports bridge the gap between nontechnical decision-makers and the database you’ve worked so hard to make understandable, robust, complete, consistent, and stable. Given this, it is worth stating the result that report developers strive to achieve: ■ Speed and availability ■ Accuracy and timeliness ■ The right amount of detail — not too much, not too little ■ Consistent, organized, and easily interpreted formatting and presentation This chapter explores the anatomy of a report, demonstrates the steps required to create a report, and covers several additional features of Reporting Services to satisfy nearly any reporting need. Anatomy of a Report A Reporting Services report consists of data sources, datasets, parameters, and the report layout or design all wrapped up in an XML file that describes the report. This section explains each of these report components. 1527 www.getcoolebook.com Nielsen c73.tex V4 - 07/21/2009 3:59pm Page 1528 Part X Business Intelligence Report Definition Language (RDL) The Report Definition Language (RDL) is an open XML schema used to represent data retrieval and lay- out information for a report. This includes elements to define datasets, parameters, charts, tables, and so on — everything needed to retrieve the r eport data and format it for display. This report definition in Reporting Services is nothing more than an XML file that conforms to the RDL specification. Microsoft provides two tools for creating RDL report definitions so that you don’t have to handwrite the XML: Visual Studio and Report Builder. This chapter focuses on building reports using Visual Studio. The Report Builder tool is part of the Report Manager deployment and provides end-users (nontechnical) with the ability to author and update reports. See Chapter 74, ‘‘Administering Reporting Services,’’ for more information on how to con- figure and deploy reports. One powerful facet of Reporting Services is the capability to extend the RDL schema. Because the RDL schema is an open schema, it is possible to accommodate advanced or custom scenarios by adding ele- ments and attributes to it. This can be accomplished by hand or programmatically using the classes in the System.XML namespace. It is also possible to programmatically build, deploy, and execute reports, which means the possibilities are endless when it comes to creating custom report authoring tools, authoring reports on-the-fly, and integrating reports into applications. For example, a developer could use the XmlTextWriter .NET class to programmatically create an RDL report definition, and then use the Report Server Web Service to deploy and render the report from within an application. The Report Server Web Service also con- tains methods to manage nearly all aspects of the report server. Data Sources A data source contains the connection information for a database or data file and includes the data source type, connection string, and credentials. A data source is required to retrieve data for a report. It can be defined and stored within a single report, or it can be shared by multiple reports within the project. Like a report, shared data sources can be deployed to the r eport server. Both shared data sources and r eport-specific data sources can be modified using the R eport Manager once they are deployed to the report server. Data Source types While it is possible to extend Reporting Services by creating a custom data extension, several data source types are available out of the box: ■ Microsoft SQL Server ■ Microsoft SQL Server Analysis Services ■ OLE DB (Object Linking and Embedding for Databases) ■ XML (Extensible Markup Language) ■ ODBC (Open Database Connectivity) ■ Report Server Model ■ Oracle ■ SAP NetWeaver BI 1528 www.getcoolebook.com Nielsen c73.tex V4 - 07/21/2009 3:59pm Page 1529 Authoring Reports with Reporting Services 73 ■ Hyperion Essbase ■ DB2 ■ Teradata Data source connection strings Adding a new data source as either a shared or a repo rt-specific source will invoke the Data Source Properties dialog. Give the source a meaningful name, choose the data source type, and enter the con- nection string. For most connection types, pressing the Edit button will assist in building the connection string. The XML data source type does not provide an editor. To connect to an XML data source, select the XMLdatatypeandprovideaURLastheconnectionstring. The following example shows a connection string to connect to the Report Server Web Service on the local machine. http://localhost/reportserver/reportservice2006.asmx The following connection string shows how to connect to an XML file named StateList.xml located on the local web server: http://localhost/StateList.xml Using expressions in a connection string Connection strings can contain expressions, allowing the connection string to be determined at runtime. For example, the following connection string will connect to the server and database specified by the parameter: ="data source=" & Parameters!ServerName.Value & ";initial catalog=" & Parameters!Database.Value Adding parameters to the report for the server and database enables the user to specify the data source of the report. Because reports often execute under the credentials of someone other than the end-user, care must be taken to avoid inappropriate data access. Setting data source credentials The data source credentials can be set to use Windows Authentication database authentication (such as SQL Server Authentication) or none. The credentials can be stored in the Report Server database, or users can be prompted for the credentials upon report execution. The best configuration option depends on your network and security environment. For more information about configuring report and data access, see the ‘‘Managing Roles’’ and ‘‘Managing Security’’ sections in Chapter 74, ‘‘Administering Reporting Services.’’ Reporting Services datasets Once a data source is defined, it can be used as the b asis for queries that result in Reporting Services datasets. Datasets provide the data that can appear on a report. Because a single report can use several data sources and datasets, a report can integrate data from multiple databases, XML sources, or ODBC sources, and so on. 1529 www.getcoolebook.com Nielsen c73.tex V4 - 07/21/2009 3:59pm Page 1530 Part X Business Intelligence New in 2008 T he report design environment has been dramatically changed and enhanced in SQL Server 2008 compared to the 2005 version. The Report Data pane provides easy access to Data Sources, Datasets, Fields, and Parameters. The Grouping pane likewise shows all the row and column groups defined in the reports. Enhanced object display and positioning, right-click menus, and pop-up properties dialogs streamline the design experience. Defining a new report using the wizard prompts for a query to define the new report’s initial dataset, which is handy for simple reports. Otherwise data sources and their corresponding datasets can be added in the Report Data pane in the design environment. Defining or changing a dataset will launch the Dataset Properties dialog, which includes several tabs: ■ Query: Can consist of most any string that is appropriate to the data source, including SQL statements, stored procedure calls, or XML queries ■ Parameters: Displays the parameters used in the query and the mapping to the report’s parameters ■ Fields: Lists a ll fields returned by the query and the corresponding namesusedinthedataset. When the names of the fields can’t be determined at design time, these names must be manu- ally entered. Calculated fields may also be added here, which helps centralize expressions that are referred to frequently in the report (e.g., defining profit as [revenue]-[expenses] enables profit to be referred to throughout the report as if it came from the source data). ■ Options: Provides source-specific options such as collation ■ Filters: Allows the definition of filters on the data returned from the source. Filters are run in-memory after the entire dataset has been returned, and are often used in conjunction with report parameters to enable users to display a subset of the data without querying the source a second time. The resulting dataset is available for use in report layout. The c hallenge of properly defining a dataset frequently involves properly setting up the associated parameters. Query parameters and report parameters Parameters can be used to empower the user to specify the report criteria and control report format- ting. Query parameters are included in the dataset definition and are used to select or filter data. For example, the parameter @CategoryID can be used to return the subcategories for a selected category, as demonstrated in the following example: Select * From Subcategory Where CategoryID = @CategoryID Query parameters can also be used to specify the values passed to a stored procedure. The following example calls a stored procedure named uspSubcategories that contains a parameter for the CategoryID: EXEC uspSubcategories @CategoryID 1530 www.getcoolebook.com Nielsen c73.tex V4 - 07/21/2009 3:59pm Page 1531 Authoring Reports with Reporting Services 73 Report parameters can also be used to control report formatting and behavior. For example, adding a Boolean parameter can control the visibility of report items used to display additional details in the report (see ‘‘Working with Expressions’’ l ater in this chapter). Many options are available to control the way parameters are presented in the report header. For example, setting the parameter data type to Boolean will render the parameter as radio buttons, enabling the user to select True or False. Setting the data type to Integer and adding a list of available values will render the parameter as a drop-down list. Selecting a parameter’s Multi-Value option and entering available values will render as a multi-value drop-down list (see Figure 73-1). FIGURE 73-1 Multi-value drop-down list for colors to be included in a report Three options are available for specifying parameter values and parameter defaults: None, Specify values ,andGet values from a query. Selecting None adds a textbox to the report, enabling the user to enter a ny value. Providing values using the Specify values and Get values from a query options adds a drop-down list to the report, enabling the user to select from the a vailable values. The Specify values option enables the report author to manually enter the parameter values as part of the parameter definition, while the Get values from a query option retrieves the parameter values from a dataset. Report content and layout Report content and layout is defined using the Design tab in the Report Designer. Reports contain header, body, and footer sections (header and footer are not displayed by default in 2008). Many report items are available to accommodate virtually any report formatting requirement. New in 2008 T he Table and Matrix items have been combined into a single, more flexible Tablix item in the report definition. Anytime a Table, Matrix, or List item is placed on the report surface, it is rendered in the .RDL code as a Tablix element. This enables the report author to use any feature from the three items regardless of which control is used in the report design. By default, the Table and Matrix cells each contain a textbox, making it easy to display data in each cell, while the List item contains a rectangle by default, making it easy to nest other items within the List. Table 73-1 contains the report items included with Visual Studio 2008. 1531 www.getcoolebook.com . custom data extension, several data source types are available out of the box: ■ Microsoft SQL Server ■ Microsoft SQL Server Analysis Services ■ OLE DB (Object Linking and Embedding for Databases) ■. V4 - 07/21/2009 3:59pm Page 1530 Part X Business Intelligence New in 2008 T he report design environment has been dramatically changed and enhanced in SQL Server 2008 compared to the 2005 version members, but only the set name and defining expres- sion need to be specified. New in 2008 S ets created in SQL Server 2008 can be specified as either static or dynamic . Static sets are evaluated only once