Ebook The data warehouse lifecycle toolkit : Expert methods for designing, developing, and deploying data warehouses – Part 2 includes contents: Chapter 10 architecture for the front room; chapter 11 infrastructure and metadata; chapter 12 a graduate course on the internet and security; chapter 13 creating the architecture plan and selecting products; chapter 14 a graduate course on aggregates; chapter 15 completing the physical... Đề tài Hoàn thiện công tác quản trị nhân sự tại Công ty TNHH Mộc Khải Tuyên được nghiên cứu nhằm giúp công ty TNHH Mộc Khải Tuyên làm rõ được thực trạng công tác quản trị nhân sự trong công ty như thế nào từ đó đề ra các giải pháp giúp công ty hoàn thiện công tác quản trị nhân sự tốt hơn trong thời gian tới.
m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo b8 9z i ep fjư 0x w7 aic ak ud 9u 4t vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b ve u6 16 4c dx ax pf ưq m a9 q3 3y vr v0 5c dw k4 pq zlt fq ew dm wc de zz b2 ưw i16 el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 ys ư0 1g 1g h7 76 dh ba 1y gy 5l nt iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq 3o h7 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe ey oy ka pe z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 hh sc 0s f 0d tm m 0l lm cư r0 ql c5 v2 rc bp rm 5h 64 hm 5w clr h 0k 8m xl td rf n5 ưf o7 nl bm 1x 3c ug de 05 xq is5 u 2g m 1x n9 2f d3 n4 fx vs y gq ajư ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb hu vư jb lw bh 90 qa 8s ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj l2 b0 x4 m xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 fa xe 2o cc i6 w9 gm m 5y v4 h6 g6 5q uu ea nz 51 kg r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak 8u h8 3b f ưh i5y t3 gp bj e6 2t j vu iry xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp s0 vy i36 la kw cg iq u sk m ư4 c3 xq x y1 pa km icd 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz 1p s3 ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư 8m nk px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv xq 1z 12 u9 hw 81 7u fd ow rk ew kx dx oc d xjs qư nn sm ưx jo uw 7z hr xs dd dw 1u m r ri8 m 21 2e dt m ict s9 9k ky dq la xn yg uv oy 0d 82 9x hq kn p1 kv vu x a1 j j3g stq we cy qg hf it cc zh 9z 34 n8 2v qg tq li 5x c8 m a hf 6iw gh s 6v 3c qi fz 99 1p 04 u9 ưj hy bh f4 ur x7 0e b ix9 gp au 9g vs ic k3 h3 fk s3 5n n9 bp f4 v iư jf vii zv d6 hf m r4 lx jg oh ur 1ư 7i 1q ưf oq hb di hj hư 2e 4n 01 zj x6 d7 ks ym gw 0g bn ks bp we ss ws 2m fye ui 0y bn vg n4 s3 ny x hg 7lư 1u hg w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk dq 0n 70 69 j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m d 29 9m dv 2x wk lu 64 tk c ap iu m m yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư w8 xq 4h 3p lo no 9w ef te 16 x6 hi ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk qn v4 xo sn cx c ld i2j ba x1 o5 1ư b0 nx g8 bh ib w8 wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.1 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo b8 9z i ep fjư 0x w7 aic ak ud 9u 4t vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b ve u6 16 4c dx ax pf ưq m a9 q3 3y vr v0 5c dw k4 pq zlt fq ew dm wc de zz b2 ưw i16 el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 ys ư0 1g 1g h7 76 dh ba 1y gy 5l nt iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq 3o h7 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe ey oy ka pe Overview z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf The front room is the public face of the warehouse It’s what the business users see and work with day-to-day In fact, for most folks, the user interface is the data warehouse They don’t know (or care) about all the time, energy, and resources behind it—they just want answers Unfortunately, the data they want to access is complex The dimensional model helps reduce the complexity, but businesses are rife with rules and exceptions that must be included in the warehouse so we can analyze and understand them This complexity only gets worse when we reach the implementation phase and add more elements to the design to achieve better performance (like aggregate tables) 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 hh sc 0s f 0d tm m 0l lm cư r0 ql c5 v2 rc bp rm 5h 64 hm 5w clr h 0k 8m xl td rf n5 ưf o7 nl bm 1x 3c ug de 05 xq is5 u 2g m 1x n9 2f d3 n4 fx vs y gq ajư ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb hu vư jb lw bh 90 qa 8s The primary goal of the warehouse should be to make information as accessible as possible—to help people get the information they need To accomplish this, we need to build a layer between the users and the information that will hide some of the complexities and help them find what they are looking for That is the primary purpose of the data access services layer Figure 10.1 shows the major stores and services to be found in the front room ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj l2 b0 x4 m xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 fa xe 2o cc i6 w9 gm m 5y v4 h6 g6 5q uu ea nz 51 kg r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak 8u h8 3b f ưh i5y t3 gp bj e6 2t j vu iry xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp s0 vy i36 la kw cg iq u sk m ư4 c3 xq x y1 pa km icd 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz 1p s3 ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư 8m nk px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv xq 1z 12 u9 hw 81 7u fd ow rk ew kx dx oc d xjs qư nn sm ưx jo uw 7z hr xs dd dw 1u m r ri8 m 21 2e dt m ict s9 9k ky dq la xn yg uv oy 0d 82 9x hq kn p1 kv vu x a1 j j3g stq we cy qg hf it cc zh 9z 34 n8 2v qg tq li 5x c8 m a hf 6iw gh s 6v 3c qi fz 99 1p 04 u9 ưj hy bh f4 ur x7 0e Figure 10.1 Front room technical architecture b ix9 gp au 9g vs ic k3 h3 fk s3 5n n9 bp f4 v iư jf vii zv d6 hf m r4 lx jg This chapter is laid out much like Chapter was First, we review the data stores that support the front room Next, we discuss the types of services that are needed in the front room to deliver information to the end users and manage the environment We describe the general characteristics of data access tools, followed by a discussion about data mining Finally, we take a moment to discuss the impact of the Internet on the front room architecture oh ur 1ư 7i 1q ưf oq hb di hj hư 2e 4n 01 zj x6 d7 ks ym gw 0g bn ks bp we ss ws 2m fye ui 0y bn vg n4 s3 ny x hg 7lư 1u hg w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk This chapter is required reading for the technical architects and end- user application developers The other project team members may find this material interesting at a high level when evaluating products in the data access marketplace As usual, the project manager needs to spend some time reviewing this chapter to be able to interact effectively with tool vendors and manage expectations of the business community dq 0n 70 69 j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m d 29 9m dv 2x wk lu 64 tk c ap iu m m yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư w8 xq 4h 3p lo no 9w ef 16 te Front Room Data Stores x6 hi ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk v4 qn Once the answer set to a specific data request leaves the presentation server, it usually xo sn cx c ld i2j ba x1 o5 1ư b0 nx g8 bh ib w8 wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.2 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 ends up on the user’s desktop Alternatively, the result set can be fed into a local data mart or a special-purpose downstream system This section looks at the architecture issues around front-end tools and other data stores downstream from the warehouse pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo b8 9z i ep fjư 0x w7 aic ak ud 9u 4t vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b Access Tool Data Stores ve u6 16 4c dx ax pf ưq m a9 q3 3y vr v0 5c dw k4 pq zlt fq ew dm wc de zz b2 ưw i16 el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 As data moves into the front room and closer to the user, it becomes more diffused Users can generate hundreds of ad hoc queries and reports in a day These are typically centered on a specific question, investigation of an anomaly, or tracking the impact of a program or event Most individual queries yield result sets with less than 10,000 rows—a large percentage have less than 1,000 rows These result sets are stored in the data access tool, at least temporarily Much of the time, the results are actually transferred into a spreadsheet and analyzed further ys ư0 1g 1g h7 76 dh ba 1y gy 5l nt iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq 3o h7 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe ey oy pe ka Some data access tools work with their own intermediate application server In some cases, this server provides an additional data store to cache the results of user queries and standard reports This cache provides much faster response time when it receives a request for a previously retrieved result set z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 hh sc 0s f 0d tm m 0l lm cư r0 Standard Reporting Data Stores ql c5 v2 rc bp rm 5h 64 hm 5w clr h 0k 8m xl td rf n5 ưf o7 nl bm As more transaction systems migrate to client/server packages, the tasks performed by the old mainframe reporting systems are being left undone or are being done poorly As a result, client/server-based standard reporting environments are beginning to pop up in the marketplace These applications usually take advantage of the data warehouse as a primary data source They may use multiple data stores, including a separate reporting database that draws from the warehouse and the operational systems They may also have a report library or cache of some sort that holds a preexecuted set of reports to provide lightning-fast response time 1x 3c ug de 05 xq is5 u 2g m 1x n9 2f d3 n4 fx vs y gq ajư ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb hu vư jb lw bh 90 qa 8s ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj l2 b0 x4 m xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 fa xe 2o cc i6 w9 gm m 5y v4 h6 g6 5q uu ea nz 51 kg r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak Personal Data Marts 8u h8 3b f ưh i5y t3 gp bj e6 iry 2t j vu The idea of a personal data mart seems like a whole new market if you listen to vendors who have recently released tools positioned specifically for this purpose Actually, the idea is as old as the personal computer and dBase People have been downloading data into dBase, Access, FoxPro, and even Excel for years What is new is that industrialstrength database tools have made it to the desktop The merchant database vendors all have desktop versions that are essentially full-strength, no-compromise relational databases There are also new products on the market that take advantage of data compression and indexing techniques to give amazing capacity and performance on a desktop computer xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp s0 vy i36 la kw cg iq u sk m ư4 c3 xq x y1 pa km icd 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz 1p s3 ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư 8m nk px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv xq 1z 12 u9 hw 81 Tip Be careful with personal data marts The temptation to solve a problem by throwing data at it is strong, and it is made more seductive by the ease with which we can use new local database tools But we are essentially talking about the difference between a prototype and a production system It’s easy to populate the database the first time, but you need to be able to keep it updated, in synch, and secure Otherwise, you’ll end up with another stovepipe data mart and a maintenance headache you didn’t plan for 7u fd ow rk ew kx dx oc d xjs qư nn sm ưx jo uw 7z hr xs dd dw 1u m r ri8 m 21 2e dt m ict s9 9k ky dq la xn yg uv oy 0d 82 9x hq kn p1 kv vu x a1 j j3g stq we cy qg hf it cc zh 9z 34 n8 2v qg tq li 5x c8 m a hf 6iw gh s 6v 3c qi fz 99 1p 04 u9 ưj hy bh f4 ur 0e x7 Personal data marts are going to spread You should plan for this component and make it easy to take advantage of standard warehouse tools and processes (like metadata, job scheduling, event notification, etc.) Personal data marts may require a replication framework to ensure they are always in synch with the data warehouse b ix9 gp au 9g vs ic k3 h3 fk s3 5n n9 bp f4 v iư jf vii zv d6 hf m r4 lx jg oh ur 1ư 7i 1q ưf oq hb di hj hư 2e 4n 01 zj x6 d7 ks ym gw 0g bn ks bp we ss 2m fye ws The personal data mart is also the home turf of many of the MOLAP products These products were born in the PC/NT environment and were created to target individual power users or departments with specific reporting needs, like the financial reporting group They will continue to play an important role in this personal segment of the marketplace ui 0y bn vg n4 s3 ny x hg 7lư 1u hg w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk dq 0n 70 69 j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m d 29 9m dv Disposable Data Marts 2x wk lu 64 tk c ap iu m m The disposable data mart is a set of data created to support a specific short-lived business situation It is similar to the personal data mart, but it is intended to have a limited life span For example, a company may be launching a significant promotion or new product or service (e.g., acquisition analysis or product recall) and want to set up a special launch control room yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư w8 xq 4h 3p lo no 9w ef te 16 x6 hi ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk qn v4 xo sn cx c ld i2j ba x1 o5 1ư b0 nx g8 bh ib w8 wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.3 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 Theoretically, the business needs of the disposable data mart can be met from the rest of the data warehouse In practice, it may make sense to create a separate data mart There may be temporary external data sources to feed in or internal sources that are not yet in the warehouse There may be security reasons for creating a separate environment, as when a company is evaluating a merger or acquisition candidate The disposable data mart also allows the data to be designed specifically for the event, applying business rules and filters to create a simple sandbox for the analysts to play in pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo b8 9z i ep fjư 0x w7 aic ak ud 9u 4t vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b ve u6 16 4c dx ax pf ưq m a9 q3 3y vr v0 5c dw k4 pq zlt fq ew dm wc de zz b2 ưw i16 el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 Application Models ys ư0 1g 1g h7 76 dh ba 1y gy 5l nt Data mining is the primary example of an application model Data mining is a confusing area mainly because it isn’t one entity It’s a collection of powerful analysis techniques for making sense out of very large data sets From a data store point of view, each of these analytical processes usually sit on a separate machine (or at least a separate process) and works with its own data drawn from the data warehouse Often, it makes sense to feed the results of a data mining process back into the warehouse to use as an attribute in one of the dimensions Credit rating and churn scores are good examples of data mining output that would be valuable in the context of the rest of the data in the warehouse We’ll return to data mining later in this chapter iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq 3o h7 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe ey oy ka pe z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 hh sc 0s f 0d tm m 0l lm cư r0 ql c5 v2 rc bp rm 5h 64 hm 5w Downstream Systems clr h 0k 8m xl td rf n5 ưf o7 nl bm 1x 3c ug de 05 xq is5 2g m u As the data warehouse becomes the authoritative data source for analysis and reporting, other systems are drawn to it as the data source of choice The basic purpose of these systems is still reporting, but they tend to fall closer to the operational edge of the spectrum 1x n9 2f d3 n4 fx vs y gq ajư ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb hu vư jb lw bh 90 qa 8s ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj b0 x4 m l2 While these systems are typically transaction oriented, they gain significant value by including some of the history in the warehouse Good examples are budgeting systems that pull some of their input from the warehouse (e.g., monthly average phone charges by office last year) and forecasting systems that draw on as many years of history as possible and whatever causal data might be available Another interesting application that has been growing in popularity is the use of warehouse data to support customer interactions Many sales force automation systems are pulling in as much information as they can about a company’s relationship with its customers The same is true on the customer support side When the phone rings in the call center, it can be extremely helpful to have access to the customer’s order history, aligned with payments, credits, and open orders—all on the same screen These applications draw from data in the data warehouse, but are enabled in separate environments xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 fa xe 2o cc i6 w9 gm m 5y v4 h6 g6 5q uu ea nz 51 kg r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak 8u h8 3b f ưh i5y t3 gp bj e6 2t j vu iry xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp s0 vy i36 la kw cg iq u sk m ư4 c3 xq x y1 pa km icd 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz 1p s3 ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư 8m nk px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv Front Room Services for Data Access xq 1z 12 u9 hw 81 7u fd ow rk ew kx There isn’t much in the way of standalone data access services in most data warehouses today Most of what exists is hard-wired into the front-end tools—the primary data stores and service providers of the front room Two major forces are dragging the data access services out of the front-end tools and moving it into the applications layer First, the buying power of the data warehouse market is putting pressure on database vendors to improve their products specifically for data warehousing Second, the demand for Webbased tools is causing tool vendors to slim down their desktop clients and move some of the shared functionality to an application server In the best of all possible data warehouses, the data access services would be independent of specific tools, available to all, and add as much value to the data access process as the data management services in the back room dx oc d xjs qư nn sm ưx jo uw 7z hr xs dd dw 1u m r ri8 m 21 2e dt m ict s9 9k ky dq la xn yg uv oy 0d 82 9x hq kn p1 kv vu x a1 j j3g stq we cy qg hf it cc zh 9z 34 n8 2v qg tq li 5x c8 m a hf 6iw gh s 6v 3c qi fz 99 1p 04 u9 ưj hy bh f4 ur x7 0e b ix9 gp au 9g vs ic k3 h3 fk s3 5n n9 bp f4 v iư jf vii zv d6 hf m r4 lx jg oh ur 1ư 7i 1q ưf oq hb di hj hư 2e 4n 01 zj x6 d7 ks ym gw 0g bn ks bp we ss fye ws 2m Data access services cover five major types of activities in the data warehouse: warehouse or metadata browsing; access and security; activity monitoring; query management; and standard reporting As you gather architectural requirements, keep an eye out for the following kinds of functionality that would reside in the data access services layer ui 0y bn vg n4 s3 ny x hg 7lư 1u hg w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk dq 0n 70 69 j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m Warehouse Browsing d 29 9m dv 2x wk lu 64 c ap iu m tk Warehouse browsing takes advantage of the metadata catalog to support the users in their efforts to find and access the information they need Ideally, a user who needs business information should be able to start with some type of browsing tool and peruse the data warehouse to look for the appropriate subject area m yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư w8 xq 4h 3p lo no 9w ef te 16 x6 hi ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk qn v4 xo sn cx c ld i2j Tip The warehouse browser should be dynamically linked to the metadata catalog ba x1 o5 1ư b0 nx g8 bh ib w8 wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.4 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 to display currently available subject areas and the data elements within those subjects It should be able to pull in the definitions and derivations of the various data elements and show a set of standard reports that include those elements Once the user finds the item of interest, the browser should provide a link to the appropriate resource: a canned report, a tool, or a report scheduler pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo b8 9z i ep fjư 0x w7 aic ak ud 9u 4t vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b ve u6 16 4c dx ax pf ưq m a9 q3 3y vr v0 5c dw k4 pq zlt fq ew dm wc de zz b2 ưw i16 el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 This sounds like a lot of work, but the payback is a self-sufficient user community We’ve seen home-grown systems that provide much of this functionality Historically, these browsers were built on the Web or use tools like Visual Basic, Microsoft Access, and even desktop help systems Moving forward, companies that are rolling their own are mostly using Web-based tools to provide some portion of this service ys ư0 1g 1g h7 76 dh ba 1y gy 5l nt iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq h7 3o Providing warehouse browsing services has not been the main focus of most data warehouses In general, the front-end tool has been the beginning and end of the navigation process A user opens the tool, and whatever they see is what they can get to Fortunately, front ends have grown more sophisticated and now use metadata to define subsets of the database to simplify the user’s view They also provide ways to hook into the descriptive metadata to provide column names and comments 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe ey oy ka pe z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 hh sc 0s f 0d tm m 0l lm Recently, several tools specifically designed to provide this kind of browsing capability have come on the market One interesting twist is that a data modeling tool company has released a warehouse metadata browsing tool This makes perfect sense in that the data modeling tool is one of the most likely places to capture the descriptive metadata about the model cư r0 ql c5 v2 rc bp rm 5h 64 hm 5w clr h 0k 8m xl td rf n5 ưf o7 nl bm 1x 3c ug de 05 xq is5 u 2g m 1x n9 2f d3 n4 fx vs y gq ajư ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb hu vư jb lw bh 90 Access and Security Services qa 8s ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj m l2 b0 x4 Access and security services facilitate a user’s connection to the database This can be a major design and management challenge We’ve dedicated Chapter 12 to a graduatelevel discussion of access and security Our goal in this section is merely to present an overview of how access and security fit into the architecture xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 fa xe 2o cc i6 w9 gm m 5y v4 h6 g6 5q uu ea nz 51 kg r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak 8u h8 3b f ưh i5y t3 gp Access and security rely on authorization and authentication services where the user is identified and access rights are determined or access is refused For our purposes, authentication means some method of verifying that you are who you say you are There are several levels of authentication, and how far you go depends on how sensitive the data is A simple, constant password is the first level, followed by a system-enforced password pattern and periodically required changes Beyond the password, it is also possible to require some physical evidence of identity, like a magnetic card There are hardware- and network-based schemes that work from a preassigned IP address, particularly on dial-in connections Authentication is really one of those infrastructure services that the warehouse should be able to count on bj e6 2t j vu iry xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp s0 vy i36 la kw cg iq u sk m ư4 c3 xq x y1 pa km icd 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz 1p s3 ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư 8m nk px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv xq 1z 12 u9 hw 81 7u fd ow rk kx ew On the database side, we strongly encourage assignment of a unique ID to each user Although it means more work maintaining IDs, it helps in tracking warehouse usage and in identifying individuals who need help dx oc d xjs qư nn sm ưx jo uw 7z hr xs dd dw 1u m r ri8 m 21 2e dt m ict s9 9k ky dq la xn yg uv oy 0d 82 9x hq kn p1 kv stq vu x a1 j j3g Once we’ve identified someone to our satisfaction, we need to determine what they are authorized to see Some of this depends on the corporate culture In some companies, management wants people to see only a limited range of information For example, regional managers can only see sales and expense information for their regions We believe the value of a data warehouse is correlated with the richness and breadth of the data sources provided Therefore, we encourage our clients to make the warehouse as broadly available as possible we cy qg hf it cc zh 9z 34 n8 2v qg tq li 5x c8 m a hf 6iw gh s 6v 3c qi fz 99 1p 04 u9 ưj hy bh f4 ur x7 0e b ix9 gp au 9g vs ic k3 h3 fk s3 5n n9 bp f4 v iư jf vii zv d6 hf m r4 lx jg oh ur 1ư 7i 1q ưf oq hb di hj hư 2e 4n 01 zj x6 d7 ks ym gw 0g bn ks bp ss we Authorization is a much more complex problem in the warehouse than authentication, because limiting access can have significant maintenance and computational overhead, especially in a relational environment ws 2m fye ui 0y bn vg n4 s3 ny x hg 7lư 1u hg w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk dq 0n 70 69 Activity Monitoring Services j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m d 29 9m dv 2x wk lu 64 tk c ap iu m m yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư Activity monitoring involves capturing information about the use of the data warehouse There are several excellent reasons to include resources in your project plan to create an activity monitoring capability centered around four areas: performance, user support, marketing, and planning w8 xq 4h 3p lo no 9w ef te 16 x6 hi • Performance Gather information about usage, and apply that information to tune the warehouse more effectively The DBA can use the data to see which tables and columns are most often joined, selected, aggregated, and filtered In many cases, this ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk qn v4 xo sn cx c ld i2j ba x1 o5 1ư b0 nx g8 bh ib w8 wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.5 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 can lead to changes in the aggregate tables, the indexes, and fundamental changes in the schema design pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo b8 9z i ep fjư 0x w7 aic ak ud 9u 4t • User support The data warehouse team should monitor newly trained users to ensure they have successful experiences with the data warehouse in the weeks following training Also, the team should be in the habit of monitoring query text occasionally throughout the day This will help the team understand what users are doing, and it can also help them intervene to assist users in constructing more efficient queries vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b ve u6 16 4c dx ax pf ưq m a9 q3 3y vr v0 5c dw k4 pq zlt fq ew dm wc de zz b2 ưw i16 el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 ys ư0 1g 1g h7 76 dh ba 1y gy • Marketing Publish simple usage statistics to inform management of how their investment is being used A nice growth curve is a wonderful marketing tool, and a flat or decreasing curve might be motivating for the warehouse team 5l nt iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq 3o h7 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe ey oy • Planning Monitor usage growth, average query time, concurrent user counts, database sizes, and load times to quantify the need and timing for capacity increases This information also could support a mainframe-style charge-back system, if necessary ka pe z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 hh sc 0s f 0d tm m 0l lm Like many of the services we’ve discussed, you can build a rudimentary version of an activity monitor yourself or buy a more full-featured package Chapter 15, which focuses on physical design, has additional information on activity monitoring There are packages on the market specifically designed to monitor data warehouse user activity Many of the query management tools also offer some level of query monitoring as a natural byproduct of managing the query process Some of the front-end tools offer rudimentary activity monitoring support as well cư r0 ql c5 v2 rc bp rm 5h 64 hm 5w clr h 0k 8m xl td rf n5 ưf o7 nl bm 1x 3c ug de 05 xq is5 u 2g m 1x n9 2f d3 n4 fx vs y gq ajư ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb hu vư jb lw bh 90 qa 8s ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj Query Management Services l2 b0 x4 m xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 fa xe 2o cc gm m i6 w9 Query management services are the set of capabilities that manage the exchange between the query formulation, the execution of the query on the database, and the return of the result set to the desktop These services arguably have the broadest impact on user interactions with the database The following paragraphs describe the major query management services you will likely want to include in your architecture Each of the items in the list has a corresponding business requirement For example, many of the query formulation services are driven by a need to create certain kinds of reports that are difficult for simple SQL generators to We’ll explore some of these capabilities further in the tool selection section in Chapter 13 Note that many of these services are metadata driven 5y v4 h6 g6 5q uu ea nz 51 kg r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak 8u h8 3b f ưh i5y t3 gp bj e6 2t j vu iry xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp s0 vy i36 la kw cg iq u sk m ư4 c3 xq x y1 pa km icd 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz 1p s3 ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư 8m nk • Content simplification These techniques attempt to shield the user from the complexities of the data and the query language before any specific queries are formulated This includes limiting the user’s view to subsets of the tables and columns, predefined join rules (including columns, types, and path preferences), and standard filters px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv xq 1z 12 u9 hw 81 7u fd ow rk ew kx dx oc d xjs qư nn sm ưx jo uw 7z hr xs dd dw 1u m r ri8 m 21 2e dt m ict s9 9k ky dq la xn yg uv Content simplification metadata is usually specific to the front-end tool rather than a generally available service The simplification rules are usually hidden in the front-end tool’s metadata repository oy 0d 82 9x hq kn p1 kv vu x a1 j j3g stq we cy qg hf it cc zh 9z 34 n8 2v qg tq li 5x c8 m a hf 6iw gh s 6v 3c qi fz 99 1p 04 u9 ưj Tip Content simplification metadata is usually created by the tool administrator during the tool implementation Today, there are no standards at all for this type of information hy bh f4 ur x7 0e b ix9 gp au 9g vs ic k3 h3 fk s3 5n n9 bp f4 v iư jf vii zv d6 hf m r4 lx jg oh ur 1ư 7i 1q ưf oq hb di hj hư 2e 4n 01 • Query reformulation As we saw in Chapter 6, query formulation can be extremely complex if you want to solve real-world business problems Tool developers have been struggling with this problem for decades, and have come up with a range of solutions, with varying degrees of success The basic problem is that most interesting business questions require a lot of data manipulation Even simple-sounding questions like “How much did we grow last year” or “Which accounts grew by more than 100 percent?” can be a challenge to the tool The query reformulation service needs to parse an incoming query and figure out how it can best be resolved Query retargeting, as described in the next section, is the simplest form of reformulation Beyond that, a query reformulation service should be able to generate complex SQL, including subqueries and unions Many of these queries require multipass SQL, where the results of the first query are part of the formulation of the second query Since data access tools provide most of the original query formulation capabilities, we discuss this further in the data access tools section later in this chapter zj x6 d7 ks ym gw 0g bn ks bp we ss ws 2m fye ui 0y bn vg n4 s3 ny x hg 7lư 1u hg w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk dq 0n 70 69 j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m d 29 9m dv 2x wk lu 64 tk c ap iu m m yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư w8 xq 4h 3p lo no 9w ef te 16 x6 hi ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk qn v4 xo sn cx c ld i2j ba x1 o5 1ư b0 nx g8 bh ib w8 wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.6 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 • Query retargeting and multipass SQL The query retargeting service parses the incoming query, looks up the elements in the metadata to see where they actually exist, and then redirects the query or its components as appropriate This includes simple redirects, heterogeneous joins, and set functions such as union and minus This simple-sounding function is actually what makes it possible to host separate fact tables on separate hardware platforms It allows us to query data from two fact tables, like manufacturing costs and customer sales, on two different servers, and seamlessly integrate the results into a customer contribution report pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo b8 9z i ep fjư 0x w7 aic ak ud 9u 4t vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b ve u6 16 4c dx ax pf ưq m a9 q3 3y vr v0 5c dw k4 pq zlt fq ew dm wc de zz b2 ưw i16 el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 ys ư0 • Aggregate awareness Aggregate awareness is a special case of query retargeting where the service recognizes that a query can be satisfied by an available aggregate table rather than summing up detail records on the fly For example, if someone asks for sales by month from the daily table, the service would reformulate the query to run against the monthly fact table The user gets better performance and doesn’t need to know there are additional fact tables out there 1g 1g h7 76 dh ba 1y gy 5l nt iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq 3o h7 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe ey oy ka pe z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf The aggregate navigator is the component that provides this aggregate awareness In the same way that indexes are automatically chosen by the database software, the aggregate navigator facility automatically chooses aggregates The aggregate navigator sits above the DBMS and intercepts the SQL sent by the requesting client, as illustrated in Figure 10.2 The best aggregate navigators are independent of the end user tools and provide the aggregate navigation benefit for all clients sending SQL to the DBMS An aggregate navigator that is embedded in the end user tool is isolated to that specific tool and creates a problem for the DBA who must support multiple tools in a complex environment 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 hh sc 0s f 0d tm m 0l lm cư r0 ql c5 v2 rc bp rm 5h 64 hm 5w clr h 0k 8m xl td rf n5 ưf o7 nl bm 1x 3c ug de 05 xq is5 u 2g m 1x n9 2f d3 n4 fx vs y gq ajư ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb hu vư jb lw bh 90 qa 8s ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj l2 b0 x4 m xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 fa xe 2o cc i6 w9 gm m 5y v4 h6 g6 5q uu ea nz 51 kg r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak 8u h8 3b f ưh i5y t3 gp bj e6 2t j vu iry xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp s0 vy i36 la kw cg iq u sk m ư4 c3 xq x y1 pa km icd 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz 1p s3 ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư 8m nk px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv xq 1z 12 u9 hw 81 Figure 10.2 The aggregate navigator 7u fd ow rk ew kx dx oc d xjs qư nn sm ưx jo uw 7z hr xs dd dw 1u m r ri8 m 21 A good aggregate navigator maintains statistics on all incoming SQL and not only reports on the usage levels of existing aggregates but suggests additional aggregates that should be built by the DBA 2e dt m ict s9 9k ky dq la xn yg uv oy 0d 82 9x hq kn p1 kv vu x a1 j j3g stq we cy qg hf it cc zh 9z 34 n8 2v qg c8 tq li 5x • Date awareness The date awareness service allows the user to ask for items like current year-to-date and prior year-to-date sales without having to figure out the specific date ranges This usually involves maintaining attributes in the Periods dimension table to identify the appropriate dates m a hf 6iw gh s 6v 3c qi fz 99 1p 04 u9 ưj hy bh f4 ur x7 0e b ix9 gp au 9g vs ic k3 h3 fk s3 5n n9 bp f4 v iư jf vii zv d6 hf m r4 lx jg oh ur 1ư 7i 1q ưf oq hb di hj • Query governing Unfortunately, it’s relatively easy to create a query that can bring the data warehouse to its knees, especially a large database Almost every warehouse has a list of queries from hell These are usually poorly formed and often incorrect queries that lead to a nested loop of full table scans on the largest table in the database Obviously, you’d like to stop these before they happen After good design and good training, the next line of defense against these runaway queries is a query governing service hư 2e 4n 01 zj x6 d7 ks ym gw 0g bn ks bp we ss ws 2m fye ui 0y bn vg n4 s3 ny x hg 7lư 1u hg w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk dq 0n 70 69 j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m d 29 9m dv 2x wk lu 64 tk c ap iu m m yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư w8 xq 4h 3p lo no 9w ef te 16 x6 hi ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk qn v4 xo sn cx c ld i2j ba x1 o5 1ư b0 nx g8 bh ib Query governing is still in its nascent stages With many tools, you can place a simple limit on the number of minutes a query can run or the number of rows it can return The problem with these limits is that they are imposed after the fact If you let a query run for an hour before you kill it, an hour of processing time is lost Besides, the user who submitted it probably suspects it would have finished in the next minute or two if you hadn’t killed it To govern queries effectively, the service needs to be able to estimate the effort of executing a query before it is actually run This can be accomplished in some cases by getting the query plan from the database optimizer and using its w8 wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.7 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 estimate A sophisticated query manager could also keep records of similar queries and use previous performance as an indicator of cost It can then check to see if the user has permission to run such a long query, ask if the user wants to schedule it for later execution, or just tell the user to reformulate it pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo b8 9z i ep fjư 0x w7 aic ak ud 9u 4t vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b ve u6 16 4c dx ax pf ưq m a9 Query Service Locations q3 3y vr v0 5c dw k4 pq zlt fq ew dm wc de zz b2 ưw i16 el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 There are three major options for where query services can be located in the architecture: on the desktop, on an application server, or in the database Today, most of these services are delivered as part of the front-end toolset and reside on the desktop In fact, all of the major front-end tool providers have had to develop many of these services over the years The problem is that everything they’ve developed is locked inside their tools The tools have become large and costly, and other tools are unable to take advantage of the query management infrastructure already in place for the first tool This is a good strategy from the vendor’s point of view because it locks their customers into a major dollar and time investment However, it’s not so good for the business when multiple tools are needed or demanded to meet multiple business requirements ys ư0 1g 1g h7 76 dh ba 1y gy 5l nt iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq 3o h7 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe ey oy ka pe z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 Some front-end tool vendors have created their own three-tier architecture and located many of these services on an application server between the desktop front-end and the database Architecturally, this works well because it allows any client to take advantage of a shared resource The client can concentrate on presenting the query formulation and report creation environment and need not carry the additional burden of query management It also allows the query to be directed to multiple databases, potentially in multiple database platforms on multiple systems The application server can own the task of combining the results sets as appropriate Unfortunately, few standards for these application servers exist yet, so they are relatively proprietary hh sc 0s f 0d tm m 0l lm cư r0 ql c5 v2 rc bp rm 5h 64 hm 5w clr h 0k 8m xl td rf n5 ưf o7 nl bm 1x 3c ug de 05 xq is5 u 2g m 1x n9 2f d3 n4 fx vs y gq ajư ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb hu vư jb lw bh 90 qa 8s ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj l2 b0 x4 m xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 There are also stand-alone middleware products that provide many of the data access services described above Unfortunately, the major alternatives in this group are also proprietary, limited to a specific hardware or database platform fa xe 2o cc i6 w9 gm m 5y v4 h6 g6 5q uu ea nz 51 kg r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak 8u h8 3b f ưh i5y t3 gp e6 bj Database vendors are moving to include some of these services in the core database engine This is significantly better than having them trapped in the front end tool because all front end tools can then take advantage of the service On the other hand, it is a little more limiting than the application-server approach because it makes it difficult to support cross-machine or cross-database awareness 2t j vu iry xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp s0 vy i36 la kw cg iq u sk m ư4 c3 xq x y1 pa km icd 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz s3 1p As you gain experience with these services, you’ll see how many of them would be much more valuable if they were based either in a common application layer or in the database platform itself rather than in the desktop tool We encourage you to explore the marketplace and communicate your requirements for these kinds of services to your tool and database vendors ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư 8m nk px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv xq 1z 12 u9 hw 81 7u fd ow rk ew kx dx oc d xjs qư nn sm ưx jo uw 7z hr xs dd dw 1u m r ri8 m 21 2e dt m Standard Reporting Services ict s9 9k ky dq la xn yg uv oy 0d 82 Standard reporting provides the ability to create production style fixed-format reports that have limited user interaction, a broad audience, and regular execution schedules The application templates described in Chapter 17 are essentially a casual kind of standard report At the formal end of the spectrum, large standard reporting systems tend to surface when the ERP system cannot handle the workload of operational transactions and reporting Be careful not to take this on as a side effort of the data warehouse Fullscale standard reporting is a big job that involves its own set of requirements and services In this case, there should be a standard reporting project solely responsible for managing this effort 9x hq kn p1 kv vu x a1 j j3g stq we cy qg hf it cc zh 9z 34 n8 2v qg tq li 5x c8 m a hf 6iw gh s 6v 3c qi fz 99 1p 04 u9 ưj hy bh f4 ur x7 0e b ix9 gp au 9g vs ic k3 h3 fk s3 5n n9 bp f4 v iư jf vii zv d6 hf m r4 lx jg oh ur 1ư 7i 1q ưf oq hb di hj hư 2e 4n 01 zj x6 d7 ks ym gw 0g bn ks bp we ss ws 2m fye ui 0y bn vg n4 s3 ny x hg 7lư 1u hg Of course, the data warehouse needs to support standard reports regardless of whether there is a large-scale standard reporting environment In fact, most of the query activity on many warehouses today comes from what could be considered standard reporting In some ways, this idea of running production reports in an end user environment seems inappropriate, but it is actually a natural evolution Often, analyses that are developed in an ad hoc fashion become standard reports The ability to put these into a managed reporting environment is an obvious requirement They will need to be run on a regular basis and made available to a broad base of consumers either on a push or pull basis (e.g., e-mail or Web posting) Most of the front-end tool developers include some form of this reporting capability in their products Requirements for standard reporting tools include: w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk dq 0n 70 69 j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m d 29 9m dv 2x wk lu 64 tk c ap iu m m yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư w8 xq 4h 3p lo no 9w ef te 16 x6 hi ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk qn v4 xo sn c ld i2j cx • Report development environment This should include most of the ad hoc tool ba x1 o5 1ư b0 nx g8 bh ib w8 wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.8 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 functionality and usability pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo • Report execution server The report execution server offloads running the reports and stages them for delivery, either as finished reports in a file system or in a custom report cache b8 9z i ep fjư 0x w7 aic ak ud 9u 4t vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b ve u6 16 4c dx ax pf ưq m a9 q3 3y vr v0 5c dw k4 pq zlt fq ew • Parameter- or variable-driven capabilities For example, you can change the Region name in one parameter and have an entire set of reports run based on that new parameter value dm wc de zz b2 ưw i16 el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 ys ư0 1g 1g h7 76 dh ba • Time- and event-based scheduling of report execution A report can be scheduled to run at a particular time of day or after a value in some database table has been updated 1y gy 5l nt iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq 3o h7 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe • Iterative execution For example, provide a list of regions and create the same report for each region Each report could then be a separate file e-mailed to each regional manager This is similar to the concept of a report section or page break, where every time a new value of a given column is encountered, the report starts over on a new page with new subtotals, except it generates separate files ey oy ka pe z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 hh sc 0s f 0d tm m 0l lm cư r0 ql v2 c5 • Flexible report definitions These should include compound document layout (graphs and tables on the same page) and full pivot capabilities for tables rc bp rm 5h 64 hm 5w clr h 0k 8m xl td rf n5 ưf o7 nl bm 1x 3c ug de 05 xq is5 u 2g m 1x n9 • Flexible report delivery: 2f d3 n4 fx vs y gq ajư ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb – Via multiple delivery methods (e-mail, Web, network directory, desktop directory and automatic fax) hu vư jb lw bh 90 qa 8s ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj x4 m l2 b0 – In the form of multiple result types (data access tool file, database table, spreadsheet) xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 fa xe 2o cc i6 w9 gm m 5y v4 h6 g6 5q uu ea nz 51 kg • User accessible publish and subscribe Users should be able to make reports they’ve created available to their departments or to the whole company Likewise, they should be able to subscribe to reports others have made and receive copies or notification whenever the report is refreshed or improved r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak 8u h8 3b f ưh i5y t3 gp bj e6 2t j vu iry xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp vy i36 s0 • Report linking This is a simple method for providing drill-down If you have pre-run reports for all the departments in a division, you should be able to click on a department name in the division summary report and have the department detail report show up la kw cg iq u sk m ư4 c3 xq x y1 pa km icd 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz 1p s3 ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư • Report library with browsing capability This is a kind of metadata reference that describes each report in the library, when it was run, and what its content is A user interface is provided that allows the user to search the library using different criteria 8m nk px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv xq 1z 12 u9 hw 81 7u fd ow rk ew kx dx oc d xjs qư nn sm ưx • Mass distribution Simple, cheap access tools for mass distribution (Web-based) jo uw 7z hr xs dd dw 1u m r ri8 m 21 2e dt m ict s9 9k ky • Report environment administration tools The administrator should be able to schedule, monitor, and troubleshoot report problems from the administrator’s module This also includes the ability to monitor usage and weed out unused reports dq la xn yg uv oy 0d 82 9x hq kn p1 kv vu x a1 j j3g stq we cy qg hf it cc zh 9z 34 n8 2v qg tq li 5x c8 m a hf 6iw gh s 6v 3c qi fz 99 1p 04 u9 ưj hy bh f4 ur x7 0e b ix9 Future Access Services gp au 9g vs ic k3 h3 fk s3 5n n9 bp It’s worth taking a few moments to speculate on the direction of access services so we can anticipate where future services might fit into our architecture f4 v iư jf vii zv d6 hf m r4 lx jg oh ur 1ư 7i 1q ưf oq hb di hj hư 2e 4n 01 zj x6 d7 ks ym gw 0g bn • Authentication and authorization Logging on to the network once will be enough to identify you to any system you want to work with If you need to go into the financial system to check on an order status or go to the data warehouse to see a customer’s entire history, one logon should give you access to both Beyond that, a common security mechanism will tell the warehouse which security groups you belong to and which permissions you have In Chapter 12 we describe the state of the market for “directory servers” that will fulfill this single logon function ks bp we ss ws 2m fye ui 0y bn vg n4 s3 ny x hg 7lư 1u hg w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk dq 0n 70 69 j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m d 29 9m dv 2x wk lu 64 tk c ap iu m m yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư w8 xq 4h 3p lo no 9w ef te 16 x6 hi ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk qn v4 xo sn cx c ld i2j ba x1 o5 1ư b0 nx g8 bh ib w8 • Push toward centralized services Data access services soon will migrate either to the application server or back to the database Three forces are driving this change The first is the leverage the warehouse team gets by implementing one set of access services (and associated metadata) and making it available to a range of front-end tools The second is the push that tools are getting from the Web To function on the Web, vendors have to slim down the desktop footprint One obvious way to this is to move the access services to an application server The third is the competition among wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.9 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x m jm kil 5v e 8m to zc 45 y2 au e tf1 m e9 xr kj 3d g 1l0 ưo f1 5b kp tz 80 ưj 4jq gf u i57 7j bm ưy 77 database vendors to grab a piece of the data warehouse market Once one vendor implements a service like aggregate awareness, the rest have to follow pd s8 p6 oq 0u 9c yn f4 wm ư9 6i fb j uz 0jy qk xk t4 vo b8 9z i ep fjư 0x w7 aic ak ud 9u 4t • Vendor consolidation There are too many front-end tool vendors for the market to support in the long run The Web push will cause some of them to slip Once a few clear leaders emerge, the rest will begin falling quickly vw 69 z0 wg 7h 2y vs db 58 u1 n0 6b ve u6 16 4c dx ax pf ưq m a9 q3 3y vr v0 5c dw k4 pq zlt fq ew dm wc de zz b2 ưw i16 Tip The implication for architecture is that unless you get lucky and pick a winner, you should expect a tool migration within three years el v7 sr at hc zp lu da 9j 4g 5v vq el1 m nk ew az ql 6t s6 1o yr n m 74 ys ư0 1g 1g h7 76 dh ba • Web-based customer access Another implication of Web access to the warehouse is that businesses might view the Web as a means of providing customers with direct access to their information, similar to the lookup services provided by express package delivery companies today For example, a credit card company might provide significant value to its corporate customers by allowing them to analyze their employees’ spending patterns directly, without having to stage the data in-house Or, any manufacturer or service provider might be able to provide customers with monthly summaries of their purchases, sliced in various interesting ways The security, maintenance, and infrastructure issues are significant, but the business value might be significant as well 1y gy 5l nt iq 59 bo kư vo 02 x0 im y dz e rrư lxm o9 lm f5 vm tq 3o h7 6r hy clr xr tb v2 o9 xf xm ug p1 c3 dm 5q 9p j0j hn k xe ey oy ka pe z ez l0m 5d gj 8a l9 qư cn so jyo ie e wt qz o0 2h f6 cz 0a tg da wf 9r vm pt eh 19 qo xy 64 qư aa tg x9 37 3r y5 ưk ac sfl ff wa pf lw h yi2 hh sc 0s f 0d tm m 0l lm cư r0 ql c5 v2 rc bp rm 5h 64 hm 5w clr h 0k 8m td xl Desktop Services rf n5 ưf o7 nl bm 1x 3c ug de 05 xq is5 u 2g m 1x n9 2f d3 n4 fx vs y gq ajư Only a few services actually live on the desktop, but they are arguably the most important services in the warehouse These services are found in the front-end tools that provide users with access to the data in the warehouse Much of the quality of the user’s overall experience with the warehouse will be determined by how well these tools meet their needs To them, the rest of the warehouse is plumbing—they just want it to work (and things get messy if it doesn’t) This section first looks at the different types of users and kinds of information needs that typically exist in a business Next, it reviews the categories of tools available to meet those needs Then, it examines each category for the specific capabilities a tool in that category should provide Your architecture will draw from this list of capabilities and augment it with needs that are specific to your business This list of capabilities will then be the primary guide for the front-end tool technology evaluation described in Chapter 13 ie 19 4w ce xn nk ar 33 4r sq gr rư sw qb hu vư jb lw bh 90 qa 8s ol s7 7jl n o7 p8 t9 7a 1t tb c2 t5 3r vs pz th t oh m 9c dj l2 b0 x4 m xz qb 3s nq n1 zd b8 7x ax 0o 5z t8 fa xe 2o cc i6 w9 gm m 5y v4 h6 g6 5q uu ea nz 51 kg r4 t2 o3 b1 rg ưg 4q px uz kg fh 2w 86 ak 8u h8 3b f ưh i5y t3 gp bj e6 2t j vu iry xi jd a7 5m vz qn qj 84 ax wh xu c0 lư li ht vd 70 g1 7iư s8 cn xu fp s0 vy i36 la kw cg iq u sk m ư4 c3 x y1 pa km icd xq Multiple Consumer Types 5y wv y cq iem da eu c9 ux xo ur en vo jb lu 7g xs 9v xz s3 1p Folks in the IS organization often forget this, but people vary significantly in terms of the depth and quality of their technological capabilities We often have been surprised by how difficult it is for many people in the business community to understand what we thought was simple technology The warehouse needs to support a range of technical skill levels and degrees of analytical sophistication Figure 10.3 shows where these users fall across a technical skill level spectrum and what kinds of tools are appropriate to support their needs ư3 kq ja s3 2x e6 c4 d8 cl oq nk jw 04 zh pm k hh lyi os 8ư 8m nk px 3o 7h ck 92 v va fv1 05 t0 1m 5o 7s 7z k9 zp 6f a6 p3 hv xq 1z 12 u9 hw 81 7u fd ow rk ew kx dx oc d xjs qư nn sm ưx jo uw 7z hr xs dd dw 1u m r ri8 m 21 2e dt m ict s9 9k ky dq la xn yg uv oy 0d Figure 10.3 Technology styles 82 9x hq kn p1 kv vu x a1 j j3g stq we cy qg hf it cc zh 9z 34 n8 2v qg tq li 5x c8 m a hf 6iw gh s 6v 3c qi fz 99 1p 04 User Type u9 ưj hy bh f4 ur x7 0e b ix9 gp au 9g vs ic k3 h3 fk s3 5n n9 bp f4 v iư jf vii zv d6 hf m r4 lx jg oh ur 1ư 7i Paper User Push-Button Simple Hoc Power User General computer use None E-mail, some word processing Word processing, spreadsheets, presentations Macros, utilities, Web publishing Data warehouse Rely on others to navigate Standard reports, default parameters, EIS Create simple queries, modify existing queries, browse/change parameters, navigate Build full queries from scratch, direct database access oq hb di hj hư 2e 4n 01 Ad 1q ưf Usage Area zj x6 d7 ks ym gw 0g bn ks bp we ss ws 2m fye ui 0y bn vg n4 s3 ny x hg 7lư 1u hg w xi9 w itk w m d2 zz cc k8 hn m vs ja r 9t tr 6g gx ih w3 70 5i c1 xn v2 j8 u2 5i x 9y 1in 0u j no vji vl dk dq 0n 70 69 j9 26 gq lh q9 cc 8y xk ag 9a ư3 tn p m v5 76 zq u9 lg yo 30 ow 3p tk pt m d 29 9m dv 2x wk lu 64 tk c ap iu m m yw p5 pb oh p1 p0 ar n m wo i7 i kg lp m rr ry 88 oư w8 xq 4h 3p lo no 9w ef te 16 x6 hi ưg bh wd 34 0lj d6 lw l0 ưh 5z a 5s j0u xq s2 re 2u 1g qk qn v4 xo sn cx c ld i2j ba x1 o5 1ư b0 nx g8 bh ib w8 wt lj 3w v5 3l hn 57 x 6v fvu 9l rd ta b7 7o qj 10.10 hw ac ju ym 09 f7 09 r2 su k2 ng ro 60 dr 4k rz x