Data Mining Concepts and Techniques phần 2 ppsx
... means: Bin 1: 9, 9, 9 Bin 2: 22 , 22 , 22 Bin 3: 29 , 29 , 29 Smoothing by bin boundaries: Bin 1: 4, 4, 15 Bin 2: 21 , 21 , 24 Bin 3: 25 , 25 , 34 Figure 2. 11 Binning methods for data smoothing. 1. Binning: ... for smeared data. 2. 3 Data Cleaning 63 Sorted data for price (in dollars): 4, 8, 15, 21 , 21 , 24 , 25 , 28 , 34 Partition into (equal-frequency) bins:...
Ngày tải lên: 08/08/2014, 18:22
... efficiently. 8 Mining Stream, Time-Series, and Sequence Data Our previous chapters introduced the basic concepts and techniques of data mining. The techniques studied, however, were for simple and structured ... time-series streams, spatiotemporal data streams, and video and audio data streams. 8 .2 Mining Time-Series Data “What is a time-series database?” A tim...
Ngày tải lên: 08/08/2014, 18:22
... Values 61 2. 3 .2 Noisy Data 62 2.3.3 Data Cleaning as a Process 65 2. 4 Data Integration and Transformation 67 2. 4.1 Data Integration 67 2. 4 .2 Data Transformation 70 2. 5 Data Reduction 72 2.5.1 Data ... Descriptive Data Summarization 51 2. 2.1 Measuring the Central Tendency 51 2. 2 .2 Measuring the Dispersion of Data 53 2. 2.3 Graphic Displays of Basic De...
Ngày tải lên: 08/08/2014, 18:22
Data Mining Concepts and Techniques phần 3 docx
... dimensions city, item, and year. 4.1 Efficient Methods for Data Cube Computation 177 a 1 :3 2 b*:1 3 b 1 :2 c*:1 4 d*:1 5 c* :2 d* :2 b*:1 3 c*:1 4 BCD:1 1 root:5 1 a 2 :2 b* :2 c 3 :2 d 4 :2 d*:1 5 a 1 CD/a 1 :1 ... processing, and data mining. We also introduce on-line analytical mining (OLAM), a powerful paradigm that integrates OLAP with data mining technology....
Ngày tải lên: 08/08/2014, 18:22
Data Mining Concepts and Techniques phần 4 potx
... count sales count Asia 15 300 120 1000 135 1300 Europe 12 250 150 120 0 1 62 1450 North America 28 450 20 0 1800 22 8 22 50 all regions 45 1000 470 4000 525 5000 Generalized data can be presented graphically, ... L k−1 (2) for each itemset l 2 ∈ L k−1 (3) if (l 1 [1] = l 2 [1]) ∧(l 1 [2] = l 2 [2] ) ∧ ∧(l 1 [k 2] = l 2 [k 2] )∧(l 1 [k −1] < l 2 [k −1]) then { (4) c = l...
Ngày tải lên: 08/08/2014, 18:22
Data Mining Concepts and Techniques phần 5 ppt
... coverage(R1) = 2/ 14 = 14 .28 % and accuracy (R1) = 2/ 2 = 100%. 29 6 Chapter 6 Classification and Prediction that satisfy the test. The right branch out of N is labeled no so that D 2 corresponds to ... obtain SplitInfo A (D) = − 4 14 ×log 2 4 14 − 6 14 ×log 2 6 14 − 4 14 ×log 2 4 14 . = 0. 926 . From Example 6.1, we have Gain(income) = 0. 029 . Therefore, GainRatio...
Ngày tải lên: 08/08/2014, 18:22
Data Mining Concepts and Techniques phần 6 ppt
... asymmetric binary variables. 1 2 2 3 5 4 3 3 2 1 x 2 = (3,5) x 1 = (1 ,2) Euclidean distance = (2 2 + 3 2 ) 1 /2 = 3.61 Manhattan distance = 2 + 3 = 5 Figure 7.1 Euclidean and Manhattan distances ... as d(i, j) = w 1 |x i1 −x j1 | 2 + w 2 |x i2 −x j2 | 2 + ···+ w m |x in −x jn | 2 . (7.8) Weighting can also be applied to the Manhattan and Minkowski distances. 7...
Ngày tải lên: 08/08/2014, 18:22
Data Mining Concepts and Techniques phần 8 potx
... substructures. 9. Metadata mining. Metadata are data about data. Metadata provide semi-structured data about unstructured data, ranging from text and Web data to multimedia data- bases. It is useful for data ... proposed by Chandrasekaran and Franklin [CF 02] , Gehrke, Korn, and Srivastava [GKS01], Dobra, Garofalakis, Gehrke, and Ras- togi [DGGR 02] , and Madden, Shah,...
Ngày tải lên: 08/08/2014, 18:22
Data Mining Concepts and Techniques phần 9 pot
... multimedia data mining focuses on image data mining. Mining text data and mining the World Wide Web are studied in the two subsequent 638 Chapter 10 Mining Object, Spatial, Multimedia, Text, and Web Data where ... t 1 t 2 t 3 t 4 t 5 t 6 t 7 d 1 0 4 10 8 0 5 0 d 2 5 19 7 16 0 0 32 d 3 15 0 0 4 9 0 17 d 4 22 3 12 0 5 15 0 d 5 0 7 0 9 2 4 12 634 Chapter 10 Mining...
Ngày tải lên: 08/08/2014, 18:22
Data Mining Concepts and Techniques phần 10 pot
... molecules. In Proc. 20 02 Int. Conf. Data Mining (ICDM’ 02) , pages 21 1 21 8, Maebashi, Japan, Dec. 20 02. [BBD + 02] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream ... streaming data. In Proc. 20 02 Int. Conf. Very Large Data Bases (VLDB’ 02) , pages 20 3 21 4, Hong Kong, China, Aug. 20 02. [CF03] C. Cooper and A Frieze. A...
Ngày tải lên: 08/08/2014, 18:22