1000 Vicenc¸ Torra 52.2 Preprocessing Data Collected data usually contains errors (either introduced on purpose e.g. for protecting confi- dentiality as in privacy preserving Data Mining, or due to incorrect data handling). Such errors make data processing a difficult task as incorrect models might be inferred from the erroneous data. This situation is even more noticeable in multi database data mining (Wrobel, 1997,Zhong et al., 1999). In such framework, models have to be extracted from data distributed among several databases. Then, data is usually non consistent, attributes in different databases are not codified in a unified way (they might have different names) and the domain of the attributes is not the same. Information fusion techniques permit to deal with some of these difficulties. We describe below some of the current techniques in use for dealing with these problems. Namely, re- identification algorithms in multi database Data Mining; fusion and aggregation operators for improving the quality of data (for both multi-database and single source database data mining). 52.2.1 Re-identification Algorithms In the construction of models from multiple databases, re-identification methods play a central role. They are to link those data descriptions that while distributed in different data files belong to the same object. To formalize such methods, let us consider a database A and a database B, both containing information about the same individuals but being the former described in terms of attributes A 1 , ,A n and the latter in terms of attributes B 1 , ,B m . In this setting, we can distinguish two groups of algorithms. They are the following ones: Record Linkage (or Record Matching) Methods Given a record r in A, such methods consist on finding all records in B that correspond to the same individual than r. Different methods rely on different assumptions on the attributes A i and B i and on the underlying model for the data. Classical methods assume that both files share a large enough set of attributes and that such attributes have the same domain. Difficulties on re-identifying the records are solely caused by the errors in the data. In this setting, different algorithms have been developed: probabilistic record linkage, distance-based record linkage. Recently, a cluster-based approach was introduced, similar in spirit to distance-based record linkage. Probabilistic record linkage is described in detail in (Jaro, 1989). This work describes the application of such method to the 1985 Census of Tampa (Florida). A more up-to-date description of the field is given in (Winkler, 1995a) and (Winkler, 1995b). Distance-based record linkage was proposed by (Pagliuca and Seri, 1999). Both methods are reviewed and compared in (Torra and Domingo-Ferrer, 2003). Cluster-based approach is given in (Bacher et al., 2002). New methods (Torra, 2000b, Domingo-Ferrer and Torra, 2003) have been proposed re- cently that weaken the conditions required for the variables. Variables are no longer assumed to be the same for both files but only similar. In this case, re-identification is based on the assumption that there is some structural information that is present in both files and that can be extracted from the relationships between objects. Differences on the methods correspond to differences on the way relationships are expressed. 52 Information Fusion 1001 For example, Torra (2000c) expresses such similarities considering partitions. This is, sim- ilar objects are clustered together while dissimilar objects are distributed in different clusters; and Torra (2000b, 2004) computes aggregates for records and use such aggregates to compute similarities between records. Scheme Matching Methods These methods establish proper correspondence between two or more database schemes. The most typical situation is to find correspondences between attributes. Methods to deal with this situation are the so-called attribute correspondence identification methods. Given the attributes A 1 , ,A n of a database A, such methods are to find the attributes in B 1 , ,B m of database B that describe the same concept. Such relationships can be one-to-one, many-to-one or many-to-many. The first case is when an attribute in one file corresponds to an attribute in the second file (although they can use different domains, or the same domain with different granularities); the second case is when one attribute in one file is represented by many attributes in the second file (e.g. address in one file, vs. street and number in the second file) and the third case, when only complex relations can be established between attributes. For examples of the first case see e.g. (Bergamaschi et al., 2001) and (Do and Rahm, 2002). See e.g. (Borkar et al., 2000) for an example of the second case. The third case is not presently studied in the literature. Structure-level matching methods are an alternative to attribute correspondence identifi- cation methods that deal with more general situations not directly corresponding to attribute correspondence. See (Rahm and Bernstein, 2001) for a review. Although we have divided re-identification algorithms in two classes, there are algorithms that can be applied to both situations. This is so because the attribute correspondence identifi- cation problem is similar to the record re-identification one. On this basis, a similar algorithm was applied in (Torra, 2000c) and (Domingo-Ferrer and Torra, 2003) to problems of both classes. 52.2.2 Fusion to Improve the Quality of Data Once data from several sources is known to refer to the same individual, they can be fused to obtain data of better quality (in some applications this is known by data consolidation) or to obtain data having a broader scope (data permitting to cope with aspects that data supplied from a single source do not permit). Fusion methods are also applied in situations in which data is supplied by a single source but at different time instants. This is the case, for example, of the moving average on a window for temporal variables. See e.g. (Tsumoto, 2003). Aggregation operators are the particular functions used to combine data into a new and more reliable datum. Aggregation operators (Calvo, Mayor and Mesiar, 2002) exist for data of several types. For example, there are operators for: • Numerical data: e.g. The arithmetic or the weighted mean and fuzzy integrals: Choquet and Sugeno integral (Choquet, 1954,Sugeno, 1974). • Linguistic or categorical data: e.g. the median and the majority or plurality rule (Roberts, 1991) Other operators exist for combining more complex data as partitions, or dendrograms (hierarchical structures). See (Day, 1986) on methods for partitions and dendrograms. 1002 Vicenc¸ Torra Differences on the operators rely on the different assumptions that apply for sources and domains. The family of operators encompassed by the Choquet integral (an operator for nu- merical data) illustrates this situation. See (Torra, 2003) or (Calvo, Koles ´ arov ´ a, Komorn ´ ıkov ´ a and Mesiar, 2002) for a more detailed account of the Choquet integral family and a detailed description of aggregation operators for numerical data. The arithmetic mean ( Σ a i /N) is the simplest aggregation operator for numerical scales. However, its definition implies that all sources have the same importance. Instead, the weighted mean ( Σ p i a i with p i ≥ 0 and Σ p i = 1) overcomes this assumption permitting the user to as- sign weights to the sources (e.g. a sensor is twice as reliable as another sensor). Alternatively, the OWA operator (Yager, 1988) permits to assign importance to the values but not to the sources. That is, the OWA operator permits to model situations in which smaller values are more important than larger ones, or, in other words, compensation is allowed between values. Also, the OWA operator permits to diminish the importance of outliers. The WOWA operator (Torra, 1997) (or the OWA with importances defined by Yager in 1996) permits both the importance of the values and the one of the sources. This is, the operator allows for compensation and at the same time some sources can be said to be more important than some other ones. Finally, Choquet integrals (Choquet, 1954) permit to overcome some implicit assumptions of the weighted mean: all sources are independent. The Choquet integral, which is defined as an integral of a function (the values to be aggregated) with respect to a fuzzy measure, permits to consider sources that interact. Aggregation Operators: Formalization In ordered scales D (ordinal or numerical scales), aggregation operators are typically functions that combine N values and that satisfy idempotency or unanimity (i.e., when all inputs are the same, the output is also the same), monotonicity (i.e., when any of the input data is increased, the output is also increased) and the output is a compromise value (i.e., the output is between the minimum and the maximum of the input values). Formally speaking, aggregation operators are functions C (from Consensus) from D N → D that satisfy: 1. C(a,a, ,a)=a 2. if a i ≤ b i then C(a 1 ,a 2 , ,a N ) ≤C(b 1 ,b 2 , ,b N ) 3. mina i ≤C(a 1 ,a 2 , ,a N ) ≤ max a i In some applications, the role of the information sources is important and some prior knowledge about their relevance or reliability is given. In this case, the set of sources X = {x 1 , ,x N } is considered in the aggregation process and the value that source x i supplies is expressed by means of a function f from X into D. In this way, f (x i )=a i . Then, knowledge on the sources is given as functions on X (or on subsets of X). For example, the weighted mean corresponds to: Σ i p(x i ) f(x i ). In the same way, a Choquet integral is an integral of the function f with respect to a fuzzy measure μ defined on the parts of X. Each μ (A) for A ⊆ X measures the reliability or importance of the sources of A. 52.3 Building Data Models The main goal of Data Mining is the extraction of knowledge from data. Information fusion methods also play a role in this process. 52 Information Fusion 1003 In fact, Data Mining techniques are usually rooted on a particular kind of knowledge representation formalism, or a particular data model. Information fusion can be used on the one hand as one of such data models and on the other hand to combine different data models for building a more accurate one. We review both situations below. 52.3.1 Data Models Using Aggregation Operators Aggregation operators, as any mathematical function, can be used for data modeling. They are appropriate to model situations that are compatible with the properties of these operators. This is, in particular, the variable to be modeled is monotonic with respect to the inputs, the operator satisfies idempotency and the output is between the minimum and the maximum of the input values. Additional constraints should be considered when a particular aggregation operator is selected. For example, when the weighted mean is used as the model, independence of the attributes is required. Even though aggregation operators are restricted to be used when data satisfy these prop- erties, they can be used in composite models (e.g. hierarchical models) when data do not follow such constraints. The validity of aggregation operators in such more complex situations is rooted on the following two results: 1. Two step Choquet integrals with constant permit the approximation of any monotonic function at the desired level of accuracy. See (Murofushi and Narukawa, 2002) for details. 2. Hierarchies of quasi-weighted means and hierarchies of Choquet integrals with constant can be used to approximate any arbitrary function at the desired level of detail. This result is given in (Narukawa and Torra, 2003). The construction of an appropriate model, based on experimental data or expert’s experi- ence, requires the selection of an appropriate aggregation function as well as of its parameters. In the case that composite models are used, the architecture of the model is also required. Research has been carried out on operator and parameter selection. In the next section, we concentrate on methods for learning parameters from examples. This is, we consider the case where there exists a set of (input x output) pairs that the model is intended to reproduce. Other research is based on the existence of an expert that gives relevant information for the selection of the operator or for the parameters of a given operator. See e.g. Saaty’s Analytical Hierarchy Process (Saaty, 1980) or O’Hagan’s selection of OWA operator’s weight from the degree of compensation (Hagan, 1988). Learning Methods for Aggregation Operators At present there exists a plethora of methods to learn parameters from examples once the aggregation operator is selected. Filev and Yager (1998) developed a method for the case of the OWA operator that is also appropriate when the operator selected is the weighted mean. A more effective method was presented in (Torra, 1999) and extended in (Torra, 2002) to be applied to the quasi-weighted means. The first approach in the literature to consider the learning of fuzzy measures for Choquet integrals is (Tanaka and Murofushi, 1989). The same problem, but considering alternative as- sumptions, has been considered, among others, in (Marichal and Roubens, 2000, Imai et al., 1004 Vicenc¸ Torra 2000,Wang et al., 1999,Torra, 2000a). While the first two papers correspond to general meth- ods the last two are for constrained fuzzy measures, namely, Sugeno λ measures and distorted probabilities. A recent survey of learning methods for the Choquet integral is given in (Gra- bisch, 2003). For more details on learning methods for aggregation operators see the excellent work by Beliakov (Beliakov, 2003). 52.3.2 Aggregation Operators to Fuse Data Models A second use of aggregation operators for data modeling corresponds to the so-called ensem- ble methods (Dietterich, 1997). In this case, aggregation is used to combine several different models constructed on the basis of different subsets of objects (as in Bagging (Breiman, 1996) and Boosting (Schapire, 1990)) or based on different machine learning approaches. The operators used in the literature depend on the kind of problem considered and on the models built. Two different cases can be underlined: Regression Problems The output of individual modules is a numerical value. Accordingly, aggregation operators for numerical data are used. Arithmetic and weighted means are the most commonly used operators (see e.g. (Merz, 1999)). Nevertheless, fuzzy integrals, as the Choquet or Sugeno integrals, can also be used. Classification Problems The output of individual modules typically correspond to a categorical value (the class for a particular instance problem), therefore, aggregation operators for categorical values (in nom- inal scales: non-ordered categorical domains) are considered. The plurality rule (voting or weighted voting procedures) is the most usual aggregation operator used (see e.g. (Bauer and Kohavi, 1999) and (Merz, 1999)). In a recent study (Kuncheva, 2003) several fuzzy combi- nation methods have been compared in classification problems against some nonfuzzy tech- niques as the (weighted) majority voting. 52.4 Information Extraction Information fusion can be used in Data Mining as a method for information extraction. In particular, the following applications can be envisioned: 52.4.1 Summarization Fusion methods are used to build summaries so that the quantity of available data is reduced. In this way, data can be represented in a more compact way. For example, Detyniecki (2000) uses aggregation operators to build summaries of video sequences and Kacprzyk, Yager and Zadrozny (2000) and Yager (2003) use aggregation operators to build linguistic summaries (fuzzy rules) of databases. Methods for dimensionality reduction or multidimensional scal- ing (Cox and Cox, 1994) can also be seen from the perspective of information fusion and summarization. Namely, the reduction of the dimensionality of records (reduction on the num- ber of variables) and the reduction on the number of records (e.g. clustering to build prototypes from data). 52 Information Fusion 1005 52.4.2 Knowledge from Aggregation Operators For some information fusion methods, their parameters can be used to extract information from the data. This is the case of Choquet integral. It has been said that this operator can be used to aggregate situations in which the sources are not independent. Accordingly, when a data model is built using a Choquet integral, the corresponding fuzzy measure can be inter- preted and interactions between variables arise from the analysis of the measure. Grabisch (2000) describes the use of Choquet integrals models to extract features from data. 1. Preprocessing data a. Re-identification algorithms: Record and scheme matching methods b. Fusion to improve the quality of data: Aggregation operators 2. Building data models a. Models based on aggregation operators: Learning methods for aggregation operators b. Aggregation operators to fuse data models: Ensemble methods 3. Information extraction a. Summarization: Dimensionality reduction and linguistic summaries b. Knowledge from aggregation operators: Interpreting operators parameters Fig. 52.1. Main topics and methods for Information Fusion in Data Mining and Knowledge Discovery. 52.5 Conclusions In this chapter we have reviewed the application of information fusion in Data Mining and knowledge discovery. We have seen that aggregation operators and information fusion meth- ods can be applied for three main uses: preprocessing data, building data models and informa- tion extraction. Figure 52.4.2 gives a more detailed account of such applications. Acknowledgments Partial support of the MCyT under the contract STREAMOBILE (TIC2001-0633-C03-01/02) is acknowledged. 1006 Vicenc¸ Torra References Bacher, J., Brand, R., Bender, S., Re-identifying register data by survey data using cluster analysis: an empirical study, Int. J. of Unc., Fuzziness and KBS, 10:5 589-607, 2002. Beliakov, G., How to Build Aggregation Operators from Data, Int. J. of Intel. Syst., 18 903- 923, 2003. Bergamaschi, S., Castano, S., Vincini, M., Beneventano, D., Semantic integration of hetero- geneous information sources, Data and Knowledge Engineering 36 215-249, 2001. Bauer, E., Kohavi, R., An Empirical Comparison of Voting Classification Algorithms: Bag- ging, Boosting and Variants, Machine Learning 36 105–139, 1999. Borkar, V. R., Deshmukh, K., Sarawagi, S., Automatically extracting structure from free text addresses, Bulletin of the Technical Committee on Data Engineering, 23: 4, 27-32, 2000. Breiman, L., Bagging predictors, Machine Learning, 24 123–140, 1996. Calvo, T., Mayor, G., Mesiar, R., (Eds), Aggregation operators: New trends and applications, Physica-Verlag: Springer, 2002. Calvo, T., Koles ´ arov ´ a, A., Komorn ´ ıkov ´ a, M., Mesiar, R., Aggregation Operators: Properties, Classes and Construction Methods, in T. Calvo, G. Mayor, R. Mesiar (Eds.), Aggregation operators: New Trends and Applications, Physica-Verlag, 3-123, 2002. Choquet, G., Theory of Capacities, Ann. Inst. Fourier 5 131-296, 1954. Cox, T. F., Cox, M. A. A., Multidimensional scaling, Chapman and Hall, 1994. Day, W.H.E., Special issue on comparison and consensus of classifications, Journal of Clas- sification, 3, 1986. Detyniecki, M., Mathematical Aggregation Operators and their Application to Video Query- ing, PhD Dissertation, University of Paris VI, Paris, France, 2000. Dietterich, T.G., Machine-Learning Research: Four Current Directions, AI Magazine, Winter 1997, 97-136. Do, H H., Rahm, E., COMA - A system for flexible combination of schema matching ap- proaches, Proc. of the 28th VLDB Conference, Hong-Kong, China, 2002. Domingo-Ferrer, J., Torra, V., Disclosure Risk Assessment in Statistical Microdata Protec- tion via advanced record linkage, Statistics and Computing, 13 343-354, 2003. Filev, D., Yager, R.R., On the issue of obtaining OWA operator weights, Fuzzy Sets and Systems, 94, 157-169, 1998. Grabisch, M., Fuzzy integral for classification and feature extraction, in M. Grabisch, T. Murofushi, M. Sugeno (Eds.), Fuzzy Measures and Integrals, Physica-Verlag, 415–434, 2000. Grabisch, M., Modelling data by the Choquet integral, in V. Torra, Information Fusion in Data Mining, Springer, 135–148, 2003. Imai, H., Miyamori, M., Miyakosi, M., Sato, Y., An algorithm based on alternative projec- tions for a fuzzy measures identification problem, Proc. of the Iizuka conference, Iizuka, Japan (CD-ROM), 2000. Jaro, M. A., Advances in record-linkage methodology as applied to matching the 1985 Cen- sus of Tampa, Florida, J. of the American Stat. Assoc., 84:406 414-420, 1989. Kacprzyk, J., Yager, R. R., Zadrozny, S., ”A fuzzy logic based approach to linguistic sum- maries in databases, Int. J. of Applied Mathematical Computer Science 10 813-834, 2000. Kuncheva, L. I., ”Fuzzy” Versus ”Nonfuzzy” in Combining Classifiers Designed by Boost- ing, IEEE Trans. on Fuzzy Systems, 11:6 729-741, 2003. Marichal, J L., Roubens, M., Determination of weights of interacting criteria from a refer- ence set, European Journal of Operational Research, 124:3 641-650, 2000. 52 Information Fusion 1007 Merz, C. J., Using Correspondence Analysis to Combine Classifiers, Machine Learning, 36 33-58, 1999. Merz, C. J., Pazzani, M. J., Combining regression estimates, Machine Learning, 36 9–32, 1999. Murofushi, T., Narukawa, Y., A Characterization of multi-level discrete Choquet integral over a finite set (in Japanese). Proc. of the 7th Workshop on Evaluation of Heart and Mind 33-36, 2002. Narukawa, Y., Torra, V., Choquet integral based models for general approximation, in I. Aguil ´ o, L. Valverde, M. T. Escrig (Eds.), Artificial Intelligence Research and Develop- ment, IOS Press, 39-50, 2003. O’Hagan, M., Aggregating template or rule antecedents in real-time expert systems with fuzzy set logic, Proceedings of the 22nd Annual IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, 1988, 681-689, 1988. Pagliuca, D., Seri, G., Some Results of Individual Ranking Method on the System of Enter- prise Accounts Annual Survey, Esprit SDC Project, Deliverable MI-3/D2, 1999. Rahm, E., Bernstein, P. A., A survey of approaches to automatic schema matching, The VLDB Journal, 10 334-350, 2001. Roberts, F. S., On the indicator function of the plurality function, Mathematical Social Sci- ences, 22 163-174, 1991. Saaty, T. L., The Analytic Hierarchy Process (McGraw-Hill, New York), 1980. Schapire, R. E., The strength of weak learnability, Machine Learning, 5: 2 197–227, 1990. Sugeno, M., Theory of Fuzzy Integrals and its Applications. (PhD Dissertation). Tokyo In- stitute of Technology, Tokyo, Japan, 1974. Tanaka, A., Murofushi, T., A learning model using fuzzy measure and the Choquet integral, Proc. of the 5th Fuzzy System Symposium, 213-217, Kobe, Japan (in Japanese), 1989. Torra, V., The Weighted OWA operator, Int. J. of Intel. Systems, 12 153-166, 1997. Torra, V., On the learning of weights in some aggregation operators: The weighted mean and the OWA operators, Mathware and Soft Computing, 6 249-265, 1999. Torra, V., Learning weights for Weighted OWA operators, Proc. IEEE Int. Conf. on Industrial Electr. Control and Instrumentation, 2000a. Torra, V., Re-identifying Individuals using OWA operators, Proc. of the 6th Int. Conference on Soft Computing, (CD Rom), Iizuka, Fukuoka, Japan, 2000b. Torra, V., Towards the re-identification of individuals in data files with Non-common vari- ables, Proc. of the 14th ECAI, 326-330, Berlin, 2000c. Torra, V., Learning weights for the quasi-weighted means, IEEE Trans. on Fuzzy Systems, 10:5 (2002) 653-666, 2002. Torra, V., On some aggregation operators for numerical information, in V. Torra (Ed.), Infor- mation Fusion in Data Mining, 9-26, Springer, 2003. Torra, V., OWA operators in data modeling and re-identification, IEEE Trans. on Fuzzy Sys- tems, 12:5 2004. Torra, V., Domingo-Ferrer, J., Record linkage methods for multidatabase Data Mining, in V. Torra, Information Fusion in Data Mining, 101-132, Springer, 2003. Tsumoto, S., Discovery of Temporal Knowledge in Medical Time-Series Databases using Moving Average, Multiscale Matching and Rule Induction, in Torra V., Information Fusion in Data Mining, Springer, 79–100, 2003. Wang, Z., Leung, K. S., Wang, J., A genetic algorithm for determining nonadditive set func- tions in information fusion, Fuzzy sets and systems, 102 462-469, 1999. 1008 Vicenc¸ Torra Winkler, W. E., Matching and record linkage, in B. G. Cox (Ed) Business Survey Methods, Wiley, 355–384, 1995. Winkler, W. E., Advanced methods for record linkage, American Statistical Association, Proceedings of the Section on Survey Research Methods, 467-472, 1995. Wrobel, S., An Algorithm for Multi-relational Discovery of Subgroups, J. Komorowski et al. (eds.), Principles of Data Mining and Knowledge Discovery, Lecture Notes in Artificial Intelligence 1263, Springer, 367-375, 1997. Yager, R. R., On ordered weighted averaging aggregation operators in multi-criteria decision making, IEEE Trans. on SMC, 18 183-190, 1988. Yager, R. R., Quantifier Guided Aggregation Using OWA operators, Int. J. of Int. Systems, 11 49-73, 1996. Yager, R. R., Data Mining Using Granular Linguistic Summaries, in V. Torra, Information Fusion in Data Mining, Springer, 211–229, 2003. Zhong, N., Yao, Y. Y., Ohsuga, S., Pecularity Oriented Multi-Database Mining”, J. Zytkow, J. Rauch, (eds.), Principles of Data Mining and Knowledge Discovery, Lecture Notes in Artificial Intelligence 1704, Springer, 136-146, 1999. 53 Parallel And Grid-Based Data Mining – Algorithms, Models and Systems for High-Performance KDD Antonio Congiusta 1 , Domenico Talia 1 , and Paolo Trunfio 1 DEIS – University of Calabria acongiusta,talia,trunfio@deis.unical.it Summary. Data Mining often is a computing intensive and time requiring process. For this reason, several Data Mining systems have been implemented on parallel computing platforms to achieve high performance in the analysis of large data sets. Moreover, when large data repositories are coupled with geographical distribution of data, users and systems, more so- phisticated technologies are needed to implement high-performance distributed KDD systems. Since computational Grids emerged as privileged platforms for distributed computing, a grow- ing number of Grid-based KDD systems has been proposed. In this chapter we first discuss different ways to exploit parallelism in the main Data Mining techniques and algorithms, then we discuss Grid-based KDD systems. Finally, we introduce the Knowledge Grid, an environ- ment which makes use of standard Grid middleware to support the development of parallel and distributed knowledge discovery applications. Key words: Parallel Data Mining, Grid-based Data Mining, Knowledge Grid, Distributed Knowledge Discovery 53.1 Introduction Today the information overload is a problem like the shortage of information is. In our daily activities we often deal with flows of data much larger than we can understand and use. Thus we need a way to sift those data for extracting what is interesting and relevant for our ac- tivities. Knowledge discovery in large data repositories can find what is interesting in them representing it in an understandable way (Berry and Linoff, 1997). Mining large data sets requires powerful computational resources. In fact, Data Mining algorithms working on very large data sets take a very long time on conventional computers to get results. One approach to reduce response time is sampling. But, in some cases reducing data might result in inaccurate models, in some other cases it is not useful (e.g., outliers identi- fication). Another approach is parallel computing. High–performance computers and parallel Data Mining algorithms can offer a very efficient way to mine very large data sets (Freitas and Lavington, 1998,Skillicorn, 1999) by analyzing them in parallel. O. Maimon, L. Rokach (eds.), Data Mining and Knowledge Discovery Handbook, 2nd ed., DOI 10.1007/978-0-387-09823-4_53, © Springer Science+Business Media, LLC 2010 . algorithms in multi database Data Mining; fusion and aggregation operators for improving the quality of data (for both multi-database and single source database data mining) . 52. 2.1 Re-identification. 52. 1. Main topics and methods for Information Fusion in Data Mining and Knowledge Discovery. 52. 5 Conclusions In this chapter we have reviewed the application of information fusion in Data Mining. development of parallel and distributed knowledge discovery applications. Key words: Parallel Data Mining, Grid-based Data Mining, Knowledge Grid, Distributed Knowledge Discovery 53.1 Introduction Today