... of the datamining task,
the nature of the available data, and the skills and preferences of the data
miner.
Data mining comes in two flavors—directed and undirected. Directed data
mining ... that, on a technical level, the datamining effort is working and
the data is reasonably accurate. This can be quite comforting. If the dataand
the dataminingtechniques applied to it are powerful ... the datamining
effort itself. If we cannot measure the results of mining the data, then we can-
not learn from the effort and there is no virtuous cycle.
Measurements of past efforts and ad...
...
performance and wide area datamining systems for over ten years. More recently, he has
worked on standards and testbeds fordata mining. He has an AB in Mathematics from
Harvard University and ... the datamining group in the centre. He has been working on distributed datamining
algorithms and systems development. He is also working on network infrastructure for
global wide datamining ... J., To, H.W., and
Yang, D. Large scale data mining: Challenges and responses. Proc. of the Third Int’l Conference on Knowledge
Discovery andData Mining.
Goil, S., Alum, S., and Ranka, S....
... level
data, 96
publications
Building the Data Warehouse (Bill
Inmon), 474
Business Modeling andDataMining
(Dorian Pyle), 60
Data Preparation forDataMining
(Dorian Pyle), 75
The Data ... 89–90
metadata repository, 484, 491
methodologies
data correction, 72–74
data exploration, 64–68
data mining process, 54–55
data selection, 60–64
data transformation, 74–76
data translation, ...
Business Modeling andData Mining, 60
Data Preparation forData Mining, 75
470643 bindex.qxd 3/8/04 11:08 AM Page 619
C
Index 619
calculations, probabilities, 133–135
call detail databases, 37...
... of the datamining task,
the nature of the available data, and the skills and preferences of the data
miner.
Data mining comes in two flavors—directed and undirected. Directed data
mining ... hours
for reports
System of record fordata Copy of data
Descriptive and repetitive Creative
First, problems being addressed by datamining differ from operational
problems—a datamining ... the datamining
effort itself. If we cannot measure the results of mining the data, then we can-
not learn from the effort and there is no virtuous cycle.
Measurements of past efforts and ad...
...
before. The newly discovered relationships suggest new hypotheses to test
and the datamining process begins all over again.
Lessons Learned
Data mining comes in two forms. Directed datamining ... 11:10 AM Page 97
Data MiningApplications 97
mining techniques used to generate the scores. It is worth noting, however,
that many of the dataminingtechniques in this book can and have been ... California based on data that excludes calls to Los Angeles.
Step Six: Transform Data to Bring
Information to the Surface
Once the data has been assembled and major data problems fixed, the data...
... standard deviation (strictly speaking,
this is the standard error but the two are equivalent for our purposes) and the
mean value and the sample size for a proportion. This is called the standard ... reasons as well. For instance, it is one way of
taking several variables and converting them to similar ranges. This can be
useful for several datamining techniques, such as clustering and neural ... Statistics: DataMining Using Familiar Tools 127
Looking at Discrete Values
Much of the data used in datamining is discrete by nature, rather than contin-
uous. Discrete data shows up in the form...
... customers in California for the challenger and everyone else for the
champion.
■■ Use the 5 percent lowest and 5 percent highest value customers for the
challenger, and everyone else for the champion. ... in several areas:
■■ Data miners tend to ignore measurement error in raw data.
■■ Data miners assume that there is more than enough dataand process-
ing power.
■■ Datamining assumes dependency ... that might be picked up by datamining algorithms.
One major difference between business dataand scientific data is that the
latter has many continuous values and the former has many discrete...
... common for neural networks are the logistic and the hyperbolic tangent.
The major difference between them is the range of their outputs, between 0 and
1 for the logistic and between –1 and 1 for ... generalize and learn from data
mimics, in some sense, our own ability to learn from experience. This ability is
useful fordata mining, and it also makes neural networks an exciting area for
research, ... test set to see how well it performs.
7. Apply the model generated by the network to predict outcomes for
unknown inputs.
Fortunately, datamining software now performs most of these steps auto-
matically....
... detection is
used to evaluate editorial zones for a major daily newspaper.
Searching for Islands of Simplicity
In Chapter 1, where dataminingtechniques are classified as directed or undi-
rected, ... inexplicable and perhaps unimportant.
In a broader sense, however, clustering can be a directed activity because
clusters are sought for some business purpose. In marketing, clusters formed
for a ... applied to data. These patterns can be turned into new features of the data,
for use in conjunction with other directed datamining techniques.
470643 c11.qxd 3/8/04 11:17 AM Page 355
Automatic Cluster...
... and Lo [24] examine network performance for supporting Internet telephony.
They use UDP to collect data between several sites, and then apply a static playout buffer and loss
compensation to determine ... take
occasional “snapshots” of Internet performance, obtaining new data between new sites. Our new
traces add to the overall amount of data available on Internet performance. Secondly, most of
the ... existing protocols for resource discovery (and finding them
lacking for wide area applications) , we present a scalable protocol for wide area service discovery,
which is ideal for discovery of gateways,...
... Al-Attar, 1998, DataMining – Beyond Algorithms’, http://www.attar.com/tutor /mining. htm.
[2] Berry, J. A. Michael; Linoff, Gordon, 1997, DataMining Techniques: For Marketing, Sales, and
Customer ... 1998, ‘A datamining application for issuing predictions, summarizing the data and
revealing interesting phenomena’, http://www.wizsoft.com/why.html.
[7] Mihelis G.; Grigoroudis E.; and Siskos ... of Data Set
(training and test set)
Filling the
empty cells
MUSAFinal Analysis
Is the Data Set
Complete?
Yes
No
Selection of complete
questionnaires
CUSTOMER SATISFACTION USING DATA MINING
TECHNIQUES
Nikolaos...
... BASED DATAMINING TECHNIQUES
The objective of datamining is to extract valuable information from one’s data, to discover the ‘hidden
gold’. In Decision Support Management terminology, datamining ... process in which one search for patterns of information in data (Parsaye, 1997).
Figure 2: Rule Induction process
Data miningtechniques are based on data retention anddata distillation. Rule induction ... Al-Attar, 1998, DataMining – Beyond Algorithms’, http://www.attar.com/tutor /mining. htm.
[2] Berry, J. A. Michael; Linoff, Gordon, 1997, DataMining Techniques: For Marketing, Sales, and
Customer...