Data Mining

Description

Efnisyfirlit

Preface
Contents
1 Data Mining and Information Systems: Quo Vadis?
Robert Stahlbock, Stefan Lessmann, and Sven F. Crone
1.1 Introduction
1.2 Special Issues in Data Mining
1.2.1 Confirmatory Data Analysis
1.2.2 Knowledge Discovery from Supervised Learning
1.2.3 Classification Analysis
1.2.4 Hybrid Data Mining Procedures
1.2.5 Web Mining
1.2.6 Privacy-Preserving Data Mining
1.3 Conclusion and Outlook
References
Part I Confirmatory Data Analysis
2 Response-Based Segmentation Using Finite Mixture Partial Least Squares
Christian M. Ringle, Marko Sarstedt, and Erik A. Mooi
2.1 Introduction
2.1.1 On the Use of PLS Path Modeling
2.1.2 Problem Statement
2.1.3 Objectives and Organization
2.2 Partial Least Squares Path Modeling
2.3 Finite Mixture Partial Least Squares Segmentation
2.3.1 Foundations
2.3.2 Methodology
2.3.3 Systematic Application of FIMIX-PLS
2.4 Application of FIMIX-PLS
2.4.1 On Measuring Customer Satisfaction
2.4.2 Data and Measures
2.4.3 Data Analysis and Results
2.5 Summary and Conclusion
References
Part II Knowledge Discovery from Supervised Learning
3 Building Acceptable Classification Models
David Martens and Bart Baesens
3.1 Introduction
3.2 Comprehensibility of Classification Models
3.2.1 Measuring Comprehensibility
3.2.2 Obtaining Comprehensible Classification Models
3.2.2.1 Building Rule-Based Models
3.2.2.2 Combining Output Types
3.2.2.3 Visualization
3.3 Justifiability of Classification Models
3.3.1 Taxonomy of Constraints
3.3.2 Monotonicity Constraint
3.3.3 Measuring Justifiability
3.3.4 Obtaining Justifiable Classification Models
3.4 Conclusion
References
4 Mining Interesting Rules Without Support Requirement: A General Universal Existential Upward Closu
Yannick Le Bras, Philippe Lenca, and Stéphane Lallich
4.1 Introduction
4.2 State of the Art
4.3 An Algorithmic Property of Confidence
4.3.1 On UEUC Framework
4.3.2 The UEUC Property
4.3.3 An Efficient Pruning Algorithm
4.3.4 Generalizing the UEUC Property
4.4 A Framework for the Study of Measures
4.4.1 Adapted Functions of Measure
4.4.1.1 Association Rules
4.4.1.2 Contingency Tables
4.4.2 Expression of a Set of Measures of Ddconf
4.5 Conditions for GUEUC
4.5.1 A Sufficient Condition
4.5.2 A Necessary Condition
4.5.3 Classification of the Measures
4.6 Conclusion
References
5 Classification Techniques and Error Control in Logic Mining
Giovanni Felici, Bruno Simeone, and Vincenzo Spinelli
5.1 Introduction
5.2 Brief Introduction to Box Clustering
5.3 BC-Based Classifier
5.4 Best Choice of a Box System
5.5 Bi-criterion Procedure for BC-Based Classifier
5.6 Examples
5.6.1 The Data Sets
5.6.2 Experimental Results with BC
5.6.3 Comparison with Decision Trees
5.7 Conclusions
References
Part III Classification Analysis
6 An Extended Study of the Discriminant Random Forest
Tracy D. Lemmond, Barry Y. Chen, Andrew O. Hatch,and William G. Hanley
6.1 Introduction
6.2 Random Forests
6.3 Discriminant Random Forests
6.3.1 Linear Discriminant Analysis
6.3.2 The Discriminant Random Forest Methodology
6.4 DRF and RF: An Empirical Study
6.4.1 Hidden Signal Detection
6.4.1.1 Training on T1, Testing on J2
6.4.1.2 Prediction Performance for J2 with Cross-validation
6.4.2 Radiation Detection
6.4.3 Significance of Empirical Results
6.4.4 Small Samples and Early Stopping
6.4.5 Expected Cost
6.5 Conclusions
References
7 Prediction with the SVM Using Test Point Margins
Süreyya Özögür-Akyüz, Zakria Hussain, and John Shawe-Taylor
7.1 Introduction
7.2 Methods
7.3 Data Set Description
7.4 Results
7.5 Discussion and Future Work
References
8 Effects of Oversampling Versus Cost-Sensitive Learning for Bayesian and SVM Classifiers
Alexander Liu, Cheryl Martin, Brian La Cour, and Joydeep Ghosh
8.1 Introduction
8.2 Resampling
8.2.1 Random Oversampling
8.2.2 Generative Oversampling
8.3 Cost-Sensitive Learning
8.4 Related Work
8.5 A Theoretical Analysis of Oversampling Versus Cost-Sensitive Learning
8.5.1 Bayesian Classification
8.5.2 Resampling Versus Cost-Sensitive Learning in Bayesian Classifiers
8.5.3 Effect of Oversampling on Gaussian Naive Bayes
8.5.3.1 Random Oversampling
8.5.3.2 Generative Oversampling
8.5.3.3 Comparison to Cost-Sensitive Learning
8.5.4 Effects of Oversampling for Multinomial Naive Bayes
8.6 Empirical Comparison of Resampling and Cost-SensitiveLearning
8.6.1 Explaining Empirical Differences Between Resampling and Cost-Sensitive Learning
8.6.2 Naive Bayes Comparisons on Low-Dimensional Gaussian Data
8.6.2.1 Gaussian Naive Bayes on Artificial, Low-Dimensional Data
8.6.2.2 A Note on ROC and AUC
8.6.3 Multinomial Naive Bayes
8.6.4 SVMs
8.6.5 Discussion
8.7 Conclusion
Appendix
References
9 The Impact of Small Disjuncts on Classifier Learning
Gary M. Weiss
9.1 Introduction
9.2 An Example: The Vote Data Set
9.3 Description of Experiments
9.4 The Problem with Small Disjuncts
9.5 The Effect of Pruning on Small Disjuncts
9.6 The Effect of Training Set Size on Small Disjuncts
9.7 The Effect of Noise on Small Disjuncts
9.8 The Effect of Class Imbalance on Small Disjuncts
9.9 Related Work
9.10 Conclusion
References
Part IV Hybrid Data Mining Procedures
10 Predicting Customer Loyalty Labels in a Large Retail Database: A Case Study in Chile
Cristián J. Figueroa
10.1 Introduction
10.2 Related Work
10.3 Objectives of the Study
10.3.1 Supervised and Unsupervised Learning
10.3.2 Unsupervised Algorithms
10.3.2.1 Self-Organizing Map
10.3.2.2 Sammon Mapping
10.3.2.3 Curvilinear Component Analysis
10.3.3 Variables for Segmentation
10.3.4 Exploratory Data Analysis
10.3.5 Results of the Segmentation
10.4 Results of the Classifier
10.5 Business Validation
10.5.1 In-Store Minutes Charges for Prepaid Cell Phones
10.5.2 Distribution of Products in the Store
10.6 Conclusions and Discussion
Appendix
References
11 PCA-Based Time Series Similarity Search
Leonidas Karamitopoulos, Georgios Evangelidis, and Dimitris Dervos
11.1 Introduction
11.2 Background
11.2.1 Review of PCA
11.2.2 Implications of PCA in Similarity Search
11.2.3 Related Work
11.3 Proposed Approach
11.4 Experimental Methodology
11.4.1 Data Sets
11.4.2 Evaluation Methods
11.4.3 Rival Measures
11.5 Results
11.5.1 1-NN Classification
11.5.2 k-NN Similarity Search
11.5.3 Speeding Up the Calculation of APEdist
11.6 Conclusion
References
12 Evolutionary Optimization of Least-Squares Support Vector Machines
Arjan Gijsberts, Giorgio Metta, and Léon Rothkrantz
12.1 Introduction
12.2 Kernel Machines
12.2.1 Least-Squares Support Vector Machines
12.2.2 Kernel Functions
12.2.2.1 Conditions for Kernels
12.3 Evolutionary Computation
12.3.1 Genetic Algorithms
12.3.2 Evolution Strategies
12.3.3 Genetic Programming
12.4 Related Work
12.4.1 Hyperparameter Optimization
12.4.2 Combined Kernel Functions
12.5 Evolutionary Optimization of Kernel Machines
12.5.1 Hyperparameter Optimization
12.5.2 Kernel Construction
12.5.3 Objective Function
12.6 Results
12.6.1 Data Sets
12.6.2 Results for Hyperparameter Optimization
12.6.3 Results for EvoKMGP
12.7 Conclusions and Future Work
References
13 Genetically Evolved kNN Ensembles
Ulf Johansson, Rikard König, and Lars Niklasson
13.1 Introduction
13.2 Background and Related Work
13.3 Method
13.3.1 Data sets
13.4 Results
13.5 Conclusions
References
Part V Web-Mining
14 Behaviorally Founded Recommendation Algorithm for Browsing Assistance Systems
Peter Géczy, Noriaki Izumi, Shotaro Akaho, and Kôiti Hasida
14.1 Introduction
14.1.1 Related Works
14.1.2 Our Contribution and Approach
14.2 Concept Formalization
14.3 System Design
14.3.1 A Priori Knowledge of Human–System Interactions
14.3.2 Strategic Design Factors
14.3.3 Recommendation Algorithm Derivation
14.4 Practical Evaluation
14.4.1 Intranet Portal
14.4.2 System Evaluation
14.4.3 Practical Implications and Limitations
14.5 Conclusions and Future Work
References
15 Using Web Text Mining to Predict Future Events: A Testof the Wisdom of Crowds Hypothesis
Scott Ryan and Lutz Hamel
15.1 Introduction
15.2 Method
15.2.1 Hypotheses and Goals
15.2.2 General Methodology
15.2.3 The 2006 Congressional and Gubernatorial Elections
15.2.4 Sporting Events and Reality Television Programs
15.2.5 Movie Box Office Receipts and Music Sales
15.2.6 Replication
15.3 Results and Discussion
15.3.1 The 2006 Congressional and Gubernatorial Elections
15.3.2 Sporting Events and Reality Television Programs
15.3.3 Movie and Music Album Results
15.4 Conclusion
References
Part VI Privacy-Preserving Data Mining
16 Avoiding Attribute Disclosure with the (Extended) p-Sensitive k-Anonymity Model
Traian Marius Truta and Alina Campan
16.1 Introduction
16.2 Privacy Models and Algorithms
16.2.1 The p-Sensitive k-Anonymity Model and Its Extension
16.2.2 Algorithms for the p-Sensitive k-Anonymity Model
16.3 Experimental Results
16.3.1 Experiments for p-Sensitive k-Anonymity
16.3.2 Experiments for Extended p-Sensitive k-Anonymity
16.4 New Enhanced Models Based on p-Sensitive k-Anonymity
16.4.1 Constrained p-Sensitive k-Anonymity
16.4.2 p-Sensitive k-Anonymity in Social Networks
16.5 Conclusions and Future Work
References
17 Privacy-Preserving Random Kernel Classification of Checkerboard Partitioned Data
Olvi L. Mangasarian and Edward W. Wild
17.1 Introduction
17.2 Privacy-Preserving Linear Classifier for Checkerboard Partitioned Data
17.3 Privacy-Preserving Nonlinear Classifier for Checkerboard Partitioned Data
17.4 Computational Results
17.5 Conclusion and Outlook
References

Additional information

Veldu vöru	Leiga á rafbók í 365 daga, Leiga á rafbók í 120 daga, Leiga á rafbók í 90 daga, Leiga á rafbók í 30 daga, Leiga á rafbók í 60 daga, Leiga á rafbók í 180 daga, Rafbók til eignar

Reviews

There are no reviews yet.

Be the first to review “Data Mining”

Description

Efnisyfirlit

Additional information

Reviews

Aðrar vörur

Bókakaup

Um okkur

Skráðu þig á póstlistann okkar

Data Mining

Description

Efnisyfirlit

Additional information

Reviews

Aðrar vörur

Related products

Armstrong’s Handbook of Learning and Development

Archaeology: The Basics

Against Borders

A Concise History of Italy

Bókakaup

Um okkur

Skráðu þig á póstlistann okkar