Data Mining

Höfundur Robert Stahlbock; ‎Sven F. Crone; ‎Stefan Lessmann

Útgefandi Springer Nature

Snið Page Fidelity

Print ISBN 9781441912794

Útgáfa 1

Útgáfuár 2010

6.090 kr.

Description

Efnisyfirlit

  • Preface
  • Contents
  • 1 Data Mining and Information Systems: Quo Vadis?
  • Robert Stahlbock, Stefan Lessmann, and Sven F. Crone
  • 1.1 Introduction
  • 1.2 Special Issues in Data Mining
  • 1.2.1 Confirmatory Data Analysis
  • 1.2.2 Knowledge Discovery from Supervised Learning
  • 1.2.3 Classification Analysis
  • 1.2.4 Hybrid Data Mining Procedures
  • 1.2.5 Web Mining
  • 1.2.6 Privacy-Preserving Data Mining
  • 1.3 Conclusion and Outlook
  • References
  • Part I Confirmatory Data Analysis
  • 2 Response-Based Segmentation Using Finite Mixture Partial Least Squares
  • Christian M. Ringle, Marko Sarstedt, and Erik A. Mooi
  • 2.1 Introduction
  • 2.1.1 On the Use of PLS Path Modeling
  • 2.1.2 Problem Statement
  • 2.1.3 Objectives and Organization
  • 2.2 Partial Least Squares Path Modeling
  • 2.3 Finite Mixture Partial Least Squares Segmentation
  • 2.3.1 Foundations
  • 2.3.2 Methodology
  • 2.3.3 Systematic Application of FIMIX-PLS
  • 2.4 Application of FIMIX-PLS
  • 2.4.1 On Measuring Customer Satisfaction
  • 2.4.2 Data and Measures
  • 2.4.3 Data Analysis and Results
  • 2.5 Summary and Conclusion
  • References
  • Part II Knowledge Discovery from Supervised Learning
  • 3 Building Acceptable Classification Models
  • David Martens and Bart Baesens
  • 3.1 Introduction
  • 3.2 Comprehensibility of Classification Models
  • 3.2.1 Measuring Comprehensibility
  • 3.2.2 Obtaining Comprehensible Classification Models
  • 3.2.2.1 Building Rule-Based Models
  • 3.2.2.2 Combining Output Types
  • 3.2.2.3 Visualization
  • 3.3 Justifiability of Classification Models
  • 3.3.1 Taxonomy of Constraints
  • 3.3.2 Monotonicity Constraint
  • 3.3.3 Measuring Justifiability
  • 3.3.4 Obtaining Justifiable Classification Models
  • 3.4 Conclusion
  • References
  • 4 Mining Interesting Rules Without Support Requirement: A General Universal Existential Upward Closu
  • Yannick Le Bras, Philippe Lenca, and Stéphane Lallich
  • 4.1 Introduction
  • 4.2 State of the Art
  • 4.3 An Algorithmic Property of Confidence
  • 4.3.1 On UEUC Framework
  • 4.3.2 The UEUC Property
  • 4.3.3 An Efficient Pruning Algorithm
  • 4.3.4 Generalizing the UEUC Property
  • 4.4 A Framework for the Study of Measures
  • 4.4.1 Adapted Functions of Measure
  • 4.4.1.1 Association Rules
  • 4.4.1.2 Contingency Tables
  • 4.4.2 Expression of a Set of Measures of Ddconf
  • 4.5 Conditions for GUEUC
  • 4.5.1 A Sufficient Condition
  • 4.5.2 A Necessary Condition
  • 4.5.3 Classification of the Measures
  • 4.6 Conclusion
  • References
  • 5 Classification Techniques and Error Control in Logic Mining
  • Giovanni Felici, Bruno Simeone, and Vincenzo Spinelli
  • 5.1 Introduction
  • 5.2 Brief Introduction to Box Clustering
  • 5.3 BC-Based Classifier
  • 5.4 Best Choice of a Box System
  • 5.5 Bi-criterion Procedure for BC-Based Classifier
  • 5.6 Examples
  • 5.6.1 The Data Sets
  • 5.6.2 Experimental Results with BC
  • 5.6.3 Comparison with Decision Trees
  • 5.7 Conclusions
  • References
  • Part III Classification Analysis
  • 6 An Extended Study of the Discriminant Random Forest
  • Tracy D. Lemmond, Barry Y. Chen, Andrew O. Hatch,and William G. Hanley
  • 6.1 Introduction
  • 6.2 Random Forests
  • 6.3 Discriminant Random Forests
  • 6.3.1 Linear Discriminant Analysis
  • 6.3.2 The Discriminant Random Forest Methodology
  • 6.4 DRF and RF: An Empirical Study
  • 6.4.1 Hidden Signal Detection
  • 6.4.1.1 Training on T1, Testing on J2
  • 6.4.1.2 Prediction Performance for J2 with Cross-validation
  • 6.4.2 Radiation Detection
  • 6.4.3 Significance of Empirical Results
  • 6.4.4 Small Samples and Early Stopping
  • 6.4.5 Expected Cost
  • 6.5 Conclusions
  • References
  • 7 Prediction with the SVM Using Test Point Margins
  • Süreyya Özögür-Akyüz, Zakria Hussain, and John Shawe-Taylor
  • 7.1 Introduction
  • 7.2 Methods
  • 7.3 Data Set Description
  • 7.4 Results
  • 7.5 Discussion and Future Work
  • References
  • 8 Effects of Oversampling Versus Cost-Sensitive Learning for Bayesian and SVM Classifiers
  • Alexander Liu, Cheryl Martin, Brian La Cour, and Joydeep Ghosh
  • 8.1 Introduction
  • 8.2 Resampling
  • 8.2.1 Random Oversampling
  • 8.2.2 Generative Oversampling
  • 8.3 Cost-Sensitive Learning
  • 8.4 Related Work
  • 8.5 A Theoretical Analysis of Oversampling Versus Cost-Sensitive Learning
  • 8.5.1 Bayesian Classification
  • 8.5.2 Resampling Versus Cost-Sensitive Learning in Bayesian Classifiers
  • 8.5.3 Effect of Oversampling on Gaussian Naive Bayes
  • 8.5.3.1 Random Oversampling
  • 8.5.3.2 Generative Oversampling
  • 8.5.3.3 Comparison to Cost-Sensitive Learning
  • 8.5.4 Effects of Oversampling for Multinomial Naive Bayes
  • 8.6 Empirical Comparison of Resampling and Cost-SensitiveLearning
  • 8.6.1 Explaining Empirical Differences Between Resampling and Cost-Sensitive Learning
  • 8.6.2 Naive Bayes Comparisons on Low-Dimensional Gaussian Data
  • 8.6.2.1 Gaussian Naive Bayes on Artificial, Low-Dimensional Data
  • 8.6.2.2 A Note on ROC and AUC
  • 8.6.3 Multinomial Naive Bayes
  • 8.6.4 SVMs
  • 8.6.5 Discussion
  • 8.7 Conclusion
  • Appendix
  • References
  • 9 The Impact of Small Disjuncts on Classifier Learning
  • Gary M. Weiss
  • 9.1 Introduction
  • 9.2 An Example: The Vote Data Set
  • 9.3 Description of Experiments
  • 9.4 The Problem with Small Disjuncts
  • 9.5 The Effect of Pruning on Small Disjuncts
  • 9.6 The Effect of Training Set Size on Small Disjuncts
  • 9.7 The Effect of Noise on Small Disjuncts
  • 9.8 The Effect of Class Imbalance on Small Disjuncts
  • 9.9 Related Work
  • 9.10 Conclusion
  • References
  • Part IV Hybrid Data Mining Procedures
  • 10 Predicting Customer Loyalty Labels in a Large Retail Database: A Case Study in Chile
  • Cristián J. Figueroa
  • 10.1 Introduction
  • 10.2 Related Work
  • 10.3 Objectives of the Study
  • 10.3.1 Supervised and Unsupervised Learning
  • 10.3.2 Unsupervised Algorithms
  • 10.3.2.1 Self-Organizing Map
  • 10.3.2.2 Sammon Mapping
  • 10.3.2.3 Curvilinear Component Analysis
  • 10.3.3 Variables for Segmentation
  • 10.3.4 Exploratory Data Analysis
  • 10.3.5 Results of the Segmentation
  • 10.4 Results of the Classifier
  • 10.5 Business Validation
  • 10.5.1 In-Store Minutes Charges for Prepaid Cell Phones
  • 10.5.2 Distribution of Products in the Store
  • 10.6 Conclusions and Discussion
  • Appendix
  • References
  • 11 PCA-Based Time Series Similarity Search
  • Leonidas Karamitopoulos, Georgios Evangelidis, and Dimitris Dervos
  • 11.1 Introduction
  • 11.2 Background
  • 11.2.1 Review of PCA
  • 11.2.2 Implications of PCA in Similarity Search
  • 11.2.3 Related Work
  • 11.3 Proposed Approach
  • 11.4 Experimental Methodology
  • 11.4.1 Data Sets
  • 11.4.2 Evaluation Methods
  • 11.4.3 Rival Measures
  • 11.5 Results
  • 11.5.1 1-NN Classification
  • 11.5.2 k-NN Similarity Search
  • 11.5.3 Speeding Up the Calculation of APEdist
  • 11.6 Conclusion
  • References
  • 12 Evolutionary Optimization of Least-Squares Support Vector Machines
  • Arjan Gijsberts, Giorgio Metta, and Léon Rothkrantz
  • 12.1 Introduction
  • 12.2 Kernel Machines
  • 12.2.1 Least-Squares Support Vector Machines
  • 12.2.2 Kernel Functions
  • 12.2.2.1 Conditions for Kernels
  • 12.3 Evolutionary Computation
  • 12.3.1 Genetic Algorithms
  • 12.3.2 Evolution Strategies
  • 12.3.3 Genetic Programming
  • 12.4 Related Work
  • 12.4.1 Hyperparameter Optimization
  • 12.4.2 Combined Kernel Functions
  • 12.5 Evolutionary Optimization of Kernel Machines
  • 12.5.1 Hyperparameter Optimization
  • 12.5.2 Kernel Construction
  • 12.5.3 Objective Function
  • 12.6 Results
  • 12.6.1 Data Sets
  • 12.6.2 Results for Hyperparameter Optimization
  • 12.6.3 Results for EvoKMGP
  • 12.7 Conclusions and Future Work
  • References
  • 13 Genetically Evolved kNN Ensembles
  • Ulf Johansson, Rikard König, and Lars Niklasson
  • 13.1 Introduction
  • 13.2 Background and Related Work
  • 13.3 Method
  • 13.3.1 Data sets
  • 13.4 Results
  • 13.5 Conclusions
  • References
  • Part V Web-Mining
  • 14 Behaviorally Founded Recommendation Algorithm for Browsing Assistance Systems
  • Peter Géczy, Noriaki Izumi, Shotaro Akaho, and Kôiti Hasida
  • 14.1 Introduction
  • 14.1.1 Related Works
  • 14.1.2 Our Contribution and Approach
  • 14.2 Concept Formalization
  • 14.3 System Design
  • 14.3.1 A Priori Knowledge of Human–System Interactions
  • 14.3.2 Strategic Design Factors
  • 14.3.3 Recommendation Algorithm Derivation
  • 14.4 Practical Evaluation
  • 14.4.1 Intranet Portal
  • 14.4.2 System Evaluation
  • 14.4.3 Practical Implications and Limitations
  • 14.5 Conclusions and Future Work
  • References
  • 15 Using Web Text Mining to Predict Future Events: A Testof the Wisdom of Crowds Hypothesis
  • Scott Ryan and Lutz Hamel
  • 15.1 Introduction
  • 15.2 Method
  • 15.2.1 Hypotheses and Goals
  • 15.2.2 General Methodology
  • 15.2.3 The 2006 Congressional and Gubernatorial Elections
  • 15.2.4 Sporting Events and Reality Television Programs
  • 15.2.5 Movie Box Office Receipts and Music Sales
  • 15.2.6 Replication
  • 15.3 Results and Discussion
  • 15.3.1 The 2006 Congressional and Gubernatorial Elections
  • 15.3.2 Sporting Events and Reality Television Programs
  • 15.3.3 Movie and Music Album Results
  • 15.4 Conclusion
  • References
  • Part VI Privacy-Preserving Data Mining
  • 16 Avoiding Attribute Disclosure with the (Extended) p-Sensitive k-Anonymity Model
  • Traian Marius Truta and Alina Campan
  • 16.1 Introduction
  • 16.2 Privacy Models and Algorithms
  • 16.2.1 The p-Sensitive k-Anonymity Model and Its Extension
  • 16.2.2 Algorithms for the p-Sensitive k-Anonymity Model
  • 16.3 Experimental Results
  • 16.3.1 Experiments for p-Sensitive k-Anonymity
  • 16.3.2 Experiments for Extended p-Sensitive k-Anonymity
  • 16.4 New Enhanced Models Based on p-Sensitive k-Anonymity
  • 16.4.1 Constrained p-Sensitive k-Anonymity
  • 16.4.2 p-Sensitive k-Anonymity in Social Networks
  • 16.5 Conclusions and Future Work
  • References
  • 17 Privacy-Preserving Random Kernel Classification of Checkerboard Partitioned Data
  • Olvi L. Mangasarian and Edward W. Wild
  • 17.1 Introduction
  • 17.2 Privacy-Preserving Linear Classifier for Checkerboard Partitioned Data
  • 17.3 Privacy-Preserving Nonlinear Classifier for Checkerboard Partitioned Data
  • 17.4 Computational Results
  • 17.5 Conclusion and Outlook
  • References
Show More

Additional information

Veldu vöru

Leiga á rafbók í 365 daga, Leiga á rafbók í 120 daga, Leiga á rafbók í 90 daga, Leiga á rafbók í 30 daga, Leiga á rafbók í 60 daga, Leiga á rafbók í 180 daga, Rafbók til eignar

Reviews

There are no reviews yet.

Be the first to review “Data Mining”

Netfang þitt verður ekki birt. Nauðsynlegir reitir eru merktir *

Aðrar vörur

1
    1
    Karfan þín
    An Introduction to Behavioural Ecology
    An Introduction to Behavioural Ecology
    Veldu vöru:

    Rafbók til eignar

    1 X 9.690 kr. = 9.690 kr.