SAS Institute. The Power to Know

Learning Center

Stay in Touch

Not sure what course to take?

Speak with an expert curriculum consultant at 800-333-7660 or send an e-mail.

Submit Your Course Requests

Want this course at another location or on a different date? Let us know your course needs.

On-site Pricing

Course fee and EPTO/APTO units differ for on-site training.

Data Mining Techniques: Theory and Practice

Business Knowledge Series course

Duration: 3.0 days
CEUs: 1.8
Available for on-site training or can be scheduled at any SAS training facility if demand warrants.

Presented by Michael J. A. Berry or Gordon S. Linoff, founders of Data Miners, Inc. and co-authors of Data Mining Techniques and Mastering Data Mining

Explore the inner workings of data mining techniques and how to make them work for you. Students are taken through all the steps of a data mining project, beginning with problem definition and data selection, and continuing through data exploration, data transformation, sampling, portioning, modeling, and assessment.

Learn how to

Who should attend

Business analysts, their managers, and statisticians

Expand/Collapse AllExpand All      Expand/Collapse AllCollapse All      PrintPrint version

Prerequisites
No prior knowledge of statistical or data mining tools is required.
Course Contents
Introduction to Data Mining
  • what is data mining?
  • directed and undirected data mining
  • models
  • profiling and prediction
Data Mining Methodology
  • why have a methodology?
  • how data miners can inadvertently learn things that are not true
  • translating business problems into data mining problems
  • the importance of model stability
  • finding the right input variables
  • sampling to create balanced model sets
  • partitioning to create training, validation, and test sets
  • data preparation
  • model assessment
Data Exploration
  • developing intuition about data
  • data structure
  • data types
  • data values
  • exploring distributions
  • summary statistics
  • histograms
  • using SAS Enterprise Miner for data exploration
Statistics and Regression
  • the null hypothesis
  • statistical significance
  • confidence bounds
  • variance and standard deviation
  • standardized values
  • correlation
  • linear regression
  • logistic regression
  • using SAS Enterprise Miner to build regression models
Decision Trees
  • decision trees as data exploration and classification tools
  • decision trees for modeling and scoring
  • decision trees for variable selection
  • alternate representations of decision trees
  • algorithms used to build decision trees
  • splitting criteria
  • recognizing instability and overfitting in decision tree models
  • capturing interactions between variables
  • using SAS Enterprise Miner to build decision trees
Neural Networks
  • origins of neural networks
  • neural networks compared with regression
  • the algorithms used to train neural networks
  • data preparation requirements for neural networks
  • picking appropriate inputs for neural networks
  • creating neural network models using SAS Enterprise Miner
Memory Based Reasoning
  • similarity and distance
  • distance metrics appropriate for different kinds of data
  • the role of the training set in MBR
  • combining the votes of several neighbors
  • other K-nearest neighbor techniques
  • collaborative filtering
  • using the SAS Enterprise Miner MBR node
Clustering
  • more on similarity and distance
  • the K-means algorithm
  • divisive clustering
  • agglomerative clustering
  • data preparation for clustering
  • interpreting clusters
  • finding clusters with SAS Enterprise Miner
Survival Analysis
  • origins of survival analysis
  • how business data is different from clinical data
  • hazards and hazard charts
  • retention curves and survival curves
  • calculating survival from retention
  • calculating hazards empirically
  • parametric hazard models
  • censoring
  • competing risks
  • survival based forecasting
  • using SAS code in SAS Enterprise Miner to create survival curves
Miscellaneous Techniques
  • link analysis
  • genetic algorithms
  • association rules
  • using SAS Enterprise Miner to discover associations in retail data
Putting Data Mining Techniques to Work
  • formulating the business problem as a data mining problem
  • finding the tool that fits the problem
Software
This course addresses SAS Enterprise Miner.
Course Materials
Students receive a hardcopy of the course notes and, in some courses, can choose to take home a copy of the course data.
Share Your Thoughts
Are there additional topics you'd like for this course to address? Would you like for this course to be offered at another training facility? Let us know by adding to our Interest List.

Not currently scheduled. Available for on-site training or can be scheduled at any SAS training facility if demand warrants.


created using SAS software This page was created using SAS software.