Data Preparation for Data Mining
This course introduces programming techniques used by analysts to transform raw data into a form suitable for predictive modeling. This course uses SAS programming extensively.
Learn how to
- extract relevant data
- transform transactions or event data
- use non-numeric data, including controlling degrees of freedom
- manage exceptions and extremes.
Who should attend
Data mining and IT professionals with SAS DATA step programming experience
Expand All
Collapse All
Print version
Prerequisites
This course assumes some experience in both data mining and SAS programming. Before attending this course, you should
Course Contents
Introduction
- raw data structures
- predictive modeling data structure
- over view of data preparation challenges
Extracting Relevant Data
- data difficulties
- assessing available data
- accessing available data
- drawing a representative target sample
- drawing an uncontaminated input sample
Transforming Transactions or Event Data
- advantages and disadvantages of transactions data
- common transaction structures
- defining the time horizon
- fixed and variable time horizon methods
- implementing common transaction transformations
Using Non-Numeric Data
- definitions and difficulties of non-numeric data
- miscoding and multicoding detection
- controlling degrees of freedom
- geocoding
Managing Exceptions and Extremes
- difficulties with outliers, missing and non-applicable values, and extreme distributions
- detection of exceptions and extremes
- remedies for exceptional and extreme values
Software
This course addresses Base SAS, SAS Enterprise Miner, SAS/STAT.
Course Materials
Students receive a hardcopy of the course notes and, in some courses, can choose to take home a copy of the course data.
Share Your Thoughts
Are there additional topics you'd like for this course to address?
Would you like for this course to be offered at another training facility?
Let us know by adding to our
Interest List.
Not currently scheduled.
Available for
on-site training or can be scheduled at any SAS training facility
if demand warrants.
This page was created using SAS software.