About the Tasks That You Will Perform

You have already set up the project and defined the input data source that you will use in this example. Now, you will import the data and perform the following tasks, which help you learn properties of the input data and prepare it for subsequent modeling:
  1. You will explore the statistical properties of the variables in the input data set. The results that are generated in this step will give you an idea of which variables are most useful in predicting the target response (whether a person donates or not) in this data set.
  2. You will partition the data into two data sets, a training data set and a validation data set. Such partitioning is common practice in data mining and enables you to develop a complete model that is not overfitted to a particular set of data.
  3. You will specify how SAS Enterprise Miner should handle missing values of predictor variables.
Tip
It is always a good idea to plot the input data and to check it for missing values before you proceed to model building. Knowing the statistical properties of your input data is essential for building an accurate and robust predictive model.