Number of Input Variables
|
Number of Observations
Processed
|
---|---|
<100
|
80,000
|
100–200
|
40,000
|
>200
|
20,000
|
Condition
|
Rare Event
|
|
---|---|---|
Yes
|
No
|
|
total number of observations
< number of observations being processed
OR
total number of events
< (0.10*number of observations being processed)
|
Sample the data so that
there is a 10:1 ratio of non-events to events.
|
no sampling
|
total number of events
> (0.10*number of observations being processed)
|
Sample the following
proportion of the rare events:
|
stratified sampling
|
Name
|
Age
|
Gender
|
Income
|
Treatment
|
Purchase
|
---|---|---|---|---|---|
Ricardo
|
29
|
M
|
33000
|
Y
|
Y
|
Susan
|
35
|
F
|
51000
|
Y
|
N
|
Jeremy
|
49
|
M
|
110000
|
N
|
Y
|