Usage Note 46266: Using non-integer Frequency variable values as weights in SAS® Enterprise Miner(tm)
There is no standard practice for implementing weight variables in a data mining problem. Therefore, SAS Enterprise Miner does not support the use of Weight variables. The following except from SAS Enterprise Miner Help discusses treating a Frequency variable with non-integer values as a Weight variable. You can find this content in either of two ways:
For more information, see the Predictive Modeling chapter in SAS Enterprise Miner Help.
The Frequency Variable and Weighted Estimation
All of the modeling nodes allow you to specify a frequency variable. Typically, the values of the frequency variable are nonnegative integers. The data are treated as if each case were replicated as many times as the value of the frequency variable.
Unlike most SAS procedures, the modeling nodes in Enterprise Miner accept values for a frequency variable that are not integers without truncating the fractional part. Thus, you can use a frequency variable to perform weighted analyses. However, Enterprise Miner does not provide explicit support for sampling weights, noise-variance weights, or other analyses where the weight variable does not represent the frequency of occurrence of each case. If the frequency variable represents sampling weights or noise-variance weights, the point estimates of regression coefficients and neural network weights will be valid. But if the frequency variable does not represent actual frequencies, then standard errors, significance tests, and statistics such as MSE, AIC, and SBC may be invalid. If you want to do weighted estimation under the usual assumption for weighted least-squares that the weights are inversely proportional to the noise variance (error variance) of the target variable, you can obtain statistically correct results by specifying frequency values that add up to the sample size. If you want to use sampling weights that are inversely proportional to the sampling probability of each case, you can get approximate estimates for MSE and related statistics in the Regression and Neural Network nodes by specifying frequencies that add up to the effective sample size. A pessimistic approximation to the effective sample size is provided by
[sum(w(i)]^2/sum(w(i)^2),
where w(i) is a sampling weight for case i.
This trick will not work properly with the Tree node.
Operating System and Release Information
| SAS System | SAS Enterprise Miner | z/OS | 6.1 | | | |
| Microsoft® Windows® for 64-Bit Itanium-based Systems | 6.1 | | | |
| Microsoft Windows Server 2003 Datacenter 64-bit Edition | 6.1 | | | |
| Microsoft Windows Server 2003 Enterprise 64-bit Edition | 6.1 | | | |
| Microsoft Windows XP 64-bit Edition | 6.1 | | | |
| Microsoft® Windows® for x64 | 6.1 | | | |
| Microsoft Windows 95/98 | 6.1 | | | |
| Microsoft Windows 2000 Advanced Server | 6.1 | | | |
| Microsoft Windows 2000 Datacenter Server | 6.1 | | | |
| Microsoft Windows 2000 Server | 6.1 | | | |
| Microsoft Windows 2000 Professional | 6.1 | | | |
| Microsoft Windows NT Workstation | 6.1 | | | |
| Microsoft Windows Server 2003 Datacenter Edition | 6.1 | | | |
| Microsoft Windows Server 2003 Enterprise Edition | 6.1 | | | |
| Microsoft Windows Server 2003 Standard Edition | 6.1 | | | |
| Microsoft Windows Server 2003 for x64 | 6.1 | | | |
| Microsoft Windows Server 2008 | 6.1 | | | |
| Microsoft Windows Server 2008 for x64 | 6.1 | | | |
| Microsoft Windows XP Professional | 6.1 | | | |
| Windows 7 Enterprise 32 bit | 6.1 | | | |
| Windows 7 Enterprise x64 | 6.1 | | | |
| Windows 7 Home Premium 32 bit | 6.1 | | | |
| Windows 7 Home Premium x64 | 6.1 | | | |
| Windows 7 Professional 32 bit | 6.1 | | | |
| Windows 7 Professional x64 | 6.1 | | | |
| Windows 7 Ultimate 32 bit | 6.1 | | | |
| Windows 7 Ultimate x64 | 6.1 | | | |
| Windows Millennium Edition (Me) | 6.1 | | | |
| Windows Vista | 6.1 | | | |
| Windows Vista for x64 | 6.1 | | | |
| 64-bit Enabled AIX | 6.1 | | | |
| 64-bit Enabled HP-UX | 6.1 | | | |
| 64-bit Enabled Solaris | 6.1 | | | |
| HP-UX IPF | 6.1 | | | |
| Linux | 6.1 | | | |
| Linux for x64 | 6.1 | | | |
| Solaris for x64 | 6.1 | | | |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
| Date Modified: | 2012-04-10 15:27:11 |
| Date Created: | 2012-04-10 14:46:12 |