Usage Note 22220: Does SAS have procedures for bootstrapping, crossvalidation, and jackknifing?
Some procedures in SAS software implement these methods in the context of the analyses that they perform:
- In SAS/STAT software, PROC MULTTEST can use bootstrap or permutation resampling (see the BOOTSTRAP and PERMUTATION options in the PROC MULTTEST statement) to adjust the p-values for the tests that are computed by the procedure.
- In SAS/STAT software, PROC DISCRIM uses cross validation (see the CROSSVALIDATE, CROSSLIST, and CROSSLISTERR options in the PROC DISCRIM statement) to obtain nearly unbiased estimates of the classification error rates in a discriminant analysis.
- In SAS/STAT software, PROC LOGISTIC uses a one-step approximation of crossvalidation to obtain predicted probabilities. See the PREDPROBS=CROSSVALIDATE option in the OUTPUT statement and the CTABLE option in the MODEL statement.
- In SAS/STAT software, PROC SURVEYSELECT can create multiple bootstrap (with replacement) or permutation (without replacement) resamples using the REP= option to create independent, replicated samples from a data set. Use the METHOD= option to select sampling with or without replacement and the SAMPSIZE= option to control the size of the samples.
- In SAS/STAT software, the GAM, LOESS and TPSPLINE procedures can use cross validation to choose the smoothing parameter. See the METHOD=GCV option in the MODEL statement of PROC GAM and the SELECT= option in PROC LOESS. PROC TPSPLINE uses cross validation by default.
- In SAS/STAT software, PROC PLS enables you to choose the number of extracted factors by cross validation. See the CV= option in the PROC PLS statement.
- In SAS/STAT software, PROC MODECLUS can use likelihood cross validation to choose the smoothing parameter. See the CROSS and CROSSLIST options in the PROC MODECLUS statement.
- In SAS/STAT software, PROC CALIS computes the expected cross validation index (ECVI) that measures how good a model is for predicting future sample covariances.
- In SAS/STAT software, PROC MI can use bootstrap resampling, which uses a simple random sample with replacement from the input data set for the initial estimate or to obtain overdispersed starting values for multiple chains. See the INITIAL=EM(BOOTSTRAP) option in the MCMC statement. Also, the propensity score method applies an approximate Bayesian bootstrap imputation. See the PROPENSITY option in the MONOTONE statement.
- In SAS/STAT software, PROC GLMSELECT provides leave-one-out and k-fold cross validation for estimating prediction error. Cross validation can be used as the selection criterion for selecting model effects, as a stopping rule for the selection process, and as the criterion for final model determination.
- In SAS/STAT software, PROC QUANTREG implements the Markov chain marginal bootstrap (MCMB) general resampling method of He and Hu (2002) to provide confidence intervals for regression quantile estimates. These intervals, available with the CI=RESAMPLING option in the PROC QUANTREG statement, provide some robustness to heteroscedasticity.
- In SAS/STAT software, beginning in SAS 9.2, jackknife variance estimation is available in all of the survey data analysis procedures (SURVEYMEANS, SURVEYREG, SURVEYLOGISTIC, SURVEYPHREG, and SURVEYFREQ).
- In SAS/Genetics software, PROC HAPLOTYPE uses jackknifing to estimate the standard errors of haplotype frequencies.
- In SAS/Genetics software, PROC ALLELE can compute bootstrap confidence intervals for estimates of allele frequencies and disequilibrium coefficients. See the BOOTSTRAP= option in the PROC ALLELE statement.
Also, there are macros available that perform bootstrap and jackknife analysis for simple random samples, computing approximate standard errors,
bias-corrected estimates, and confidence intervals assuming a normal sampling distribution. In order to use these macros, you need to know enough about the SAS macro language to be able to write simple macros.
Operating System and Release Information
| SAS System | SAS/STAT | All | n/a | |
| SAS System | SAS/Genetics | z/OS | | |
| Microsoft® Windows® for 64-Bit Itanium-based Systems | | |
| Microsoft Windows Server 2003 Datacenter 64-bit Edition | | |
| Microsoft Windows Server 2003 Enterprise 64-bit Edition | | |
| Microsoft Windows XP 64-bit Edition | | |
| Microsoft® Windows® for x64 | | |
| Microsoft Windows 95/98 | | |
| Microsoft Windows 2000 Advanced Server | | |
| Microsoft Windows 2000 Datacenter Server | | |
| Microsoft Windows 2000 Server | | |
| Microsoft Windows 2000 Professional | | |
| Microsoft Windows NT Workstation | | |
| Microsoft Windows Server 2003 Datacenter Edition | | |
| Microsoft Windows Server 2003 Enterprise Edition | | |
| Microsoft Windows Server 2003 Standard Edition | | |
| Microsoft Windows XP Professional | | |
| Windows Millennium Edition (Me) | | |
| Windows Vista | | |
| 64-bit Enabled AIX | | |
| 64-bit Enabled HP-UX | | |
| 64-bit Enabled Solaris | | |
| AIX | | |
| HP-UX | | |
| HP-UX IPF | | |
| Linux | | |
| Linux for x64 | | |
| Linux on Itanium | | |
| OpenVMS Alpha | | |
| OpenVMS on HP Integrity | | |
| Solaris | | |
| Solaris for x64 | | |
| Tru64 UNIX | | |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
| Type: | Usage Note |
| Priority: | low |
| Topic: | SAS Reference ==> Procedures ==> SURVEYSELECT Analytics ==> Multivariate Analysis Analytics ==> Forecasting Analytics ==> Genetics SAS Reference ==> Procedures ==> MULTTEST SAS Reference ==> Procedures ==> LOGISTIC Analytics ==> Categorical Data Analysis Analytics ==> Discriminant Analysis SAS Reference ==> Procedures ==> DISCRIM Analytics ==> Missing Value Imputation SAS Reference ==> Procedures ==> SURVEYPHREG Analytics ==> Survey Sampling and Analysis SAS Reference ==> Procedures ==> ALLELE SAS Reference ==> Procedures ==> CALIS SAS Reference ==> Procedures ==> GAM SAS Reference ==> Procedures ==> HAPLOTYPE SAS Reference ==> Procedures ==> LOESS SAS Reference ==> Procedures ==> MODECLUS SAS Reference ==> Procedures ==> PLS SAS Reference ==> Procedures ==> SURVEYFREQ SAS Reference ==> Procedures ==> SURVEYLOGISTIC SAS Reference ==> Procedures ==> SURVEYMEANS SAS Reference ==> Procedures ==> SURVEYREG SAS Reference ==> Procedures ==> TPSPLINE
|
| Date Modified: | 2006-06-01 06:50:24 |
| Date Created: | 2002-12-16 10:56:36 |