Contents: | Purpose / History / Requirements / Usage / Details / Limitations / Missing Values / References |
%RsquareV(version, <macro options>)
The RsquareV macro always attempts to check for a later version of itself. If it is unable to do this (such as if there is no active internet connection available), the macro will issue the following message:
RsquareV: Unable to check for newer version
The computations performed by the macro are not affected by the appearance of this message.
Version
|
Update Notes
|
1.3 | Fixed incomplete removal of observations with missing values. |
1.1 | Fixed error in KEEP statement when Base SAS® is used. |
1.0 | Initial coding |
%inc "<location of your file containing the RsquareV macro>";
Following this statement, you can call the RsquareV macro. See the Results tab for examples.
Before calling the macro, fit the full model and save the response and predicted values in a data set. This is usually accomplished by including an OUTPUT statement with the PRED= option in the modeling procedure. Use this data set as input for fitting the reduced model and save the predicted values from the reduced model in an output data set using a different variable name than for full model predicted values. Specify this data set containing the observed responses and both sets of predicted values in the data= option in the macro. This process is illustrated in the examples in the Results tab.
The following parameters are required when using the RsquareV macro:
The following parameters are optional:
R 2 V for a single model is obtained by fitting the model of interest and an intercept-only model using the same data and response distribution. A data set containing the observed responses and the predicted values from both models are required. If a FREQ and/or WEIGHT statement is used to fit the model of interest, the same must be done when fitting the reduced model. The FREQ variable must be included in the data set read by the macro. The WEIGHT variable is not needed by the macro.
Partial R 2 V comparing a full model and a nested submodel can also be computed. The submodel is reduced from the full model by removing (constraining to zero) some of its parameters. Use the macro as above, but instead of the intercept-only model, fit the reduced model of interest and save its predicted values. The result is the partial R2 assessing the effect of the parameters in the full model that are constrained in the reduced model. For an ordinary linear regression model with normal response, this is the same as the square partial correlation provided by the PCORR2 option in PROC REG. If the difference between the full and reduced models is a single parameter, then the square root of the partial R2 (with sign matching the parameter's sign) is the partial R associated with that parameter.
Penalized R 2 V , adjusted for the additional parameters in the full model, is provided when the numbers of parameters in the full and reduced models are provided.
While the RsquareV macro does not directly support BY group processing, this capability can be provided by the RunBY macro which can run the modeling procedure and the RsquareV macro repeatedly for each of the BY groups in your data. See the RunBY macro documentation for details on its use. Also see the example titled "BY group processing" in the Results tab above.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
In addition to the following examples, see this note which uses the RsquareV macro to assess the relative importance of the effects in a generalized linear model.
proc genmod data=Crabs; class color spine; model satellite = color spine width|width weight|weight / dist=poisson; output out=Preds p=px; run;
These statements use the Preds data set to fit the intercept-only model. The OUTPUT statement adds the predicted values from this model as variable p1 to data set Preds.
proc genmod data=Preds; model satellite = / dist=poisson; output out=Preds p=p1; run;
The following statement defines the RsquareV macro and makes it available for use. Specify the path of the saved macro code on your system between the quotes.
%include "<location of your file containing the RsquareV macro>";
The following calls the macro and computes R 2 V and penalized R 2 V . The response variable and predicted values from both models are specified as well as the number of parameters in the model of interest.
%RsquareV(data=Preds, response=satellite, pfull=px, psub=p1, dist=poisson, nparmfull=10)
The computed R 2 V for the first model is 0.1515. Adjusting this for the number of parameters included in the model, the penalized R 2 V is 0.1046.
|
Note that these results do not change when using a quasi-likelihood to fit the model. For example, adding a dispersion parameter to the Poisson likelihood using the SCALE= option in the MODEL statement results in use of a quasi-likelihood function for model estimation. The SCALE=PEARSON option produces an estimated dispersion parameter of 3.2354 which serves to inflate the parameter covariance matrix and therefore the standard errors of the model parameters. However, the model parameter estimates are unaffected, so the predicted values are the same resulting in the same R 2 V statistics.
Next, partial R 2 V is computed comparing a full model containing color, width and weight with a reduced model that removes the five parameters of color and width. First, the full model is fit.
proc genmod data=Crabs; class color spine; model satellite = color width|width weight|weight / dist=poisson; output out=Preds p=pf; run;
The reduced model is fit next, and its predicted values are added to the OUTPUT OUT= data set.
proc genmod data=Preds; model satellite = weight|weight / dist=poisson; output out=Preds p=pr; run;
Calling the macro produces the partial R 2 V comparing the full and reduced models.
%RsquareV(data=Preds, response=satellite, pfull=pf, psub=pr, dist=poisson)
The partial R 2 V is 0.0205 suggesting relatively small additional contributions by the color and width predictors.
|
proc gampl data=sashelp.Vote1980 plots seed=12345; model LogVoteRate = spline(Pop ) spline(Edu) spline(Houses) spline(Income) spline(Longitude Latitude); id LogVoteRate Pop Edu Houses Income Longitude Latitude; output out=gamout p=pgamf; run;
The following statements fit the intercept-only model, add its predicted values to the output data set, and call the macro.
proc gampl data=gamout plots seed=12345; model LogVoteRate = ; id LogVoteRate Pop Edu Houses Income Longitude Latitude pgamf; output out=gamout p=pgam1; run; %RsquareV(data=gamout, response=LogVoteRate, pfull=pgamf, psub=pgam1, dist=normal, nparmfull=48.70944, nparmsub=2)
R 2 V for this model is 0.74.
|
Compare this to an ordinary regression model with linear effects for each of the predictors.
proc reg data=gamout; model LogVoteRate=pop edu houses income longitude latitude; output out=gamregout p=preg; run; quit;
R2 for the linear regression model is 0.59. This macro call provides the partial and adjusted partial R 2 V
%RsquareV(data=gamregout, response=LogVoteRate, pfull=pgamf, psub=preg, dist=normal, nparmfull=48.71, nparmsub=7)
The partial value, 0.38, is the proportion of total variation left over from the reduced model that is accounted for by the full model, approximately (0.74 - 0.59)/(1 - 0.59). That is, the splines account for about 38% of the total variation left over from the linear regression model.
|
The following uses the low birth weight data presented in Hosmer and Lemeshow (2000). In the statements below, a WHERE statement is included in the LOGISTIC modeling steps to subset the input data to one level of the BY variable, RACE. The special macro variables, _BYx and _LVLx, are used by the RunBY macro to fit the models to each BY group and then to run the RsquareV macro. The BYlabel macro variable is also used to label the displayed results with the BY group definition. Since the RsquareV macro writes its own titles, a FOOTNOTE statement is used instead of a TITLE statement to provide the label.
%macro code(); proc logistic data=lowbirth; where &_BY1=&_LVL1; model low(event="1")=; output out=lb p=pnull; title "&BYlabel"; run; proc logistic data=lb; where &_BY1=&_LVL1; model low(event="1")=age lwt ftv; output out=lb p=pfull; title "&BYlabel"; run; footnote "Above for &BYlabel"; %RsquareV(response=low, dist=binomial, psub=pnull, pfull=pfull); footnote; %mend; %RunBY(data=lowbirth, by=race)
Right-click on the link below and select Save to save the RsquareV macro definition to a file. It is recommended that you name the file RsquareV.sas.
Download and save RsquareV.sas
Type: | Sample |
Topic: | Analytics ==> Regression |
Date Modified: | 2021-05-13 11:02:32 |
Date Created: | 2017-03-21 11:26:29 |
Product Family | Product | Host | SAS Release | |
Starting | Ending | |||
SAS System | SAS/STAT | z/OS | ||
z/OS 64-bit | ||||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 8 Enterprise 32-bit | ||||
Microsoft Windows 8 Enterprise x64 | ||||
Microsoft Windows 8 Pro 32-bit | ||||
Microsoft Windows 8 Pro x64 | ||||
Microsoft Windows 8.1 Enterprise 32-bit | ||||
Microsoft Windows 8.1 Enterprise x64 | ||||
Microsoft Windows 8.1 Pro 32-bit | ||||
Microsoft Windows 8.1 Pro x64 | ||||
Microsoft Windows 10 | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 R2 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows Server 2012 Datacenter | ||||
Microsoft Windows Server 2012 R2 Datacenter | ||||
Microsoft Windows Server 2012 R2 Std | ||||
Microsoft Windows Server 2012 Std | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |