Contents: | Purpose / History / Requirements / Usage / Details / Limitations / See Also |
Note: Beginning in SAS® 9.4M6 (TS1M6), a version of this macro is available in the SAS/STAT® Autocall library as the NLEST macro and does not need to be downloaded and defined before use. To access features in more recent versions of the macro (see History), download and run as described in Usage below.
%nlest(v)
The macro always attempts to check for a later version of itself. If it is unable to do this (such as if there is no active internet connection available), the macro will issue the following message in the log:
NOTE: Unable to check for newer version of the NLEST macro.
The computations performed by the macro are not affected by the appearance of this message. However, this check can be avoided by specifying nochk as the first macro argument. This can be useful if your machine has no connection to the internet.
Version
|
Update Notes
|
2.1 | Fixed incorrect computation of p-value when df= is specified. |
2.0 | nochk can be specified as the first (version) parameter. |
1.9 | Added optional where= and covdrop=. This version of NLEST is available in the SAS/STAT Autocall Library beginning in SAS 9.4M8 (TS1M8). |
1.8 | Added optional print= and null=. |
1.6 | This release of the macro can be called using the preferred name NLEST and is available in the SAS/STAT Autocall Library beginning in SAS 9.4M6 (TS1M6). |
1.51 | Minor fix to version printing. |
1.5 | Added optional title= and listnames=. |
1.4 | When df= is omitted, large-sample Wald statistics are produced. When score= is omitted, results are saved in data set EST in addition to being displayed. |
1.3 | Fixes error that occurred when input model has only a single parameter such as an intercept-only model. |
1.2 | SAS/IML® no longer required. |
1.1 | Added optional score= and outscore=. |
1.0 | Initial coding |
%inc "<location of your file containing the NLEST macro>";
Following this statement, you can call the macro. See the Results tab for examples.
There are two basic uses of the macro:
%nlest(shownames, instore=logmod)
%nlest(instore=logmod, fdata=funcs)
Following are the parameters available with the macro. The necessary model information is provided to the macro by specifying either instore= or both inest= and incovb=. If the modeling procedure provides a STORE statement for saving the fitted model, instore= is generally the better method for providing the model information. Additionally, if shownames is not specified as the first argument, either fdata= or f= (or both) is required.
Compatibility error when using inest= and incovb=
The incovb= data set should have the same number of observations (rows) and variables (columns) as the number of rows in the inest= data set in order to be compatible. Otherwise, an error message is issued that indicates the relevant numbers of rows and columns. If the incovb= data set contains numeric variables other than those containing the covariance matrix, they should be removed in order to avoid a compatibility error. This can be done either by preprocessing the data set to remove the extraneous variables or by specifying them in covdrop= (requires version 1.9 or later of the macro).
When the modeling procedure uses GLM parameterization of CLASS variables (PARAM=GLM in the CLASS statement, which is the default in many procedures), the NLEST macro will typically display the following Warning message in this log. This Warning can be ignored.
WARNING: The final Hessian matrix is not positive definite, and therefore the estimated covariance matrix is not full rank and may be unreliable. The variance of some parameter estimates is zero or some parameters are linearly related to other parameters.
BY group or domain processing
The NLEST macro does not directly support BY group processing (such as for the analysis of multiply imputed data) or processing of domains from a survey analysis. That is, it cannot directly be used to process all results from a modeling procedure that was run using a BY or DOMAIN statement. However, this capability can be provided by the RunBY macro, which can run the NLEST macro repeatedly for each of the BY groups in your data. Version 1.9 or later of the NLEST macro and version 1.1 or later of the RunBY macro are required. See the RunBY macro documentation (SAS Note 66249) for details about its use. Additionally, you can use where= to allow NLEST to process the results of one BY group or domain by specifying an appropriate condition to select that BY group or domain. See Example 2 in the Results tab above.
Output data set
When score= is not specified, the results from the macro are automatically saved in data set EST. When score= is specified, the f= function is evaluated for each observation in the score= data set and saved in a data set named NLEST, by default, or as named in outscore=.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
In addition to the examples below, several more examples of using the NLEST (or NLEstimate) macro can be found in these notes:
The example titled "Gamma Distribution Applied to Life Data" in the GENMOD documentation (SAS Note 22930) presents failure times of 201 machine parts from two manufacturers, denoted A and B. These GENMOD statements fit the log-linked gamma model and use the LSMEANS statement to estimate the manufacturer means. The STORE statement saves the model in an item store for later use by the NLEST macro.
proc genmod data = lifdat; class mfg; model lifetime = mfg / dist=gamma link=log; lsmeans mfg / e ilink diff exp; store out=gammod; run;
The values in the Estimate column in "MFG Least Squares Means" table are the estimates of the linear combinations of model parameters defined in the "Coefficients for MFG Least Squares Means" table. This table of coefficients is produced by the E option. That is, the MFG="A" estimate, 6.1501, is the Row1 linear combination defined as 1*Intercept+1*MFGA. Similarly for MFG="B". The Intercept, MFGA, and MFGB estimated model parameters are shown in the "Analysis Of Maximum Likelihood Parameter Estimates" table. Note that since the linear predictor in this gamma model estimates the log gamma mean, linear combinations of the model parameters, such as those from the LSMEANS, ESTIMATE, or CONTRAST statements, also estimate the log gamma mean. Therefore, 6.1501 is the estimated log mean for manufacturer A. The ILINK option in the LSMEANS statement applies the inverse of the link function to the estimates. In this log-linked model, that means that the estimates are exponentiated. The resulting mean estimates are presented in the column labeled "Mean". The estimated mean lifetimes are 468.74 for manufacturer A and 459.51 for manufacturer B. The DIFF option computes the pairwise differences among the LS-mean estimates and presents them in the "Differences of MFG Least Squares Means" table. This produces a difference of the log means, or equivalently a log ratio of means, in the Estimate column. The ratio of means is estimated to be exp(6.1501-6.1302) = 468.74/459.51 = 1.0201. The EXP option exponentiates this difference producing in an estimated ratio of means in the Exponentiated column. Note that the EXP option also exponentiates the estimates in the "MFG Least Squares Means" table resulting in the Exponentiated column that reproduces the Mean column from the ILINK option in this case.
|
Note that the difference in means cannot be directly estimated by PROC GENMOD using the LSMEANS (or ESTIMATE) statement. Instead, you can estimate the difference in log means (log mean ratio) or ratio of means as shown above. Of course, a point estimate of the mean difference can be computed from the estimated means: 468.74 - 459.51 = 9.23. However, a confidence interval for this difference is not easily obtained. The mean difference is this nonlinear combination of parameters:
mean(A) - mean(B) = exp(Intercept+MFGA) - exp(Intercept)
This nonlinear function can be estimated with the NLEST macro. In order to specify the function to estimate, the names of the parameters must be known. Running the NLEST macro with shownames as the first argument displays the parameter names. The model information is provided in the instore= macro parameter by specifying the name of the item store saved previously by the STORE statement.
%nlest(shownames, instore=gammod)
The resulting table displays the parameter names that you need to write the functions that you want to estimate.
|
The nonlinear function above can then be written and specified in the f= macro parameter.
%nlest( instore=gammod, label=Mean Diff, f=exp(B_p1+B_p2)-exp(B_p1) )
The estimated mean difference is 9.2309, agreeing with the value computed above. A large-sample 95% confidence interval for the mean difference is (-132.77, 151.23). Note that a warning message is produced in this example since the default GLM parameterization is used for the CLASS variable, MFG, which is not a full-rank parameterization. The results are still correct. If a full rank parameterization was used, such as reference parameterization as requested with the PARAM=REF option in the CLASS statement, the warning would not appear and the results are the same.
|
The function to estimate can also be specified in a data set using the fdata= macro parameter. This method is particularly useful when there are multiple functions to estimate. The following statements create a data set with the required variables, LABEL and F, for estimating the manufacturer means and their difference. Each observation in the data set specifies a function to estimate in F and a label for the function in LABEL.
data fd; length label f $32767; infile datalines delimiter=','; input label f; datalines; Mfg A,exp(B_p1+B_p2) Mfg B,exp(B_p1) Mean Diff,exp(B_p1+B_p2)-exp(B_p1) ; %nlest(instore=gammod, fdata=fd)
The mean estimates match those from the LSMEANS statement above and the difference estimate is the same as from the previous NLEST call.
|
Note that the desired model and nonlinear functions of the parameters can be estimated by using PROC NLMIXED. The ESTIMATE statement in NLMIXED can estimate nonlinear functions of model parameters. This is fairly easy for model like the gamma model above since the gamma distribution is directly supported in the MODEL statement of NLMIXED. But this can be more difficult for distributions that are not directly available such as the multinomial, beta, lognormal, and other distributions as well as special distributions like truncated or zero-inflated distributions. The NLEST macro uses NLMIXED and its ESTIMATE statement by adopting the fitted model parameters and covariance matrix supplied by other modeling procedures. This enables you to fit the desired model using the most natural and convenient procedure and can greatly simplify estimation of nonlinear functions of model parameters such as differences of means in generalized models when using a link function other than the identity link.
These statements fit the gamma model above (using reference parameterization for MFG) and again estimate the means and mean difference. The use of very large degrees of freedom (df=1e8) essentially produces a large-sample test and confidence interval like the default from the NLEST macro.
proc nlmixed data=lifdat; mu=exp(b1+b2*(mfg="A")); model lifetime ~ gamma(scale,mu/scale); estimate "Mfg A" exp(b1+b2) df=1e8; estimate "Mfg B" exp(b1) df=1e8; estimate "Mean Diff" exp(b1+b2)-exp(b1) df=1e8; run;
The results are very close to those from the NLEST macro above differing only because of minor estimation method differences.
The NLEST macro cannot directly process results from a modeling procedure that uses a BY statement. However, the RunBY macro (SAS Note 66249) can be used to run the NLEST macro on all BY groups in the saved results. To process a single BY group, where= can be used. The capabilities shown below require version 1.9 or later of the NLEST macro and version 1.1 or later of the RunBY macro.
The following uses the insurance data in the example titled "Poisson Regression" in the Getting Started section of the GENMOD documentation (SAS Note 22930). The differences in age rates are estimated by car size using separate models fit to the data for each car size. These statements fit and save the models using the STORE statement.
proc sort data=insure; by car; run; proc genmod data=insure; by car; class age; model c=age / dist=poisson offset=ln; store out=insmodel; run;
The following NLEST macro call specifies a condition in where= to select the large car model and estimate the difference in age rates in large cars.
%nlest(instore=insmodel, where=car='large', f=exp(b_p1+b_p2)-exp(b_p1), label=Rate Difference, title=Difference in age rates for large cars)
The next statements process all of the BY groups using the RunBY macro. The NLEST macro call is placed in the CODE macro and the condition in where= is changed to use the special macro variables, _BYx and _LVLx. Since the BY groups are defined by the levels of the CAR variable in the INSURE data set, data=insure and by=car are specified in the RunBY macro call. The BYlabel macro variable is specified in title= in NLEST to label the displayed results with the BY group definition.
%macro code(); %nlest(instore=insmodel, where=&_BY1=&_LVL1, f=exp(b_p1+b_p2)-exp(b_p1), listnames=no, label=Rate Difference, title=Difference in age Rates for &BYlabel) %mend; %RunBY(data=insure, by=car)
Domain analysis from the survey analysis procedures is similar to BY processing in that multiple models are fit to the domains identified in the DOMAIN statement. The following uses the data in the example titled "Domain Analysis" in the SURVEYREG documentation (SAS Note 22930). Dichotomized versions of the body weight and age variables, BW2 and AGEGR, are created using the RANK procedure. The SURVEYLOGISTIC procedure then estimates a logistic model in each of the two domains identified by the two levels of the CANCER variable. The fitted models are saved using the STORE statement.
It is of interest to estimate the probability of low birthweight (BW2=0) in each age group and to estimate the relative risk (ratio) of these probabilities. The probabilities are estimated by the ILINK option in the LSMEANS statement.
proc rank data=cancer groups=2 out=c2; var bodyweight age; ranks bw2 agegr; run; proc surveylogistic data=c2; class agegr/param=glm; strata strata; cluster psu; weight ObservationWt; model bw2(event="0") = agegr; domain cancer; store out=logdom; lsmeans agegr / ilink; run;
The ratio of the age probabilities can be estimated using the NLEST macro by specifying the appropriate function of model parameters. For each domain, the fitted logistic model has an intercept parameter and, because GLM parameterization is used, a parameter for each age group with the last age parameter set to zero. The macro names these parameters B_p1, B_p2, and B_p3. The event probability (mean) for the low age group is estimated by applying the inverse of the logit link to the sum of the intercept and first age parameter. The expression, logistic(B_p1+B_p2), does this computation. Similarly, logistic(B_p1) estimates the mean in the high age group. The following NLEST call uses where= to compute the ratio of these mean estimates in only the cancer (CANCER=1) domain. The relative risk is tested against a null hypothesis value of 1 by specifying null=1.
%nlest(instore=logdom, where=cancer=1, null=1, f=logistic(b_p1+b_p2)/logistic(b_p1), label=Relative Risk, title=Relative Risk in cancer domain)
The relative risk can be computed for both domains by using the RunBY macro in the same way as for BY processing above.
%macro code(); %nlest(instore=logdom, where=&_BY1=&_LVL1, null=1, f=logistic(b_p1+b_p2)/logistic(b_p1), listnames=no, label=Relative Risk, title=Relative Risk for &BYlabel) %mend; %RunBY(data=c2, by=cancer)
The analysis of multiply imputed data also involves BY processing to run the intended analysis procedure for each imputation data set. So, the use of the NLEST macro follows a similar procedure as above to provide the desired estimates for each imputation. The estimates from the NLMeans macro can then be combined using the MIANALYZE procedure.
Suppose that the original data containing missing values is multiply imputed using the MI procedure, resulting in a data set, IMPUT, containing multiple imputations of the original data. The blocks of observations containing the imputations are indexed by the variable, _IMPUTATION_. Further suppose that the intended analysis on the original data involves fitting a logistic model and the desire is to estimate the odds ratio of a continuous predictor. The binary response, Y, has levels 0 or 1 with Y=1 representing the event of interest. Variable X is the single continuous predictor in the model. Then, given the IMPUT data set of the original data, the following statements conduct the analysis on each imputation data set and saves the fitting model in item store MOD. The ODS EXCLUDE and ODS SELECT options are used to suppress the displayed results from the multiple analyses.
ods exclude all; proc logistic data=imput; by _imputation_; model y(event="1") = x; store mod; run; ods select all;
Using the above approach for BY processing, the NLEST macro can now be used inside the RunBY macro to estimate the odds ratio for each imputation data set. Note that the two parameters in the model are the intercept and X parameter, so the name referring to the X parameter is B_P2. The odds ratio for X is computed as exp(B_P2) and this is specified in f= in the NLEST call. listnames=no and print=no are specified to suppress all displayed results from the multiple runs of the macro. In the RunBY macro, the _BY1 macro variable in this case is the _IMPUTATION_ variable provided by PROC MI. Its values are 1, 2, 3, ... and are represented by the _LVL1 macro variable. The NLEST macro automatically saves the results from each run in data set EST. Since this data set does not contain the _IMPUTATION_ variable and value for each run, it is added in the DATA EST step. The APPEND procedure is used to accumulate the mean difference results in data set ALL.
%macro code; %nlest(instore=mod, f=exp(b_p2), where=&_BY1=&_LVL1, listnames=no, print=no) data est; set est; _imputation_=&_LVL1; run; proc append base=all data=est; run; %mend; %runby(data=imput, by=_imputation_)
Finally, the NLEST results in data set ALL from the multiple imputations can be combined into a single estimate of the odds ratio using the MIANALYZE procedure. The odds ratio estimates are in the variable Estimate and their standard errors are in the variable StandardError.
proc sort data=all; by Label; run; proc mianalyze data=all; by Label; modeleffects Estimate; stderr StandardError; run;
Right-click the link below and select Save to save the NLEST/NLEstimate macro definition to a file. It is recommended that you name the file nlest.sas.
Type: | Sample |
Topic: | Analytics ==> Regression SAS Reference ==> Procedures ==> NLMIXED SAS Reference ==> Macro |
Date Modified: | 2024-08-14 18:52:11 |
Date Created: | 2016-08-11 14:16:41 |
Product Family | Product | Host | SAS Release | |
Starting | Ending | |||
SAS System | N/A | Aster Data nCluster on Linux x64 | ||
DB2 Universal Database on AIX | ||||
DB2 Universal Database on Linux x64 | ||||
Netezza TwinFin 32-bit SMP Hosts | ||||
Netezza TwinFin 32bit blade | ||||
Netezza TwinFin 64-bit S-Blades | ||||
Netezza TwinFin 64-bit SMP Hosts | ||||
Teradata on Linux | ||||
z/OS | ||||
z/OS 64-bit | ||||
IBM AS/400 | ||||
OpenVMS VAX | ||||
N/A | ||||
Android Operating System | ||||
Apple Mobile Operating System | ||||
Chrome Web Browser | ||||
Macintosh | ||||
Macintosh on x64 | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
SAS Cloud | ||||
Microsoft Windows 8 Enterprise 32-bit | ||||
Microsoft Windows 8 Enterprise x64 | ||||
Microsoft Windows 8 Pro 32-bit | ||||
Microsoft Windows 8 Pro x64 | ||||
Microsoft Windows 8.1 Enterprise 32-bit | ||||
Microsoft Windows 8.1 Enterprise x64 | ||||
Microsoft Windows 8.1 Pro 32-bit | ||||
Microsoft Windows 8.1 Pro x64 | ||||
Microsoft Windows 10 | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 R2 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows Server 2012 Datacenter | ||||
Microsoft Windows Server 2012 R2 Datacenter | ||||
Microsoft Windows Server 2012 R2 Std | ||||
Microsoft Windows Server 2012 Std | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |