Call an R Analysis from IMLPlus

The example in Chapter 4: Calling SAS Procedures, submits SAS statements to call the REG procedure. The example preforms a linear regression of the wind_kts variable by the min_pressure variable of the Hurricanes data. The following program repeats the same analysis, but does it by submitting statements to R:

declare DataObject dobj;
dobj = DataObject.CreateFromFile("Hurricanes");
dobj.GetVarData( "wind_kts", w );                         /* Step 1 */
dobj.GetVarData( "min_pressure", p );

/* send matrices to R */
run ExportMatrixToR( w, "Wind" );                         /* Step 2 */
run ExportMatrixToR( p, "Pressure" );

print "--------------  In R   ---------------";           /* Step 3 */
submit / R;
  Model    <- lm(Wind~Pressure, na.action="na.exclude")         # 3a
  ParamEst <- coef(Model)                                       # 3b
  Pred     <- fitted(Model)
  Resid    <- residuals(Model)
  print (ParamEst)                                              # 3c
endsubmit;

print "-----------  In SAS/IML  -------------";
run ImportMatrixFromR( pe, "ParamEst" );                  /* Step 4 */
print pe[r={"Intercept" "min_pressure"}];

/* add variables to the DataObject */
dobj.AddVarFromR( "R_Pred", "Pred" );                     /* Step 5 */
dobj.AddVarFromR( "R_Resid", "Resid" );
ScatterPlot.Create(dobj, "min_pressure", "R_Resid");

The output from this program is shown in Figure 11.3. The program consists of the following steps:

  1. The GetVarData method of the DataObject class copies the data for the wind_kts and min_pressure variables into SAS/IML vectors named w and p.

  2. These vectors are sent to R by the ExportMatrixToR module. The names of the corresponding R vectors that contain the data are Wind and Pressure.

  3. The SUBMIT statement with the R option is used to send statements to R. Note that comments in R begin with a hash mark (#, also called a number sign or a pound sign).

    1. The lm function computes a linear model of Wind as a function of Pressure. The na.action= option specifies how the model handles missing values (which in R are represented by NA). In particular, the na.exclude option specifies that the lm function should not omit observations with missing values from residual and predicted values. This option makes it easier to merge the R results with the original data.

    2. Various information is retrieved from the linear model and placed into R vectors named ParamEst, Pred, and Resid.

    3. The parameter estimates are printed in R, as shown in Figure 11.3.

  4. The ImportMatrixFromR module transfers the ParamEst vector from R into a SAS/IML vector named pe. This vector is printed by the SAS/IML PRINT statement.

  5. The Pred and Resid vectors are added to the DataObject. The new variables are given the names R_Pred and R_Resid. A scatter plot of the residual values versus the explanatory variable is created, similar to Figure 6.1.

Figure 11.3: Calling an R Analysis

--------------  In R   ---------------
(Intercept)    Pressure 
1333.354893   -1.291374 


-----------  In SAS/IML  -------------

          pe

Intercept    1333.3549
min_pressure -1.291374


Note that you cannot directly transfer the contents of the Model object. Instead, various R functions are used to extract portions of the Model object, and those pieces are transferred.

As an alternative to steps 1 and 2, you can call the ExportToR method in the DataObject class. The ExportToR method writes an entire DataObject to an R data frame. For example, after creating the DataObject you could use the following statements to create an R data frame named Hurr:

dobj.ExportToR("Hurr");
submit / R;
  Model <- lm(wind_kts~min_pressure, data=Hurr, na.action="na.exclude")
endsubmit;

The R language is case-sensitive so you must use the correct case to refer to variables in a data frame.

The SUBMIT statement for R supports parameter substitution from SAS/IML matrices, just as it does for SAS statements. For example, you can substitute the names of analysis variables into a SUBMIT block by using the following statements:

YVar = "wind_kts";
XVar = "min_pressure";
submit XVar YVar / R;
  Model <- lm(&YVar ~ &XVar, data=Hurr, na.action="na.exclude")
  print (Model$call)
endsubmit;

Figure 11.4 shows the result of the print(Model$call) statement. The output shows that the values of the YVar and XVar matrices were substituted into the SUBMIT block.

Figure 11.4: Parameter Substitutions in a SUBMIT Block

lm(formula = wind_kts ~ min_pressure, data = Hurr, na.action = "na.exclude")