The example in Chapter 4: Calling SAS Procedures, submits SAS statements to call the REG procedure. The example preforms a linear regression of the wind_kts
variable by the min_pressure
variable of the Hurricanes
data. The following program repeats the same analysis, but does it by submitting statements to R:
declare DataObject dobj; dobj = DataObject.CreateFromFile("Hurricanes"); dobj.GetVarData( "wind_kts", w ); /* Step 1 */ dobj.GetVarData( "min_pressure", p ); /* send matrices to R */ run ExportMatrixToR( w, "Wind" ); /* Step 2 */ run ExportMatrixToR( p, "Pressure" ); print "-------------- In R ---------------"; /* Step 3 */ submit / R; Model <- lm(Wind~Pressure, na.action="na.exclude") # 3a ParamEst <- coef(Model) # 3b Pred <- fitted(Model) Resid <- residuals(Model) print (ParamEst) # 3c endsubmit; print "----------- In SAS/IML -------------"; run ImportMatrixFromR( pe, "ParamEst" ); /* Step 4 */ print pe[r={"Intercept" "min_pressure"}]; /* add variables to the DataObject */ dobj.AddVarFromR( "R_Pred", "Pred" ); /* Step 5 */ dobj.AddVarFromR( "R_Resid", "Resid" ); ScatterPlot.Create(dobj, "min_pressure", "R_Resid");
The output from this program is shown in Figure 11.3. The program consists of the following steps:
The GetVarData method of the DataObject class copies the data for the wind_kts
and min_pressure
variables into SAS/IML vectors named w
and p
.
These vectors are sent to R by the ExportMatrixToR module. The names of the corresponding R vectors that contain the data
are Wind
and Pressure
.
The SUBMIT statement with the R option is used to send statements to R. Note that comments in R begin with a hash mark (#, also called a number sign or a pound sign).
The lm
function computes a linear model of Wind
as a function of Pressure
. The na.action=
option specifies how the model handles missing values (which in R are represented by NA). In particular, the na.exclude
option specifies that the lm
function should not omit observations with missing values from residual and predicted values. This option makes it easier
to merge the R results with the original data.
Various information is retrieved from the linear model and placed into R vectors named ParamEst
, Pred
, and Resid
.
The parameter estimates are printed in R, as shown in Figure 11.3.
The ImportMatrixFromR module transfers the ParamEst
vector from R into a SAS/IML vector named pe
. This vector is printed by the SAS/IML PRINT statement.
The Pred
and Resid
vectors are added to the DataObject. The new variables are given the names R_Pred
and R_Resid
. A scatter plot of the residual values versus the explanatory variable is created, similar to Figure 6.1.
Note that you cannot directly transfer the contents of the Model
object. Instead, various R functions are used to extract portions of the Model
object, and those pieces are transferred.
As an alternative to steps 1 and 2, you can call the ExportToR method in the DataObject class. The ExportToR method writes
an entire DataObject to an R data frame. For example, after creating the DataObject you could use the following statements
to create an R data frame named Hurr
:
dobj.ExportToR("Hurr"); submit / R; Model <- lm(wind_kts~min_pressure, data=Hurr, na.action="na.exclude") endsubmit;
The R language is case-sensitive so you must use the correct case to refer to variables in a data frame.
The SUBMIT statement for R supports parameter substitution from SAS/IML matrices, just as it does for SAS statements. For example, you can substitute the names of analysis variables into a SUBMIT block by using the following statements:
YVar = "wind_kts"; XVar = "min_pressure"; submit XVar YVar / R; Model <- lm(&YVar ~ &XVar, data=Hurr, na.action="na.exclude") print (Model$call) endsubmit;
Figure 11.4 shows the result of the print(Model$call)
statement. The output shows that the values of the YVar
and XVar
matrices were substituted into the SUBMIT block.