SAS/IML Papers A-Z

A
Session 7660-2016:
A SAS® Macro for Generating Random Numbers of Skew-Normal and Skew-T Distributions
This paper aims to show a SAS® macro for generating random numbers of skew-normal and skew-t distributions, as well as the quantiles of these distributions. The results are similar to those generated by the sn package of R software.
Read the paper (PDF) | Download the data file (ZIP) | View the e-poster or slides (PDF)
Alan Silva, University of Brasilia
Paulo Henrique Dourado da Silva, Banco do Brasil
B
Session 7200-2016:
Bayesian Inference for Gaussian Semiparametric Multilevel Models
Bayesian inference for complex hierarchical models with smoothing splines is typically intractable, requiring approximate inference methods for use in practice. Markov Chain Monte Carlo (MCMC) is the standard method for generating samples from the posterior distribution. However, for large or complex models, MCMC can be computationally intensive, or even infeasible. Mean Field Variational Bayes (MFVB) is a fast deterministic alternative to MCMC. It provides an approximating distribution that has minimum Kullback-Leibler distance to the posterior. Unlike MCMC, MFVB efficiently scales to arbitrarily large and complex models. We derive MFVB algorithms for Gaussian semiparametric multilevel models and implement them in SAS/IML® software. To improve speed and memory efficiency, we use block decomposition to streamline the estimation of the large sparse covariance matrix. Through a series of simulations and real data examples, we demonstrate that the inference obtained from MFVB is comparable to that of PROC MCMC. We also provide practical demonstrations of how to estimate additional posterior quantities of interest from MFVB either directly or via Monte Carlo simulation.
Read the paper (PDF) | Download the data file (ZIP)
Jason Bentley, The University of Sydney
Cathy Lee, University of Technology Sydney
I
Session 11420-2016:
Integrating SAS® and R to Perform Optimal Propensity Score Matching
In studies where randomization is not possible, imbalance in baseline covariates (confounding by indication) is a fundamental concern. Propensity score matching (PSM) is a popular method to minimize this potential bias, matching individuals who received treatment to those who did not, to reduce the imbalance in pre-treatment covariate distributions. PSM methods continue to advance, as computing resources expand. Optimal matching, which selects the set of matches that minimizes the average difference in propensity scores between mates, has been shown to outperform less computationally intensive methods. However, many find the implementation daunting. SAS/IML® software allows the integration of optimal matching routines that execute in R, e.g. the R optmatch package. This presentation walks through performing optimal PSM in SAS® through implementing R functions, assessing whether covariate trimming is necessary prior to PSM. It covers the propensity score analysis in SAS, the matching procedure, and the post-matching assessment of covariate balance using SAS/STAT® 13.2 and SAS/IML procedures.
Read the paper (PDF)
Lucy D'Agostino McGowan, Vanderbilt University
Robert Greevy, Department of Biostatistics, Vanderbilt University
M
Session 9080-2016:
MCMC in SAS®: From Scratch or by PROC
Markov chain Monte Carlo (MCMC) algorithms are an essential tool in Bayesian statistics for sampling from various probability distributions. Many users prefer to use an existing procedure to code these algorithms, while others prefer to write an algorithm from scratch. We demonstrate the various capabilities in SAS® software to satisfy both of these approaches. In particular, we first illustrate the ease of using the MCMC procedure to define a structure. Then we step through the process of using SAS/IML® to write an algorithm from scratch, with examples of a Gibbs sampler and a Metropolis-Hastings random walk.
Read the paper (PDF)
Chelsea Lofland, University of California, Santa Cruz
O
Session SAS1760-2016:
Outlier Detection Using the Forward Search in SAS/IML® Studio
In cooperation with the Joint Research Centre - European Commission (JRC), we have developed a number of innovative techniques to detect outliers on a large scale. In this session, we show the power of SAS/IML® Studio as an interactive tool for exploring and detecting outliers using customized algorithms that were built from scratch. The JRC uses this for detecting abnormal trade transactions on a large scale. The outliers are detected using the Forward Search, which starts from a central subset in the data and subsequently adds observations that are close to the current subset based on regression (R-student) or multivariate (Mahalanobis distance) output statistics. The implementation of this algorithm and its applications were done in SAS/IML Studio and converted to a macro for use in the IML procedure in Base SAS®.
Read the paper (PDF) | Download the data file (ZIP)
Jos Polfliet, SAS
S
Session 10260-2016:
SAS® Macro for Generalized Method of Moments Estimation for Longitudinal Data with Time-Dependent Covariates
Longitudinal data with time-dependent covariates is not readily analyzed as there are inherent, complex correlations due to the repeated measurements on the sampling unit and the feedback process between the covariates in one time period and the response in another. A generalized method of moments (GMM) logistic regression model (Lalonde, Wilson, and Yin 2014) is one method for analyzing such correlated binary data. While GMM can account for the correlation due to both of these factors, it is imperative to identify the appropriate estimating equations in the model. Cai and Wilson (2015) developed a SAS® macro using SAS/IML® software to fit GMM logistic regression models with extended classifications. In this paper, we expand the use of this macro to allow for continuous responses and as many repeated time points and predictors as possible. We demonstrate the use of the macro through two examples, one with binary response and another with continuous response.
Read the paper (PDF)
Katherine Cai, Arizona State University
Jeffrey Wilson, Arizona State University
Session 10960-2016:
SAS® and R: A Perfect Combination for Sports Analytics
Revolution Analytics reports more than two million R users worldwide. SAS® has the capability to use R code, but users have discovered a slight learning curve to performing certain basic functions such as getting data from the web. R is a functional programming language while SAS is a procedural programming language. These differences create difficulties when first making the switch from programming in R to programming in SAS. However, SAS/IML® software enables integration between the two languages by enabling users to write R code directly into SAS/IML. This paper details the process of using the SAS/IML command Submit /R and the R package XML to get data from the web into SAS/IML. The project uses public basketball data for each of the 30 NBA teams over the past 35 years, taken directly from Basketball-Reference.com. The data was retrieved from 66 individual web pages, cleaned using R functions, and compiled into a final data set composed of 48 variables and 895 records. The seamless compatibility between SAS and R provide an opportunity to use R code in SAS for robust modeling. The resulting analysis provides a clear and concise approach for those interested in pursuing sports analytics.
View the e-poster or slides (PDF)
Matt Collins, University of Alabama
Taylor Larkin, The University of Alabama
Session 2060-2016:
Simultaneous Forecasts of Multiple Interrelated Time Series with Markov Chain Model
In forecasting, there are often situations where several time series are interrelated: components of one time series can transition into and from other time series. A Markov chain forecast model may readily capture such intricacies through the estimation of a transition probability matrix, which enables a forecaster to forecast all the interrelated time series simultaneously. A Markov chain forecast model is flexible in accommodating various forecast assumptions and structures. Implementation of a Markov chain forecast model is straightforward using SAS/IML® software. This paper demonstrates a real-world application in forecasting a community supervision caseload in Washington State. A Markov model was used to forecast five interrelated time series in the midst of turbulent caseload changes. This paper discusses the considerations and techniques in building a Markov chain forecast model at each step. Sample code using SAS/IML is provided. Anyone interested in adding another tool to their forecasting technique toolbox will find that the Markov approach is useful and has some unique advantages in certain settings.
Read the paper (PDF)
Gongwei Chen, Caseload Forecast Council
U
Session 11823-2016:
Upgrade from ARIMA to ARIMAX to Improve Forecasting Accuracy of Nonlinear Time-Series: Create Your Own Exogenous Variables Using Wavelet Analysis
This paper proposes a technique to implement wavelet analysis (WA) for improving a forecasting accuracy of the autoregressive integrated moving average model (ARIMA) in nonlinear time-series. With the assumption of the linear correlation, and conventional seasonality adjustment methods used in ARIMA (that is, differencing, X11, and X12), the model might fail to capture any nonlinear pattern. Rather than directly model such a signal, we decompose it to less complex components such as trend, seasonality, process variations, and noises, using WA. Then, we use them as exogenous variables in the autoregressive integrated moving average with explanatory variable model (ARIMAX). We describe a background of WA. Then, the code and a detailed explanation of WA based on multi-resolution analysis (MRA) in SAS/IML® software are demonstrated. The idea and mathematical basis of ARIMA and ARIMAX are also given. Next, we demonstrate our technique in forecasting applications using SAS® Forecast Studio. The demonstrated time-series are nonlinear in nature from different fields. The results suggest that WA effects are good regressors in ARIMAX, which captures nonlinear patterns well.
Read the paper (PDF)
Woranat Wongdhamma, Oklahoma State University
W
Session SAS4201-2016:
Writing Packages: A New Way to Distribute and Use SAS/IML® Programs
SAS/IML® 14.1 enables you to author, install, and call packages. A package consists of SAS/IML source code, documentation, data sets, and sample programs. Packages provide a simple way to share SAS/IML functions. An expert who writes a statistical analysis in SAS/IML can create a package and upload it to the SAS/IML File Exchange. A nonexpert can download the package, install it, and immediately start using it. Packages provide a standard and uniform mechanism for sharing programs, which benefits both experts and nonexperts. Packages are very popular with users of other statistical software, such as R. This paper describes how SAS/IML programmers can construct, upload, download, and install packages. They're not wrapped in brown paper or tied up with strings, but they'll soon be a few of your favorite things!
Read the paper (PDF)
Rick Wicklin, SAS
back to top