SAS Global Forum 2017 Proceedings

UNIX and Linux SAS^® administrators, have you ever been greeted by one of these statements as you walk into the office before you have gotten your first cup of coffee? Power outage! SAS servers are down. I cannot access my reports. Have you frantically tried to restart the SAS servers to avoid loss of productivity and missed one of the steps in the process, causing further delays while other work continues to pile up? If you have had this experience, you understand the benefit to be gained from a utility that automates the management of these multi-tiered deployments. Until recently, there was no method for automatically starting and stopping multi-tiered services in an orchestrated fashion. Instead, you had to use time-consuming manual procedures to manage SAS services. These procedures were also prone to human error, which could result in corrupted services and additional time lost, debugging and resolving issues injected by this process. To address this challenge, SAS Technical Support created the SAS Local Services Management (SAS_lsm) utility, which provides automated, orderly management of your SAS^® multi-tiered deployments. The intent of this paper is to demonstrate the deployment and usage of the SAS_lsm utility. Now, go grab a coffee, and let's see how SAS_lsm can make life less chaotic.

Read the paper (PDF)

In this paper, we introduce a SAS/IML^® program of Classification Accuracy and Classification Consistency (CA/CC) that provides useful resources to test analysts or psychometricians. Our program optimizes functions of SAS^® by offering the CA/CC statistics not only with dichotomous items, but also with polytomous items. Classification Decision (CD) is a method to categorize examinees into achievement groups based on cut scores (Quinn and Cheng, 2013). CD has been predominantly used in educational and vocational situations such as admissions, selection, placement, or certification. This method needs to be accurate because its use has been important to examinees' professional and academic futures. Classification Accuracy and Classification Consistency (CA/CC) statistics are indices representing the precision of CD, and they need to be reported in order to affirm the validity of the CD. Classification Accuracy is referred to as the degree to which the classification of observed scores matches with the classification of true scores, and Classification Consistency is defined as the degree to which examinees are classified in the same category when taking two parallel test forms (Lee, 2010). Under item response theory (IRT), there are two methods to calculate CA/CC: Rudner (2001) and Lee (2010) approaches. This research deals with these two approaches for CA/CC with the examinee level.

View the e-poster or slides (PDF)

A/B testing is a form of statistical hypothesis testing on two business options (A and B) to determine which is more effective in the modern Internet age. The challenge for startups or new product businesses leveraging A/B testing are two-fold: a small number of customers and poor understanding of their responses. This paper shows you how to use the IML and POWER procedures to deal with the reassessment of sample size for adaptive multiple business stage designs based on conditional power arguments, using the data observed at the previous business stage.

Read the paper (PDF)

The analysis of longitudinal data requires a model that correctly accounts for both the inherent correlation amongst the responses as a result of the repeated measurements, as well as the feedback between the responses and predictors at different time points. Lalonde, Wilson, and Yin (2013) developed an approach based on generalized method of moments (GMM) for identifying and using valid moment conditions to account for time-dependent covariates in longitudinal data with binary outcomes. However, the model developed using this approach does not provide information about the specific relationships that exist across time points. We present a SAS^® macro that extends the work of Lalonde, Wilson, and Yin by using valid moment conditions to estimate and evaluate the relationships between the response and predictors at different time periods. The performance of this method is compared to previously established results.

Read the paper (PDF)

The mean-variance model might be the most famous model in the financial field. It can determine the optimal portfolio if you know every asset's expected return and its covariance matrix. The tangency portfolio is a type of optimal portfolio, which means that it has the maximum expected return (mean) and the minimial risk (variance) among all portfolios. This paper uses sample data to get the tangency portfolio using SAS/IML^® code.

Read the paper (PDF) | View the e-poster or slides (PDF)

Dynamic social networks can be used to monitor the constantly changing nature of interactions and relationships between people and groups. The size and complexity of modern dynamic networks can make this task extremely challenging. Using the combination of SAS/IML^®, SAS/QC^®, and R, we propose a fast approach to monitor dynamic social networks. A discrepancy score at edge level was developed to measure the unusualness of the observed social network. Then, multivariate and univariate change-point detection methods were applied on the aggregated discrepancy score to identify the edges and vertices that have experienced changes. Stochastic block model (SBM) networks were simulated to demonstrate this method using SAS/IML and R. PROC SHEWHART and PROC CUSUM in SAS/QC and PROC SGRENDER heat maps were applied on the aggregated discrepancy score to monitor the dynamic social network. The combination of SAS/IML, SAS/QC, and R make it an ideal tool to monitor dynamic social networks.

View the e-poster or slides (PDF)

The SAS/IML^® language excels in handling matrices and performing matrix computations. A new feature in SAS/IML 14.2 is support for nonmatrix data structures such as tables and lists. In a matrix, all elements are of the same type: numeric or character. Furthermore, all rows have the same length. In contrast, SAS/IML 14.2 enables you to create a structure that contains many objects of different types and sizes. For example, you can create an array of matrices in which each matrix has a different dimension. You can create a table, which is an in-memory version of a data set. You can create a list that contains matrices, tables, and other lists. This paper describes the new data structures and shows how you can use them to emulate other structures such as stacks, associative arrays, and trees. It also presents examples of how you can use collections of objects as data structures in statistical algorithms.

Read the paper (PDF)

The purpose of this paper is to show a SAS^® macro named %SURVEYGENMOD developed in a SAS/IML^® procedure as an upgrade of macro %SURVEYGLM developed by Silva and Silva (2014) to deal with complex survey design in generalized linear models (GLMs). The new capabilities are the inclusion of negative binomial distribution, zero-inflated Poisson (ZIP) model, zero-inflated negative binomial (ZINB) model, and the possibility to get estimates for domains. The R function svyglm (Lumley, 2004) and Stata software were used as background, and the results showed that estimates generated by the %SURVEYGENMOD macro are close to the R function and Stata software.