In 2013, the University of North Carolina (UNC) at Chapel Hill initiated enterprise-wide use of SAS® solutions for reporting and data transformations. Just over one year later, the initial rollout was scheduled to go live to an audience of 5,500 users as part of an adoption of PeopleSoft ERP for Finance, Human Resources, Payroll, and Student systems. SAS® Visual Analytics was used for primary report delivery as an embedded resource within the UNC Infoporte, an existing portal. UNC made the date. With the SAS solutions, UNC delivered the data warehouse and initial reports on the same day that the ERP systems went live. After the success of the initial launch, UNC continues to develop and evolve the solution with additional technologies, data, and reports. This presentation touches on a few of the elements required for a medium to large size organization to integrate SAS solutions such as SAS Visual Analytics and SAS® Enterprise Business Intelligence within their infrastructure.
Jonathan Pletzke, UNC Chapel Hill
In the very near future you will likely encounter Hadoop. It is rapidly displacing database management systems in the corporate world and is rearing its head in the SAS® world. If you think now is the time to learn how to use SAS with Hadoop, you are in luck. This workshop is the jump start you need. This workshop introduces Hadoop and shows you how to access it by using SAS/ACCESS® Interface to Hadoop. During the workshop, we show you how to do the following: how to configure your SAS environment so that you can access Hadoop data; how to use the Hadoop FILENAME statement; how to use the HADOOP procedure; and how to use SAS/ACCESS Interface to Hadoop (including performance tuning).
Diane Hatcher, SAS
This paper provides an overview of analysis of data derived from complex sample designs. General discussion of how and why analysis of complex sample data differs from standard analysis is included. In addition, a variety of applications are presented using PROC SURVEYMEANS, PROC SURVEYFREQ, PROC SURVEYREG, PROC SURVEYLOGISTIC, and PROC SURVEYPHREG, with an emphasis on correct usage and interpretation of results.
Patricia Berglund, University of Michigan
Managing the large-scale displacement of people and communities caused by a natural disaster has historically been reactive rather than proactive. Following a disaster, data is collected to inform and prompt operational responses. In many countries prone to frequent natural disasters such as the Philippines, large amounts of longitudinal data are collected and available to apply to new disaster scenarios. However, because of the nature of natural disasters, it is difficult to analyze all of the data until long after the emergency has passed. For this reason, little research and analysis have been conducted to derive deeper analytical insight for proactive responses. This paper demonstrates the application of SAS® analytics to this data and establishes predictive alternatives that can improve conventional storm responses. Humanitarian organizations can use this data to understand displacement patterns and trends and to optimize evacuation routing and planning. Identifying the main contributing factors and leading indicators for the displacement of communities in a timely and efficient manner prevents detrimental incidents at disaster evacuation sites. Using quantitative and qualitative methods, responding organizations can make data-driven decisions that innovate and improve approaches to managing disaster response on a global basis. The benefits of creating a data-driven analytical model can help reduce response time, improve the health and safety of displaced individuals, and optimize scarce resources in a more effective manner. The International Organization for Migration (IOM), an intergovernmental organization, is one of the first-response organizations on the ground that responds to most emergencies. IOM is the global co-load for the Camp Coordination and Camp Management (CCCM) cluster in natural disasters. This paper shows how to use SAS® Visual Analytics and SAS® Visual Statistics for the Philippines in response to Super Typhoon Haiyan in Nove
mber 2013 to develop increasingly accurate models for better emergency-preparedness. Using data collected from IOM's Displacement Tracking Matrix (DTM), the final analysis shows how to better coordinate service delivery to evacuation centers sheltering large numbers of displaced individuals, applying accurate hindsight to develop foresight on how to better respond to emergencies and disasters. Predictive models build on patterns found in historical and transactional data to identify risks and opportunities. The capacity to predict trends and behavior patterns related to displacement and mobility has the potential to enable the IOM to respond in a more timely and targeted manner. By predicting the locations of displacement, numbers of persons displaced, number of vulnerable groups, and sites at most risk of security incidents, humanitarians can respond quickly and more effectively with the appropriate resources (material and human) from the outset. The end analysis uses the SAS® Storm Optimization model combined with human mobility algorithms to predict population movement.
Lorelle Yuen, International Organization for Migration
Kathy Ball, Devon Energy
We have to pull data from several data files in creating our working databases. The simplest use of SAS® hash objects greatly reduces the time required to draw data from many sources when compared to the use of multiple proc sorts and merges.
Andrew Dagis, City of Hope
Over the years, the SAS® Business Intelligence platform has proved its importance in this big data world with its suite of applications that enable us to efficiently process, analyze, and transform huge amounts of business data. Within the data warehouse universe, 'batch execution' sits in the heart of SAS Data Integration technologies. On a day-to-day basis, batches run, and the current status of the batch is generally sent out to the team or to the client as a 'static' e-mail or as a report. From experience, we know that they don't provide much insight into the real 'bits and bytes' of a batch run. Imagine if the status of the running batch is automatically captured in one central repository and is presented on a beautiful web browser on your computer or on your iPad. All this can be achieved without asking anybody to send reports and with all 'post-batch' queries being answered automatically with a click. This paper aims to answer the same with a framework that is designed specifically to automate the reporting aspects of SAS batches and, yes, it is all about collecting statistics of the batch, and we call it - 'BatchStats.'
Prajwal Shetty, Tesco HSC
Kaiser Permanente Northwest is contractually obligated for regulatory submissions to Oregon Health Authority, Health Share of Oregon, and Molina Healthcare in Washington. The submissions consist of Medicaid Encounter data for medical and pharmacy claims. SAS® programs are used to extract claims data from Kaiser's claims data warehouse, process the data, and produce output files in HIPAA ASC X12 and NCPDP format. Prior to April 2014, programs were written in SAS® 8.2 running on a VAX server. Several key drivers resulted in the conversion of the existing system to SAS® Enterprise Guide® 5.1 running on UNIX. These drivers were: the need to have a scalable system in preparation for the Affordable Care Act (ACA); performance issues with the existing system; incomplete process reporting and notification to business owners; and a highly manual, labor-intensive process of running individual programs. The upgraded system addressed these drivers. The estimated cost reduction was from $1.30 per reported encounter to $0.13 per encounter. The converted system provides for better preparedness for the ACA. One expected result of ACA is significant Medicaid membership growth. The program has already increased in size by 50% in the preceding 12 months. The updated system allows for the expected growth in membership.
Eric Sather, Kaiser Permanente
We regularly speak with organizations running established SAS® 9.1.3 systems that have not yet upgraded to a later version of SAS®. Often this is because their current SAS 9.1.3 environment is working fine, and no compelling event to upgrade has materialized. Now that SAS 9.1.3 has moved to a lower level of support and some very exciting technologies (Hadoop, cloud, ever-better scalability) are more accessible than ever using SAS® 9.4, the case for migrating from SAS 9.1.3 is strong. Upgrading a large SAS ecosystem with multiple environments, an active development stream, and a busy production environment can seem daunting. This paper aims to demystify the process, suggesting outline migration approaches for a variety of the most common scenarios in SAS 9.1.3 to SAS 9.4 upgrades, and a scalable template project plan that has been proven at a range of organizations.
David Stern, SAS
So you are still writing SAS® DATA steps and SAS macros and running them through a command-line scheduler. When work comes in, there is only one person who knows that code, and they are out--what to do? This paper shows how SAS applies extract, transform, load (ETL) modernization techniques with SAS® Data Integration Studio to gain resource efficiencies and to break down the ETL black box. We are going to share the fundamentals (metadata foldering and naming standards) that ensure success, along with steps to ease into the pool while iteratively gaining benefits. Benefits include self-documenting code visualization, impact analysis on jobs and tables impacted by change, and being supportable by interchangeable bench resources. We conclude with demonstrating how SAS® Visual Analytics is being used to monitor service-level agreements and provide actionable insights into job-flow performance and scheduling.
Brandon Kirk, SAS
The DATA step enables you to read, write, and manipulate many types of data. As data evolves to a more free-form state, the ability of SAS® to handle character data becomes increasingly important. This presentation, expanded and enhanced from an earlier version, addresses character data from multiple vantage points. For example, what is the default length of a character string, and why does it appear to change under different circumstances? Special emphasis is given to the myriad functions that can facilitate the processing and manipulation of character data. This paper is targeted at a beginning to intermediate audience.
Andrew Kuligowski, HSN
Swati Agarwal, OptumInsight
With all the talk of 'big data' and 'visual analytics' we sometimes forget how important it is, and often how hard it is, to get external data into SAS®. In this paper, we review some common data sources such as delimited sources (for example, CSV), as well as structured flat files, and the programming steps needed to successfully load these files into SAS. In addition to examining the INFILE and INPUT statements, we look at some methods for dealing with bad data. This paper assumes only basic SAS skills, although the topic can be of interest to anyone who needs to read external files.
Peter Eberhardt, Fernwood Consulting Group Inc.
Audrey Yeo, Athlene
This session is an introduction to predictive analytics and causal analytics in the context of improving outcomes. The session covers the following topics: 1) Basic predictive analytics vs. causal analytics; 2) The causal analytics framework; 3) Testing whether the outcomes improve because of an intervention; 4) Targeting the cases that have the best improvement in outcomes because of an intervention; and 5) Tweaking an intervention in a way that improves outcomes further.
Jason Pieratt, Humana
Data analysis begins with cleaning up data, calculating descriptive statistics, and examining variable distributions. Before more rigorous statistical analysis begins, many statisticians perform basic inferential statistical tests such as chi-square and t tests to assess unadjusted associations. These tests help guide the direction of the more rigorous statistical analysis. How to perform chi-square and t tests is presented. We explain how to interpret the output and where to look for the association or difference based on the hypothesis being tested. We propose the next steps for further analysis using example data.
Maribeth Johnson, Georgia Regents University
Jennifer Waller, Georgia Regents University
In Bayesian statistics, Markov chain Monte Carlo (MCMC) algorithms are an essential tool for sampling from probability distributions. PROC MCMC is useful for these algorithms. However, it is often desirable to code an algorithm from scratch. This is especially present in academia where students are expected to be able to understand and code an MCMC. The ability of SAS® to accomplish this is relatively unknown yet quite straightforward. We use SAS/IML® to demonstrate methods for coding an MCMC algorithm with examples of a Gibbs sampler and Metropolis-Hastings random walk.
Chelsea Lofland, University of California Santa Cruz
With the expansive new features in SAS® Visual Analytics 7.1, you can now take control of the graph data while viewing a report. Using parameterized expressions, calculated items, custom categories, and prompt controls, you can now change the measures or categories on a graph from a mobile device or web viewer. View your data from different perspectives while using the same graph. This paper demonstrates how you can use these features in SAS® Visual Analytics Designer to create reports in which graph roles can be dynamically changed with the click of a button.
Kenny Lui, SAS
The DATA Step has served SAS® programmers well over the years, and although it is handy, the new, exciting, and powerful DS2 is a significant alternative to the DATA Step by introducing an object-oriented programming environment. It enables users to effectively manipulate complex data and efficiently manage the programming through additional data types, programming structure elements, user-defined methods, and shareable packages, as well as threaded execution. This tutorial is developed based on our experiences with getting started with DS2 and learning to use it to access, manage, and share data in a scalable and standards-based way. It facilitates SAS users of all levels to easily get started with DS2 and understand its basic functionality by practicing the features of DS2.
Peter Eberhardt, Fernwood Consulting Group Inc.
Xue Yao, Winnipeg Regional Health Aurthority
Graduate students often need to explore data and summarize multiple statistical models into tables for a dissertation. The challenges of data summarization include coding multiple, similar statistical models, and summarizing these models into meaningful tables for review. The default method is to type (or copy and paste) results into tables. This often takes longer than creating and running the analyses. Students might spend hours creating tables, only to have to start over when a change or correction in the underlying data requires the analyses to be updated. This paper gives graduate students the tools to efficiently summarize the results of statistical models in tables. These tools include a macro-based SAS/STAT® analysis and ODS OUTPUT statement to summarize statistics into meaningful tables. Specifically, we summarize PROC GLM and PROC LOGISTIC output. We convert an analysis of hospital-acquired delirium from hundreds of pages of output into three formatted Microsoft Excel files. This paper is appropriate for users familiar with basic macro language.
Elisa Priest, Texas A&M University Health Science Center
Ashley Collinsworth, Baylor Scott & White Health/Tulane University
Intervals have been a feature of Base SAS® for a long time, enabling SAS users to work with commonly (and not-so-commonly) defined periods of time such as years, months, and quarters. With the release of SAS®9, there are more options and capabilities for intervals and their functions. This paper first discusses the basics of intervals in detail, and then discusses several of the enhancements to the interval feature, such as the ability to select how the INTCK function defines interval boundaries and the ability to create your own custom intervals beyond multipliers and shift operators.
Derek Morgan
We have many options for performing merges or joins these days, each with various advantages and disadvantages. Depending on how you perform your joins, different checks can help you verify whether the join was successful. In this presentation, we look at some sample data, use different methods, and see what kinds of tests can be done to ensure that the results are correct. If the join is performed with PROC SQL and two criteria are fulfilled (the number of observations in the primary data set has not changed [presuming a one-to-one or many-to-one situation], and a variable that should be populated is not missing), then the merge was successful.
Emmy Pahmer, inVentiv Health
Today, employers and universities alike must choose the most talented individuals from a large pool. However, it is difficult to determine whether a student's A+ in English means that he or she can write as proficiently as another student who writes as a hobby. As a result, there are now dozens of ways to compare individuals to one or another spectrum. For example, the ACT and SAT enable universities to view a student's performance on a test given to all applicants in order to help determine whether they will be successful. High schools use students' GPAs in order to compare them to one another for academic opportunities. The WorkKeys Exam enables employers to rate prospective employees on their abilities to perform day-to-day business activities. Rarely do standardized tests and in-class performance get compared to each other. We used SAS® to analyze the GPAs and WorkKeys Exam results of 285 seniors who attend the Phillip O Berry Academy. The purpose was to compare class standing to what a student can prove he knows in a standardized environment. Emphasis is on the use of PROC SQL in SAS® 9.3 rather than DATA step processing.
Jonathan Gomez Martinez, Phillip O Berry Academy of Technology
Christopher Simpson, Phillip O Berry Academy of Technology
This paper explains best practices for using temporary files in SAS® programs. These practices include using the TEMP access method, writing to the WORK directory, and ensuring that you leave no litter files behind. An additional special temporary file technique is described for mainframe users.
Rick Langston, SAS
Programming SAS® has just been made easier, now that SAS 9.4 has incorporated the Lua programming language into the heart of the SAS System. With its elegant syntax, modern design, and support for data structures, Lua offers you a fresh way to write SAS programs, getting you past many of the limitations of the SAS macro language. This paper shows how you can get started using Lua to drive SAS, via a quick introduction to Lua and a tour through some of the features of the Lua and SAS combination that make SAS programming easier. SAS macro programming is also compared with Lua, so that you can decide where you might benefit most by using either language.
Paul Tomas, SAS
Do you need to deliver business insight and analytics to support decision-making? Using SAS® Enterprise Guide®, you can access the full power of SAS® for analytics, without needing to learn the details of SAS programming. This presentation focuses on the following uses of SAS Enterprise Guide: Exploring and understanding--getting a feel for your data and for its issues and anomalies Visualizing--looking at the relationships, trends, surprises Consolidating--starting to piece together the story Presenting--building the insight and analytics into a presentation using SAS Enterprise Guide
Marje Fecht, Prowerk Consulting
The use of Bayesian methods has become increasingly popular in modern statistical analysis, with applications in numerous scientific fields. In recent releases, SAS® software has provided a wealth of tools for Bayesian analysis, with convenient access through several popular procedures in addition to the MCMC procedure, which is designed for general Bayesian modeling. This paper introduces the principles of Bayesian inference and reviews the steps in a Bayesian analysis. It then uses examples from the GENMOD and PHREG procedures to describe the built-in Bayesian capabilities, which became available for all platforms in SAS/STAT® 9.3. Discussion includes how to specify prior distributions, evaluate convergence diagnostics, and interpret the posterior summary statistics.
Maura Stokes, SAS
As many leadership experts suggest, growth happens only when it is intentional. Growth is vital to the immediate and long-term success of our employees as well as our employers. As SAS® programming leaders, we have a responsibility to encourage individual growth and to provide the opportunity for it. With an increased workload yet fewer resources, initial and ongoing training seem to be deemphasized as we are pressured to meet project timelines. The current workforce continues to evolve with time and technology. More important than simply providing the opportunity for training, individual trainees need the motivation for any training program to be successful. Although many existing principles for growth remain true, how such principles are applied needs to evolve with the current generation of SAS programmers. The primary goal of this poster is to identify the critical components that we feel are necessary for the development of an effective training program, one that meets the individual needs of the current workforce. Rather than proposing a single 12-step program that works for everyone, we think that identifying key components for enhancing the existing training infrastructure is a step in the right direction.
Amber Randall, Axio Research
Proving difference is the point of most statistical testing. In contrast, the point of equivalence and noninferiority tests is to prove that results are substantially the same, or at least not appreciably worse. An equivalence test can show that a new treatment, one that is less expensive or causes fewer side effects, can replace a standard treatment. A noninferiority test can show that a faster manufacturing process creates no more product defects or industrial waste than the standard process. This paper reviews familiar and new methods for planning and analyzing equivalence and noninferiority studies in the POWER, TTEST, and FREQ procedures in SAS/STAT® software. Techniques that are discussed range from Schuirmann's classic method of two one-sided tests (TOST) for demonstrating similar normal or lognormal means in bioequivalence studies, to Farrington and Manning's noninferiority score test for showing that an incidence rate (such as a rate of mortality, side effects, or product defects) is no worse. Real-world examples from clinical trials, drug development, and industrial process design are included.
John Castelloe, SAS
Donna Watts, SAS
As organizations strive to do more with fewer resources, many modernize their disparate PC operations to centralized server deployments. Administrators and users share many concerns about using SAS® on a Microsoft Windows server. This paper outlines key guidelines, plus architecture and performance considerations, that are essential to making a successful transition from PC to server. This paper outlines the five key considerations for SAS customers who will change their configuration from PC-based SAS to using SAS on a Windows server: 1) Data and directory references; 2) Interactive and surrounding applications; 3) Usability; 4) Performance; 5) SAS Metadata Server.
Kate Schwarz, SAS
Donna Bennett, SAS
Margaret Crevar, SAS
Are you looking to track changes to your SAS® programs? Do you wish you could easily find errors, warnings, and notes in your SAS logs? Looking for a convenient way to find point-and-click tasks? Want to search your SAS® Enterprise Guide® project? How about a point-and-click way to view SAS system options and SAS macro variables? Or perhaps you want to upload data to the SAS® LASR™ Analytics Server, view SAS® Visual Analytics reports, or run SAS® Studio tasks, all from within SAS Enterprise Guide? You can find these capabilities and more in SAS Enterprise Guide. Knowing what tools are at your disposal and how to use them will put you a step ahead of the rest. Come learn about some of the newer features in SAS Enterprise Guide 7.1 and how you can leverage them in your work.
Casey Smith, SAS
During grad school, students learn SAS® in class or on their own for a research project. Time is limited, so faculty have to focus on what they know are the fundamental skills that students need to successfully complete their coursework. However, real-world research projects are often multifaceted and require a variety of SAS skills. When students transition from grad school to a paying job, they might find that in order to be successful, they need more than the basic SAS skills that they learned in class. This paper highlights 10 insights that I've had over the past year during my transition from grad school to a paying SAS research job. I hope this paper will help other students make a successful transition. Top 10 insights: 1. You still get graded, but there is no syllabus. 2. There isn't time for perfection. 3. Learn to use your resources. 4. There is more than one solution to every problem. 5. Asking for help is not a weakness. 6. Working with a team is required. 7. There is more than one type of SAS®. 8. The skills you learned in school are just the basics. 9. Data is complicated and often frustrating. 10. You will continue to learn both personally and professionally.
Lauren Hall, Baylor Scott & White Health
Elisa Priest, Texas A&M University Health Science Center
Companies spend vast amounts of resources developing and enhancing proprietary software to clean their business data. Save time and obtain more accurate results by leveraging the SAS® Quality Knowledge Base (QKB), formerly a DataFlux® Data Quality technology. Tap into the existing QKB rules for cleansing contact information or product data, or easily design your own custom rules using the QKB editing tools. The QKB enables data management operations such as parsing, standardization, and fuzzy matching for contact information such as names, organizations, addresses, and phone numbers, or for product data attributes such as materials, colors, and dimensions. The QKB supports data in native character sets in over 38 locales. A single QKB can be shared by multiple SAS® Data Management installations across your enterprise, ensuring consistent results on workstations, servers, and massive parallel processing systems such as Hadoop. In this breakout, a SAS R&D manager demonstrates the power and flexibility of the QKB, and answers your questions about how to deploy and customize the QKB for your environment.
Brian Rineer, SAS
While there has been tremendous progress in technologies related to data storage, high-performance computing, and advanced analytic techniques, organizations have only recently begun to comprehend the importance of parallel strategies that help manage the cacophony of concerns around access, quality, provenance, data sharing, and use. While data governance is not new, the drumbeat around it, along with master data management and data quality, is approaching a crescendo. Intensified by the increase in consumption of information, expectations about ubiquitous access, and highly dynamic visualizations, these factors are also circumscribed by security and regulatory constraints. In this paper, we provide a summary of what data governance is and its importance. We go beyond the obvious and provide practical guidance on what it takes to build out a data governance capability appropriate to the scale, size, and purpose of the organization and its culture. Moreover, we discuss best practices in the form of requirements that highlight what we think is important to consider as you provide that tactical linkage between people, policies, and processes to the actual data lifecycle. To that end, our focus includes the organization and its culture, people, processes, policies, and technology. Further, we include discussions of organizational models as well as the role of the data steward, and provide guidance on how to formalize data governance into a sustainable set of practices within your organization.
Greg Nelson, ThotWave
Lisa Dodson, SAS
This presentation provides a brief introduction to logistic regression analysis in SAS. Learn differences between Linear Regression and Logistic Regression, including ordinary least squares versus maximum likelihood estimation. Learn to: understand LOGISTIC procedure syntax, use continuous and categorical predictors, and interpret output from ODS Graphics.
Danny Modlin, SAS
For decades, mixed models been used by researchers to account for random sources of variation in regression-type models. Now they are gaining favor in business statistics to give better predictions for naturally occurring groups of data, such as sales reps, store locations, or regions. Learn about how predictions based on a mixed model differ from predictions in ordinary regression, and see examples of mixed models with business data.
Catherine Truxillo, SAS
Text data constitutes more than half of the unstructured data held in organizations. Buried within the narrative of customer inquiries, the pages of research reports, and the notes in servicing transactions are the details that describe concerns, ideas and opportunities. The historical manual effort needed to develop a training corpus is now no longer required, making it simpler to gain insight buried in unstructured text. With the ease of machine learning refined with the specificity of linguistic rules, SAS Contextual Analysis helps analysts identify and evaluate the meaning of the electronic written word. From a single, point-and-click GUI interface the process of developing text models is guided and visually intuitive. This presentation will walk through the text model development process with SAS Contextual Analysis. The results are in SAS format, ready for text-based insights to be used in any other SAS application.
George Fernandez, SAS
SAS/ETS provides many tools to improve the productivity of the analyst who works with time series data. This tutorial will take an analyst through the process of turning transaction-level data into a time series. The session will then cover some basic forecasting techniques that use past fluctuations to predict future events. We will then extend this modeling technique to include explanatory factors in the prediction equation.
Kenneth Sanford, SAS
When ODS Graphics was introduced a few years ago, it gave SAS users an array of new ways to generate graphs. One of those ways is with Statistical Graphics procedures. Now, with just a few lines of simple code, you can create a wide variety of high-quality graphs. This paper shows how to produce single-celled graphs using PROC SGPLOT and paneled graphs using PROC SGPANEL. This paper also shows how to send your graphs to different ODS destinations, how to apply ODS styles to your graphs, and how to specify properties of graphs, such as format, name, height, and width.
Susan Slaughter, Avocet Solutions
The SAS® Macro Language is a powerful tool for extending the capabilities of the SAS® System. This hands-on workshop teaches essential macro coding concepts, techniques, tips, and tricks to help beginning users learn the basics of how the macro language works. Using a collection of proven macro language coding techniques, attendees learn how to write and process macro statements and parameters; replace text strings with macro (symbolic) variables; generate SAS code using macro techniques; manipulate macro variable values with macro functions; create and use global and local macro variables; construct simple arithmetic and logical expressions; interface the macro language with the SQL procedure; store and reuse macros; troubleshoot and debug macros; and develop efficient and portable macro language code.
Kirk Paul Lafler, Software Intelligence Corporation
Graduate students encounter many challenges when conducting health services research using real world data obtained from electronic health records (EHRs). These challenges include cleaning and sorting data, summarizing and identifying present-on-admission diagnosis codes, identifying appropriate metrics for risk-adjustment, and determining the effectiveness and cost effectiveness of treatments. In addition, outcome variables commonly used in health service research are not normally distributed. This necessitates the use of nonparametric methods in statistical analyses. This paper provides graduate students with the basic tools for the conduct of health services research with EHR data. We will examine SAS® tools and step-by-step approaches used in an analysis of the effectiveness and cost-effectiveness of the ABCDE (Awakening and Breathing Coordination, Delirium monitoring/management, and Early exercise/mobility) bundle in improving outcomes for intensive care unit (ICU) patients. These tools include the following: (1) ARRAYS; (2) lookup tables; (3) LAG functions; (4) PROC TABULATE; (5) recycled predictions; and (6) bootstrapping. We will discuss challenges and lessons learned in working with data obtained from the EHR. This content is appropriate for beginning SAS users.
Ashley Collinsworth, Baylor Scott & White Health/Tulane University
Elisa Priest, Texas A&M University Health Science Center
SAS® users are already familiar with the FCMP procedure and the flexibility it provides them in writing their own functions and subroutines. However, did you know that FCMP also allows you to call functions written in C? Did you know that you can create and populate complex C structures and use C types in FCMP? With the PROTO procedure, you can define function prototypes, structures, enumeration types, and even small bits of C code. This paper gets you started on how to use the PROTO procedure and, in turn, how to call your C functions from within FCMP and SAS.
Andrew Henrick, SAS
Karen Croft, SAS
Donald Erdman, SAS
The first task to accomplish our SAS® 9.4 installation goal is to create an Amazon Web Services (AWS) secured EC2 (Elastic Compute Cloud 2) instance called a Virtual Private Cloud (VPC). Through a series of wizard-driven dialog boxes, the SAS administrator selects virtual CPUs (vCPUs, which have about a 2:1 ratio to cores ), memory, storage, and network performance considerations via regional availability zones. Then, there is a prompt to create a VPC that will be housed within the EC2 instance, along with a major component called subnets. A step to create a security group is next, which enables the SAS administrator to specify all of the VPC firewall port rules required for the SAS 9.4 application. Next, the EC2 instance is reviewed and a security key pair is either selected or created. Then the EC2 launches. At this point, Internet connectivity to the EC2 instance is granted by attaching an Internet gateway and its route table to the VPC and allocating and associating an elastic IP address along with a public DNS. The second major task involves establishing connectivity to the EC2 instance and a method of download for SAS software. In the case of the Linux Red Hat instance created here, putty is configured to use the EC2's security key pair (.ppk file). In order to transfer files securely to the EC2 instance, a tool such as WinSCP is installed and uses the putty connection for secure FTP. The Linux OS is then updated, and then VNCServer is installed and configured so that the SAS administrator can use a GUI. Finally, a Firefox web browser is installed to download the SAS® Download Manager. After downloading the SAS Download Manager, a SAS depot directory is created on the Linux file system and the SAS Download Manager is run once we have provided the software order number and SAS installation key. Once the SAS software depot has been loaded, we can verify the success of the SAS software depot's download by running the SAS depot checker. The next pre-installatio
n task is to take care of some Linux OS housekeeping. Local users (for example, the SAS installation ID), sas, and other IDs such as sassrv, lsfadmin, lsfuser, and sasdemo are created. Specific directory permissions are set for the installer ID sas. The ulimit setting for open files and max user processes are increased and directories are created for a SAS installation home and configuration directory. Some third-party tools such as python, which are required for SAS 9.4, are installed. Then Korn shell and other required Linux packages are installed. Finally, the SAS Deployment Manager installation wizard is launched and the multiple dialog boxes are filled out, with many defaults accepted and Next clicked. SAS administrators should consider running the SAS Deployment Manager twice, first to solely install the SAS software, and then later to configure. Finally, after SAS Deployment Manager completion, SAS post-installation tasks are completed.
Jeff Lehmann, Slalom Consulting
The telecommunications industry is the fastest changing business ecosystem in this century. Therefore, handset campaigning to increase loyalty is the top issue for telco companies. However, these handset campaigns have great fraud and payment risks if the companies do not have the ability to classify and assess customers properly according to their risk propensity. For many years, telco companies managed the risk with business rules such as customer tenure until the launch of analytics solutions into the market. But few business rules restrict telco companies in the sales of handsets to new customers. On the other hand, with increasing competition pressure in telco companies, it is necessary to use external credit data to sell handsets to new customers. Credit bureau data was a good opportunity to measure and understand the behaviors of the applicants. But using external data required system integration and real-time decision systems. For those reasons, we need a solution that enables us to predict risky customers and then integrate risk scores and all information into one real-time decision engine for optimized handset application vetting. After an assessment period, SAS® Analytics platform and RTDM were chosen as the most suitable solution because they provide a flexible user friendly interface, high integration, and fast deployment capability. In this project, we build a process that includes three main stages to transform the data into knowledge. These stages are data collection, predictive modelling, and deployment and decision optimization. a) Data Collection: We designed a specific daily updated data mart that connects internal payment behavior, demographics, and customer experience data with external credit bureau data. In this way, we can turn data into meaningful knowledge for better understanding of customer behavior. b) Predictive Modelling: For using the company potential, it is critically important to use an analytics approach that is based on state-of-the-art tec
hnologies. We built nine models to predict customer propensity to pay. As a result of better classification of customers, we obtain satisfied results in designing collection scenarios and decision model in handset application vetting. c) Deployment and Decision Optimization: Knowledge is not enough to reach success in business. It should be turned into optimized decision and deployed real time. For this reason, we have been using SAS® Predictive Analytics Tools and SAS® Real-Time Decision Manager to primarily turn data into knowledge and turn knowledge into strategy and execution. With this system, we are now able to assess customers properly and to sell handset even to our brand-new customers as part of the application vetting process. As a result of this, while we are decreasing nonpayment risk, we generated extra revenue that is coming from brand-new contracted customers. In three months, 13% of all handset sales was concluded via RTDM. Another benefit of the RTDM is a 30% cost saving in external data inquiries. Thanks to the RTDM, Avea has become the first telecom operator that uses bureau data in Turkish Telco industry.
Hurcan Coskun, Avea
Research has shown that the top five percent of patients can account for nearly fifty percent of the total healthcare expenditure in the United States. Using SAS® Enterprise Guide® and PROC LOGISTIC, a statistical methodology was developed to identify factors (for example, patient demographics, diagnostic symptoms, comorbidity, and the type of procedure code) associated with the high cost of healthcare. Analyses were performed using the FAIR Health National Private Insurance Claims (NPIC) database, which contains information about healthcare utilization and cost in the United States. The analyses focused on treatments for chronic conditions, such as trans-myocardial laser revascularization for the treatment of coronary heart disease (CHD) and pressurized inhalation for the treatment of asthma. Furthermore, bubble plots and heat maps were created using SAS® Visual Analytics to provide key insights into potentially high-cost treatments for heart disease and asthma patients across the nation.
Jeff Dang, FAIR Health
Over the years, very few published studies have discussed ways to improve the performance of two-stage predictive models. This study, based on 10 years (1999-2008) of data from 130 US hospitals and integrated delivery networks, is an attempt to demonstrate how we can leverage the Association node in SAS® Enterprise Miner™ to improve the classification accuracy of the two-stage model. We prepared the data with imputation operations and data cleaning procedures. Variable selection methods and domain knowledge were used to choose 43 key variables for the analysis. The prominent association rules revealed interesting relationships between prescribed medications and patient readmission/no-readmission. The rules with lift values greater than 1.6 were used to create dummy variables for use in the subsequent predictive modeling. Next, we used two-stage sequential modeling, where the first stage predicted if the diabetic patient was readmitted and the second stage predicted whether the readmission happened within 30 days. The backward logistic regression model outperformed competing models for the first stage. After including dummy variables from an association analysis, many fit indices improved, such as the validation ASE to 0.228 from 0.238, cumulative lift to 1.56 from 1.40. Likewise, the performance of the second stage was improved after including dummy variables from an association analysis. Fit indices such as the misclassification rate improved to 0.240 from 0.243 and the final prediction error to 0.17 from 0.18.
Girish Shirodkar, Oklahoma State University
Goutam Chakraborty, Oklahoma State University
Ankita Chaudhari, Oklahoma State University
Why did my merge fail? How did that variable get truncated? Why am I getting unexpected results? Understanding how the DATA step actually works is the key to answering these and many other questions. In this paper, two independent consultants with a combined three decades of SAS® programming experience share a treasure trove of knowledge aimed at helping the novice SAS programmer take his or her game to the next level by peering behind the scenes of the DATA step. We touch on a variety of topics, including compilation versus execution, the program data vector, and proper merging techniques, with a focus on good programming practices that help the programmer steer clear of common pitfalls.
Joshua Horstman, Nested Loop Consulting
Britney Gilbert, Juniper Tree Consulting
Joshua Horstman, Nested Loop Consulting
This presentation teaches the audience how to use ODS Graphics. Now part of Base SAS®, ODS Graphics are a great way to easily create clear graphics that enable any user to tell their story well. SGPLOT and SGPANEL are two of the procedures that can be used to produce powerful graphics that used to require a lot of work. The core of the procedures is explained, as well as some of the many options available. Furthermore, we explore the ways to combine the individual statements to make more complex graphics that tell the story better. Any user of Base SAS on any platform will find great value in the SAS ODS Graphics procedures.
Chuck Kincaid, Experis
Organizations are loading data into Hadoop platforms at an extraordinary rate. However, in order to extract value from these platforms, the data must be prepared for analytic exploit. As the volume of data grows, it becomes increasingly more important to reduce data movement, as well as to leverage the computing power of these distributed systems. This paper provides a cursory overview of SAS® Data Loader, a product specifically aimed at these challenges. We cover the underlying mechanisms of how SAS Data Loader works, as well as how it's used to profile, cleanse, transform, and ultimately prepare data for analytics in Hadoop.
Keith Renison, SAS
This paper introduces Jeffreys interval for one-sample proportion using SAS® software. It compares the credible interval from a Bayesian approach with the confidence interval from a frequentist approach. Different ways to calculate the Jeffreys interval are presented using PROC FREQ, the QUANTILE function, a SAS program of the random walk Metropolis sampler, and PROC MCMC.
Wu Gong, The Children's Hospital of Philadelphia
Examples include: how to join when your data is perfect, how to join when your data does not match and you have to manipulate it in order to join, how to create a subquery, how to use a subquery.
Anita Measey, Bank of Montreal, Risk Capital & Stress Testing
Across the languages of SAS® are many golden nuggets--functions, formats, and programming features just waiting to impress your friends and colleagues. Learning SAS over 30+ years, I have collected a few, and I offer them to you in this presentation.
Peter Crawford, Crawford Software Consultancy Limited
A utility's meter data is a valuable asset that can be daunting to leverage. Consider that one household or premise can produce over 35,000 rows of information, consisting of over 8 MB of data per year. Thirty thousand meters collecting fifteen-minute-interval data with forty variables equates to 1.2 billion rows of data. Using SAS® Visual Analytics, we provide examples of leveraging smart meter data to address business around revenue protection, meter operations, and customer analysis. Key analyses include identifying consumption on inactive meters, potential energy theft, and stopped or slowing meters; and support of all customer classes (for example, residential, small commercial, and industrial) and their data with different time intervals and frequencies.
Tom Anderson, SAS
Jennifer Whaley, SAS
Banks can create a competitive advantage in their business by using business intelligence (BI) and by building models. In the credit domain, the best practice is to build risk-sensitive models (Probability of Default, Exposure at Default, Loss-given Default, Unexpected Loss, Concentration Risk, and so on) and implement them in decision-making, credit granting, and credit risk management. There are models and tools on the next level built on these models and that are used to help in achieving business targets, risk-sensitive pricing, capital planning, optimizing of ROE/RAROC, managing the credit portfolio, setting the level of provisions, and so on. It works remarkably well as long as the models work. However, over time, models deteriorate and their predictive power can drop dramatically. Since the global financial crisis in 2008, we have faced a tsunami of regulation and accelerated frequency of changes in the business environment, which cause models to deteriorate faster than ever before. As a result, heavy reliance on models in decision-making (some decisions are automated following the model's results--without human intervention) might result in a huge error that can have dramatic consequences for the bank's performance. In my presentation, I share our experience in reducing model risk and establishing corporate governance of models with the following SAS® tools: model monitoring, SAS® Model Manager, dashboards, and SAS® Visual Analytics.
Boaz Galinson, Bank Leumi
Electricity is an extremely important product for society. In Brazil, the electric sector is regulated by ANEEL (Ag ncia Nacional de Energia El trica), and one of the regulated aspects is power loss in the distribution system. In 2013, 13.99% of all injected energy was lost in the Brazilian system. Commercial loss is one of the power loss classifications, which can be countered by inspections of the electrical installation in a search for irregularities in power meters. CEMIG (Companhia Energ tica de Minas Gerais) currently serves approximately 7.8 million customers, which makes it unfeasible (in financial and logistic terms) to inspect all customer units. Thus, the ability to select potential inspection targets is essential. In this paper, logistic regression models, decision tree models, and the Ensemble model were used to improve the target selection process in CEMIG. The results indicate an improvement in the positive predictive value from 35% to 50%.
Sergio Henrique Ribeiro, Cemig
Iguatinan Monteiro, CEMIG
This paper describes the new features added to the macro facility in SAS® 9.3 and SAS® 9.4. New features described include the /READONLY option for macro variables, the %SYSMACEXIST macro function, the %PUT &= feature, and new automatic macro variables such as &SYSTIMEZONEOFFSET.
Rick Langston, SAS
SAS® Visual Analytics provides users with a unique view of their company by monitoring products, and identifying opportunities and threats, making it possible to hold recommendations, set a price strategy, and accelerate or brake product growth. In SAS Visual Analytics, you can see in one report the return required, a competitor analysis, and a comparison of realized results versus predicted results. Reports can be used to obtain a vision of the whole company and include several hierarchies (for example, by business unit, by segment, by product, by region, and so on). SAS Visual Analytics enables senior executives to easily and quickly view information. You can also use tracking indicators that are used by the insurance market.
Jacqueline Fraga, SulAmerica Cia Nacional de Seguros
SAS® data sets have PROC DATASETS, and SAS catalogs have PROC CATALOG. Find out what the little-known PROC CATALOG can do for you!
Louise Hadden, Abt Associates Inc.
It is a common task to reshape your data from long to wide for the purpose of reporting or analytical modeling and PROC TRANSPOSE provides a convenient way to accomplish this. However, when performing the transpose action on large tables stored in a database management system (DBMS) such as Teradata, the performance of PROC TRANSPOSE can be significantly compromised. In this case, it is more efficient for the DBMS to perform the transpose task. SAS® provides in-database processing technology in PROC SQL, which allows the SQL explicit pass-through method to push some or all of the work to the DBMS. This technique has facilitated integration between SAS and a wide range of data warehouses and databases, including Teradata, EMC Greenplum, IBM DB2, IBM Netezza, Oracle, and Aster Data. This paper uses the Teradata database as an example DBMS and explains how to transpose a large table that resides in it using the SQL explicit pass-through method. The paper begins with comparing the execution time using PROC TRANSPOSE with the execution time using SQL explicit pass-through. From this comparison, it is clear that SQL explicit pass-through is more efficient than the traditional PROC TRANSPOSE when transposing Teradata tables, especially large tables. The paper explains how to use the SQL explicit pass-through method and discusses the types of data columns that you might need to transpose, such as numeric and character. The paper presents a transpose solution for these types of columns. Finally, the paper provides recommendations on packaging the SQL explicit pass-through method by embedding it in a macro. SAS programmers who are working with data stored in an external DBMS and who would like to efficiently transpose their data will benefit from this paper.
Tao Cheng, Accenture
If your data do not meet the assumptions for a standard parametric test, you might want to consider using a permutation test. By randomly shuffling the data and recalculating a test statistic, a permutation test can calculate the probability of getting a value equal to or more extreme than an observed test statistic. With the power of matrices, vectors, functions, and user-defined modules, the SAS/IML® language is an excellent option. This paper covers two examples of permutation tests: one for paired data and another for repeated measures analysis of variance. For those new to SAS/IML® software, this paper offers a basic introduction and examples of how effective it can be.
John Vickery, North Carolina State University
Inpatient treatment is the most common type of treatment ordered for patients who have a serious ailment and need immediate attention. Using a data set about diabetes patients downloaded from the UCI Network Data Repository, we built a model to predict the probability that the patient will be rehospitalized within 30 days of discharge. The data has about 100,000 rows and 51 columns. In our preliminary analysis, a neural network turned out to be the best model, followed closely by the decision tree model and regression model.
Nikhil Kapoor, Oklahoma State University
Ganesh Kumar Gangarajula, Oklahoma State University
A common complaint of employers is that educational institutions do not prepare students for the types of messy data and multi-faceted requirements that occur on the job. No organization has data that resembles the perfectly scrubbed data sets in the back of a statistics textbook. The objective of the Annual Report Project is to quickly bring new SAS® users to a level of competence where they can use real data to meet real business requirements. Many organizations need annual reports for stockholders, funding agencies, or donors. Or, they need annual reports at the department or division level for an internal audience. Being tapped as part of the team creating an annual report used to mean weeks of tedium, poring over columns of numbers in 8-point font in (shudder) Excel spreadsheets, but no more. No longer painful, using a few SAS procedures and functions, reporting can be easy and, dare I say, fun. All analyses are done using SAS® Studio (formerly SAS® Web Editor) of SAS OnDemand for Academics. This paper uses an example with actual data for a report prepared to comply with federal grant funding requirements as proof that, yes, it really is that simple.
AnnMaria De Mars, AnnMaria De Mars
This paper presents a methodology developed to define and prioritize feeders with the least satisfactory performances for continuity of energy supply, in order to obtain an efficiency ranking that supports a decision-making process regarding investments to be implemented. Data Envelopment Analysis (DEA) was the basis for the development of this methodology, in which the input-oriented model with variable returns to scale was adopted. To perform the analysis of the feeders, data from the utility geographic information system (GIS) and from the interruption control system was exported to SAS® Enterprise Guide®, where data manipulation was possible. Different continuity variables and physical-electrical parameters were consolidated for each feeder for the years 2011 to 2013. They were separated according to the geographical regions of the concession area, according to their location (urban or rural), and then grouped by physical similarity. Results showed that 56.8% of the feeders could be considered as efficient, based on the continuity of the service. Furthermore, the results enable identification of the assets with the most critical performance and their benchmarks, and the definition of preliminary goals to reach efficiency.
Victor Henrique de Oliveira, Cemig
Iguatinan Monteiro, CEMIG
A bubble map can be a useful tool for identifying trends and visualizing the geographic proximity and intensity of events. This session shows how to use PROC GEOCODE and PROC GMAP to turn a data set of addresses and events into a map of the United States with scaled bubbles depicting the location and intensity of the events.
Caroline Cutting, Warren Rogers Associates
REST is being used across the industry for designing networked applications to provide lightweight and powerful alternatives to web services such as SOAP and Web Services Description Language (WSDL). Since REST is based entirely around HTTP, SAS® provides everything you need to make REST calls and process structured and unstructured data alike. Learn how PROC HTTP and other SAS language features provide everything you need to simply and securely make use of REST.
Joseph Henry, SAS
Risk managers and traders know that some knowledge loses its value quickly. Unfortunately, due to the computationally intensive nature of risk, most risk managers use stale data. Knowing your positions and risk intraday can provide immense value. Imagine knowing the portfolio risk impact of a trade before you execute. This paper shows you a path to doing real-time risk analysis leveraging capabilities from SAS® Event Stream Processing Engine and SAS® High-Performance Risk. Event stream processing (ESP) offers the ability to process large amounts of data with high throughput and low latency, including streaming real-time trade data from front-office systems into a centralized risk engine. SAS High-Performance Risk enables robust, complex portfolio valuations and risk calculations quickly and accurately. In this paper, we present techniques and demonstrate concepts that enable you to more efficiently use these capabilities together. We also show techniques for analyzing SAS High-Performance data with SAS® Visual Analytics.
Albert Hopping, SAS
Arvind Kulkarni, SAS
Ling Xiang, SAS
As a consequence of the financial crisis, banks are required to stress test their balance sheet and earnings based on prescribed macroeconomic scenarios. In the US, this exercise is known as the Comprehensive Capital Analysis and Review (CCAR) or Dodd-Frank Act Stress Testing (DFAST). In order to assess capital adequacy under these stress scenarios, banks need a unified view of their projected balance sheet, incomes, and losses. In addition, the bar for these regulatory stress tests is very high regarding governance and overall infrastructure. Regulators and auditors want to ensure that the granularity and quality of data, model methodology, and assumptions reflect the complexity of the banks. This calls for close internal collaboration and information sharing across business lines, risk management, and finance. Currently, this process is managed in an ad hoc, manual fashion. Results are aggregated from various lines of business using spreadsheets and Microsoft SharePoint. Although the spreadsheet option provides flexibility, it brings ambiguity into the process and makes the process error prone and inefficient. This paper introduces a new SAS® stress testing solution that can help banks define, orchestrate and streamline the stress-testing process for easier traceability, auditability, and reproducibility. The integrated platform provides greater control, efficiency, and transparency to the CCAR process. This will enable banks to focus on more value-added analysis such as scenario exploration, sensitivity analysis, capital planning and management, and model dependencies. Lastly, the solution was designed to leverage existing in-house platforms that banks might already have in place.
Wei Chen, SAS
Shannon Clark
Erik Leaver, SAS
John Pechacek
Reporting Best Practices
Trina Gladwell, Bealls Inc
Given the challenges of data security in today's business environment, how can you protect the data that is used by SAS® Visual Analytics? SAS® has implemented security features in its widely used business intelligence platform, including row-level security in SAS Visual Analytics. Row-level security specifies who can access particular rows in a LASR table. Throughout this paper, we discuss two ways of implementing row-level security for LASR tables in SAS® Visual Analytics--interactively and in batch. Both approaches link table-based permission conditions with identities that are stored in metadata.
Zuzu Williams, SAS
SAS® formats can be used in so many different ways! Even the most basic SAS format use (modifying the way a SAS data value is displayed without changing the underlying data value) holds a variety of nifty tricks, such as nesting formats, formats that affect various style attributes, and conditional formatting. Add in picture formats, multi-label formats, using formats for data cleaning, and formats for joins and table look-ups, and we have quite a bag of tricks for the humble SAS format and PROC FORMAT, which are used to generate them. This paper describes a few very useful programming techniques that employ SAS formats. While this paper will be appropriate for the newest SAS user, it will also focus on some of the lesser-known features of formats and PROC FORMAT and so should be useful for even quite experienced users of SAS.
Christianna Williams, Self-Employed
In today's competitive job market, both recent graduates and experienced professionals are looking for ways to set themselves apart from the crowd. SAS® certification is one way to do that. SAS Institute Inc. offers a range of exams to validate your knowledge level. In writing this paper, we have drawn upon our personal experiences, remarks shared by new and longtime SAS users, and conversations with experts at SAS. We discuss what certification is and why you might want to pursue it. Then we share practical tips you can use to prepare for an exam and do your best on exam day.
Andra Northup, Advanced Analytic Designs, Inc.
Susan Slaughter, Avocet Solutions
Can you create hundreds of great looking Microsoft Excel tables all within SAS® and make them all Section 508 compliant at the same time? This paper examines how to use the ODS TAGSETS.EXCELXP statement and other Base SAS® features to create fantastic looking Excel worksheet tables that are all Section 508 compliant. This paper demonstrates that there is no need for any outside intervention or pre- or post-meddling with the Excel files to make them Section 508 compliant. We do it all with simple Base SAS code.
Chris Boniface, U.S. Census Bureau
Chris Boniface, U.S. Census Bureau
The Query Builder in SAS® Enterprise Guide® is an excellent point-and-click tool that generates PROC SQL code and creates queries in SAS®. This hands-on workshop will give an overview of query options, sorting, simple and complex filtering, and joining tables. It is a great workshop for programmers and non-programmers alike.
Jennifer First-Kluge, Systems Seminar Consultants
SAS® Studio (previously known as SAS® Web Editor) was introduced in the first maintenance release of SAS® 9.4 as an alternative programming environment to SAS® Enterprise Guide® and SAS® Display Manager. SAS Studio is different in many ways from SAS Enterprise Guide and SAS Display Manager. As a programmer, I currently use SAS Enterprise Guide to help me code, test, maintain, and organize my SAS® programs. I have SAS Display Manager installed on my PC, but I still prefer to write my programs in SAS Enterprise Guide because I know it saves my log and output whenever I run a program, even if that program crashes and takes the SAS session with it! So should I now be using SAS Studio instead, and should you be using it, too?
Philip Holland, Holland Numerics Limited
The purpose behind this paper is to provide a high-level overview of how SAS® security works in a way that can be communicated to both SAS administrators and users who are not familiar with SAS. It is not uncommon to hear SAS administrators complain that their IT department and users just don't 'get' it when it comes to metadata and security. For the administrator or user not familiar with SAS, understanding how SAS interacts with the operating system, the file system, external databases, and users can be confusing. Based on a SAS® Enterprise Office Analytics installation in a Windows environment, this paper walks the reader through all of the basic metadata relationships and how they are created, thus unraveling the mystery of how the host system, external databases, and SAS work together to provide users what they need, while reliably enforcing the appropriate security.
Charyn Faenza, F.N.B. Corporation
Are you a SAS® software user hoping to convince your organization to move to the latest product release? Has your management team asked how your organization can hire new SAS users familiar with the latest and greatest procedures and techniques? SAS® Studio and SAS® University Edition might provide the answers for you. SAS University Edition was created for teaching and learning. It's a new downloadable package of selected SAS products (Base SAS®, SAS/STAT®, SAS/IML®, SAS/ACCESS® Interface to PC Files, and SAS Studio) that runs on Windows, Linux, and Mac. With the exploding demand for analytical talent, SAS launched this package to grow the next generation of SAS users. Part of the way SAS is helping grow that next generation of users is through the interface to SAS University Edition: SAS Studio. SAS Studio is a developmental web application for SAS that you access through your web browser and--since the first maintenance release of SAS 9.4--is included in Base SAS at no additional charge. The connection between SAS University Edition and commercial SAS means that it's easier than ever to use SAS for teaching, research, and learning, from high schools to community colleges to universities and beyond. This paper describes the product, as well as the intent behind it and other programs that support it, and then talks about some successes in adopting SAS University Edition to grow the next generation of users.
Polly Mitchell-Guthrie, SAS
Amy Peters, SAS
SAS® Visual Analytics is a powerful tool for exploring, analyzing, and reporting on your data. Whether you understand your data well or are in need of additional insights, SAS Visual Analytics has the capabilities you need to discover trends, see relationships, and share the results with your information consumers. This paper presents a case study applying the capabilities of SAS Visual Analytics to NCAA Division I college football data from 2005 through 2014. It follows the process from reading raw comma-separated values (csv) files through processing that data into SAS data sets, doing data enrichment, and finally loading the data into in-memory SAS® LASR™ tables. The case study then demonstrates using SAS Visual Analytics to explore detailed play-by-play data to discover trends and relationships, as well as to analyze team tendencies to develop game-time strategies. Reports on player, team, conference, and game statistics can be used for fun (by fans) and for profit (by coaches, agents and sportscasters). Finally, the paper illustrates how all of these capabilities can be delivered via the web or to a mobile device--anywhere--even in the stands at the stadium. Whether you are using SAS Visual Analytics to study college football data or to tackle a complex problem in the financial, insurance, or manufacturing industry, SAS Visual Analytics provides the power and flexibility to score a big win in your organization.
John Davis, SAS
As a risk management unit, our team has invested countless hours in developing the processes and infrastructure necessary to produce large, multipurpose analytical base tables. These tables serve as the source for our key reporting and loss provisioning activities, the output of which (reports and GL bookings) is disseminated throughout the corporation. Invariably, questions arise and further insight is desired. Traditionally, any inquiries were returned to the original analyst for further investigation. But what if there was a way for the less technical user base to gain insights independently? Now there is with SAS® Visual Analytics. SAS Visual Analytics is often thought of as a big data tool, and while it is certainly capable in this space, its usefulness in regard to leveraging the value in your existing data sets should not be overlooked. By using these tried-and-true analytical base tables, you are guaranteed to achieve one version of the truth since traditional reports match perfectly to the data being explored. SAS Visual Analytics enables your organization to share these proven data assets with an entirely new population of data consumers--people with less 'traditional data skills but with questions that need to be answered. Finally, all this is achieved without any additional data preparation effort and testing. This paper explores our experience with SAS Visual Analytics and the benefits realized.
Shaun Kaufmann, Farm Credit Canada
Data visualization can be like a GPS directing us to where in the sea of data we should spend our analytical efforts. In today's big data world, many businesses are still challenged to quickly and accurately distill insights and solutions from ever-expanding information streams. Wells Fargo CEO John Stumpf challenges us with the following: We all work for the customer. Our customers say to us, 'Know me, understand me, appreciate me and reward me.' Everything we need to know about a customer must be available easily, accurately, and securely, as fast as the best Internet search engine. For the Wells Fargo Credit Risk department, we have been focused on delivering more timely, accurate, reliable, and actionable information and analytics to help answer questions posed by internal and external stakeholders. Our group has to measure, analyze, and provide proactive recommendations to support and direct credit policy and strategic business changes, and we were challenged by a high volume of information coming from disparate data sources. This session focuses on how we evaluated potential solutions and created a new go-forward vision using a world-class visual analytics platform with strong data governance to replace manually intensive processes. As a result of this work, our group is on its way to proactively anticipating problems and delivering more dynamic reports.
Ryan Marcum, Wells Fargo Home Mortgage
This workshop provides hands-on experience using SAS® Enterprise Miner. Workshop participants will learn to: open a project, create and explore a data source, build and compare models, and produce and examine score code that can be used for deployment.
Chip Wells, SAS
This workshop provides hands-on experience using SAS® Forecast Server. Workshop participants will learn to: create a project with a hierarchy, generate multiple forecast automatically, evaluate the forecasts accuracy, and build a custom model.
Catherine Truxillo, SAS
George Fernandez, SAS
Terry Woodfield, SAS
This workshop provides hands-on experience with SAS® Data Loader for Hadoop. Workshop participants will configure SAS Data Loader for Hadoop and use various directives inside SAS Data Loader for Hadoop to interact with data in the Hadoop cluster.
Kari Richardson, SAS
This workshop provides hands-on experience with SAS® Visual Analytics. Workshop participants will explore data with SAS® Visual Analytics Explorer and design reports with SAS® Visual Analytics Designer.
Nicole Ball, SAS
This workshop provides hands-on experience with SAS® Visual Statistics. Workshop participants will learn to: move between the Visual Analytics Explorer interface and Visual Statistics, fit automatic statistical models, create exploratory statistical analysis, compare models using a variety of metrics, and create score code.
Catherine Truxillo, SAS
Xiangxiang Meng, SAS
Mike Jenista, SAS
This workshop provides hands-on experience performing statistical analysis with SAS University Edition and SAS Studio. Workshop participants will learn to: install and setup, perform basic statistical analyses using tasks, connect folders to SAS Studio for data access and results storage, invoke code snippets to import CSV data into SAS, and create a code snippet.
Danny Modlin, SAS
This workshop provides hands-on experience using SAS® Text Miner. Workshop participants will learn to: read a collection of text documents and convert them for use by SAS Text Miner using the Text Import node, use the simple query language supported by the Text Filter node to extract information from a collection of documents, use the Text Topic node to identify the dominant themes and concepts in a collection of documents, and use the Text Rule Builder node to classify documents having pre-assigned categories.
Terry Woodfield, SAS
Six Sigma is a business management strategy that seeks to improve the quality of process outputs by identifying and removing the causes of defects (errors) and minimizing variability in manufacturing and business processes. Each Six Sigma project carried out within an organization follows a defined sequence of steps and has quantified financial targets. All Six Sigma project methodologies include an extensive analysis phase in which SAS® software can be applied. JMP® software is widely used for Six Sigma projects. However, this paper demonstrates how Base SAS® (and a bit of SAS/GRAPH® and SAS/STAT® software) can be used to address a wide variety of Six Sigma analysis tasks. The reader is assumed to have a basic knowledge of Six Sigma methodology. Therefore, the focus of the paper is the use of SAS code to produce outputs for analysis.
Dan Bretheim, Towers Watson
Before the Internet era, you might not have come across many Sankey diagrams. These diagrams, which contain nodes and links (paths) that cross, intertwine, and have different widths, were named after Captain Sankey. He first created this type of diagram to visualize steam engine efficiency. Sankey diagrams used to have very specialized applications such as mapping out energy, gas, heat, or water distribution and flow, or cost budget flow. These days, it's become very common to display the flow of web traffic or customer actions and reactions through Sankey diagrams as well. Sankey diagrams in SAS® Visual Analytics easily enable users to create meaningful visualizations that represent the flow of data from one event or value to another. In this paper, we take a look at the components that make up a Sankey diagram: 1. Nodes; 2. Links; 3. Drop-off links; 4. A transaction. In addition, we look at a practical example of how Sankey diagrams can be used to evaluate web traffic and influence the design of a website. We use SAS Visual Analytics to demonstrate the best way to build a Sankey diagram.
Varsha Chawla, SAS
Renato Luppi, SAS
Understanding organizational trends in spending can help overseeing government agencies make appropriate modifications in spending to best serve the organization and the citizenry. However, given millions of line items for organizations annually, including free-form text, it is unrealistic for these overseeing agencies to succeed by using only a manual approach to this textual data. Using a publicly available data set, this paper explores how business users can apply text analytics using SAS® Contextual Analysis to assess trends in spending for particular agencies, apply subject matter expertise to refine these trends into a taxonomy, and ultimately, categorize the spending for organizations in a flexible, user-friendly manner. SAS® Visual Analytics enables dynamic exploration, including modeling results from SAS® Visual Statistics, in order to assess areas of potentially extraneous spending, providing actionable information to the decision makers.
Tom Sabo, SAS
The era of mass marketing is over. Welcome to the new age of relevant marketing where whispering matters far more than shouting.' At ZapFi, using the combination of sponsored free Wi-Fi and real-time consumer analytics,' we help businesses to better understand who their customers are. This gives businesses the opportunity to send highly relevant marketing messages based on the profile and the location of the customer. It also leads to new ways to build deeper and more intimate, one-on-one relationships between the business and the customer. During this presentation, ZapFi will use a few real-world examples to demonstrate that the future of mobile marketing is much more about data and far less about advertising.
Gery Pollet, ZapFi
In today's omni-channel world, consumers expect retailers to deliver the product they want, where they want it, when they want it, at a price they accept. A major challenge many retailers face in delighting their customers is successfully predicting consumer demand. Business decisions across the enterprise are affected by these demand estimates. Forecasts used to inform high-level strategic planning, merchandising decisions (planning assortments, buying products, pricing, and allocating and replenishing inventory) and operational execution (labor planning) are similar in many respects. However, each business process requires careful consideration of specific input data, modeling strategies, output requirements, and success metrics. In this session, learn how leading retailers are increasing sales and profitability by operationalizing forecasts that improve decisions across their enterprise.
Alex Chien, SAS
Elizabeth Cubbage, SAS
Wanda Shive, SAS
SAS® users organize their applications in a variety of ways. However, there are some approaches that are more successful, and some that are less successful. In particular, the need to process some of the code some of the time in a file is sometimes challenging. Reproducible research methods require that SAS applications be understandable by the author and other staff members. In this presentation, you learn how to organize and structure your SAS application to manage the process of data access, data analysis, and data presentation. The approach to structure applications requires that tasks in the process of data analysis be compartmentalized. This can be done using a well-defined program. The author presents his structuring algorithm, and discusses the characteristics of good structuring methods for SAS applications. Reproducible research methods are becoming more centrally important, and SAS users must keep up with the current developments.
Paul Thomas, ASUP Ltd
Marketers often face a cross-channel challenge in making sense of the behavior of web visitors who spend considerable time researching an item online, even putting the item in a wish list or checkout basket, but failing to follow up with an actual purchase online, instead opting to purchase the item in the store. This research shows the use of SAS® Visual Analytics to address this challenge. This research uses a large data set of simulated web transactional data, combines it with common IDs to attach the data to in-store retail data, and studies it in SAS Visual Analytics. In this presentation, we go over tips and tricks for using SAS Visual Analytics on a non-distributed server. The loaded data set is analyzed step by step to show how to draw correlations in the web browsing behavior of customers and how to link the data to their subsequent in-store behavior. It shows how we can draw inferences between web visits and in-store visits by department. You'll change your marketing strategy as a result of the research.
Tricia Aanderud, Zencos
Johann Pasion, 89 Degrees
Understanding the behavior of your customers is key to improving and maintaining revenue streams. It is a critical requirement in the crafting of successful marketing campaigns. Using SAS® Visual Analytics, you can analyze and explore user behavior, click paths, and other event-based scenarios. Flow visualizations help you to best understand hotspots, highlight common trends, and find insights in individual user paths or in aggregated paths. This paper explains the basic concepts of path analysis as well as provides detailed background information about how to use flow visualizations within SAS Visual Analytics.
Falko Schulz, SAS
Olaf Kratzsch, SAS
This presentation details the steps involved in using SAS® Enterprise Miner™ to text mine a sample of member complaints. Specifically, it describes how the Text Parsing, Text Filtering, and Text Topic nodes were used to generate topics that described the complaints. Text mining results are reviewed (slightly modified for confidentiality), as well as conclusions and lessons learned from the project.
Amanda Pasch, Kaiser Permanenta
With cloud service providers such as Amazon commodifying the process to create a server instance based on desirable OS and sizing requirements for a SAS® implementation, a definite advantage is the speed and simplicity of getting started with a SAS installation. Planning horizons are nonexistent, and initial financial outlay is economized because no server hardware procurement occurs, no data center space reserved, nor any hardware/OS engineers assigned to participate in the initial server instance creation. The cloud infrastructure seems to make the OS irrelevant, an afterthought, and even just an extension of SAS software. In addition, if the initial sizing, memory allocation, or disk space selection results later in some deficiency or errors in SAS processing, the flexibility of the virtual server instance allows the instance image to be saved and restored to a new, larger, or performance-enhanced instance at relatively low cost and minor inconvenience to production users. Once logged on with an authenticated ID, with Internet connectivity established, a SAS installer ID created, and a web browser started, it's just a matter of downloading the SAS® Download Manager to begin the creation of the SAS software depot. Many Amazon cloud instances have download speeds that tend to be greater and processing time that is shorter to create the depot. Installing SAS via the SAS® Deployment Wizard is not dissimilar on a cloud instance versus a server instance, and all the same challenges (for example, SSL, authentication and single sign-on, and repository migration) apply. Overall, SAS administrators have an optimal, straightforward, and low-cost opportunity to deploy additional SAS instances running different versions or more complex configurations (for example, SAS® Grid Computing, resource-based load balancing, and SAS jobs split and run parallel across multiple nodes). While the main advantages of using a cloud instance to deploy a new SAS i
mplementation tend to revolve around efficiency, speed, and affordability, its pitfalls have to do with vulnerability to intrusion and external attack. The same easy, low-cost server instance launch also has a negative flip side that includes a possible lack of experienced OS oversight and basic security precaution. At the moment, Linux administrators around the country are patching their physical and virtual systems to prevent the spread of the Shellshock vulnerability for web servers that originated in cloud instances. Cloud instances have also been targeted and credentials compromised which, in some cases, have allowed thousands of new instances to be spun up and billed to an unsuspecting AWS licensed user. Extra steps have to be taken to prevent the aforementioned attacks and fortunately, there are cloud-based methods available. By creating a Virtual Private Cloud (VPC) instance, AWS users can restrict access by originating IP addresses while also requiring additional administration, including creating entries for application ports that require external access. Moreover, with each step toward more secure cloud implementations, there are additional complexities that arise, including making additional changes or compromises with corporate firewall policy and user authentication methods.
Jeff Lehmann, Slalom Consulting
For the past two academic school years, our SAS® Programming 1 class had a classroom discussion about the Charlotte Bobcats. We wondered aloud If the Bobcats changed their team name would the dwindling fan base return? As a class, we created a survey that consisted of 10 questions asking people if they liked the name Bobcats, did they attend basketball games, and if they bought merchandise. Within a one-hour class period, our class surveyed 981 out of 1,733 students at Phillip O. Berry Academy of Technology. After collecting the data, we performed advanced analytics using Base SAS® and concluded that 75% of students and faculty at Phillip O. Berry would prefer any other name except the Bobcats. In other results, 80% percent of the student body liked basketball, and the most preferred name was the Hornets, followed by the Royals, Flight, Dragons, and finally the Bobcats, The following school year, we conducted another survey to discover if people's opinions had changed since the previous survey and if people were happy with the Bobcats changing their name. During this time period, the Bobcats had recently reported that they were granted the opportunity to change the team name to the Hornets. Once more, we collected and analyzed the data and concluded that 77% percent of people surveyed were thrilled with the name change. In addition, around 50% percent of surveyors were interested in purchasing merchandise. Through the work of this project, SAS® Analytics was applied in the classroom to a real world scenario. The ability to see how SAS® could be applied to a question of interest and create change inspired the students in our class. This project is significantly important to show the economic impact that sports can have on a city. This project in particular, focused on the nostalgia that people of the city of Charlotte felt for the name Hornets. The project opened the door for more analysis and questions and continues to spa
rk interest. This is the case because when people have a connection to the team and the more the team flourishes, the more Charlotte benefits.
Lauren Cook, Charlotte Mecklenburg School System
The first thing that you need to know is that SAS® software stores dates and times as numbers. However, this is not the only thing that you need to know, and this presentation gives you a solid base for working with dates and times in SAS. It also introduces you to functions and features that enable you to manipulate your dates and times with surprising flexibility. This paper also shows you some of the possible pitfalls with dates (and with times and datetimes) in your SAS code, and how to avoid them. We show you how SAS handles dates and times through examples, including the ISO 8601 formats and informats, and how to use dates and times in TITLE and FOOTNOTE statements. We close with a brief discussion of Microsoft Excel conversions.
Derek Morgan
The knight's tour is a sequence of moves on a chess board such that a knight visits each square only once. Using a heuristic method, it is possible to find a complete path, beginning from any arbitrary square on the board and landing on the remaining squares only once. However, the implementation poses challenging programming problems. For example, it is necessary to discern viable knight moves, which change throughout the tour. Even worse, the heuristic approach does not guarantee a solution. This paper explains a SAS® solution that finds a knight's tour beginning from every initial square on a chess board...well, almost.
John R Gerlach, Dataceutics, Inc.
It is well-known in the world of SAS® programming that the REPORT procedure is one of the best procedures for creating dynamic reports. However, you might not realize that the compute block is where all of the action takes place! Its flexibility enables you to customize your output. This paper is a primer for using a compute block. With a compute block, you can easily change values in your output with the proper assignment statement and add text with the LINE statement. With the CALL DEFINE statement, you can adjust style attributes such as color and formatting. Through examples, you learn how to apply these techniques for use with any style of output. Understanding how to use the compute-block functionality empowers you to move from creating a simple report to creating one that is more complex and informative, yet still easy to use.
Jane Eslinger, SAS
If you are one of the many customers who want to move your SAS® data to Hadoop, one decision you will encounter is what data storage format to use. There are many choices, and all have their pros and cons. One factor to consider is how you currently store your data. If you currently use the Base SAS® engine or the SAS® Scalable Performance Data Engine, then using the SPD Engine with Hadoop will enable you to continue accessing your data with as little change to your existing SAS programs as possible. This paper discusses the enhancements, usage, and benefits of the SPD Engine with Hadoop.
Lisa Brown, SAS
Universities often have student data that is difficult to represent, which includes information about the student's home location. Often, student data is represented in tables, and patterns are easily overlooked. This study aimed to represent recruiting and retention data at a county level using SAS® mapping software for a public, land-grant university. Three years of student data from the student records database were used to visually represent enrollment, retention, and other predictors of student success. SAS® Enterprise Guide® was used along with the GMAP procedure to make user-friendly maps. Displaying data using maps on a county level revealed patterns in enrollment, retention, and other factors of interest that might have otherwise been overlooked, which might be beneficial for recruiting purposes.
Allison Lempola, South Dakota State University
Thomas Brandenburger, South Dakota State University
There are many 'gotcha's' when you are trying to automate a well-written program. The details differ depending on the way you schedule the program and the environment you are using. This paper covers system options, error handling logic, and best practices for logging. Save time and frustration by using these tips as you schedule programs to run.
Adam Hood, Slalom Consulting
This presentation provides an in-depth analysis, with example SAS® code, of the health care use and expenditures associated with depression among individuals with heart disease using the 2012 Medical Expenditure Panel Survey (MEPS) data. A cross-sectional study design was used to identify differences in health care use and expenditures between depressed (n = 601) and nondepressed (n = 1,720) individuals among patients with heart disease in the United States. Multivariate regression analyses using the SAS survey analysis procedures were conducted to estimate the incremental health services and direct medical costs (inpatient, outpatient, emergency room, prescription drugs, and other) attributable to depression. The prevalence of depression among individuals with heart disease in 2012 was estimated at 27.1% (6.48 million persons) and their total direct medical costs were estimated at approximately $110 billion in 2012 U.S. dollars. Younger adults (< 60 years), women, unmarried, poor, and sicker individuals with heart disease were more likely to have depression. Patients with heart disease and depression had more hospital discharges (relative ratio (RR) = 1.06, 95% confidence interval (CI) [1.02 to 1.09]), office-based visits (RR = 1.27, 95% CI [1.15 to 1.41]), emergency room visits (RR = 1.08, 95% CI [1.02 to 1.14]), and prescribed medicines (RR = 1.89, 95% CI [1.70, 2.11]) than their counterparts without depression. Finally, among individuals with heart disease, overall health care expenditures for individuals with depression was 69% higher than that for individuals without depression (RR = 1.69, 95% CI [1.44, 1.99]). The conclusion is that depression in individuals with heart disease is associated with increased health care use and expenditures, even after adjusting for differences in age, gender, race/ethnicity, marital status, poverty level, and medical comorbidity.
Seungyoung Hwang, Johns Hopkins University Bloomberg School of Public Health
Athletes in sports, such as baseball and softball, commonly undergo elbow reconstruction surgeries. There is research that suggests that the rate of elbow reconstruction surgeries among professional baseball pitchers continues to rise by leaps and bounds. Given the trend found among professional pitchers, the current study reviews patterns of elbow reconstruction surgery among the privately insured population. The study examined trends (for example, cost, age, geography, and utilization) in elbow reconstruction surgeries among privately insured patients using analytic tools such as SAS® Enterprise Guide® and SAS® Visual Analytics, based on the medical and surgical claims data from the FAIR Health National Private Insurance Claims (NPIC) database. The findings of the study suggested that there are discernable patterns in the prevalence of elbow reconstruction surgeries over time and across specific geographic regions.
Jeff Dang, FAIR Health
Becoming one of the best memorizers in the world doesn't happen overnight. With hard work, dedication, a bit of obsession, and with the assistance of some clever analytics metrics, Nelson Dellis was able to climb himself up to the top of the memory rankings in under a year to become the now 3x USA Memory Champion. In this talk, he explains what it takes to become the best at memory, what is involved in such grueling memory competitions, and how analytics helped him get there.
Nelson Dellis, Climb for Memory
How do you serve 25 million video ads a day to Internet users in 25 countries, while ensuring that you target the right ads to the right people on the right websites at the right time? With a lot of help from math, that's how! Come hear how Videology, an Internet advertising company, combines mathematical programming, predictive modeling, and big data techniques to meet the expectations of advertisers and online publishers, while respecting the privacy of online users and combatting fraudulent Internet traffic.
Kaushik Sinha, Videology
Whether you have a few variables to compare or billions of rows of data to explore, seeing the data in visual format can make all the difference in the insights you glean. In this session, learn how to determine which data is best delivered through visualization, understand the myriad types of data visualizations for use with your big data, and create effective data visualizations. If you are new to data visualization, this talk will help you understand how to best communicate with your data.
Tricia Aanderud, Zencos
If you have not had a chance to explore SAS® Studio yet, or if you're anxious to see what's new, this paper gives you an introduction to this new browser-based interface for SAS® programmers and a peek at what's coming. With SAS Studio, you can access your data files, libraries, and existing programs, and you can write new programs while using SAS software behind the scenes. SAS Studio connects to a SAS server in order to process SAS programs. The SAS server can be a hosted server in a cloud environment, a server in your local environment, or a copy of SAS on your local machine. It's a comfortable environment for those used to the traditional SAS windowing environment (SAS® Display Manager) but new features like a query window, process flow diagrams, and tasks have been added to appeal to traditional SAS® Enterprise Guide® users.
Mike Porter, SAS
Michael Monaco, SAS
Amy Peters, SAS
The latest releases of SAS® Data Integration Studio and DataFlux® Data Management Platform provide an integrated environment for managing and transforming your data to meet new and increasingly complex data management challenges. The enhancements help develop efficient processes that can clean, standardize, transform, master, and manage your data. The latest features include capabilities for building complex job processes, new web-based development and job monitoring environments, enhanced ELT transformation capabilities, big data transformation capabilities for Hadoop, integration with the analytic platform provided by SAS® LASR™ Analytic Server, enhanced features for lineage tracing and impact analysis, and new features for master data and metadata management. This paper provides an overview of the latest features of the products and includes use cases and examples for leveraging product capabilities.
Nancy Rausch, SAS
Mike Frost, SAS
We demonstrate a method of using SAS® 9.4 to supplement the interpretation of dimensions of a Multidimensional Scaling (MDS) model, a process that could be difficult without SAS®. In our paper, we examine why do people choose to drive to work (over other means of travel),' a question that transportation researchers need to answer in order to encourage drivers to switch to more environmentally-friendly travel modes. We applied the MDS approach on a travel survey data set because MDS has the advantage of extracting drivers' motivations in multiple dimensions.To overcome the challenges of dimension interpretation with MDS, we used the logistic regression function of SAS 9.4 to identify the variables that are strongly associated with each dimension, thus greatly aiding our interpretation procedure. Our findings are important to transportation researchers, practitioners, and MDS users.
Jun Neoh, University of Southampton
Jun Neoh, University of Southampton
Many retail and consumer packaged goods (CPG) companies are now keeping track of what their customers purchased in the past, often through some form of loyalty program. This record keeping is one example of how modern corporations are building data sets that have a panel structure, a data structure that is also pervasive in insurance and finance organizations. Panel data (sometimes called longitudinal data) can be thought of as the joining of cross-sectional and time series data. Panel data enable analysts to control for factors that cannot be considered by simple cross-sectional regression models that ignore the time dimension. These factors, which are unobserved by the modeler, might bias regression coefficients if they are ignored. This paper compares several methods of working with panel data in the PANEL procedure and discusses how you might benefit from using multiple observations for each customer. Sample code is available.
Bobby Gutierrez, SAS
Kenneth Sanford, SAS
Everyone likes getting a raise, and using ARRAYs in SAS® can help you do just that! Using ARRAYs simplifies processing, allowing for reading and analyzing of repetitive data with minimum coding. Improving the efficiency of your coding and in turn, your SAS productivity, is easier than you think! ARRAYs simplify coding by identifying a group of related variables that can then be referred to later in a DATA step. In this quick tip, you learn how to define an ARRAY using an array statement that establishes the ARRAY name, length, and elements. You also learn the criteria and limitations for the ARRAY name, the requirements for array elements to be processed as a group, and how to call an ARRAY and specific array elements within a DATA step. This quick tip reveals useful functions and operators, such as the DIM function and using the OF operator within existing SAS functions, that make using ARRAYs an efficient and productive way to process data. This paper takes you through an example of how to do the same task with and without using ARRAYs in order to illustrate the ease and benefits of using them. Your coding will be more efficient and your data analysis will be more productive, meaning you will deserve ARRAYs!
Kate Burnett-Isaacs, Statistics Canada
We all know there are multiple ways to use SAS® language components to generate the same values in data sets and output (for example, using the DATA step versus PROC SQL, If-Then-Elses versus Format table conversions, PROC MEANS versus PROC SQL summarizations, and so on). However, do you compare those different ways to determine which are the most efficient in terms of computer resources used? Do you ever consider the time a programmer takes to develop or maintain code? In addition to determining efficient syntax, do you validate your resulting data sets? Do you ever check data values that must remain the same after being processed by multiple steps and verify that they really don't change? We share some simple coding techniques that have proven to save computer and human resources. We also explain some data validation and comparison techniques that ensure data integrity. In our distributed computing environment, we show a quick way to transfer data from a SAS server to a local client by using PROC DOWNLOAD and then PROC EXPORT on the client to convert the SAS data set to a Microsoft Excel file.
Jason Beene, WellsFargo
Mary Katz, Wells Fargo Bank