The Kolmogorov-Smirnov (K-S) test is one of the most useful and general nonparametric methods for comparing two samples. It is sensitive to all types of differences between two populations (shift, scale, shape, and so on). In this paper, we will present a thorough investigation into the K-S test including, derivation of the formal test procedure, practical demonstration of the test, large sample approximation of the test, and ease of use in SAS® using the NPAR1WAY procedure.
Tison Bolen, Cardinal Health
Dawit Mulugeta, Cardinal Health
Jason Greenfield, Cardinal Health
Lisa Conley, Cardinal Health
The creation of production reports for our organization has historically been a labor-intensive process. Each month, our team produced around 650 SAS® graphs and 30 tables which were then copied and pasted into 16 custom Microsoft PowerPoint presentations, each between 20 and 30 pages. To reduce the number of manual steps, we converted to using stored processes and the SAS® Add-In for Microsoft Office. This allowed us to simply refresh those 16 PowerPoint presentations by using SAS Add-In for Microsoft Office to run SAS® Stored Processes. SAS Stored Processes generates the graphs and tables while SAS Add-In for Microsoft Office refreshes the document with updated graphs already sized and positioned on the slides just as we need them. With this new process, we are realizing the dream of reducing the amount of time spent on a single monthly production process. This paper will discuss the steps to creating a complex PowerPoint presentation that is simply refreshed rather than created new each month. I will discuss converting the original code to stored processes using SAS® Enterprise Guide®, options and style statements that are required to continue to use a custom style sheet, and how to create the PowerPoint presentation with an assortment of output types including horizontal bar charts, control charts, and tables. I will also discuss some of the challenges and solutions specific to the stored process and PowerPoint Add-In that we encountered during this conversion process.
Julie VanBuskirk, Baylor Health Care System
Often in a clinical trial, measures are needed to describe pain, discomfort, or physical constraints that are visible but not measurable through lab tests or other vital signs. In these cases, researchers turn to questionnaires to provide documentation of improvement or statistically meaningful change in support safety and efficacy hypotheses. For example, in studies (like Parkinson s studies) where pain or depression are serious non-motor symptoms of the disease, these questionnaires provide primary endpoints for analysis. Questionnaire data presents unique challenges in both collection and analysis in the world of CDISC standards. The questions are usually aggregated into scale scores, as the underlying questions by themselves provide little additional usefulness. SAS® is a powerful tool for extraction of the raw data from the collection databases and transposition of columns into a basic data structure in SDTM, which is vertical. The data is then processed further as per the instructions in the Statistical Analysis Plan (SAP). This involves translation of the originally collected values into sums, and the values of some questions need to be reversed. Missing values can be computed as means of the remaining questions. These scores are then saved as new rows in the ADaM (analysis-ready) data sets. This paper describes the types of questionnaires, how data collection takes place, the basic CDISC rules for storing raw data in SDTM, and how to create analysis data sets with derived records using ADaM standards, while maintaining traceability to the original question.
Karin LaPann, PRA International
Terek Peterson, PRA International
With the growth in size and complexity of organizations investing in SAS® platform technologies, the size and complexity of ETL subsystems and data integration (DI) jobs is growing at a rapid rate. Developers are pushed to come up with new and innovative ways to improve process efficiency in their DI jobs to meet increasingly demanding service level agreements (SLAs). The ability to conditionally execute or switch paths in a DI job is an extremely useful technique for improving process efficiency. How can a SAS® Data Integration developer design a job to best suit conditional execution? This paper discusses a technique for providing a parameterized dynamic execution custom transformation that can be easily incorporated into SAS® Data Integration Studio jobs to provide process path switching capabilities. The aim of any data integration task is to ensure that all sources of business data are integrated as efficiently as possible. It is concerned with the repurposing of data via transformation, should be a value-adding process, and also should be the product of collaboration. Modularization of common or repeatable processes is a fundamental part of the collaboration process in DI design and development. Switch path a custom transformation built to conditionally execute branches or nodes in SAS Data Integration Studio provides a reusable module for solving the conditional execution limitations of standard SAS Data Integration Studio transformations and jobs. Switch Path logic in SAS Data Integration Studio can serve many purposes in day-to-day business needs for a SAS data integration developer as it is completely reusable
Prajwal Shetty, Tesco
The Washington D.C. aqueduct was completed in 1863, carrying desperately needed clean water to its many residents. Just as the aqueduct was vital and important to its residents, a lifeline if you will, so too is the supply of data to the business. Without the flow of vital information, many businesses would not be able to make important decisions. The task of building my company s first dashboard was brought before us by our CIO; the business had not asked for it. In this poster, I discuss how we were able to bring fresh ideas and data to our business units by converting the data they saw on a daily basis in reports to dashboards. The road to success was long with plenty of struggles from creating our own business requirements to building data marts, synching SQL to SAS®, using information maps and SAS® Enterprise Guide® projects to move data around, all while dealing with technology and other I.T. team roadblocks. Then on to designing what would become our real-time dashboards, fighting for SharePoint single sign-on, and, oh yeah, user adoption. My story of how dashboards revitalized the business is a refreshing tale for all levels.
Jennifer McBride, Virginia Credit Union
SAS® is an outstanding suite of software, but not everyone in the workplace speaks SAS. However, almost everyone speaks Excel. Often, the data you are analyzing, the data you are creating, and the report you are producing is a form of a Microsoft Excel spreadsheet. Every year at SAS® Global Forum, there are SAS and Excel presentations, not just because Excel isso pervasive in the workplace, but because there s always something new to learn (or re-learn)! This paper summarizes and references (and pays homage to!) previous SAS Global Forum presentations, as well as examines some of the latest Excel capabilities with the latest versions of SAS® 9.4 and SAS® Visual Analytics.
Andrew Howell, ANJ Solutions
Business Intelligence (BI) dashboards serve as an invaluable, high-level, visual reference tool for decision-making processes in many business industries. A request was made to our department to develop some BI dashboards that could be incorporated in an academic setting. These dashboards would aim to serve various undergraduate executive and administrative staff at the university. While most business data may lend itself to work very well and easily in the development of dashboards, academic data is typically modeled differently and, therefore, faces unique challenges. In this paper, the authors detail and share the design and development process of creating dashboards for decision making in an academic environment utilizing SAS® BI Dashboard 4.3 and other SAS® Enterprise Business Intelligence 9.2 tools. The authors also provide lessons learned as well as recommendations for future implementations of BI dashboards utilizing academic data.
Evangeline Collado, University of Central Florida
Michelle Parente, University of Central Florida
Portfolio segmentation is key in all forecasting projects. Not all products are equally predictable. Nestl uses animal names for its segmentation, and the animal behavior translates well into how the planners should plan these products. Mad Bulls are those products that are tough to predict, if we don't know what is causing their unpredictability. The Horses are easier to deal with. Modern time series based statistical forecasting methods can tame Mad Bulls, as they allow to add explanatory variables into the models. Nestl now complements its Demand Planning solution based on SAP with predictive analytics technology provided by SAS®, to overcome these issues in an industry that is highly promotion-driven. In this talk, we will provide an overview of the relationship Nestl is building with SAS, and provide concrete examples of how modern statistical forecasting methods available in SAS® Demand-Driven Planning and Optimization help us to increase forecasting performance, and therefore to provide high service to our customers with optimized stock, the primary goal of Nestl 's supply chains.
Marcel Baumgartner, Nestlé SA
The role of the Data Scientist is the viral job description of the decade. And like LOLcats, there are many types of Data Scientists. What is this new role? Who is hiring them? What do they do? What skills are required to do their job? What does this mean for the SAS® programmer and the statistician? Are they obsolete? And finally, if I am a SAS user, how can I become a Data Scientist? Come learn about this job of the future and what you can do to be part of it.
Chuck Kincaid, Experis Business Analytics
Understanding the actual gambling behavior of an individual over the Internet, we develop markers which identify behavioral patterns, which in turn can be used to predict the level of risk a subscriber is prone to gambling. The data set contains 4,056 subscribers. Using SAS® Enterprise Miner™ 12.1, a set of models are run to predict which subscriber is likely to become a high-risk internet gambler. The data contains 114 variables such as first active date and first active product used on the website as well as the characteristics of the game such as fixed odds, poker, casino, games, etc. Other measures of a subscriber s data such as money put at stake and what odds are being bet are also included. These variables provide a comprehensive view of a subscriber s behavior while gambling over the website. The target variable is modeled as a binary variable, 0 indicating a risky gambler and 1 indicating a controlled gambler. The data is a typical example of real-world data with many missing values and hence had to be transformed, imputed, and then later considered for analysis. The model comparison algorithm of SAS Enterprise Miner 12.1 was used to determine the best model. The stepwise Regression performs the best among a set of 25 models which were run using over a 100 permutations of each model. The Stepwise Regression model predicts a high-risk Internet gambler at an accuracy of 69.63% with variables such as wk4frequency and wk3frequency of bets.
Sai Vijay Kishore Movva, Oklahoma State University
Vandana Reddy, Oklahoma State University
Goutam Chakraborty, Oklahoma State University
A common complaint from users working on identifying fraud and abuse in Medicare is that teams focus on operational applications, static reports, and high-level outliers. But, when faced with the need to constantly evaluate changing Medicare provider and beneficiary or enrollee dynamics, users are clamoring for more dynamic and accurate detection approaches. Providing these organizations with a data discovery and predictive analytics framework that leverages Hadoop and other big data approaches, while providing a clear path for teams to make more fact-based decisions more quickly is very important in pre- and post-fraud and abuse analysis. Organizations that do pursue a framework and a reusable services-based data discovery and analytics framework and architecture approach enjoy greater success in supporting data management, reporting, and analytics demands. They can quickly turn models into prioritized alerts and avoid improper or fraudulent payments. A successful framework should enable organizations to come up with efficient fraud, waste, and abuse models to address complex schemes; identify fraud, waste, and abuse vulnerabilities; and shorten triage efforts using a variety of data sourced from big data platforms like Hadoop and other relational database management systems. This paper talks about the data management, data discovery, predictive analytics, and social network analysis capabilities that are included in the SAS fraud framework and how a unified approach can significantly reduce the lifecycle of building and deploying fraud models. We hope this paper will provide IT leaders with a clear path for resolving issues from the simple to the incredibly complex, through a measured and scalable approach for delivering value for fraud, waste, and abuse models by providing deep insights to support evidence-based investigations.
Vivek Sethunatesan, Northrop Grumman Corp
Paper SAS1393-2014:
SAS® Workshop: SAS® Office Analytics
This workshop provides hands-on experience using SAS® Office Analytics. Workshop participants will complete the following tasks: use SAS® Enterprise Guide® to access and analyze data create a stored process that can be shared across an organization access and analyze data sources and stored processes using the SAS® Add-In for Microsoft Office
Eric Rossland, SAS
The Purchasing Department is considering contracting with your team for a new SAS® Enterprise BI application. He's already met with SAS® and seen the sales pitch, and he is very interested. But the manager is a tightwad and not sure about spending the money. Also, he wants his team to be the primary developers for this new application. Before investing his money on training, programming, and support, he would like a proof-of-concept. This paper will walk you through the seven steps to create a SAS Enterprise BI POC project: Develop a kick-off meeting including a full demo of the SAS Enterprise BI tools. Set up your UNIX file systems and security. Set up your SAS metadata ACTs, users, groups, folders, and libraries. Make sure the necessary SAS client tools are installed on the developers machines. Hold a SAS Enterprise BI workshop to introduce them to the basics, including SAS® Enterprise Guide®, SAS® Stored Processes, SAS® Information Maps, SAS® Web Report Studio, SAS® Information Delivery Portal, and SAS® Add-In for Microsoft Office, along with supporting documentation. Work with them to develop a simple project, one that highlights the benefits of SAS Enterprise BI and shows several methods for achieving the desired results. Last but not least, follow up! Remember, your goal is not to launch a full-blown application. Instead, we ll strive toward helping them see the potential in your organization for applying this methodology.
Sheryl Weise, Wells Fargo
SAS® Visual Analytics enables you to conduct ad hoc data analysis, visually explore data, develop reports, and then share insights through the web and mobile tablet apps. You can now also share your insights with colleagues using the SAS® Office Analytics integration with Microsoft Excel, Microsoft Word, Microsoft PowerPoint, Microsoft Outlook, and Microsoft SharePoint. In addition to opening and refreshing reports created using SAS Visual Analytics, a new SAS® Central view enables you to manage and comment on your favorite and recent reports from your Microsoft Office applications. You can also view your SAS Visual Analytics results in SAS® Enterprise Guide®. Learn more about this integration and what's coming in the future in this breakout session.
David Bailey, SAS
I-Kong Fu, SAS
Anand Chitale, SAS
Distributing SAS® software to a large number of machines can be challenging at best and exhausting at worst. Common areas of concern for installers are silent automation, network traffic, ease of setup, standardized configurations, maintainability, and simply the sheer amount of time it takes to make the software available to end users. We describe a variety of techniques for easing the pain of provisioning SAS software, including the new standalone SAS® Enterprise Guide® and SAS® Add-in for Microsoft Office installers, as well as the tried and true SAS® Deployment Wizard record and playback functionality. We also cover ways to shrink SAS Software Depots, like the new 'subsetting recipe' feature, in order to ease scenarios requiring depot redistribution. Finally, we touch on alternate methods for workstation access to SAS client software, including application streaming, desktop virtualization, and Java Web Start.
Mark Schneider, SAS
For decades, SAS® has been the cornerstone of many organizations for business reporting. In more recent times, the ability to quickly determine the performance of an organization through the use of dashboards has become a requirement. Different ways of providing dashboard capabilities are discussed in this paper: using out-of-the-box solutions such as SAS® Visual Analytics and SAS® BI Dashboard, through to alternative solutions using SAS® Stored Processes, batch processes, and SAS® Integration Technologies. Extending the available indicators is also discussed, using Graph Template Language and KPI indicators provided with Base SAS®, as well as alternatives such as Google Charts and Flash objects. Real-world field experience, problem areas, solutions, and tips are shared, along with live examples of some of the different methods.
Mark Bodt, The Knowledge Warehouse (Knoware)
SAS® Add-In for Microsoft Office remains a popular tool for people who are not SAS® programmers due to its easy interface with the SAS servers. In this session, you'll learn some of the many tricks that other organizations use for getting more value out of the tool.
Tricia Aanderud, And Data Inc