SAS Global Forum 2016 Proceedings

A B C D E F G H I K L M N P T U W Y

A

Session 2260-2016:

A Practical Introduction to SAS^® Data Integration Studio

A useful and often overlooked tool released with the SAS^® Business Intelligence suite to aid in ETL is SAS^® Data Integration Studio. This product gives users the ability to extract, transform, join, and load data from various database management systems (DBMSs), data marts, and other data stores by using a graphical interface and without having to code different credentials for each schema. It enables seamless promotion of code to a production system without the need to alter the code. And it is quite useful for deploying and scheduling jobs by using the schedule manager in SAS^® Management Console, because all code created by Data Integration Studio is optimized. Although this tool enables users to create code from scratch, one of its most useful capabilities is that it can take legacy SAS^® code and, with minimal alterations, have its data associations created and have all the properties of a job coded from scratch.

Read the paper (PDF)

Erik Larsen, Independent Consultant

Session SAS5642-2016:

A Ringside Seat: The ODS Excel Destination versus the ODS ExcelXP Tagset

The new and highly anticipated SAS^® Output Delivery System (ODS) destination for Microsoft Excel is finally here! Available as a production feature in the third maintenance release of SAS^® 9.4 (TS1M3), this new destination generates native Excel (XLSX) files that are compatible with Microsoft Office 2010 or later. This paper is written for anyone, from entry-level programmers to business analysts, who uses the SAS^® System and Microsoft Excel to create reports. The discussion covers features and benefits of the new Excel destination, differences between the Excel destination and the older ExcelXP tagset, and functionality that exists in the ExcelXP tagset that is not available in the Excel destination. These topics are all illustrated with meaningful examples. The paper also explains how you can bridge the gap that exists as a result of differences in the functionality between the destination and the tagset. In addition, the discussion outlines when it is beneficial for you to use the Excel destination versus the ExcelXP tagset, and vice versa. After reading this paper, you should be able to make an informed decision about which tool best meets your needs.

Read the paper (PDF) | Watch the recording

Chevell Parker, SAS

B

Session SAS6485-2016:

Best Practices for Effective Model Risk Management

Financial institutions rely heavily on quantitative models for risk management, balance-sheet stress testing and various business analyses and decision support functions. Investment decisions and business strategies are largely driven by estimates from models. Recent financial crises and model failures at high-profile banks have emphasized the need for better modeling practices. Regulators have stepped-in to assist banks with enhanced guidance and regulations for effective model risk management. Effective model risk management is more than developing a good model. SAS^® Model Risk Management provides a robust framework to capture and track model inventory. In this paper we present best practices in model risk management learned from implementation projects and interactions with industry experts. These best practices help firms that are setting up a model risk management framework or enhancing their existing practices.

Read the paper (PDF)

Satish Garla, SAS

Sukhbir Dhillon, SAS

C

Session SAS4240-2016:

Creating a Strong Business Case for SAS^® Grid Manager: Translating Grid Computing Benefits to Business Benefits

SAS^® Grid Manager, as well as other grid computing technologies, have a set of great capabilities that we, IT professionals, love to have in our systems. This technology increases high availability, allows parallel processing, facilitates increasing demand by scale out, and offers other features that make life better for those managing and using these environments. However, even when business users take advantage of these features, they are more concerned about the business part of the problem. Most of the time business groups hold the budgets and are key stakeholders for any SAS Grid Manager project. Therefore, it is crucial to demonstrate to business users how they will benefit from the new technologies, how the features will improve their daily operations, help them be more efficient and productive, and help them achieve better results. This paper guides you through a process to create a strong and persuasive business plan that translates the technology features from SAS Grid Manager to business benefits.

Read the paper (PDF) | Watch the recording

Marlos Bosso, SAS

Session SAS6685-2016:

Credit Risk Modeling in a New Era

The recent advances in regulatory stress testing, including stress testing regulated by Comprehensive Capital Analysis and Review (CCAR) in the US, the Prudential Regulation Authority (PRA) in the UK, and the European Banking Authority in the EU, as well as the new international accounting requirement known as IFRS 9 (International Financial Reporting Standard), all pose new challenges to credit risk modeling. The increasing sophistication of the models that are supposed to cover all the material risks in the underlying assets in various economic scenarios makes models harder to implement. Banks are spending a lot of resources on the model implementation but are still facing issues due to long time to deployment and disconnection between the model development and implementation teams. Models are also required at a more granular level, in many cases, down to the trade and account levels. Efficient model execution becomes valuable for banks to get timely response to the analysis requests. At the same time, models are subject to more stringent internal and external scrutiny. This paper introduces a suite of risk modeling solutions from credit risk modeling leader SAS^® to help banks overcome these new challenges and be competent to meet the regulatory requirements.

Read the paper (PDF)

Wei Chen, SAS

Martim Rocha, SAS

Jimmy Skoglund, SAS

Session SAS5245-2016:

Custom Risk Metrics with SAS^® High-Performance Risk

There are standard risk metrics financial institutions use to assess the risk of a portfolio. These include well known measures like value at risk and expected shortfall and related measures like contribution value at risk. While there are industry-standard approaches for calculating these measures, it is often the case that financial institutions have their own methodologies. Further, financial institutions write their own measures, in addition to the common risk measures. SAS^® High-Performance Risk comes equipped with over 20 risk measures that use standard methodology, but the product also allows customers to define their own risk measures. These user-defined statistics are treated the same way as the built-in measures, but the logic is specified by the customer. This paper leads the user through the creation of custom risk metrics using the HPRISK procedure.

Read the paper (PDF)

Katherine Taylor, SAS

Steven Miles, SAS

D

Session 10740-2016:

Developing an On-Demand Web Report Platform Using Stored Processes and SAS^® Web Application Server

As SAS^® programmers, we often develop listings, graphs, and reports that need to be delivered frequently to our customers. We might decide to manually run the program every time we get a request, or we might easily schedule an automatic task to send a report at a specific date and time. Both scenarios have some disadvantages. If the report is manual, we have to find and run the program every time someone request an updated version of the output. It takes some time and it is not the most interesting part of the job. If we schedule an automatic task in Windows, we still sometimes get an email from the customers because they need the report immediately. That means that we have to find and run the program for them. This paper explains how we developed an on-demand report platform using SAS^® Enterprise Guide^®, SAS^® Web Application Server, and stored processes. We had developed many reports for different customer groups, and we were getting more and more emails from them asking for updated versions of their reports. We felt we were not using our time wisely and decided to create an infrastructure where users could easily run their programs through a web interface. The tool that we created enables SAS programmers to easily release on-demand web reports with minimum programming. It has web interfaces developed using stored processes for the administrative tasks, and it also automatically customizes the front end based on the user who connects to the website. One of the challenges of the project was that certain reports had to be available to a specific group of users only.

Read the paper (PDF)

Romain Miralles, Genomic Health

E

Session SAS5246-2016:

Enterprise Data Governance across SAS^® and Beyond

As Data Management professionals, you have to comply with new regulations and controls. One such regulation is Basel Committee on Banking Supervision (BCBS) 239. To respond to these new demands, you have to put processes and methods in place to automate metadata collection and analysis, and to provide rigorous documentation around your data flows. You also have to deal with many aspects of data management including data access, data manipulation (ETL and other), data quality, data usage, and data consumption, often from a variety of toolsets that are not necessarily from a single vendor. This paper shows you how to use SAS^® technologies to support data governance requirements, including third party metadata collection and data monitoring. It highlights best practices such as implementing a business glossary and establishing controls for monitoring data. Attend this session to become familiar with the SAS tools used to meet the new requirements and to implement a more managed environment.

Read the paper (PDF)

Jeff Stander, SAS

Session SAS5060-2016:

Exploring SAS^® Embedded Process Technologies on Hadoop

SAS^® Embedded Process offers a flexible, efficient way to leverage increasing amounts of data by injecting the processing power of SAS^® directly where the data lives. SAS Embedded Process can tap into the massively parallel processing (MPP) architecture of Hadoop for scalable performance. Using SAS^® In-Database Technologies for Hadoop, you can run scoring models generated by SAS^® Enterprise Miner™ or, with SAS^® In-Database Code Accelerator for Hadoop, user-written DS2 programs in parallel. With SAS Embedded Process on Hadoop you can also perform data quality operations, and extract and transform data using SAS^® Data Loader. This paper explores key SAS technologies that run inside the Hadoop parallel processing framework and prepares you to get started with them.

Read the paper (PDF)

David Ghazaleh, SAS

F

Session 9260-2016:

FASHION, STYLE "GOTTA HAVE IT" COMPUTE DEFINE BLOCK

Do you create complex reports using PROC REPORT? Are you confused by the COMPUTE BLOCK feature of PROC REPORT? Are you even aware of it? Maybe you already produce reports using PROC REPORT, but suddenly your boss needs you to modify some of the values in one or more of the columns. Maybe your boss needs to see the values of some rows in boldface and others highlighted in a stylish yellow. Perhaps one of the columns in the report needs to display a variety of fashionable formats (some with varying decimal places and some without any decimals). Maybe the customer needs to see a footnote in specific cells of the report. Well, if this sounds familiar then come take a look at the COMPUTE BLOCK of PROC REPORT. This paper shows a few tips and tricks of using the COMPUTE DEFINE block with conditional IF/THEN logic to make your reports stylish and fashionable. The COMPUTE BLOCK allows you to use data DATA step code within PROC REPORT to provide customization and style to your reports. We'll see how the Census Bureau produces a stylish demographic profile for customers of its Special Census program using PROC REPORT with the COMPUTE BLOCK. The paper focuses on how to use the COMPUTE BLOCK to create this stylish Special Census profile. The paper shows quick tips and simple code to handle multiple formats within the same column, make the values in the Total rows boldface, trafficlighting, and how to add footnotes to any cell based on the column or row. The Special Census profile report is an Excel table created with ODS tagsets.ExcelXP that is stylish and fashionable, thanks in part to the COMPUTE BLOCK.

Read the paper (PDF) | Watch the recording

Chris Boniface, Census Bureau

Session 11201-2016:

Finding the National Best Bid and Offer--Quote by Quote

U.S. stock exchanges (currently there are 12) are tracked in real time via the Consolidated Tape System (CTS) and the Consolidated Quotation System (CQS). CQS contains every updated quote (buyer's bid price and seller's offer price) from each exchange, covering some 8,500 stock tickers. This is the basis by which brokers can honor their obligation to investors, mandated by the U.S. Securities and Exchange Commission, to execute transactions at the best price, that is, at the National Best Bid and Offer (NBBO). With the advent of electronic exchanges and high-frequency trading (timestamps are published to the microsecond), data set size has become a major operational consideration for market researchers re-creating NBBO values (over 1 billion quotes requiring 80 gigabytes of storage for a normal trading day). This presentation demonstrates a straightforward use of hash tables for tracking constantly changing quotes for each ticker/exchange combination, in tandem with an efficient means of determining changes in NBBO with every new quote.

Read the paper (PDF)

Mark Keintz, Wharton Research Data Services

G

Session SAS5501-2016:

Getting There from Here: Lifting Enterprise SAS^® to the Amazon Public Cloud

If your organization already deploys one or more software solutions via Amazon Web Services (AWS), you know the value of the public cloud. AWS provides a scalable public cloud with a global footprint, allowing users access to enterprise software solutions anywhere at any time. Although SAS^® began long before AWS was even imagined, many loyal organizations driven by SAS are moving their local SAS analytics into the public AWS cloud, alongside other software hosted by AWS. SAS^® Solutions OnDemand has assisted organizations in this transition. In this paper, we describe how we extended our enterprise hosting business to AWS. We describe the open source automation framework from which SAS Soultions onDemand built our automation stack, which simplified the process of migrating a SAS implementation. We'll provide the technical details of our automation and network footprint, a discussion of the technologies we chose along the way, and a list of lessons learned.

Read the paper (PDF)

Ethan Merrill, SAS

Bryan Harkola, SAS

Session 7300-2016:

Graphing Made Easy for Project Management

Project management is a hot topic across many industries, and there are multiple commercial software applications for managing projects available. The reality, however, is that the majority of project management software is not applicable for daily usage. SAS^® has a solution for this issue that can be used for managing projects graphically in real time. This paper introduces a new paradigm for project management using the SAS^® Graph Template Language (GTL). SAS clients, in real time, can use GTL to visualize resource assignments, task plans, delivery tracking, and project status across multiple project levels for more efficient project management.

Read the paper (PDF)

Zhouming(Victor) Sun, Medimmune

H

Session 9800-2016:

How to Visualize SAS^® Data with JavaScript Libraries like HighCharts and D3

Have you ever wondered how to get the most from Web 2.0 technologies in order to visualize SAS^® data? How to make those graphs dynamic, so that users can explore the data in a controlled way, without needing prior knowledge of SAS products or data science? Wonder no more! In this session, you learn how to turn basic sashelp.stocks data into a snazzy HighCharts stock chart in which a user can review any time period, zoom in and out, and export the graph as an image. All of these features with only two DATA steps and one SORT procedure, for 57 lines of SAS code.

Download the data file (ZIP) | View the e-poster or slides (PDF)

Vasilij Nevlev, Analytium Ltd

I

Session 8680-2016:

Integrating Microsoft VBScript and SAS^®

Microsoft Visual Basic Scripting Edition (VBScript) and SAS^® software are each powerful tools in their own right. These two technologies can be combined so that SAS code can call a VBScript program or vice versa. This gives a programmer the ability to automate SAS tasks; traverse the file system; send emails programmatically via Microsoft Outlook or SMTP; manipulate Microsoft Word, Microsoft Excel, and Microsoft PowerPoint files; get web data; and more. This paper presents example code to demonstrate each of these capabilities.

Read the paper (PDF) | Download the data file (ZIP)

Christopher Johnson, BrickStreet Insurance

K

Session 7140-2016:

Key Requirements For SAS^® Grid Users

Considering the fact that SAS^® Grid Manager is becoming more and more popular, it is important to fulfill the user's need for a successful migration to a SAS^® Grid environment. This paper focuses on key requirements and common issues for new SAS Grid users, especially if they are coming from a traditional environment. This paper describes a few common requirements like the need for a current working directory, the change of file system navigation in SAS^® Enterprise Guide^® with user-given location, getting job execution summary email, and so on. The GRIDWORK directory has been introduced in SAS Grid Manager, which is a bit different from the traditional SAS WORK location. This paper explains how you can use the GRIDWORK location in a more user-friendly way. Sometimes users experience data set size differences during grid migration. A few important reasons for data set size difference are demonstrated. We also demonstrate how to create new custom scripts as per business needs and how to incorporate them with SAS Grid Manager engine.

Read the paper (PDF) | View the e-poster or slides (PDF)

Piyush Singh, TATA Consultancy Services Ltd

Tanuj Gupta, TATA Consultancy Services

Prasoon Sangwan, Tata consultancy services limited

L

Session 11221-2016:

Lead and Lags: Static and Dynamic Queues in the SAS^® DATA Step

From stock price histories to hospital stay records, analysis of time series data often requires use of lagged (and occasionally lead) values of one or more analysis variables. For the SAS^® user, the central operational task is typically getting lagged (lead) values for each time point in the data set. While SAS has long provided a LAG function, it has no analogous lead function--an especially significant problem in the case of large data series. This paper reviews the LAG function, in particular the powerful, but non-intuitive implications of its queue-oriented basis. The paper demonstrates efficient ways to generate leads with the same flexibility as the LAG function, but without the common and expensive recourse to data re-sorting. It also shows how to dynamically generate leads and lags through use of the hash object.

Read the paper (PDF)

Mark Keintz, Wharton Research Data Services

Session SAS5140-2016:

Leverage Your Reports in SAS^® Visual Analytics: Using SAS^® Theme Designer

Is uniqueness essential for your reports? SAS^® Visual Analytics provides the ability to customize your reports to make them unique by using the SAS^® Theme Designer. The SAS Theme Designer can be accessed from the SAS^® Visual Analytics Hub to create custom themes to meet your branding needs and to ensure a unified look across your company. The report themes affect the colors, fonts, and other elements that are used in tables and graphs. The paper explores how to access SAS Theme Designer from the SAS Visual Analytics home page, how to create and modify report themes that are used in SAS Visual Analytics, how to create report themes from imported custom themes, and how to import and export custom report themes.

Read the paper (PDF)

Meenu Jaiswal, SAS

Ipsita Samantarai, SAS Research & Development (India) Pvt Ltd

M

Session 5580-2016:

Macro Variables in SAS^® Enterprise Guide^®

For SAS^® Enterprise Guide^® users, sometimes macro variables and their values need to be brought over to the local workspace from the server, especially when multiple data sets or outputs need to be written to separate files in a local drive. Manually retyping the macro variables and their values in the local workspace after they have been created on the server workspace would be time-consuming and error-prone, especially when we have quite a number of macro variables and values to bring over. Instead, this task can be achieved in an efficient manner by using dictionary tables and the CALL SYMPUT routine, as illustrated in more detail below. The same approach can also be used to bring macro variables and their values from the local to the server workspace.

Read the paper (PDF) | Download the data file (ZIP) | Watch the recording

Khoi To, Office of Planning and Decision Support, Virginia Commonwealth University

Session SAS6364-2016:

Macroeconomic Simulation Analysis for Multi-asset Class Portfolio Returns

Macroeconomic simulation analysis provides in-depth insights to a portfolio's performance spectrum. Conventionally, portfolio and risk managers obtain macroeconomic scenarios from third parties such as the Federal Reserve and determine portfolio performance under the provided scenarios. In this paper, we propose a technique to extend scenario analysis to an unconditional simulation capturing the distribution of possible macroeconomic climates and hence the true multivariate distribution of returns. We propose a methodology that adds to the existing scenario analysis tools and can be used to determine which types of macroeconomic climates have the most adverse outcomes for the portfolio. This provides a broader perspective on value at risk measures thereby allowing more robust investment decisions. We explain the use of SAS^® procedures like VARMAX and PROC COPULA in SAS/IML^® in this analysis.

Read the paper (PDF)

Srikant Jayaraman, SAS

Joe Burdis, SAS Research and Development

Lokesh Nagar, SAS Research and Development

Session SAS6344-2016:

Mass-Scale, Automated Machine Learning and Model Deployment Using SAS^® Factory Miner and SAS^® Decision Manager

Business problems have become more stratified and micro-segmentation is driving the need for mass-scale, automated machine learning solutions. Additionally, deployment environments include diverse ecosystems, requiring hundreds of models to be built and deployed quickly via web services to operational systems. The new SAS^® automated modeling tool allows you to build and test hundreds of models across all of the segments in your data, testing a wide variety of machine learning techniques. The tool is completely customizable, allowing you transparent access to all modeling results. This paper shows you how to identify hundreds of champion models using SAS^® Factory Miner, while generating scoring web services using SAS^® Decision Manager. Immediate benefits include efficient model deployments, which allow you to spend more time generating insights that might reveal new opportunities, expose hidden risks, and fuel smarter, well-timed decisions.

Read the paper (PDF)

Jonathan Wexler, SAS

Steve Sparano, SAS

Session SAS5801-2016:

Minimizing Fraud Risk through Dynamic Entity Resolution and Network Analysis

Every day, businesses have to remain vigilant of fraudulent activity, which threatens customers, partners, employees, and financials. Normally, networks of people or groups perpetrate deviant activity. Finding these connections is now made easier for analysts with SAS^® Visual Investigator, an upcoming SAS^® solution that ultimately minimizes the loss of money and preserves mutual trust among its shareholders. SAS Visual Investigator takes advantage of the capabilities of the new SAS^® In-Memory Server. Investigators can efficiently investigate suspicious cases across business lines, which has traditionally been difficult. However, the time required to collect, process and identify emerging fraud and compliance issues has been costly. Making proactive analysis accessible to analysts is now more important than ever. SAS Visual Investigator was designed with this goal in mind and a key component is the visual social network view. This paper discusses how the network analysis view of SAS Visual Investigator, with all its dynamic visual capabilities, can make the investigative process more informative and efficient.

Read the paper (PDF)

Danielle Davis, SAS

Stephen Boyd, SAS Institute

Ray Ong, SAS Institute

Session 10460-2016:

Missing Values: They Are NOT Nothing

When analyzing data with SAS^®, we often encounter missing or null values in data. Missing values can arise from the availability, collectibility, or other issues with the data. They represent the imperfect nature of real data. Under most circumstances, we need to clean, filter, separate, impute, or investigate the missing values in data. These processes can take up a lot of time, and they are annoying. For these reasons, missing values are usually unwelcome and need to be avoided in data analysis. There are two sides to every coin, however. If we can think outside the box, we can take advantage of the negative features of missing values for positive uses. Sometimes, we can create and use missing values to achieve our particular goals in data manipulation and analysis. These approaches can make data analyses convenient and improve work efficiency for SAS programming. This kind of creative and critical thinking is the most valuable quality for data analysts. This paper exploits real-world examples to demonstrate the creative uses of missing values in data analysis and SAS programming, and discusses the advantages and disadvantages of these methods and approaches. The illustrated methods and advanced programming skills can be used in a wide variety of data analysis and business analytics fields.

Read the paper (PDF)

Justin Jia, Trans Union Canada

Shan Shan Lin, CIBC

N

Session 10360-2016:

Nine Frequently Asked Questions about Getting Started with SAS^® Visual Analytics

You've heard all the talk about SAS^® Visual Analytics--but maybe you are still confused about how the product would work in your SAS^® environment. Many customers have the same points of confusion about what they need to do with their data, how to get data into the product, how SAS Visual Analytics would benefit them, and even should they be considering Hadoop or the cloud. In this paper, we cover the questions we are asked most often about implementation, administration, and usage of SAS Visual Analytics.

Read the paper (PDF) | Watch the recording

Tricia Aanderud, Zencos Consulting LLC

Ryan Kumpfmiller, Zencos Consulting

Nick Welke, Zencos Consulting

P

Session 7540-2016:

PROC SQL for SQL DieHards

Inspired by Christianna William's paper on transitioning to PROC SQL from the DATA step, this paper aims to help SQL programmers transition to SAS^® by using PROC SQL. SAS adapted the Structured Query Language (SQL) by means of PROC SQL back with SAS^®6. PROC SQL syntax closely resembles SQL. However, there are some SQL features that are not available in SAS. Throughout this paper, we outline common SQL tasks and how they might differ in PROC SQL. We also introduce useful SAS features that are not available in SQL. Topics covered are appropriate for novice SAS users.

Read the paper (PDF)

Barbara Ross, NA

Jessica Bennett, Snap Finance

Session 2480-2016:

Performing Pattern Matching by Using Perl Regular Expressions

SAS^® software provides many DATA step functions that search and extract patterns from a character string, such as SUBSTR, SCAN, INDEX, TRANWRD, etc. Using these functions to perform pattern matching often requires you to use many function calls to match a character position. However, using the Perl regular expression (PRX) functions or routines in the DATA step improves pattern-matching tasks by reducing the number of function calls and making the program easier to maintain. This talk, in addition to discussing the syntax of Perl regular expressions, demonstrates many real-world applications.

Read the paper (PDF) | Download the data file (ZIP)

Arthur Li, City of Hope

Session 1820-2016:

Portfolio Optimization with Discontinuous Constraint

Optimization models require continuous constraints to converge. However, some real-life problems are better described by models that incorporate discontinuous constraints. A common type of such discontinuous constraints becomes apparent when a regulation-mandated diversification requirement is implemented in an investment portfolio model. Generally stated, the requirement postulates that the aggregate of investments with individual weights exceeding certain threshold in the portfolio should not exceed some predefined total within the portfolio. This format of the diversification requirement can be defined by the rules of any specific portfolio construction methodology and is commonly imposed by the regulators. The paper discusses the impact of this type of discontinuous portfolio diversification constraint on the portfolio optimization model solution process, and develops a convergent approach. The latter includes a sequence of definite series of convergent non-linear optimization problems and is presented in the framework of the OPTMODEL procedure modeling environment. The approach discussed has been used in constructing investable equity indexes.

Read the paper (PDF) | Download the data file (ZIP) | View the e-poster or slides (PDF)

Taras Zlupko, University of Chicago

Robert Spatz, University of Chicago

Session 7560-2016:

Processing CDC and SCD Type 2 for Sources without CDC: A Hybrid Approach

In a data warehousing system, change data capture (CDC) plays an important part not just in making the data warehouse (DWH) aware of the change but also in providing a means of flowing the change to the DWH marts and reporting tables so that we see the current and latest version of the truth. This and slowly changing dimensions (SCD) create a cycle that runs the DWH and provides valuable insights in the history and for the decision-making future. What if the source has no CDC? It would be an ETL nightmare to identify the exact change and report the absolute truth. If these two processes can be combined into a single process where just one single transform does both jobs of identifying the change and applying the change to the DWH, then we can save significant processing times and value resources of the system. Hence, I came up with a hybrid SCD with CDC approach for this. My paper focuses on sources that DO NOT have CDC in their sources and need to perform SCD Type 2 on such records without worrying about data duplications and increased processing times.

Read the paper (PDF) | Watch the recording

Vishant Bhat, University of Newcastle

Tony Blanch, SAS Consultant

Session 10481-2016:

Product Purchase Sequence Analyses by Using a Horizontal Data Sorting Technique

Horizontal data sorting is a very useful SAS^® technique in advanced data analysis when you are using SAS programming. Two years ago (SAS^® Global Forum Paper 376-2013), we presented and illustrated various methods and approaches to perform horizontal data sorting, and we demonstrated its valuable application in strategic data reporting. However, this technique can also be used as a creative analytic method in advanced business analytics. This paper presents and discusses its innovative and insightful applications in product purchase sequence analyses such as product opening sequence analysis, product affinity analysis, next best offer analysis, time-span analysis, and so on. Compared to other analytic approaches, the horizontal data sorting technique has the distinct advantages of being straightforward, simple, and convenient to use. This technique also produces easy-to-interpret analytic results. Therefore, the technique can have a wide variety of applications in customer data analysis and business analytics fields.

Read the paper (PDF) | View the e-poster or slides (PDF)

Justin Jia, Trans Union Canada

Shan Shan Lin, CIBC

T

Session SAS2560-2016:

Ten Tips to Unlock the Power of Hadoop with SAS^®

This paper discusses a set of practical recommendations for optimizing the performance and scalability of your Hadoop system using SAS^®. Topics include recommendations gleaned from actual deployments from a variety of implementations and distributions. Techniques cover tips for improving performance and working with complex Hadoop technologies such as Kerberos, techniques for improving efficiency when working with data, methods to better leverage the SAS in Hadoop components, and other recommendations. With this information, you can unlock the power of SAS in your Hadoop system.

Read the paper (PDF)

Nancy Rausch, SAS

Wilbram Hazejager, SAS

Session 7120-2016:

The Combination of SAS^® and VBA Makes Life Easier

VBA has been described as a glue language, and has been widely used in exchanging data between Microsoft products such as Excel and Word or PowerPoint. How to trigger the VBA macro from SAS^® via DDE has been widely discussed in recent years. However, using SAS to send parameters to a VBA macro was seldom reported. This paper provides a solution for this problem. Copying Excel tables to PowerPoint using the combination of SAS and VBA is illustrated as an example. The SAS program rapidly scans all Excel files that are contained in one folder, passes the file information to VBA as parameters, and triggers the VBA macro to write PowerPoint files in a loop. As a result, a batch of PowerPoint files can be generated by just one mouse-click.

Read the paper (PDF) | Watch the recording

Zhu Yanrong, Medtronic

Session SAS6477-2016:

The Optimization of the Optimal Customer

For marketers who are responsible for identifying the best customer to target in a campaign, it is often daunting to determine which media channel, offer, or campaign program is the one the customer is more apt to respond to, and therefore, is more likely to increase revenue. This presentation examines the components of designing campaigns to identify promotable segments of customers and to target the optimal customers using SAS^® Marketing Automation integrated with SAS^® Marketing Optimization.

Read the paper (PDF)

Pamela Dixon, SAS

Session 7020-2016:

Three Methods to Dynamically Assign Colors to Plots Based on Group Value

Specifying colors based on group value is a quite popular practice in visualizing data, but it is not so easy to do, especially when there are multiple group values. This paper explores three different methods to dynamically assign colors to plots based on their group values. They are combining EVAL and IFN functions in the plot statements; bringing the DISCRETEATTRMAP block into the plot statements; and using the macro from the SAS^® sample 40255.

Read the paper (PDF) | Watch the recording

Amos Shu, MedImmune

U

Session SAS5244-2016:

Unleashing High-Performance Risk Data with the Hadoop Custom File Reader

SAS^® High-Performance Risk distributes financial risk data and big data portfolios with complex analyses across a networked Hadoop Distributed File System (HDFS) grid to support rapid in-memory queries for hundreds of simultaneous users. This data is extremely complex and must be stored in a proprietary format to guarantee data affinity for rapid access. However, customers still desire the ability to view and process this data directly. This paper demonstrates how to use the HPRISK custom file reader to directly access risk data in Hadoop MapReduce jobs, using the HPDS2 procedure and the LASR procedure.

Read the paper (PDF) | Download the data file (ZIP)

Mike Whitcher, SAS

Stacey Christian, SAS

Phil Hanna, SAS Institute

Don McAlister, SAS

Session SAS6660-2016:

Using Metadata Queries To Build Row-Level Audit Reports in SAS^® Visual Analytics

Sensitive data requires elevated security requirements and the flexibility to apply logic that subsets data based on user privileges. Following the instructions in SAS^® Visual Analytics: Administration Guide gives you the ability to apply row-level permission conditions. After you have set the permissions, you have to prove through audits who has access and row-level security. This paper provides you with the ability to easily apply, validate, report, and audit all tables that have row-level permissions, along with the groups, users, and conditions that will be applied. Take the hours of maintenance and lack of visibility out of row-level secure data and build confidence in the data and analytics that are provided to the enterprise.

Read the paper (PDF) | Download the data file (ZIP)

Brandon Kirk, SAS

Session 5581-2016:

Using PROC TABULATE and LAG(n) Function for Rates of Change

For SAS^® users, PROC TABULATE and PROC REPORT (and its compute blocks) are probably among the most common procedures for calculating and displaying data. It is, however, pretty difficult to calculate and display changes from one column to another using data from other rows with just these two procedures. Compute blocks in PROC REPORT can calculate additional columns, but it would be challenging to pick up values from other rows as inputs. This presentation shows how PROC TABULATE can work with the lag(n) function to calculate rates of change from one period of time to another. This offers the flexibility of feeding into calculations the data retrieved from other rows of the report. PROC REPORT is then used to produce the desired output. The same approach can also be used in a variety of scenarios to produce customized reports.

Read the paper (PDF) | Download the data file (ZIP) | Watch the recording

Khoi To, Office of Planning and Decision Support, Virginia Commonwealth University

W

Session SAS2400-2016:

What's New in SAS^® Data Management

The latest releases of SAS^® Data Integration Studio, SAS^® Data Management Studio and SAS^® Data Integration Server, SAS^® Data Governance, and SAS/ACCESS^® software provide a comprehensive and integrated set of capabilities for collecting, transforming, and managing your data. The latest features in the product suite include capabilities for working with data from a wide variety of environments and types including Hadoop, cloud, RDBMS, files, unstructured data, streaming, and others, and the ability to perform ETL and ELT transformations in diverse run-time environments including SAS^®, database systems, Hadoop, Spark, SAS^® Analytics, cloud, and data virtualization environments. There are also new capabilities for lineage, impact analysis, clustering, and other data governance features for enhancements to master data and support metadata management. This paper provides an overview of the latest features of the SAS^® Data Management product suite and includes use cases and examples for leveraging product capabilities.

Read the paper (PDF)

Nancy Rausch, SAS

Session 7080-2016:

What's the Difference?

Each night on the news we hear the level of the Dow Jones Industrial Average along with the 'first difference,' which is today's price-weighted average minus yesterday's. It is that series of first differences that excites or depresses us each night as it reflects whether stocks made or lost money that day. Furthermore, the differences form the data series that has the most addressable statistical features. In particular, the differences have the stationarity requirement, which justifies standard distributional results such as asymptotically normal distributions of parameter estimates. Differencing arises in many practical time series because they seem to have what are called 'unit roots,' which mathematically indicate the need to take differences. In 1976, Dickey and Fuller developed the first well-known tests to decide whether differencing is needed. These tests are part of the ARIMA procedure in SAS/ETS^® in addition to many other time series analysis products. I'll review a little of what is was like to do the development and the required computing back then, say a little about why this is an important issue, and focus on examples.

Read the paper (PDF) | Watch the recording

David Dickey, NC State University

Session SAS5520-2016:

When the Answer to Public or Private Is Both: Managing a Hybrid Cloud Environment

For many organizations, the answer to whether to manage their data and analytics in a public or private cloud is going to be both. Both can be the answer for many different reasons: common sense logic not to replace a system that already works just to incorporate something new; legal or corporate regulations that require some data, but not all data, to remain in place; and even a desire to provide local employees with a traditional data center experience while providing remote or international employees with cloud-based analytics easily managed through software deployed via Amazon Web Services (AWS). In this paper, we discuss some of the unique technical challenges of managing a hybrid environment, including how to monitor system performance simultaneously for two different systems that might not share the same infrastructure or even provide comparable system monitoring tools; how to manage authorization when access and permissions might be driven by two different security technologies that make implementation of a singular protocol problematic; and how to ensure overall automation of two platforms that might be independently automated, but not originally designed to work together. In this paper, we share lessons learned from a decade of experience implementing hybrid cloud environments.

Read the paper (PDF)

Ethan Merrill, SAS

Bryan Harkola, SAS

Y

Session 10600-2016:

You Can Bet on It: Missing Observations Are Preserved with the PRELOADFMT and COMPLETETYPES Options

Do you write reports that sometimes have missing categories across all class variables? Some programmers write all sorts of additional DATA step code in order to show the zeros for the missing rows or columns. Did you ever wonder whether there is an easier way to accomplish this? PROC MEANS and PROC TABULATE, in conjunction with PROC FORMAT, can handle this situation with a couple of powerful options. With PROC TABULATE, we can use the PRELOADFMT and PRINTMISS options in conjunction with a user-defined format in PROC FORMAT to accomplish this task. With PROC SUMMARY, we can use the COMPLETETYPES option to get all the rows with zeros. This paper uses examples from Census Bureau tabulations to illustrate the use of these procedures and options to preserve missing rows or columns.

Read the paper (PDF) | Watch the recording

Chris Boniface, Census Bureau

Janet Wysocki, U.S. Census Bureau