The new and highly anticipated SAS® Output Delivery System (ODS) destination for Microsoft Excel is finally here! Available as a production feature in the third maintenance release of SAS® 9.4 (TS1M3), this new destination generates native Excel (XLSX) files that are compatible with Microsoft Office 2010 or later. This paper is written for anyone, from entry-level programmers to business analysts, who uses the SAS® System and Microsoft Excel to create reports. The discussion covers features and benefits of the new Excel destination, differences between the Excel destination and the older ExcelXP tagset, and functionality that exists in the ExcelXP tagset that is not available in the Excel destination. These topics are all illustrated with meaningful examples. The paper also explains how you can bridge the gap that exists as a result of differences in the functionality between the destination and the tagset. In addition, the discussion outlines when it is beneficial for you to use the Excel destination versus the ExcelXP tagset, and vice versa. After reading this paper, you should be able to make an informed decision about which tool best meets your needs.
Chevell Parker, SAS
The surge of data and data sources in marketing has created an analytical bottleneck in most organizations. Analytics departments have been pushed into a difficult decision: either purchase black-box analytical tools to generate efficiencies or hire more analysts, modelers, and data scientists. Knowledge gaps stemming from restrictions in black-box tools or from backlogs in the work of analytical teams have resulted in lost business opportunities. Existing big data analytics tools respond well when dealing with large record counts and small variable counts, but they fall short in bringing efficiencies when dealing with wide data. This paper discusses the importance of an agile modeling engine designed to deliver productivity, irrespective of the size of the data or the complexity of the modeling approach.
Mariam Seirafi, Cornerstone Group of Companies
Nowadays, the recommender system is a popular tool for online retailer businesses to predict customers' next-product-to-buy (NPTB). Based on statistical techniques and the information collected by the retailer, an efficient recommender system can suggest a meaningful NPTB to customers. A useful suggestion can reduce the customer's searching time for a wanted product and improve the buying experience, thus increasing the chance of cross-selling for online retailers and helping them build customer loyalty. Within a recommender system, the combination of advanced statistical techniques with available information (such as customer profiles, product attributes, and popular products) is the key element in using the retailer's database to produce a useful suggestion of an NPTB for customers. This paper illustrates how to create a recommender system with the SAS® RECOMMEND procedure for online business. Using the recommender system, we can produce predictions, compare the performance of different predictive models (such as decision trees or multinomial discrete-choice models), and make business-oriented recommendations from the analysis.
Miao Nie, ABN AMRO Bank
Shanshan Cong, SAS Institute
SAS® Grid Manager, as well as other grid computing technologies, have a set of great capabilities that we, IT professionals, love to have in our systems. This technology increases high availability, allows parallel processing, facilitates increasing demand by scale out, and offers other features that make life better for those managing and using these environments. However, even when business users take advantage of these features, they are more concerned about the business part of the problem. Most of the time business groups hold the budgets and are key stakeholders for any SAS Grid Manager project. Therefore, it is crucial to demonstrate to business users how they will benefit from the new technologies, how the features will improve their daily operations, help them be more efficient and productive, and help them achieve better results. This paper guides you through a process to create a strong and persuasive business plan that translates the technology features from SAS Grid Manager to business benefits.
Marlos Bosso, SAS
As SAS® programmers, we often develop listings, graphs, and reports that need to be delivered frequently to our customers. We might decide to manually run the program every time we get a request, or we might easily schedule an automatic task to send a report at a specific date and time. Both scenarios have some disadvantages. If the report is manual, we have to find and run the program every time someone request an updated version of the output. It takes some time and it is not the most interesting part of the job. If we schedule an automatic task in Windows, we still sometimes get an email from the customers because they need the report immediately. That means that we have to find and run the program for them. This paper explains how we developed an on-demand report platform using SAS® Enterprise Guide®, SAS® Web Application Server, and stored processes. We had developed many reports for different customer groups, and we were getting more and more emails from them asking for updated versions of their reports. We felt we were not using our time wisely and decided to create an infrastructure where users could easily run their programs through a web interface. The tool that we created enables SAS programmers to easily release on-demand web reports with minimum programming. It has web interfaces developed using stored processes for the administrative tasks, and it also automatically customizes the front end based on the user who connects to the website. One of the challenges of the project was that certain reports had to be available to a specific group of users only.
Romain Miralles, Genomic Health
As Data Management professionals, you have to comply with new regulations and controls. One such regulation is Basel Committee on Banking Supervision (BCBS) 239. To respond to these new demands, you have to put processes and methods in place to automate metadata collection and analysis, and to provide rigorous documentation around your data flows. You also have to deal with many aspects of data management including data access, data manipulation (ETL and other), data quality, data usage, and data consumption, often from a variety of toolsets that are not necessarily from a single vendor. This paper shows you how to use SAS® technologies to support data governance requirements, including third party metadata collection and data monitoring. It highlights best practices such as implementing a business glossary and establishing controls for monitoring data. Attend this session to become familiar with the SAS tools used to meet the new requirements and to implement a more managed environment.
Jeff Stander, SAS
Do you create complex reports using PROC REPORT? Are you confused by the COMPUTE BLOCK feature of PROC REPORT? Are you even aware of it? Maybe you already produce reports using PROC REPORT, but suddenly your boss needs you to modify some of the values in one or more of the columns. Maybe your boss needs to see the values of some rows in boldface and others highlighted in a stylish yellow. Perhaps one of the columns in the report needs to display a variety of fashionable formats (some with varying decimal places and some without any decimals). Maybe the customer needs to see a footnote in specific cells of the report. Well, if this sounds familiar then come take a look at the COMPUTE BLOCK of PROC REPORT. This paper shows a few tips and tricks of using the COMPUTE DEFINE block with conditional IF/THEN logic to make your reports stylish and fashionable. The COMPUTE BLOCK allows you to use data DATA step code within PROC REPORT to provide customization and style to your reports. We'll see how the Census Bureau produces a stylish demographic profile for customers of its Special Census program using PROC REPORT with the COMPUTE BLOCK. The paper focuses on how to use the COMPUTE BLOCK to create this stylish Special Census profile. The paper shows quick tips and simple code to handle multiple formats within the same column, make the values in the Total rows boldface, trafficlighting, and how to add footnotes to any cell based on the column or row. The Special Census profile report is an Excel table created with ODS tagsets.ExcelXP that is stylish and fashionable, thanks in part to the COMPUTE BLOCK.
Chris Boniface, Census Bureau
With millions of users and peak traffic of thousands of requests a second for complex user-specific data, fantasy football offers many data design challenges. Not only is there a high volume of data transfers, but the data is also dynamic and of diverse types. We need to process data originating on the stadium playing field and user devices and make it available to a variety of different services. The system must be nimble and must produce accurate and timely responses. This talk discusses the strategies employed by and lessons learned from one of the primary architects of the National Football League's fantasy football system. We explore general data design considerations with specific examples of high availability, data integrity, system performance, and some other random buzzwords. We review some of the common pitfalls facing large-scale databases and the systems using them. And we cover some of the tips and best practices to take your data-driven applications from fantasy to reality.
Clint Carpenter, Carpenter Programming
If your organization already deploys one or more software solutions via Amazon Web Services (AWS), you know the value of the public cloud. AWS provides a scalable public cloud with a global footprint, allowing users access to enterprise software solutions anywhere at any time. Although SAS® began long before AWS was even imagined, many loyal organizations driven by SAS are moving their local SAS analytics into the public AWS cloud, alongside other software hosted by AWS. SAS® Solutions OnDemand has assisted organizations in this transition. In this paper, we describe how we extended our enterprise hosting business to AWS. We describe the open source automation framework from which SAS Soultions onDemand built our automation stack, which simplified the process of migrating a SAS implementation. We'll provide the technical details of our automation and network footprint, a discussion of the technologies we chose along the way, and a list of lessons learned.
Ethan Merrill, SAS
Bryan Harkola, SAS
Project management is a hot topic across many industries, and there are multiple commercial software applications for managing projects available. The reality, however, is that the majority of project management software is not applicable for daily usage. SAS® has a solution for this issue that can be used for managing projects graphically in real time. This paper introduces a new paradigm for project management using the SAS® Graph Template Language (GTL). SAS clients, in real time, can use GTL to visualize resource assignments, task plans, delivery tracking, and project status across multiple project levels for more efficient project management.
Zhouming(Victor) Sun, Medimmune
Microsoft Visual Basic Scripting Edition (VBScript) and SAS® software are each powerful tools in their own right. These two technologies can be combined so that SAS code can call a VBScript program or vice versa. This gives a programmer the ability to automate SAS tasks; traverse the file system; send emails programmatically via Microsoft Outlook or SMTP; manipulate Microsoft Word, Microsoft Excel, and Microsoft PowerPoint files; get web data; and more. This paper presents example code to demonstrate each of these capabilities.
Christopher Johnson, BrickStreet Insurance
Considering the fact that SAS® Grid Manager is becoming more and more popular, it is important to fulfill the user's need for a successful migration to a SAS® Grid environment. This paper focuses on key requirements and common issues for new SAS Grid users, especially if they are coming from a traditional environment. This paper describes a few common requirements like the need for a current working directory, the change of file system navigation in SAS® Enterprise Guide® with user-given location, getting job execution summary email, and so on. The GRIDWORK directory has been introduced in SAS Grid Manager, which is a bit different from the traditional SAS WORK location. This paper explains how you can use the GRIDWORK location in a more user-friendly way. Sometimes users experience data set size differences during grid migration. A few important reasons for data set size difference are demonstrated. We also demonstrate how to create new custom scripts as per business needs and how to incorporate them with SAS Grid Manager engine.
Piyush Singh, TATA Consultancy Services Ltd
Tanuj Gupta, TATA Consultancy Services
Prasoon Sangwan, Tata consultancy services limited
Business Intelligence users analyze business data in a variety of ways. Seventy percent of business data contains location information. For in-depth analysis, it is essential to combine location information with mapping. New analytical capabilities are added to SAS® Visual Analytics, leveraging the new partnership with Esri, a leader in location intelligence and mapping. The new capabilities enable users to enhance the analytical insights from SAS Visual Analytics. This paper demonstrates and discusses the new partnership with Esri and the new capabilities added to SAS Visual Analytics.
Murali Nori, SAS
Himesh Patel, SAS
For SAS® Enterprise Guide® users, sometimes macro variables and their values need to be brought over to the local workspace from the server, especially when multiple data sets or outputs need to be written to separate files in a local drive. Manually retyping the macro variables and their values in the local workspace after they have been created on the server workspace would be time-consuming and error-prone, especially when we have quite a number of macro variables and values to bring over. Instead, this task can be achieved in an efficient manner by using dictionary tables and the CALL SYMPUT routine, as illustrated in more detail below. The same approach can also be used to bring macro variables and their values from the local to the server workspace.
Khoi To, Office of Planning and Decision Support, Virginia Commonwealth University
SAS® Visual Analytics Explorer puts the robust power of decision trees at your fingertips, enabling you to visualize and explore how data is structured. Decision trees help analysts better understand discrete relationships within data by visually showing how combinations of variables lead to a target indicator. This paper explores the practical use of decision trees in SAS Visual Analytics Explorer through an example of risk classification in the financial services industry. It explains various parameters and implications, explores ways the decision tree provides value, and provides alternative methods to help you the reality of imperfect data.
Stephen Overton, Zencos Consulting LLC
Ben Murphy, Zencos Consulting LLC
Wouldn't it be fantastic to develop and tune scenarios in SAS® Visual Scenario Designer and then smoothly incorporate them into your SAS® Anti-Money Laundering solution with just a few clicks of your mouse? Well, now there is a way. SAS Visual Scenario Designer is the first data-driven solution for interactive rule and scenario authoring, testing, and validation. It facilitates exploration, visualization, detection, rule writing, auditing, and parameter tuning to reduce false positives; and all of these tasks are performed using point and click. No SAS® coding skills required! Using the approach detailed in this paper, we demonstrate how you can seamlessly port these SAS Visual Scenario Designer scenarios into your SAS Anti-Money Laundering solution. Rewriting the SAS Visual Scenario Designer scenarios in Base SAS® is no longer required! Furthermore, the SAS Visual Scenario Designer scenarios are executed on the lightning-speed SAS® LASR™ Analytic Server, reducing the time of the SAS Anti-Money Laundering scenario nightly batch run. The results of both the traditional SAS Anti-Money Laundering alerts and SAS Visual Scenario Designer alerts are combined and available for display on the SAS® Enterprise Case Management interface. This paper describes the different ways that the data can be explored to detect anomalous patterns and the three mechanisms for translating these patterns into rules. It also documents how to create the scenarios in SAS Visual Scenario Designer; how to test and tune the scenarios and parameters; and how alerts are ported seamlessly into the SAS Anti-Money Laundering alert generation process and the SAS Enterprise Case Management system.
Renee Palmer, SAS
Yue Chai, SAS Institute
You've heard all the talk about SAS® Visual Analytics--but maybe you are still confused about how the product would work in your SAS® environment. Many customers have the same points of confusion about what they need to do with their data, how to get data into the product, how SAS Visual Analytics would benefit them, and even should they be considering Hadoop or the cloud. In this paper, we cover the questions we are asked most often about implementation, administration, and usage of SAS Visual Analytics.
Tricia Aanderud, Zencos Consulting LLC
Ryan Kumpfmiller, Zencos Consulting
Nick Welke, Zencos Consulting
Inspired by Christianna William's paper on transitioning to PROC SQL from the DATA step, this paper aims to help SQL programmers transition to SAS® by using PROC SQL. SAS adapted the Structured Query Language (SQL) by means of PROC SQL back with SAS®6. PROC SQL syntax closely resembles SQL. However, there are some SQL features that are not available in SAS. Throughout this paper, we outline common SQL tasks and how they might differ in PROC SQL. We also introduce useful SAS features that are not available in SQL. Topics covered are appropriate for novice SAS users.
Barbara Ross, NA
Jessica Bennett, Snap Finance
SAS® software provides many DATA step functions that search and extract patterns from a character string, such as SUBSTR, SCAN, INDEX, TRANWRD, etc. Using these functions to perform pattern matching often requires you to use many function calls to match a character position. However, using the Perl regular expression (PRX) functions or routines in the DATA step improves pattern-matching tasks by reducing the number of function calls and making the program easier to maintain. This talk, in addition to discussing the syntax of Perl regular expressions, demonstrates many real-world applications.
Arthur Li, City of Hope
In a data warehousing system, change data capture (CDC) plays an important part not just in making the data warehouse (DWH) aware of the change but also in providing a means of flowing the change to the DWH marts and reporting tables so that we see the current and latest version of the truth. This and slowly changing dimensions (SCD) create a cycle that runs the DWH and provides valuable insights in the history and for the decision-making future. What if the source has no CDC? It would be an ETL nightmare to identify the exact change and report the absolute truth. If these two processes can be combined into a single process where just one single transform does both jobs of identifying the change and applying the change to the DWH, then we can save significant processing times and value resources of the system. Hence, I came up with a hybrid SCD with CDC approach for this. My paper focuses on sources that DO NOT have CDC in their sources and need to perform SCD Type 2 on such records without worrying about data duplications and increased processing times.
Vishant Bhat, University of Newcastle
Tony Blanch, SAS Consultant
With advances in technology, the world of sports is now offering rich data sets that are of interest to statisticians. This talk concerns some research problems in various sports that are based on large data sets. In baseball, PITCHf/x data is used to help quantify the quality of pitches. From this, questions about pitcher evaluation and effectiveness are addressed. In cricket, match commentaries are parsed to yield ball-by-ball data in order to assist in building a match simulator. The simulator can then be used to investigate optimal lineups, player evaluation, and the assessment of fielding.
Our company Veikkaus is a state-owned gambling and lottery company in Finland that has a national legalized monopoly for gambling. All the profit we make goes back to Finnish society (for art, sports, science, and culture), and this is done by our government. In addition to the government's requirements of profit, the state (Finland) also requires us to handle the adverse social aspects of gaming, such as problem gambling. The challenge in our business is to balance between these two factors. For the purposes of problem gambling, we have used SAS® tools to create a responsible gaming tool, called VasA, based on a logistic regression model. The name VasA is derived from the Finnish words for 'Responsible Customership.' The model identifies problem gamblers from our customer database using the data from identified gaming, money transfers, web behavior, and customer data. The variables that were used in the model are based on the theory behind the problem gambling. Our actions for problem gambling include, for example, different CRM and personalization of a customer's website in our web service. There were several companies who provided responsible gambling tools as such for us to buy, but we wanted to create our own for two reasons. Firstly, we wanted it to include our whole customer database, meaning all our customers and not just those customers who wanted to take part in it. These other tools normally include only customers who want to take part. The other reason was that we saved a ridiculous amount of money by doing it by ourselves compared to having to buy one. During this process, SAS played a big role, from gathering the data to the construction of the tool, and from modeling to creating the VasA variables, then on to the database, and finally to the analyses and reporting.
Tero Kallioniemi, Veikkaus
Mobile devices are an integral part of a business professional's life. These mobile devices are getting increasingly powerful in terms of processor speeds and memory capabilities. Business users can benefit from a more analytical visualization of the data along with their business context. The new SAS® Mobile BI contains many enhancements that facilitate the use of SAS® Analytics in the newest version of SAS® Visual Analytics. This paper demonstrates how to use the new analytical visualization that has been added to SAS Mobile BI from SAS Visual Analytics, for a richer and more insightful experience for business professionals on the go.
Murali Nori, SAS
For marketers who are responsible for identifying the best customer to target in a campaign, it is often daunting to determine which media channel, offer, or campaign program is the one the customer is more apt to respond to, and therefore, is more likely to increase revenue. This presentation examines the components of designing campaigns to identify promotable segments of customers and to target the optimal customers using SAS® Marketing Automation integrated with SAS® Marketing Optimization.
Pamela Dixon, SAS
Specifying colors based on group value is a quite popular practice in visualizing data, but it is not so easy to do, especially when there are multiple group values. This paper explores three different methods to dynamically assign colors to plots based on their group values. They are combining EVAL and IFN functions in the plot statements; bringing the DISCRETEATTRMAP block into the plot statements; and using the macro from the SAS® sample 40255.
Amos Shu, MedImmune
Someone has aptly said, Las Vegas looks the way one would imagine heaven must look at night. What if you know the secret to run a plethora of various businesses in the entertainment capital of the world? Nothing better, right? Well, we have what you want, all the necessary ingredients for you to precisely know what business to target in a particular locality of Las Vegas. Yelp, a community portal, wants to help people finding great local businesses. They cover almost everything from dentists and hair stylists through mechanics and restaurants. Yelp's users, Yelpers, write reviews and give ratings for all types of businesses. Yelp then uses this data to make recommendations to the Yelpers about which institutions best fit their individual needs. We have the yelp academic data set comprising 1.6 million reviews and 500K tips by 366K in 61K businesses across several cities. We combine current Yelp data from all the various data sets for Las Vegas to create an interactive map that provides an overview of how a business runs in a locality and how the ratings and reviews tickers a business. We answer the following questions: Where is the most appropriate neighborhood to start a new business (such as cafes, bars, and so on)? Which category of business has the greatest total count of reviews that is the most talked about (trending) business in Las Vegas? How does a business' working hours affect the customer reviews and the corresponding rating of the business? Our findings present research for further understanding of perceptions of various users, while giving reviews and ratings for the growth of a business by encompassing a variety of topics in data mining and data visualization.
Anirban Chakraborty, Oklahoma State University
Sensitive data requires elevated security requirements and the flexibility to apply logic that subsets data based on user privileges. Following the instructions in SAS® Visual Analytics: Administration Guide gives you the ability to apply row-level permission conditions. After you have set the permissions, you have to prove through audits who has access and row-level security. This paper provides you with the ability to easily apply, validate, report, and audit all tables that have row-level permissions, along with the groups, users, and conditions that will be applied. Take the hours of maintenance and lack of visibility out of row-level secure data and build confidence in the data and analytics that are provided to the enterprise.
Brandon Kirk, SAS
For SAS® users, PROC TABULATE and PROC REPORT (and its compute blocks) are probably among the most common procedures for calculating and displaying data. It is, however, pretty difficult to calculate and display changes from one column to another using data from other rows with just these two procedures. Compute blocks in PROC REPORT can calculate additional columns, but it would be challenging to pick up values from other rows as inputs. This presentation shows how PROC TABULATE can work with the lag(n) function to calculate rates of change from one period of time to another. This offers the flexibility of feeding into calculations the data retrieved from other rows of the report. PROC REPORT is then used to produce the desired output. The same approach can also be used in a variety of scenarios to produce customized reports.
Khoi To, Office of Planning and Decision Support, Virginia Commonwealth University
For many organizations, the answer to whether to manage their data and analytics in a public or private cloud is going to be both. Both can be the answer for many different reasons: common sense logic not to replace a system that already works just to incorporate something new; legal or corporate regulations that require some data, but not all data, to remain in place; and even a desire to provide local employees with a traditional data center experience while providing remote or international employees with cloud-based analytics easily managed through software deployed via Amazon Web Services (AWS). In this paper, we discuss some of the unique technical challenges of managing a hybrid environment, including how to monitor system performance simultaneously for two different systems that might not share the same infrastructure or even provide comparable system monitoring tools; how to manage authorization when access and permissions might be driven by two different security technologies that make implementation of a singular protocol problematic; and how to ensure overall automation of two platforms that might be independently automated, but not originally designed to work together. In this paper, we share lessons learned from a decade of experience implementing hybrid cloud environments.
Ethan Merrill, SAS
Bryan Harkola, SAS
Do you write reports that sometimes have missing categories across all class variables? Some programmers write all sorts of additional DATA step code in order to show the zeros for the missing rows or columns. Did you ever wonder whether there is an easier way to accomplish this? PROC MEANS and PROC TABULATE, in conjunction with PROC FORMAT, can handle this situation with a couple of powerful options. With PROC TABULATE, we can use the PRELOADFMT and PRINTMISS options in conjunction with a user-defined format in PROC FORMAT to accomplish this task. With PROC SUMMARY, we can use the COMPLETETYPES option to get all the rows with zeros. This paper uses examples from Census Bureau tabulations to illustrate the use of these procedures and options to preserve missing rows or columns.
Chris Boniface, Census Bureau
Janet Wysocki, U.S. Census Bureau