SAS Global Forum 2017 Proceedings

This paper introduces a macro that can generate the keyhole markup language (KML) files for U.S. states and counties. The generated KML files can be used directly by Google Maps to add customized state and county layers with user-defined colors and transparencies. When someone clicks on the state and county layers in Google Maps, customized information is shown. To use the macro, the user needs to prepare only a simple SAS^® input data set. The paper includes all the SAS codes for the macro and provides examples that show you how to use the macro as well as how to display the KML files in Google Maps.

Read the paper (PDF)

We have a lot of chances to use time-to-event (survival) analysis, especially in the biomedical and pharmaceutical fields. SAS^® provides the LIFETEST procedure to calculate Kaplan-Meier estimates for survival function and to delineate a survival plot. The PHREG procedure is used in Cox regression models to estimate the effect of predictors in hazard rates. Programs with ODS tables that are defined by PROC LIFETEST and PROC PHREG can provide more statistical information from the generated data sets. This paper provides a macro that uses PROC LIFETEST and PROC PHREG with ODS. It helps users to have a survival plot with estimates that include the subject at risk, events and total subject number, survival rate with median and 95% confidence interval, and hazard ratio estimates with 95% confidence interval. Some of these estimates are optional in the macro, so users can select what they need to display in the output. (Subject at risk and event and subject number are not optional.) Users can also specify the tick marks in the X-axis and subject at risk table, for example, every 10 or 20 units. The macro dynamic calculates the maximum for the X-axis and uses the interval that the user specified. Finally, the macro uses ODS and can be output in any document files, including JPG, PDF, and RTF formats.

Read the paper (PDF) | View the e-poster or slides (PDF)

Duplicates in a clinical trial or survey database could jeopardize data quality and integrity, and they can induce biased analysis results. These complications often happen in clinical trials, meta analyses, and registry and observational studies. Common practice to identify possible duplicates involves sensitive personal information, such as name, Social Security number (SSN), date of birth, address, telephone number, etc. However, access to this sensitive information is limited. Sometimes, it is even restricted. As a measure of data quality control, a SAS^® program was developed to identify duplicated individuals using non-sensitive information, such as age, gender, race, medical history, vital signs, and laboratory measurements. A probabilistic approach was used by calculating weights for data elements used to identify duplicates based on two probabilities (probability of agreement for an element among matched pairs and probability of agreement purely by chance among non-matched pairs). For elements with categorical values, agreement was defined as matching pairs sharing the same value. For elements with interval values, agreement was defined as matching values within 1% of measurement precision range. Probabilities used to compute matching element weights were estimated using an expectation-maximization (EM) algorithm. The method was then tested on a survey and clinical trial data from hypertension studies.

View the e-poster or slides (PDF)

SAS^® is perfect for building enterprise apps. Think about it: SAS speaks to almost any database you can think of and is probably already hooked in to most of the data sources in your organization. A full-fledged metadata security layer happens to already be integrated with your single sign-on authentication provider, and every time a user interacts with the system, their permissions are checked and the data their app asks for is automatically encrypted. SAS ticks all the boxes required by IT, and the skills required to start developing apps already sit within your department. Your team most likely already knows what your app needs to do, so instead of writing lists of requirements, give them an HTML5 resource, and together they can write and deploy the back-end code themselves. The apps run in the browser, the server-side logic is deployed using SAS .spk packages, and permissions are managed via SAS^® Management Console. Best of all, the infrastructure that would normally take months to integrate is already there, eliminating barriers to entry and letting you demonstrate the value of your solution to internal customers with zero up-front investment. This paper shows how SAS integrates with open-source tools like H54S, AngularJS, and PostGIS, together with next-generation developer-centric analytical platforms like SAS^® Viya , to build secure, enterprise-class apps that can support thousands of users. This presentation includes lots of app demos. This presentation was included at SAS^® Forum UK 2016.

Read the paper (PDF)

Cascading Style Sheets (CSS) frameworks like Bootstrap, and JavaScript libraries such as jQuery and h54s, have made it faster than ever before to develop enterprise-grade web apps on top of the SAS^® platform. Hailing the benefits of using SAS as a back end (authentication, security, ease of data access), this paper navigates the configuration issues to consider for maximum responsiveness to client web requests (pooled sessions, load balancing, multibridge connections). Cherry picking from the whirlwind of front end technologies and approaches, the author presents a framework that enables the novice programmer to build a simple web app in minutes. The exact steps necessary to achieve this are described, alongside a hurricane of practical tips like the following: dealing with CORS; logging in SAS; debugging AJAX calls; and SAS http responses. Beware this approach is likely to cause a storm of demand in your area! Server requirements: SAS^® Business Intelligence Platform (SAS^® 9.2 or later); SAS^® Stored Process Web Application (SAS^® Integration Technologies). Client requirements: HTML5 browser (Microsoft Internet Explorer 8 or later); access to open-source libraries (which can be hosted on-premises if Internet access is an issue).

Read the paper (PDF)

Standard SAS^® Studio tasks already include many advanced analytic procedures for data mining and other high-performance models, enabling point-and-click generation and execution of SAS^® code. However, you can extend the power of tasks by creating tasks of your own to enable point-and-click access to the latest SAS statistical procedures, to your own default model definitions, or to your previously developed SAS/STAT^® or SAS macro code. Best of all, these point-and-click tasks can be developed directly in SAS Studio without the need to compile binaries or build DLL files using third-party software. In this paper, we demonstrate three approaches to developing custom tasks. First, we build a custom task to provide point-and-click access to PROC IRT, including recently added functionality to PROC IRT used to analyze educational test and opinion survey data. Second, we build a custom task that calls a macro for previously developed SAS code, and we show how point-and-click options can be set up to allow users to guide the execution of complex macro code. Third, we demonstrate just enough of the underlying Apache Velocity Template Language code to enable developers to take advantage of the benefits of that language to support their SAS process. Finally, we show how these tasks can easily be shared with a user community, increasing the efficiency of analytic modeling across the organization.

Read the paper (PDF)

The DOCUMENT procedure is a little known procedure that can save you vast amounts of time and effort when managing the output of your SAS^® programming efforts. This procedure is deeply associated with the mechanism by which SAS controls output in the Output Delivery System (ODS). Have you ever wished you didn't have to modify and rerun the report-generating program every time there was some tweak in the desired report? PROC DOCUMENT enables you to store one version of the report as an ODS Document Object and then call it out in many different output forms, such as PDF, HTML, listing, RTF, and so on, without rerunning the code. Have you ever wished you could extract those pages of the output that apply to certain BY variables such as State, StudentName, or CarModel? With PROC DOCUMENT, you have where capabilities to extract these. Do you want to customize the table of contents that assorted SAS procedures produce when you make frames for the table of contents with HTML, or use the facilities available for PDF? PROC DOCUMENT enables you to get to the inner workings of ODS and manipulate them. This paper addresses PROC DOCUMENT from the viewpoint of end results, rather than provide a complete technical review of how to do the task at hand. The emphasis is on the benefits of using the procedure, not on detailed mechanics.

Read the paper (PDF) | View the e-poster or slides (PDF)

Finding daylight saving time (DST) is a common task for manipulating time series data. The date of daylight saving time changes every year. If SAS^® programmers depend on manually entering the value of daylight saving time in their programs, the maintenance of the program becomes tedious. Using a SAS function can make finding the value easy. This paper discusses several ways to capture and use daylight saving time.

Read the paper (PDF)

U.S. stock exchanges (currently there are 12) are tracked in real time via the Consolidated Trade System (CTS) and the Consolidated Quote System (CQS). CQS contains every updated quote from each of these exchanges, covering some 8,500 stock tickers. It provides the basis by which brokers can honor their fiduciary obligation to investors to execute transactions at the best price, that is, at the National Best Bid or Best Offer (NBBO). With the advent of electronic exchanges and high-frequency trading (timestamps are published to the nanosecond), data set size (approaching 1 billion quotes requiring 80 gigabytes of storage for a normal trading day) has become a major operational consideration for market behavior researchers re-creating NBBO values from quotes. This presentation demonstrates a straightforward use of hash tables for tracking constantly changing quotes for each ticker/exchange combination to provide the NBBO for each ticker at each time point in the trading day.

In the past 10 years, SAS^® Enterprise Guide^® has developed into the go-to application to access the power of SAS^®. With each new release, SAS continues to add functionality that makes the SAS user's life easier. We take a closer look at some of the built-in features within SAS Enterprise Guide and how they can make your life easier. One of the most exciting and powerful features we explore is allowing parallel execution on the same server. This gives you the ability to run multiple SAS processes at the same time regardless of whether you have a SAS^® Grid Computing environment. Some other topics we cover include conditional processing within SAS Enterprise Guide, how to securely store database login and password information, setting up autoexec files in SAS Enterprise Guide, exploiting process flows, and much more.

Read the paper (PDF)

As technology expands, we have the need to create programs that can be handed off to clients, to regulatory agencies, to parent companies, or to other projects, and handed off with little or no modification by the recipient. Minimizing modification by the recipient often requires the program itself to self-modify. To some extent the program must be aware of its own operating environment and what it needs to do to adapt to it. There are a great many tools available to the SAS^® programmer that will allow the program to self-adjust to its own surroundings. These include location-detection routines, batch files based on folder contents, the ability to detect the version and location of SAS, programs that discern and adjust to the current operating system and the corresponding folder structure, the use of automatic and user defined environmental variables, and macro functions that use and modify system information. Need to create a portable program? We can hand you the tools.

Read the paper (PDF)

From stock price histories to hospital stay records, analysis of time series data often requires use of lagged (and occasionally lead) values of one or more analysis variable. For the SAS^® user, the central operational task is typically getting lagged (lead) values for each time point in the data set. Although SAS has long provided a LAG function, it has no analogous lead function, which is an especially significant problem in the case of large data series. This paper 1) reviews the lag function, in particular, the powerful but non-intuitive implications of its queue-oriented basis; 2) demonstrates efficient ways to generate leads with the same flexibility as the LAG function, but without the common and expensive recourse to data re-sorting; and 3) shows how to dynamically generate leads and lags through use of the hash object.

Read the paper (PDF)

String externalization is the key to making your SAS^® applications speak multiple languages, even if you can't. Using the new features in SAS^® 9.3 for internationalization, your SAS applications can be written to adapt to whatever environment they are found in. String externalization is the process of identifying and separating translatable strings from your SAS program. This paper outlines the four steps of string externalization: create a Microsoft Excel spreadsheet for messages (optional), create SMD files, convert SMD files, and create the final SAS data set. Furthermore, it briefly shows you a real-world project on applying the concept. Using the Excel spreadsheet message text approach, professional translators can work more efficiently translating text in a friendlier and more comfortable environment. Subsequently, a programmer can also fully concentrate on developing and maintaining SAS code when your application is traveling to a new country.

View the e-poster or slides (PDF)

In order to display data visually, our audience preferred charts and graphs generated by Microsoft Excel over those generated by SAS^®. However, to make the necessary 30 graphs in Excel took 2 3 hours of manual work, even though the chart templates had already been created, and led to mistakes due to human error. SAS graphs took much less time to create, but lacked key functionality that the audience preferred and that was available in Excel graphs. Thanks to SAS, the answer came in Excel 4 Macro Language (X4ML) programming. SAS can actually submit coding to Excel in order to create customized data reporting, to create graphs or to update templates' data series, and even to populate Microsoft Word documents for finalized reports. This paper explores how SAS can be used to create presentation-ready graphs in a proven process that takes less than one minute, compared to the earlier process that took hours. The following code is used and discussed: %macro(macro_var), filename, rc commands, Output Delivery System (ODS), X4ML, and Microsoft Visual Basic for Applications (VBA).

Read the paper (PDF)

Making optimal use of SAS^® Grid Computing relies on the ability to spread the workload effectively across all of the available nodes. With SAS^® Scalable Performance Data Server (SPD Server), it is possible to partition your data and spread the processing across the SAS Grid Computing environment. In an ideal world it would be possible to adjust the size and number of partitions according to the data volumes being processed on any given day. This paper discusses a technique that enables the processing performed in the SAS Grid Computing environment to be dynamically reconfigured, automatically at run time, to optimize the use of SAS Grid Computing, and to provide significant performance benefits.

Read the paper (PDF)

The DATASETS procedure provides the most diverse selection of capabilities and features of any of the SAS^® procedures. It is the prime tool that programmers can use to manage SAS data sets, indexes, catalogs, and so on. Many SAS programmers are only familiar with a few of PROC DATASETS's many capabilities. Most often, they only use the data set updating, deleting, and renaming capabilities. However, there are many more features and uses that should be in a SAS programmer's toolkit. This paper highlights many of the major capabilities of PROC DATASETS. It discusses how it can be used as a tool to update variable information in a SAS data set; provide information about data set and catalog contents; delete data sets, catalogs, and indexes; repair damaged SAS data sets; rename files; create and manage audit trails; add, delete, and modify passwords; add and delete integrity constraints; and more. The paper contains examples of the various uses of PROC DATASETS that programmers can cut and paste into their own programs as a starting point. After reading this paper, a SAS programmer will have practical knowledge of the many different facets of this important SAS procedure.

Read the paper (PDF)

SAS offers generation data set structure as part of the language feature that many users are familiar with. They use it in their organizations and manage it using keywords such as GENMAX and GENNUM. While SAS operates in a mainframe environment, users also have the ability to tap into the GDG (generation data group) feature available on z/OS, OS/390, OS/370, IBM 3070, or IBM 3090 machines. With cost-saving initiatives across businesses and due to some scaling factors, many organizations are in the process of migrating to mid-tier platforms to cheaper operating platforms such as UNIX and AIX. Because Linux is open source and is a cheaper alternative, several organizations have opted for the UNIX distribution of SAS that can work in UNIX and AIX environments. While this might be a viable alternative, there are certain nuances that the migration effort brings to the technical conversion teams. On UNIX, the concept of GDGs does not exist. While SAS offers generation data sets, they are good only for SAS data sets. If the business organization needs to house and operate with a GDG-like structure for text data sets, there isn't one available. While my organization had a similar initiative to migrate programs used to run the subprime mortgage analytic, incentive, and regulatory reporting, we identified the paucity of literature and research on this topic. Hence, I ended up developing the utility that addresses this need. This is a simple macro that helps us closely simulate a GDG/GDS.

Read the paper (PDF) | View the e-poster or slides (PDF)

JSON is quickly becoming the industry standard for data interchanges, especially in supporting REST APIs. But until now, importing JSON content into SAS^® software and leveraging it in SAS has required significant custom code. Developing that code can be laborious, requiring transcoding, manual text parsing, and creating handlers for unexpected structure changes. Fortunately, the new JSON LIBNAME engine (in the fourth maintenance release for SAS^® 9.4 and later) delivers a robust, efficient method for importing JSON content into SAS data structures. This paper demonstrates several real-world examples of the JSON LIBNAME using open data APIs. The first example contrasts the traditional custom code and JSON LIBNAME approach using big data from the United Nations Comtrade Database. The two approaches are compared in terms of complexity of code, time to execute, and the resulting data structures. The same method is applied to data from Google and the US Census Bureau's APIs. Finally, to demonstrate the ability of the JSON LIBNAME to handle unexpected changes to a JSON data structure, we use the SAS JSON procedure to write a JSON file and then simulate changes to that structure to show how one JSON LIBNAME process can easily adjust the import to handle those changes.

Read the paper (PDF)

This presentation explores the steps taken by a large public research institution to develop a five-year enrollment forecasting model to support the critical enrollment management process at an institution. A key component of the process is providing university stakeholders with a self-service, secure, and flexible tool that enables them to quickly generate different enrollment projections using the most up-to-date information as possible in Microsoft Excel. The presentation shows how we integrated both SAS^® Enterprise Guide^® and the SAS^® Add-In for Microsoft Office to support this critical process, which had very specific stakeholder requirements and expectations.

Read the paper (PDF)

JMP^® integrates very nicely with SAS^® software, so you can do some pretty amazing things by combining the power of JMP and SAS. You can submit some code to run something on a SAS server and bring the results back as a JMP table. Then you can do lots of things with the JMP table to analyze the data returned. This workshop shows you how to access data via SAS servers, run SAS code and bring data back to JMP, and use JMP to do many things very quickly and easily. Explore the synergies between these tools; having both is a powerful combination that far outstrips just having one, or not using them together.