SAS Global Forum 2015 Proceedings

The Centers for Medicare & Medicaid Services (CMS) uses the Proportion of Days Covered (PDC) to measure medication adherence. There is also some PDC-related research based on Medicare Part D Event (PDE) Data. However, Under Medicare rules, beneficiaries who receive care at an Inpatient (IP) [facility] may receive Medicare covered medications directly from the IP, rather than by filling prescriptions through their Part D contracts; thus, their medication fills during an IP stay would not be included in the PDE claims used to calculate the Patient Safety adherence measures. (Medicare 2014 Part C&D star rating technical notes). Therefore, the previous PDC calculation method underestimated the true PDC value. Starting with 2013 Star rating, PDC calculation was adjusted with IP stays. This is, when a patient has an inpatient admission during the measurement period, the inpatient stays are censored for the PDC calculation. If the patient also has measured drug coverage during the inpatient stay, the drug supplied during inpatient stay will be shifted after the inpatient stay. This shifting also causes a chain of shifting. This paper presents a SAS R Macro using the SAS Hash Object to match inpatient stays, censoring the inpatient stays, shifting the drug starting and ending dates, and calculating the adjusted PDC.

Read the paper (PDF). | Download the data file (ZIP).

Managing the large-scale displacement of people and communities caused by a natural disaster has historically been reactive rather than proactive. Following a disaster, data is collected to inform and prompt operational responses. In many countries prone to frequent natural disasters such as the Philippines, large amounts of longitudinal data are collected and available to apply to new disaster scenarios. However, because of the nature of natural disasters, it is difficult to analyze all of the data until long after the emergency has passed. For this reason, little research and analysis have been conducted to derive deeper analytical insight for proactive responses. This paper demonstrates the application of SAS^® analytics to this data and establishes predictive alternatives that can improve conventional storm responses. Humanitarian organizations can use this data to understand displacement patterns and trends and to optimize evacuation routing and planning. Identifying the main contributing factors and leading indicators for the displacement of communities in a timely and efficient manner prevents detrimental incidents at disaster evacuation sites. Using quantitative and qualitative methods, responding organizations can make data-driven decisions that innovate and improve approaches to managing disaster response on a global basis. The benefits of creating a data-driven analytical model can help reduce response time, improve the health and safety of displaced individuals, and optimize scarce resources in a more effective manner. The International Organization for Migration (IOM), an intergovernmental organization, is one of the first-response organizations on the ground that responds to most emergencies. IOM is the global co-load for the Camp Coordination and Camp Management (CCCM) cluster in natural disasters. This paper shows how to use SAS^® Visual Analytics and SAS^® Visual Statistics for the Philippines in response to Super Typhoon Haiyan in Nove mber 2013 to develop increasingly accurate models for better emergency-preparedness. Using data collected from IOM's Displacement Tracking Matrix (DTM), the final analysis shows how to better coordinate service delivery to evacuation centers sheltering large numbers of displaced individuals, applying accurate hindsight to develop foresight on how to better respond to emergencies and disasters. Predictive models build on patterns found in historical and transactional data to identify risks and opportunities. The capacity to predict trends and behavior patterns related to displacement and mobility has the potential to enable the IOM to respond in a more timely and targeted manner. By predicting the locations of displacement, numbers of persons displaced, number of vulnerable groups, and sites at most risk of security incidents, humanitarians can respond quickly and more effectively with the appropriate resources (material and human) from the outset. The end analysis uses the SAS^® Storm Optimization model combined with human mobility algorithms to predict population movement.

In observational data analyses, it is often helpful to use patients as their own controls by comparing their outcomes before and after some signal event, such as the initiation of a new therapy. It might be useful to have a control group that does not have the event but that is instead evaluated before and after some arbitrary point in time, such as their birthday. In this context, the change over time is a continuous outcome that can be modeled as a (possibly discontinuous) line, with the same or different slope before and after the event. Mixed models can be used to estimate random slopes and intercepts and compare patients between groups. A specific example published in a peer-reviewed journal is presented.

Read the paper (PDF).

Kaiser Permanente Northwest is contractually obligated for regulatory submissions to Oregon Health Authority, Health Share of Oregon, and Molina Healthcare in Washington. The submissions consist of Medicaid Encounter data for medical and pharmacy claims. SAS^® programs are used to extract claims data from Kaiser's claims data warehouse, process the data, and produce output files in HIPAA ASC X12 and NCPDP format. Prior to April 2014, programs were written in SAS^® 8.2 running on a VAX server. Several key drivers resulted in the conversion of the existing system to SAS^® Enterprise Guide^® 5.1 running on UNIX. These drivers were: the need to have a scalable system in preparation for the Affordable Care Act (ACA); performance issues with the existing system; incomplete process reporting and notification to business owners; and a highly manual, labor-intensive process of running individual programs. The upgraded system addressed these drivers. The estimated cost reduction was from $1.30 per reported encounter to $0.13 per encounter. The converted system provides for better preparedness for the ACA. One expected result of ACA is significant Medicaid membership growth. The program has already increased in size by 50% in the preceding 12 months. The updated system allows for the expected growth in membership.

Read the paper (PDF).

Quality measurement is increasingly important in the health-care sphere for both performance optimization and reimbursement. Treatment of chronic conditions is a key area of quality measurement. However, medication compendiums change frequently, and health-care providers often free text medications into a patient's record. Manually reviewing a complete medications database is time consuming. In order to build a robust medications list, we matched a pharmacist-generated list of categorized medications to a raw medications database that contained names, name-dose combinations, and misspellings. The matching procedure we used is called PROC COMPGED. We were able to combine a truncation function and an upcase function to optimize the output of PROC COMPGED. Using these combinations and manipulating the scoring metric of PROC COMPGED enabled us to narrow the database list to medications that were relevant to our categories. This process transformed a tedious task for PROC COMPARE or an Excel macro into a quick and efficient method of matching. The task of sorting through relevant matches was still conducted manually, but the time required to do so was significantly decreased by the fuzzy match in our application of PROC COMPGED.

Read the paper (PDF).

Graduate students encounter many challenges when conducting health services research using real world data obtained from electronic health records (EHRs). These challenges include cleaning and sorting data, summarizing and identifying present-on-admission diagnosis codes, identifying appropriate metrics for risk-adjustment, and determining the effectiveness and cost effectiveness of treatments. In addition, outcome variables commonly used in health service research are not normally distributed. This necessitates the use of nonparametric methods in statistical analyses. This paper provides graduate students with the basic tools for the conduct of health services research with EHR data. We will examine SAS^® tools and step-by-step approaches used in an analysis of the effectiveness and cost-effectiveness of the ABCDE (Awakening and Breathing Coordination, Delirium monitoring/management, and Early exercise/mobility) bundle in improving outcomes for intensive care unit (ICU) patients. These tools include the following: (1) ARRAYS; (2) lookup tables; (3) LAG functions; (4) PROC TABULATE; (5) recycled predictions; and (6) bootstrapping. We will discuss challenges and lessons learned in working with data obtained from the EHR. This content is appropriate for beginning SAS users.

Read the paper (PDF).

With the constant need to inform researchers about neighborhood health data, the Santa Clara County Health Department created socio-demographic and health profiles for 109 neighborhoods in the county. Data was pulled from many public and county data sets, compiled, analyzed, and automated using SAS^®. With over 60 indicators and 109 profiles, an efficient set of macros was used to automate the calculation of percentages, rates, and mean statistics for all of the indicators. Macros were also used to automate individual census tracts into pre-decided neighborhoods to avoid data entry errors. Simple SQL procedures were used to calculate and format percentages within the macros, and output was pushed out using Output Delivery System (ODS) Graphics. This output was exported to Microsoft Excel, which was used to create a sortable database for end users to compare cities and/or neighborhoods. Finally, the automated SAS output was used to map the demographic data using geographic information system (GIS) software at three geographies: city, neighborhood, and census tract. This presentation describes the use of simple macros and SAS procedures to reduce resources and time spent on checking data for quality assurance purposes. It also highlights the simple use of ODS Graphics to export data to an Excel file, which was used to mail merge the data into 109 unique profiles. The presentation is aimed at intermediate SAS users at local and state health departments who might be interested in finding an efficient way to run and present health statistics given limited staff and resources.

Read the paper (PDF).

Several U.S. Federal agencies conduct national surveys to monitor health status of residents. Many of these agencies release their survey data to the public. Investigators might be able to address their research objectives by conducting secondary statistical analyses with these available data sources. This paper describes the steps in using the SAS SURVEY procedures to analyze publicly released data from surveys that use probability sampling to make statistical inference to a carefully defined population of elements (the target population).

Read the paper (PDF). | Watch the recording.

PROC MIXED is one of the most popular SAS procedures to perform longitudinal analysis or multilevel models in epidemiology. Model selection is one of the fundamental questions in model building. One of the most popular and widely used strategies is model selection based on information criteria, such as Akaike Information Criterion (AIC) and Sawa Bayesian Information Criterion (BIC). This strategy considers both fit and complexity, and enables multiple models to be compared simultaneously. However, there is no existing SAS procedure to perform model selection automatically based on information criteria for PROC MIXED, given a set of covariates. This paper provides information about using the SAS %ic_mixed macro to select a final model with the smallest value of AIC and BIC. Specifically, the %ic_mixed macro will do the following: 1) produce a complete list of all possible model specifications given a set of covariates, 2) use do loop to read in one model specification every time and save it in a macro variable, 3) execute PROC MIXED and use the Output Delivery System (ODS) to output AICs and BICs, 4) append all outputs and use the DATA step to create a sorted list of information criteria with model specifications, and 5) run PROC REPORT to produce the final summary table. Based on the sorted list of information criteria, researchers can easily identify the best model. This paper includes the macro programming language, as well as examples of the macro calls and outputs.

Read the paper (PDF).

This presentation provides an in-depth analysis, with example SAS^® code, of the health care use and expenditures associated with depression among individuals with heart disease using the 2012 Medical Expenditure Panel Survey (MEPS) data. A cross-sectional study design was used to identify differences in health care use and expenditures between depressed (n = 601) and nondepressed (n = 1,720) individuals among patients with heart disease in the United States. Multivariate regression analyses using the SAS survey analysis procedures were conducted to estimate the incremental health services and direct medical costs (inpatient, outpatient, emergency room, prescription drugs, and other) attributable to depression. The prevalence of depression among individuals with heart disease in 2012 was estimated at 27.1% (6.48 million persons) and their total direct medical costs were estimated at approximately $110 billion in 2012 U.S. dollars. Younger adults (< 60 years), women, unmarried, poor, and sicker individuals with heart disease were more likely to have depression. Patients with heart disease and depression had more hospital discharges (relative ratio (RR) = 1.06, 95% confidence interval (CI) [1.02 to 1.09]), office-based visits (RR = 1.27, 95% CI [1.15 to 1.41]), emergency room visits (RR = 1.08, 95% CI [1.02 to 1.14]), and prescribed medicines (RR = 1.89, 95% CI [1.70, 2.11]) than their counterparts without depression. Finally, among individuals with heart disease, overall health care expenditures for individuals with depression was 69% higher than that for individuals without depression (RR = 1.69, 95% CI [1.44, 1.99]). The conclusion is that depression in individuals with heart disease is associated with increased health care use and expenditures, even after adjusting for differences in age, gender, race/ethnicity, marital status, poverty level, and medical comorbidity.

Read the paper (PDF).

Many epidemiological studies use medical claims to identify and describe a population. But finding out who was diagnosed, and who received treatment, isn't always simple. Each claim can have dozens of medical codes, with different types of codes for procedures, drugs, and diagnoses. Even a basic definition of treatment could require a search for any one of 100 different codes. A SAS^® macro may come to mind, but generalizing the macro to work with different codes and types allows it to be reused in a variety of different scenarios. We look at a number of examples, starting with a single code type and variable. Then we consider multiple code variables, multiple code types, and multiple flag variables. We show how these macros can be combined and customized for different data with minimal rework. Macro flexibility and reusability are also discussed, along with ways to keep our list of medical codes separate from our program. Finally, we discuss time-dependent medical codes, codes requiring database lookup, and macro performance.

Read the paper (PDF). | Download the data file (ZIP).

An essential part of health services research is describing the use and sequencing of a variety of health services. One of the most frequently examined health services is hospitalization. A common problem in describing hospitalizations is that a patient might have multiple hospitalizations to treat the same health problem. Specifically, a hospitalized patient might be (1) sent to and returned from another facility in a single day for testing, (2) transferred from one hospital to another, and/or (3) discharged home and re-admitted within 24 hours. In all cases, these hospitalizations are treating the same underlying health problem and should be considered as a single episode. If examined without regard for the episode, a patient would be identified as having 4 hospitalizations (the initial hospitalization, the testing hospitalization, the transfer hospitalization, and the readmission hospitalization). In reality, they had one hospitalization episode spanning multiple facilities. IMPORTANCE: Failing to account for multiple hospitalizations in the same episode has implications for many disciplines including health services research, health services planning, and quality improvement for patient safety. HEALTH SERVICES RESEARCH: Hospitalizations will be counted multiple times, leading to an overestimate of the number of hospitalizations a person had. For example, a person can be identified as having 4 hospitalizations when in reality they had one episode of hospitalization. This will result in a person appearing to be a higher user of health care than is true. RESOURCE PLANNING FOR HEALTH SERVICES. The average time and resources needed to treat a specific health problem may be underestimated. To illustrate, if a patient spends 10 days each in 3 different hospitals in the same episode, the total number of days needed to treat the health problem is 30 days, but each hospital will believe it is only 10, and planned resourcing may be inadequate. QUALITY IMPROVEMENT FOR PATIENT SAFETY. Hospital-acquir ed infections are a serious concern and a major cause of extended hospital stays, morbidity, and death. As a result, many hospitals have quality improvement programs that monitor the occurrence of infections in order to identify ways to reduce them. If episodes of hospitalizations are not considered, an infection acquired in a hospital that does not manifest until a patient is transferred to a different hospital will incorrectly be attributed to the receiving hospital. PROPOSAL: We have developed SAS^® code to identify episodes of hospitalizations, the sequence of hospitalizations within each episode, and the overall duration of the episode. The output clearly displays the data in an intuitive and easy-to-understand format. APPLICATION: The method we will describe and the associated SAS code will be useful to not only health services researchers, but also anyone who works with temporal data that includes nested, overlapping, and subsequent events.