SAS/OR^{®} Software and SAS^{®} Simultation Studio Papers A-Z

The Challenge: assigning outbound calling agents in a telemarketing campaign to geographic districts. The districts have a variable number of leads, and each agent needs to be assigned entire districts with the total number of leads being as close as possible to a specified number for each of the agents (usually, but not always, an equal number). In addition, there are constraints concerning the distribution of assigned districts across time zones in order to maximize productivity and availability. Our Solution: use the SAS/OR^{®} procedure PROC CLP to formulate the challenge as a constraint satisfaction problem (CSP) since the objective is not necessarily to minimize a cost function, but rather to find a feasible solution to the constraint set. The input consists of the number of agents, the number of districts, the number of leads in each district, the desired number of leads per agent, the amount by which the actual number of leads can differ from the desired number, and the time zone for each district.

Dataset Matching and Clustering with PROC OPTNET

We used OPTNET to link hedge fund datasets from four vendors, covering overlapping populations, but with no universal identifier. This quick tip shows how to treat data records as nodes, use pairwise identifiers to generate distance measures, and get PROC OPTNET to assign clusters of records from all sources to each hedge fund. This proved to be far faster, and easier, than doing the same task in PROC SQL.

Literature suggests two main approaches, parametric and non-parametric, for constructing efficiency frontiers on which efficiency scores of other units can be based. Parametric functions can be either deterministic or stochastic in nature. However, when multiple inputs and outputs are encountered, Data Envelopment Analysis (DEA), a non-parametric approach, is a powerful tool used for decades in measurement of productivity/efficiency with a wide range of applications. Both approaches have advantages and limitations. This paper attempts to further explore and validate a hybrid approach, taking the best of both the DEA and the parametric approach, in order to estimate efficiency of Decision Making Units (DMUs) in an even better way.

The role of the Data Scientist is the viral job description of the decade. And like LOLcats, there are many types of Data Scientists. What is this new role? Who is hiring them? What do they do? What skills are required to do their job? What does this mean for the SAS^{®} programmer and the statistician? Are they obsolete? And finally, if I am a SAS user, how can I become a Data Scientist? Come learn about this job of the future and what you can do to be part of it.

Over the past decade, sports analytics has seen an explosion in research and model development to calculate wins, reaching cult popularity with the release of the film 'Moneyball.' The purpose of this paper is to explore the methodology of solving a real-life Moneyball problem in basketball. An optimal basketball lineup will be selected in an attempt to maximize the total points per game while maximizing court coverage. We will briefly review some of the literature that has explored this type of problem, traditionally called the maximum coverage problem (MCP) in operations research. An exploratory data analysis will be performed, including visualizations and clustering in order to prep the modeling dataset for optimization. Finally, SAS^{®} will be used to formulate an MCP problem, and additional constraints will be added to run different business scenarios.

SAS/OR^{®} software for operations research includes mathematical optimization, discrete-event simulation, and project and resource scheduling capabilities. This paper surveys a number of its new features that better equip you to address decision-making challenges such as planning, resource management, and asset allocation. Optimization performance improvements help you solve larger, more detailed problems more quickly. Improvements encompass linear, mixed integer linear, and nonlinear optimization, and include multithreading of the mixed integer linear solver and major improvements in the performance and functionality of the decomposition algorithm for linear and mixed integer linear optimization. The OPTMODEL procedure for optimization modeling adds direct access to the same set of efficient network optimization algorithms available via the OPTNET procedure in SAS/OR, enabling you to embed network optimization as a component of larger solution processes. Other new features enable you to execute multiple optimizations in parallel and use the FCMP procedure to define functions. The OPTLSO procedure for global and local search optimization adds the ability to work with multiple objective functions and produce a set of Pareto-optimal solutions. This approach enables you to manage the trade-offs that arise between competing objectives and adds to the range of optimization problems that you can solve using PROC OPTLSO. Another new feature is support for the READ_ARRAY function in PROC FCMP, with which you can much more easily input array-structured data to be used in function definitions. Finally, SAS^{®} Simulation Studio for discrete-event simulation enhances its graphical interface to better support customization and increase ease of use.

This paper reveals the human mobility behavior in the metropolitan area of Rio de Janeiro, Brazil. The base for this study is the mobile phone data provided by one of the largest mobile carriers in Brazil. Mobile phone data comprises a reasonable variety of information, including data about time and location for call activity throughout urban areas. This information might be used to build users trajectories over time, describing the major characteristics of the urban mobility within the city. A variety of distribution analyses is presented in this paper aiming clearly describes the most relevant characteristics of the overall mobility in the metropolitan area of Rio de Janeiro. In addition to that, methods from physics to describe trends in trips such as gravity and radiation models were computed and compared in terms of granularity of the geographic scales and also in relation to traditional data mining approach such as linear regressions. A brief comparison in terms of performance in predicting the amount of trips between pairs of locations is presented at the end.

In the traveling salesman problem, a salesman must minimize travel distance while visiting each of a given set of cities exactly once. This paper uses the SAS/OR^{®} OPTMODEL procedure to formulate and solve the traveling baseball fan problem, which complicates the traveling salesman problem by incorporating scheduling constraints: a baseball fan must visit each of the 30 Major League ballparks exactly once, and each visit must include watching a scheduled Major League game. The objective is to minimize the time between the start of the first attended game and the end of the last attended game. One natural integer programming formulation involves a binary decision variable for each scheduled game, indicating whether the fan attends. But a reformulation as a side-constrained network flow problem yields much better solver performance.

A European utility company has several thousand service engineers who provide its customers with services that range from performing routine maintenance to handling emergency breakdowns. Each service engineer is assigned to a work area that consists of a set of postal sectors. The company wants to understand how it should configure its work areas to improve customer satisfaction, minimize travel time for its full-time service engineers, and minimize the costs of overtime and subcontractor hours. This paper describes the use of SAS/OR^{®} optimization procedures to model this problem and configure optimal work areas, and the use of SAS^{®} Simulation Studio to simulate how the optimal configurations might satisfy the customer service requirements. The experimental results show that the proposed solution can satisfy customer demand within the desired service-time window, with significantly less travel time for the engineers, and with lower overtime and subcontractor costs.