A man with one watch always knows what time it is...but a man with two watches is never sure. Contrary to this adage, load forecasters at electric utilities would gladly wear an armful of watches. With only one model to choose from, it is certain that some forecasts will be wrong. But with multiple models, forecasters can have confidence about periods when the forecasts agree and can focus their attention on periods when the predictions diverge. Having a second opinion is preferred, and that's one of the six classic rules for forecasters as per Dr. Tao Hong of the University of North Carolina at Charlotte. Dr. Hong is the premiere thought leader and practitioner in the field of energy forecasting. This presentation discusses Dr. Hong's six rules, how they relate to the increasingly complex problem of forecasting electricity consumption, and the role that predictive analytics plays.
Tim Fairchild, SAS
The singular spectrum analysis (SSA) method of time series analysis applies nonparametric techniques to decompose time series into principal components. SSA is particularly valuable for long time series, in which patterns (such as trends and cycles) are difficult to visualize and analyze. An important step in SSA is determining the spectral groupings; this step can be automated by analyzing the w-correlations of the spectral components. This paper provides an introduction to singular spectrum analysis and demonstrates how to use SAS/ETS® software to perform it. To illustrate, monthly data on temperatures in the United States over the last century are analyzed to discover significant patterns.
Michael Leonard, SAS
Bruce Elsheimer, SAS
Data that are gathered in modern data collection processes are often large and contain geographic information that enables you to examine how spatial proximity affects the outcome of interest. For example, in real estate economics, the price of a housing unit is likely to depend on the prices of housing units in the same neighborhood or nearby neighborhoods, either because of their locations or because of some unobserved characteristics that these neighborhoods share. Understanding spatial relationships and being able to represent them in a compact form are vital to extracting value from big data. This paper describes how to glean analytical insights from big data and discover their big value by using spatial econometric methods in SAS/ETS® software.
Guohui Wu, SAS
Jan Chvosta, SAS
Detection and adjustment of structural breaks are an important step in modeling time series and panel data. In some cases, such as studying the impact of a new policy or an advertising campaign, structural break analysis might even be the main goal of a data analysis project. In other cases, the adjustment of structural breaks is a necessary step to achieve other analysis objectives, such as obtaining accurate forecasts and effective seasonal adjustment. Structural breaks can occur in a variety of ways during the course of a time series. For example, a series can have an abrupt change in its trend, its seasonal pattern, or its response to a regressor. The SSM procedure in SAS/ETS® software provides a comprehensive set of tools for modeling different types of sequential data, including univariate and multivariate time series data and panel data. These tools include options for easy detection and adjustment of a wide variety of structural breaks. This paper shows how you can use the SSM procedure to detect and adjust structural breaks in many different modeling scenarios. Several real-world data sets are used in the examples. The paper also includes a brief review of the structural break detection facilities of other SAS/ETS procedures, such as the ARIMA, AUTOREG, and UCM procedures.
Rajesh Selukar, SAS
Session 1528-2017:
Getting started with ARIMA Models
Getting Started with ARIMA Models will introduce the basic features of time series variation, and the model components used to accommodate them; stationary (ARMA), trend and seasonal (the 'I' in ARIMA) and exogenous (input variable related). The Identify, Estimate and Forecast framework for building ARIMA models is illustrated with two demonstrations.
Chip Wells, SAS
The traditional view is that a utility's long-term forecast must have a standard against which it is judged. Weather normalization is one of the industry-standard practices that utilities use to assess the efficacy of a forecasting solution. While recent advances in probabilistic load forecasting techniques are proving to be a methodology that brings many benefits to a forecast, many utilities still require the benchmarking process to determine the accuracy of their long-term forecasts. Due to climatological volatility and the potentially large annual variances in temperature, humidity, and other relevant weather variables, most utilities create normalized weather profiles through various processes in order to estimate what is traditionally called a weather normalized load profile. However, new research shows that due to the nonlinear response of electric demand to weather variations, a simple normal weather profile in many cases might not equate to a normal load. In this paper, we introduce a probabilistic approach to deriving normalized load profiles and monthly peak and energy in through a process we label load normalization against the effects of weather . We compare it with the traditional weather normalization process to quantify the costs and benefits of using such a process. The proposed method has been successfully deployed at utilities for their long-term operation and planning purposes, and risk management.
Kyle Wood, Seminole Electric Cooperative Inc
Jason Wilson, SAS
Bradley Lawson, SAS
Rain Xie
Interrupted time series analysis (ITS) is a tool that can be used to help Learning Healthcare Systems evaluate programs in settings where randomization is not feasible. Interrupted time series is a statistical method to assess repeated snap shots over regular intervals of time before and after a system-level intervention or program is implemented. This method can be used by Learning Healthcare Systems to evaluate programs aimed at improving patient outcomes in real-world, clinical settings. In practice, the number of patients and the timing of observations are restricted. This presentation describes a program that helps statisticians identify optimal segments of time within a fixed population size for an interrupted time series analysis. A macro creates simulations based on DO loops to calculate power to detect changes over time due to system-level interventions. Parameters used in the macro are sample size, number of subjects in each time frame in each year, number of intervals in a year, and the probability of the event before and after the intervention. The macro gives the user the ability to specify different assumptions that result in design options that yield varying power based on the number of patients in each time intervals given the fixed parameters. The output from the macro can help stakeholders understand necessary parameters to help determine the optimal evaluation design.
Nigel Rozario, UNCC
Andrew McWilliams, CHS
Charity Moore, CHS
Prince Niccolo Machiavelli said things on the order of, The promise given was a necessity of the past: the word broken is a necessity of the present. His utilitarian philosophy can be summed up by the phrase, The ends justify the means. As a personality trait, Machiavelianism is characterized by the drive to pursue one's own goals at the cost of others. In 1970, Richard Christie and Florence L. Geis created the MACH-IV test to assign a MACH score to an individual, using 20 Likert-scaled questions. The purpose of this study was to build a regression model that can be used to predict the MACH score of an individual using fewer factors. Such a model could be useful in screening processes where personality is considered, such as in job screening, offender profiling, or online dating. The research was conducted on a data set from an online personality test similar to the MACH-IV test. It was hypothesized that a statistically significant model exists that can predict an average MACH score for individuals with similar factors. This hypothesis was accepted.
Patrick Schambach, Kennesaw State University
Recent advances in computing technology, monitoring systems, and data collection mechanisms have prompted renewed interest in multivariate time series analysis. In contrast to univariate time series models, which focus on temporal dependencies of individual variables, multivariate time series models also exploit the interrelationships between different series, thus often yielding improved forecasts. This paper focuses on cointegration and long memory, two phenomena that require careful consideration and are observed in time series data sets from several application areas, such as finance, economics, and computer networks. Cointegration of time series implies a long-run equilibrium between the underlying variables, and long memory is a special type of dependence in which the impact of a series' past values on its future values dies out slowly with the increasing lag. Two examples illustrate how you can use the new features of the VARMAX procedure in SAS/ETS® 14.1 and 14.2 to glean important insights and obtain improved forecasts for multivariate time series. One example examines cointegration by using the Granger causality tests and the vector error correction models, which are the techniques frequently applied in the Federal Reserve Board's Comprehensive Capital Analysis and Review (CCAR), and the other example analyzes the long-memory behavior of US inflation rates.
Xilong Chen, SAS
Stefanos Kechagias, SAS
People typically invest in more than one stock to help diversify their risk. These stock portfolios are a collection of assets that each have their own inherit risk. If you know the future risk of each of the assets, you can optimize how much of each asset to keep in the portfolio. The real challenge is trying to evaluate the potential future risk of these assets. Different techniques provide different forecasts, which can drastically change the optimal allocation of assets. This talk presents a case study of portfolio optimization in three different scenarios historical standard deviation estimation, capital asset pricing model (CAPM), and GARCH-based volatility modeling. The structure and results of these three approaches are discussed.
Aric LaBarr, Institute for Advanced Analytics at NC State University
Many organizations need to analyze large numbers of time series that have time-varying or frequency-varying properties (or both). The time-varying properties can include time-varying trends, and the frequency-varying properties can include time-varying periodic cycles. Time-frequency analysis simultaneously analyzes both time and frequency; it is particularly useful for monitoring time series that contain several signals of differing frequency. These signals are commonplace in data that are associated with the internet of things. This paper introduces techniques for large-scale time-frequency analysis and uses SAS® Forecast Server and SAS/ETS® software to demonstrate these techniques.
Michael Leonard, SAS
Wei Xiao, SAS
Arin Chaudhuri, SAS
Panel data, which are collected on a set (panel) of individuals over several time points, are ubiquitous in economics and other analytic fields because their structure allows for individuals to act as their own control groups. The PANEL procedure in SAS/ETS® software models panel data that have a continuous response, and it provides many options for estimating regression coefficients and their standard errors. Some of the available estimation methods enable you to estimate a dynamic model by using a lagged dependent variable as a regressor, thus capturing the autoregressive nature of the underlying process. Including lagged dependent variables introduces correlation between the regressors and the residual error, which necessitates using instrumental variables. This paper guides you through the process of using the typical estimation method for this situation-the generalized method of moments (GMM)-and the process of selecting the optimal set of instrumental variables for your model. Your goal is to achieve unbiased, consistent, and efficient parameter estimates that best represent the dynamic nature of the model.
Roberto Gutierrez, SAS