Tourism demand modeling and forecasting are very important for tourism-related business decision making. This example illustrates modeling tourism demand using the VARMAX procedure.
Tourism demand can be measured in terms of
Generally, the explanatory variables for tourism demand include origin country income, destination country tourism prices, substitute destination country tourism prices, tastes, etc. Empirical studies usually use living costs for tourists in the destination as the tourism price. Various demand models can be used to estimate and forecast tourism demand.
This example considers modeling tourism demand in a vector autoregressive (VAR) framework. The goal is to forecast the number of holidays in Spain taken by United Kingdom (U.K.) residents using annual tourism data from 1966 to 1994 obtained from Song and Witt (2000). The variables in the data set are described in the following table:
SAS Variable Name | Description |
VSP | The number of holidays in Spain taken by U.K. residents |
PDI | U.K. real personal disposable income |
PUK | The implicit deflater of U.K. consumer expenditure |
EXSP | Exchange rate index of Spanish pesetas against the US dollar |
EXUK | Exchange rate index of U.K. pound against the U.S. dollar |
POP | U.K. population |
CPISP | The consumer price index in Spain |
The log of the number of per capita holiday visits to Spain by U.K. residents is plotted in Figure 1 using the following code:
data tourism ;
set sashelp.tourism;
lpvsp = log(vsp);
lppdi = log(pdi);
lrcsp = log( (cpisp/exsp)/(puk/exuk) );
label lpvsp='log(per capita holiday visits to Spain)'
lppdi='log(per capita disposable income)'
lrcsp='log(living costs in Spain relative to
UK living costs adjusted by the exchange rate)';
run;
proc gplot data = tourism ;
plot lpvsp*year / cframe = ligr vaxis = axis1 haxis = axis2 ;
title 'Log of Per Capita Holiday Visits to Spain' ;
symbol c = blue i = join v = star ;
axis1 label = (angle=90 'log(per capita visits to Spain)') ;
axis2 label = ('Year') ;
run ;
quit ;
Figure 1: Log of Per Capita Holiday Visits to Spain
The VAR model of order one can be expressed as follows:
Yt =C + ΦYt-1 + 𝜖t
whereYt is a k by 1 observation vector, is a k by 1 white noise vector, C is a k by 1 vector of parameters, and 𝜖t is a k by k matrix of first order autoregressive parameters.
In the tourism model, the vectorYt is (LPVSPt,LPPDIt,LRCSPt), where LPVSP denotes a log of the number of holidays in Spain taken by U.K. residents, LPPDI denotes a log of U.K. real personal disposable income, and LRCSP denotes a log of living costs in Spain relative to the U.K. adjusted by the exchange rate. LRCSP is obtained by the relation
LRCSP = ln([(cpisp/exsp)/(puk/exuk)])
where cpisp is the consumer price index in Spain, exsp is the exchange rate index of Spanish pesetas against the U.S. dollar, puk is the implicit deflater of U.K. consumer expenditure, and exuk is the exchange rate index of the U.K. pound against the U.S. dollar.
The tourism model can be written as
You can use the VARMAX procedure to fit this VAR(1) model. First, state the variable names in the MODEL statement. Then use the P= option in the MODEL statement to specify the order of the VAR processes.
To determine the appropriate order of the VAR processes to be used, you can look at the partial autoregressive matrices, partial correlation matrices, or partial canonical matrices. These matrices have the cutoff property for a VAR(p) model at lag = p. Using the PRINT=(PARCOEF PCORR PCANCORR) option enables you to print out the necessary matrices, as shown in the following code:
proc varmax data=tourism;
id year interval=year;
model lpvsp lppdi lrcsp / p=1 print=(parcoef pcorr pcancorr) lagmax=6 ;
run ;
The VARMAX Procedure
|
Figure 2: Partial Autoregressive Coefficients
Figure 2 shows that the model can be obtained by an AR order p=1 since partial autoregression matrices are insignificant after lag 1 with respect to two standard errors.
|
Figure 3: Partial Correlations
The partial cross-correlation matrices in Figure 3 are insignificant after lag 1 with respect to two standard errors. This indicates that an AR order of p=1 can be an appropriate choice.
|
Figure 4: Partial Canonical Correlations
Figure 4 shows that the partial canonical correlations between yt and yt-p are {0.91597, 0.79709}, {0.36957, 0.22090}, {0.37744, 0.28023} for lags p=1 to 3. After lag p=1, the partial canonical correlations are insignificant with respect to the 0.05 significance level, indicating than an AR order of p=1 would be an appropriate choice.
Since the results show that the partial autoregressive matrices, partial cross correlation matrices, and partial canonical correlation matrices are all insignificant after lag 1 with respect to two standard errors, a VAR(1) model is appropriate. The parameter estimates of the VAR(1) model are shown in Figure 5.
|
Figure 5: Parameter Estimates of the VAR(1) Model
Figure 6 plots the residuals obtained from the equation for LPVSP.
Figure 6: Residual Plot of the VAR(1) Model
The preceding residual plot shows that prediction errors from the VAR(1) model are all within two standard errors. This also provides some support for choosing the first order VAR model.
You can also use the CAUSAL statement in the VARMAX procedure to perform a Granger-Causality test on the variables of interest. The GROUP1=(VARIABLES) GROUP2=(VARIABLES) option enables you to specify the variables to be tested. The null hypothesis of the Granger-Causality test is that GROUP1 is influenced by itself, and not by GROUP2. If the test of hypothesis fails to reject the null, the variables in the GROUP1 may be considered as independent variables. The Granger-Causality test results are shown in Figure 7.
proc varmax data=tourism;
id year interval=year;
model lpvsp lppdi lrcsp / p=1;
/* Test for Causality */
causal group1=(lrcsp) group2=(lpvsp lppdi);
run;
The VARMAX Procedure
|
Figure 7: Granger-Causality Test Results
The Granger-Causality test statistics shows that LRCSP is not influenced by LPVSP and LPPDI. Hence, LRCSP should be treated as an exogenous variable, and the following VARX(1) model is appropriate.
where Yt=(LPVSPt,LPPDIt), and Xt=LRCSPt. The VARX(1,1) model can then be expressed as
You can use the MODEL statement in the VARMAX procedure to estimate the VARX(1,1) model. You first specify the dependent variables, then the equality sign, and then the independent variables. You then use the P= option to specify the order of the VAR processes and the XLAG= option to specify the lag order of the exogenous variables. The VARMAX parameter estimates are shown in Figure 8.
/* Fit the VARX(1,1) model */
proc varmax data=tourism;
id year interval=year;
model lpvsp lppdi = lrcsp / p=1 xlag=1 ;
run;
The VARMAX Procedure
|
Figure 8: Parameter Estimates of the VARX(1,1) Model
The parameter estimates results show that 𝛽01 and 𝛽12 are not significant at the 10% significance level, hence, these two parameters can be restricted to be zero. The following residual plot in Figure 9 shows that prediction errors from the VARX(1,1) model are all within two standard errors.
Figure 9: Residual Diagnostic for the VARX(1,1) Model
You can use the RESTRICT statement in the VARMAX procedure to restrict parameters in the model. To perform forecasts based on the model estimated, you can use the OUTPUT statement. You need to specify the number of observations to be forecasted with the LEAD= option. The restricted VARX(1,1) model parameter estimates are shown in Figure 10.
/* Fit the VARX(1,1) model with parameter restriction */
proc varmax data=tourism;
id year interval=year;
model lpvsp lppdi = lrcsp / p=1 xlag=1;
restrict XL(0,1,1) = 0 XL(1,2,1)=0;
output out=forecasts lead=6;
run;
The VARMAX Procedure
|
Figure 10: Restricted Parameter Estimates
The output also shows the six-year-ahead forecasts on a log of the number of holidays in Spain taken by U.K. residents (LPVSP) and a log of the U.K. real personal disposable income (LPPSI) and their corresponding 95% confidence intervals. These results suggest that on average, U.K. residents are expected to spend about seven days in Spain in 1995. Note that the numbers in the Forecast column in Figure 11 are in logarithm form.
|
Figure 11: Six-Year Forecast
Figure 12 plots the forecasts on a log of the number of holidays in Spain taken by U.K. residents and their 95% confidence intervals. The plot shows that future holiday visits to Spain by U.K. residents are expected to increase over the next several years.
Figure 12: Forecasted Log Number of Holidays in Spain
SAS Institute Inc. (1999), SAS/ETS User's Guide, Version 8, Cary, NC: SAS Institute Inc.
SAS Institute Inc. (2004), SAS/ETS User's Guide, Version 9, Cary, NC: SAS Institute Inc.
Song, H. and Witt, S. F. (2000), Tourism Demand Modeling and Forecasting: Modern Econometric Approaches, Oxford, U.K.: Elsevier Science Ltd.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.