Getting Started: COPULA Procedure

The following example illustrates the use of PROC COPULA. The data used are daily returns on several major stocks. The main purpose of this example is to estimate the joint distribution of stock returns and then simulate from this distribution a new sample of specified size.

Figure 10.1 shows the first 10 observations of the daily stock return data set.

Figure 10.1 First 10 Observations of Daily Returns
Obs date ret_msft ret_ko ret_ibm ret_duk ret_bp
1 01/03/2008 0.004182 0.010367 0.002002 0.003503 0.019114
2 01/04/2008 -0.027960 0.001913 -0.035861 -0.000582 -0.014536
3 01/07/2008 0.006732 0.023607 -0.010671 0.025611 0.017922
4 01/08/2008 -0.033435 0.004239 -0.024610 -0.002838 -0.016049
5 01/09/2008 0.029560 0.026680 0.007301 0.010814 -0.027078
6 01/10/2008 -0.003054 0.004441 0.016414 -0.001689 -0.004395
7 01/11/2008 -0.012255 -0.027346 -0.022546 -0.012408 -0.018473
8 01/14/2008 0.013958 0.008418 0.053857 0.003427 0.001166
9 01/15/2008 -0.011318 -0.010851 -0.010689 -0.017075 -0.040925
10 01/16/2008 -0.022587 -0.015021 -0.001955 0.002316 -0.021336

The following statements fit a normal copula to the returns data (with the FIT statement) and create a new SAS data set that contains parameter estimates of the model. The VAR statement specifies the list of variables, which in this case are the daily returns of five large company stocks.

/* Copula estimation */
proc copula data = returns; 
   var ret_ibm ret_msft ret_bp ret_ko ret_duk;
   fit normal / outcopula=estimates;
run;

The first table in Figure 10.2 shows some general information about the copula fitting procedure: the number of observations, the name of the input data set, the type of model and the correlation matrix.

Figure 10.2 Copula Estimation: Fit Summary and Correlation Matrix
The COPULA Procedure

Model Fit Summary
Number of Observations 603
Data Set WORK.RETURNS
Model Normal

Correlation Matrix
  ret_ibm ret_msft ret_bp ret_ko ret_duk
ret_ibm 1.000000000 0.623161741 0.529368606 0.472457515 0.490165022
ret_msft 0.623161741 1.000000000 0.522909718 0.501507922 0.456725646
ret_bp 0.529368606 0.522909718 1.000000000 0.398018155 0.437770169
ret_ko 0.472457515 0.501507922 0.398018155 1.000000000 0.528328599
ret_duk 0.490165022 0.456725646 0.437770169 0.528328599 1.000000000

Next, the following statements restrict the data set to only those columns that contain correlation parameter estimates.

/* keep only correlation estimates */
data estimates;
   set estimates;
   keep ret_ibm ret_msft ret_bp ret_ko ret_duk;
run;

Then, in the following statements, the DEFINE statement specifies a normal copula named COP, and the COR= option specifies that the data set Estimates be used as the source for the model parameters. The NDRAWS=1000 option in the SIMULATE statement generates 500 observations from the normal copula. The OUTUNIFORM= option specifies the name of SAS data set to contain the simulated sample with uniform marginal distributions. Note that this syntax does not require the DATA= option.

/* Copula simulation of uniforms */
proc copula;
   var ret_ibm ret_msft ret_bp ret_ko ret_duk;
   define cop normal (cor = estimates);
   simulate cop / ndraws     = 500
                  outuniform = simulated_uniforms 
                  plots=(data=uniform matrix);
run;

The simulated data is contained in the new SAS data set, Simulated_Uniforms. A scatter plot matrix of uniform marginals contained in the data set is shown in Output 10.3.

Figure 10.3 Simulated Data, Uniform Marginals
Simulated Data, Uniform Marginals

The preceding sequence of PROC COPULA usage—first fit, then simulate given estimated parameters—is a legitimate sequence but has a limitation in that the second COPULA call does not generate the sample according to the empirical distribution of the raw data. It generates only marginally uniform series.

In the following statements, the FIT statement fits a copula to the returns data and at the same time simulates the sample according to empirical marginal distributions:

/* Copula estimation and simulation of returns */
proc copula data = returns; 
   var ret_ibm ret_msft ret_bp ret_ko ret_duk;
   fit T;
   simulate / ndraws = 1000
              out    = simulated_returns;
run;

The output of the statements is similar in structure to the output displayed in Figure 10.2 with the addition of parameter estimates and inference statistics that are specific to the copula model as shown in Figure 10.4. For a copula, the degrees of freedom are displayed (as in Figure 10.4); for Archimedean copulas, the parameter "theta" is displayed; and for a normal copula, this table is not printed.

Figure 10.4 Copula Estimation: Specific Parameter Estimates
The COPULA Procedure

Parameter Estimates
Parameter Estimate Standard Error t Value Approx
Pr > |t|
DF 3.659320 0.320729 11.41 <.0001

The simulated data is contained in the new SAS data set, Simulated_Returns.