The COPULA Procedure

Canonical Maximum Likelihood Estimation (CMLE)

In the canonical maximum likelihood estimation (CMLE) method, it is assumed that the sample data $\pmb x_ i = ( x_{i1}, x_{i2}, \ldots , x_{im} )^\top $, $i=1,\ldots ,n$ have been transformed into uniform variates $\boldsymbol {\hat{u}}_ i = (\hat{u}_{i1}, \ldots , \hat{u}_{im})$, $i=1,\ldots ,n$. One commonly used transformation is the nonparametric estimation of the CDF of the marginal distributions, which is closely related to empirical CDF,

\[  \hat{u}_{i,j}= \hat{F}_{j,n}(x_{i,j})  \]

where

\[  \hat{F}_{j,n}(x) = \frac{1}{n+1}\sum _{i=1}^ n I_{[x_{i,j}\le x]}  \]

The transformed data $ \hat{u}_{i,j}$ are used as if they had uniform marginal distributions; hence, they are called pseudo-samples. The function $\hat{F}_{j,n}$ is different from the standard empirical CDF in the scalar $1/(n+1)$, which is to ensure that the transformed data cannot be on the boundary of the unit interval $[0,1]$. It is clear that

\[  \hat{u}_{i,j} = \frac{1}{n+1} \textrm{rank}(x_{i,j})  \]

where $\textrm{rank}(x_{i,j})$ is the rank among $i=1,\ldots ,n$ in increasing order.

Let $c(u_1, u_2, {\ldots }, u_ m; {\theta })$ be the density function of a copula $C(u_1, u_2,{\ldots }, u_ m; {\theta })$, and let $\theta $ be the parameter vector to be estimated. The parameter $\theta $ is estimated by maximum likelihood:

\[  \hat{\theta } = \arg \max _{{\theta }\in {\Theta }} \sum _{i=1}^{n} \log c(\hat{u}_{i1}, {\ldots }, \hat{u}_{im}; {\theta })  \]