In fitting a Cox model, the phenomenon of monotone likelihood is observed if the likelihood converges to a finite value while at least one parameter diverges (Heinze and Schemper, 2001).
Let denote the vector explanatory variables for the lth individual at time t. Let denote the k distinct, ordered event times. Let denote the multiplicity of failures at ; that is, is the size of the set of individuals that fail at . Let denote the risk set just before . Let be the vector of regression parameters. The Breslow log partial likelihood is given by

Denote

Then the score function is given by









and the Fisher information matrix is given by






Heinze (1999); Heinze and Schemper (2001) applied the idea of Firth (1993) by maximizing the penalized partial likelihood

The score function is replaced by the modified score function by , where

The Firth estimate is obtained iteratively as

The covariance matrix is computed as , where is the maximum penalized partial likelihood estimate.
Denote






Then








