/****************************************************************/ /* S A S S A M P L E L I B R A R Y */ /* */ /* NAME: LOGIEX14 */ /* TITLE: Example 14 for PROC LOGISTIC */ /* PRODUCT: STAT */ /* SYSTEM: ALL */ /* KEYS: logistic regression analysis, */ /* binomial response data, */ /* CLOGLOG link */ /* PROCS: LOGISTIC */ /* DATA: */ /* */ /* SUPPORT: Bob Derr */ /* REF: SAS/STAT User's Guide, PROC LOGISTIC chapter */ /* MISC: */ /* */ /****************************************************************/ /***************************************************************** Example 14. Complementary Log-Log Model for Interval-Censored Survival Times *****************************************************************/ /* Often survival times are not observed more precisely than the interval (for instance, a day) within which the event occurred. Survival data of this form are known as grouped or interval- censored data. A discrete analogue of the continuous proportional hazards model (Prentice and Gloeckler 1978; Allison 1982) is used to investigate the relationship between these survival times and a set of explanatory variables. As a example of this method. consider a study of the effect of insecticide on flour beetles. Four different concentrations of an insecticide were sprayed on separate groups of flour beetles and the numbers of male and female flour beetles dying in successive intervals are recorded. The data are saved in data set BEETLES below. This data set contains four variables: TIME, SEX, CONC, and FREQ. TIME represents the interval death time; for example, TIME=2 is the interval between day 1 and day 2. Insects surviving the duration (13 days) of the experiment are given a TIME value of 14. The variable SEX represents the sex of the insect (1=male, 2=female), CONC represents the concentration of the insecticide (mg/cm^2), and FREQ represents the frequency of the observations. To use PROC LOGISTIC with the grouped survival data, you must expand the data so that each beetle has a separate record for each day of survival. A beetle that died in the third day (time=3) would contribute three observations to the analysis, one for each day it was alive at the beginning of the day. A beetle that survives the 13-day duration of the experiment (time=14) would contribute 13 observations. A new data set DAYS that contains the beetle-day observations is created from the data set BEETLES. In addition to the variables SEX, CONC and FREQ, the data set contains an outcome variable Y and the variable DAY taking the values 1,...,13. Y has a value of 1 if the observation corresponds to the day that the beetle died and has a value of 0 otherwise. PROC LOGISTIC is invoked to fit a complementary log-log model for binary data with response variable Y and explanatory variables DAY, SEX, and CONC. Since the values of Y are coded 0 and 1, specifying the DESCENDING option ensures that the event (y=1) probability is modeled. The DAY variable is specified with the GLM coding, which adds a parameter to the model for each of DAY=1,...,13. The coefficients of these 13 DAY parameters can be used to estimate the baseline survival function. The NOINT option is specified to prevent any redundancy in estimating the coefficients of the DAY effects. The Newton-Raphson algorithm is used for the maximum likelihood estimation of the parameters. Finally, DATA step code is used to compute the survivor curves for male and female flour beetles exposed to the insecticide of concentrations .20 mg/cm2 and .80 mg/cm2. The SGPLOT procedure is used to plot the survival curves. Instead of plotting them as step functions, the PBSPLINE statement is used to smooth the curves. */ title 'Example 14: CLOGLOG Model for Interval-Censored Survival Times'; data Beetles(keep=time sex conc freq); input time m20 f20 m32 f32 m50 f50 m80 f80; conc=.20; freq= m20; sex=1; output; freq= f20; sex=2; output; conc=.32; freq= m32; sex=1; output; freq= f32; sex=2; output; conc=.50; freq= m50; sex=1; output; freq= f50; sex=2; output; conc=.80; freq= m80; sex=1; output; freq= f80; sex=2; output; datalines; 1 3 0 7 1 5 0 4 2 2 11 2 10 5 8 4 10 7 3 10 4 11 11 11 6 8 15 4 7 8 16 10 15 6 14 9 5 4 9 3 5 4 3 8 3 6 3 3 2 1 2 1 2 4 7 2 0 1 0 1 1 1 1 8 1 0 0 1 1 4 0 1 9 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 1 1 11 0 0 0 0 1 1 0 0 12 1 0 0 0 0 1 0 0 13 1 0 0 0 0 1 0 0 14 101 126 19 47 7 17 2 4 ; data Days; set Beetles; do day=1 to time; if (day < 14) then do; y= (day=time); output; end; end; run; proc logistic data=Days outest=est1; class day / param=glm; model y(event='1')= day sex conc / noint link=cloglog technique=newton; freq freq; run; data one (keep=day survival element s_m20 s_f20 s_m80 s_f80); array dd day1-day13; array sc[4] m20 f20 m80 f80; array s_sc[4] s_m20 s_f20 s_m80 s_f80 (1 1 1 1); set est1; m20= exp(sex + .20 * conc); f20= exp(2 * sex + .20 * conc); m80= exp(sex + .80 * conc); f80= exp(2 * sex + .80 * conc); survival=1; day=0; output; do over dd; element= exp(-exp(dd)); survival= survival * element; do i=1 to 4; s_sc[i] = survival ** sc[i]; end; day + 1; output; end; run; %modstyle(name=LogiStyle,parent=htmlblue,markers=circlefilled); ods listing style=LogiStyle; proc sgplot data=one; title 'Flour Beetles Sprayed with Insecticide'; xaxis grid integer; yaxis grid label='Survival Function'; pbspline y=s_m20 x=day / legendlabel = "Male at 0.20 conc." name="pred1"; pbspline y=s_m80 x=day / legendlabel = "Male at 0.80 conc." name="pred2"; pbspline y=s_f20 x=day / legendlabel = "Female at 0.20 conc." name="pred3"; pbspline y=s_f80 x=day / legendlabel = "Female at 0.80 conc." name="pred4"; discretelegend "pred1" "pred2" "pred3" "pred4" / across=2; run; ods listing close; ods listing;