bcg {mgcv}R Documentation

Censored Box-Cox Gaussian family

Description

Family for use with gam or bam, implementing regression for censored data that can be modelled as Gaussian after Box-Cox transformation. If y>0 is the response and

z= \left \{ \begin{array}{ll} (y^\lambda-1)/\lambda & \lambda \ne 0 \\ \log(y) & \lambda =0 \end{array} \right .

with mean \mu and standard deviation w^{-1/2}\exp(\theta), then w^{1/2}(z-\mu)\exp(-\theta) follows an N(0,1) distribution. That is

z \sim N(\mu,e^{2\theta}w^{-1}).

\theta is a single scalar for all observations, and similarly Box-Cox parameter \lambda is a single scalar. Observations may be left, interval or right censored or uncensored.

Note that the regression model here specifies the mean of the Box-Cox transformed response, not the mean of the response itself: this is rather different to the usual GLM approach.

Usage

bcg(theta=NULL,link="identity")

Arguments

theta

a 2-vector containing the Box-Cox parameter \lambda and the log standard deviation parameter \theta. If supplied and positive then taken as a fixed value of \lambda and \exp(\theta). If supplied and second element negative taken as initial value and negative of initial value respectively.

link

The link function: "identity", "log" or "sqrt".

Details

If the family is used with a vector response, then it is assumed that there is no censoring. If there is censoring then the response should be supplied as a two column matrix. The first column is always numeric. Entries in the second column are as follows.

Any mixture of censored and uncensored data is allowed, but be aware that data consisting only of right and/or left censored data contain very little information.

Value

An object of class extended.family.

Author(s)

Simon N. Wood simon.wood@r-project.org

References

Wood, S.N., N. Pya and B. Saefken (2016), Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association 111, 1548-1575 doi:10.1080/01621459.2016.1180986

See Also

cnorm, cpois, clog

Examples

library(mgcv)

set.seed(3) ## Simulate some gamma data?
dat <- gamSim(1,n=400,dist="normal",scale=1)
dat$f <- dat$f/4 ## true linear predictor 
Ey <- exp(dat$f);scale <- .5 ## mean and GLM scale parameter
dat$y <- rgamma(Ey*0,shape=1/scale,scale=Ey*scale)

## Just Box-Cox no censoring...
b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=bcg,data=dat)
summary(b)
plot(b,pages=1,scheme=1)

## try various censoring...
yb <- cbind(dat$y,dat$y)
ind <- 1:100
yb[ind,2] <- yb[ind,1] + runif(100)*3
yb[51:100,2] <- 0 ## left censored
yb[101:140,2] <- Inf ## right censored

b <- gam(yb~s(x0)+s(x1)+s(x2)+s(x3),family=bcg,data=dat)
summary(b)
plot(b,pages=1,scheme=1)


[Package mgcv version 1.9-4 Index]