GAMLSS: Generalized Additive Models
GAMLSS: Generalized Additive Models
URL [Link]
NeedsCompilation yes
Author Mikis Stasinopoulos [aut, cre, cph],
Bob Rigby [aut],
Vlasios Voudouris [ctb],
Calliope Akantziliotou [ctb],
Marco Enea [ctb],
Daniil Kiose [ctb]
Repository CRAN
Date/Publication 2017-05-25 [Link] UTC
R topics documented:
gamlss-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
acfResid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
bfp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
centiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1
2 R topics documented:
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
cs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
dtop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
edf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
fitDist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
fittedPlot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
gamlss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
gamlssML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
gamlssVGD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
getPEF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
getSmo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
histDist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
histSmo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
IC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
lms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
lo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
loglogSurv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
lpred . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
pcat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
plot2way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
polyS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
ps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
quantSheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
gamlss-package 3
random . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
refit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
ri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Rsq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
rvcov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
stepGAIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
wp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
[Link] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Index 139
Description
This a collection of functions to fit Generalized Additive Models for Location Scale and Shape(GAMLSS)
and handled gamlss objects.
GAMLSS were introduced by Rigby and Stasinopoulos (2005). GAMLSS is a general framework
for univariate regression type statistical problems using new ways of dealing with overdispersion,
skewness and kurtosis in the response variable. In GAMLSS the exponential family distribution
assumption used in Generalized Linear Model (GLM) and Generalized Additive Model (GAM),(see
Nelder and Wedderburn, 1972 and Hastie and Tibshirani, 1990, respectively) is relaxed and replaced
by a very general distribution family including highly skew and kurtotic discrete and continuous
distributions. The systematic part of the model is expanded to allow modelling not only the mean
(or location) but other parameters of the distribution of the response variable as linear parametric,
nonlinear parametric or additive non-parametric functions of explanatory variables and/or random
effects terms. Maximum (penalized) likelihood estimation is used to fit the models.
Details
Package: gamlss
Type: Package
Version: 1.5-0
Date: 2006-12-13
License: GPL (version 2 or later) See file LICENSE
This package allow the user to model the distribution of the response variable using a variety of one,
two, three and four parameter families of distributions. The distributions implemented currently can
be found in [Link]. Other distributions can be easily added. In the current implementation
4 acfResid
of GAMLSS several additive terms have been implemented including regression splines, smoothing
splines, penalized splines, varying coefficients, fractional polynomials and random effects. Other
additive terms can be easily added.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby with contributions from Cal-
liope Akantziliotou.
Maintainer: Mikis Stasinopoulos <[Link]@[Link]>
References
Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. J. R. Statist. Soc. A.,
135 370-384.
Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Chapman and Hall, London.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
Examples
data(abdom)
mod<-gamlss(y~pb(x),[Link]=~pb(x),family=BCT, data=abdom, method=mixed(1,20))
plot(mod)
rm(mod)
Description
This plot display the ACF and PACF of the residuals of a gamlss or other fitted model (provided
that they have been standardised appropriately. Is is approriate for time series data.
Usage
acfResid(obj = NULL, resid = NULL)
Arguments
obj A gamlss model or othe fitted model where the resid() function applies exist
resid if obj does not exist the argument here will be used
[Link] 5
Details
The ACF abd PACF for the residuals r, squared residuals r^2, r^3 and r^4 are plotted
Value
The relevant plots are displayied
Author(s)
Mikis Stasinopoulos. Bob Rigby. Vlasios Voudouris and Majid Djennad
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M., Rigby R.A. and Akantziliotou C. (2006) Instructions on how to use the
GAMLSS package in R. Accompanying documentation in the current GAMLSS help files, (see
also [Link]
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
See Also
acf
Examples
library(datasets)
data(co2)
m1<- gamlss(co2~pb([Link](time(co2)))+factor(cycle(co2)))
acfResid(m1)
Description
This function is not to be used on its own. It is used for backfitting in the GAMLSS fitting al-
gorithms and it is based on the equivalent function written by Trevor Hastie in the gam() S-plus
implementation, (Chambers and Hastie, 1991).
Usage
[Link](x, y, w, s, who, [Link], maxit = 30, tol = 0.001,
trace = FALSE, se = TRUE, ...)
6 [Link]
Arguments
x the linear part of the explanatory variables
y the response variable
w the weights
s the matrix containing the smoothers
who the current smoothers
[Link] the data frame used for the smoothers
maxit maximum number of iterations in the backfitting
tol the tolerance level for the backfitting
trace whether to trace the backfitting algorithm
se whether standard errors are required
... for extra arguments
Details
This function should not be used on its own
Value
Returns a list with the linear fit plus the smothers
Author(s)
Mikis Stasinopoulos
References
Chambers, J. M. and Hastie, T. J. (1991). Statistical Models in S, Chapman and Hall, London.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
bfp 7
Description
The function bfp generate a power polynomial basis matrix which (for given powers) can be used
to fit power polynomials in one x-variable. The function fp takes a vector and returns it with several
attributes. The vector is used in the construction of the model matrix. The function fp() is not used
for fitting the fractional polynomial curves but assigns the attributes to the vector to aid gamlss in
the fitting process. The function doing the fitting is [Link]() which is used at the backfitting
function [Link] (but never used on its own). The (experimental) function pp can be use
to fit power polynomials as in a + b1 xp1 + b2 xp2 ., where p1 and p2 have arbitrary values rather
restricted as in the fp function.
Usage
bfp(x, powers = c(1, 2), shift = NULL, scale = NULL)
fp(x, npoly = 2, shift = NULL, scale = NULL)
pp(x, start = list(), shift = NULL, scale = NULL)
Arguments
x the explanatory variable to be used in functions bfp() or fp(). Note that this is
different from the argument x use in [Link] (a function used in the backfit-
ting but not by straight by the user)
powers a vector containing as elements the powers in which the x has to be raised
shift a number for shifting the x-variable. The default values is zero, if x is positive,
or the minimum of the positive difference in x minus the minimum of x
scale a positive number for scalling the x-variable. The default values is 10( sign(log10(range)))∗
trunc(abs(log10(range)))
npoly a positive indicating how many fractional polynomials should be considered in
the fit. Can take the values 1, 2 or 3 with 2 as default
start a list containing the starting values for the non-linear maximization to find the
powers. The results from fitting the equivalent fractional polynomials can be
used here
Details
The above functions are an implementation of the fractional polynomials introduced by Royston
and Altman (1994). The three functions involved in the fitting are loosely based on the frac-
tional polynomials implementation in S-plus written by Gareth Amber in 1999, (unfortunately
the URL link for his work no longer exist). The function bfp generates the right design matrix
for the fitting a power polynomial of the type a + b1 xp1 + b2 xp2 + . . . + bk xpk . For given pow-
ers p1 , p2 , . . . , pk given as the argument powers in bfp() the function can be used to fit power
polynomials in the same way as the functions poly() or bs() (of package splines) are used to
fit orthogonal or piecewise polynomials respectively. The function fp(), which is working as a
8 bfp
smoother in gamlss, is used to fit the best fractional polynomials within a set of power values. Its
argument npoly determines whether one, two or three fractional polynomials should used in the
fitting. For a fixed number npoly the algorithm looks for the best fitting fractional polynomials
in the list c(-2, -1, -0.5, 0, 0.5, 1, 2, 3) . Note that npolu=3 is rather slow since
it fits all possible combinations 3-way combinations at each backfitting interaction. The function
[Link]() is an internal function of GAMLSS allowing the fractional polynomials to be fitted
in the backfitting cycle of gamlss, and should be not used on its own.
Value
The function bfp returns a matrix to be used as part of the design matrix in the fitting.
The function fp returns a vector with values zero to be included in the design matrix but with
attributes useful in the fitting of the fractional polynomials algorithm in [Link].
Warning
Since the model constant is included in both the design matrix X and in the backfitting part of
fractional polynomials, its values is wrongly given in the summary. Its true values is the model
constant minus the constant from the fractional polynomial fitting ??? What happens if more that
one fractional polynomials are fitted?
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby <[Link]@[Link]>
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Royston, P. and Altman, D. G., (1994). Regression using fractional polynomials of continuous
covariates: parsimonious parametric modelling (with discussion), Appl. Statist., 43, 429-467.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, [Link]
Examples
data(abdom)
#fits polynomials with power 1 and .5
mod1<-gamlss(y~bfp(x,c(1,0.5)),data=abdom)
# fit the best of one fractional polynomial
m1<-gamlss(y~fp(x,1),data=abdom)
calibration 9
Description
This function can used when the fitted model centiles do not coincide with the sample centiles.
Usage
calibration(object, xvar, cent = 100 * pnorm((-4:4) * 2/3),
legend = FALSE, fan = FALSE, ...)
Arguments
object a gamlss fitted object
xvar The explanatory variable
cent a vector with elements the % centile values for which the centile curves have to
be evaluated
legend whether legend is required
fan whether to use the fan version of centiles
... other argument pass on to centiles() function
Details
The function finds the sample quantiles of the residuals of the fitted model (the z-scores) and use
them as sample quantile in the argument cent of the centiles() function. This procedure is
appropriate if the fitted model centiles do not coincide with the sample centiles and when this
failure is the same in all values of the explanatory variable xvar.
Value
A centile plot is produced and the sample centiles below each centile curve are printed (or saved)
10 centiles
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby <[Link]@[Link]>
and Vlasios Voudouris <[Link]@[Link]>
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
centiles, [Link]
Examples
data(abdom)
m1<-gamlss(y~pb(x), [Link]=~pb(x), family=LO, data=abdom)
calibration(m1, xvar=abdom$x, fan=TRUE)
Description
This function centiles() plots centiles curves for distributions belonging to the GAMLSS fam-
ily of distributions. The function also tabulates the sample percentages below each centile curve (for
comparison with the model percentages given by the argument cent.) The function [Link]()
plots a fan-chart of the centile curves. A restriction of the functions is that it applies to models with
one explanatory variable only.
Usage
centiles(obj, xvar = NULL, cent = c(0.4, 2, 10, 25, 50, 75, 90, 98, 99.6),
legend = TRUE, ylab = "y", xlab = "x", main = NULL,
[Link] = "@", xleg = min(xvar), yleg = max(obj$y),
xlim = range(xvar), ylim = range(obj$y), save = FALSE,
plot = TRUE, points = TRUE, pch = 15, cex = 0.5, col = gray(0.7),
[Link] = 1:length(cent) + 2, [Link] = 1, [Link] = 1, ...)
[Link](obj, xvar = NULL, cent = c(0.4, 2, 10, 25, 50, 75, 90, 98, 99.6),
ylab = "y", xlab = "x", main = NULL, [Link] = "@",
centiles 11
Arguments
obj a fitted gamlss object from fitting a gamlss distribution
xvar the unique explanatory variable
cent a vector with elements the % centile values for which the centile curves have to
be evaluated
legend whether a legend is required in the plot or not, the default is legent=TRUE
ylab the y-variable label
xlab the x-variable label
main the main title here as character. If NULL the default title "centile curves using
NO" (or the relevant distributions name) is shown
[Link] if the [Link] (with default "@") appears in the main title then it is substi-
tuted with the default title.
xleg position of the legend in the x-axis
yleg position of the legend in the y-axis
xlim the limits of the x-axis
ylim the limits of the y-axis
save whether to save the sample percentages or not with default equal to FALSE. In
this case the sample percentages are printed but are not saved
plot whether to plot the centiles. This option is useful for [Link]
pch the character to be used as the default in plotting points see par
cex size of character see par
col plotting colour see par
[Link] Plotting colours for the centile curves
[Link] line type for the centile curves
[Link] The line width for the centile curves
colors the different colour schemes to be used for the fan-chart. The following are
available c("cm","gray", "rainbow", "heat", "terrain", "topo"),
points whether the data points should be plotted, default is TRUE for centiles() and
FALSE for [Link]()
median whether the median should be plotted (only in [Link]())
... for extra arguments
Details
Centiles are calculated using the fitted values in obj and xvar must correspond exactly to the pre-
dictor in obj to plot correctly.
[Link], [Link] and [Link] may be vector arguments and are recycled to
the length cent if necessary.
12 centiles
Value
A centile plot is produced and the sample centiles below each centile curve are printed (or saved)
Warning
This function is appropriate only when one continuous explanatory variable is fitted in the model
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
Examples
data(abdom)
h<-gamlss(y~pb(x), [Link]=~pb(x), family=BCT, data=abdom)
# default plot
centiles(h,xvar=abdom$x)
# control of colours and lines
centiles(h, xvar=abdom$x, [Link]=c(2,3,4,5,1,5,4,3,2,1),
[Link]=c(1,1,1,1,2,1,1,1,1))
#Control line types
centiles(h, xvar=abdom$x, [Link]=1, cent=c(.5,2.5,50,97.5,99.5),
[Link]=c(3,2,1,2,3),[Link]=c(1,1,2,1,1))
# control of the main title
centiles(h, xvar=abdom$x, main="Abdominal data \n @")
# the fan-chart
[Link](h,xvar=abdom$x, colors="rainbow")
rm(h)
[Link] 13
Description
This function compares centiles curves for more than one GAMLSS [Link] is based on the
centiles function. The function also tabulates the sample percentages below each centile curve
(for comparison with the model percentages given by the argument cent.) A restriction of the
function is that it applies to models with one explanatory variable only
Usage
[Link](obj, ..., xvar = NULL, cent = c(0.4, 10, 50, 90, 99.6),
legend = TRUE, ylab = "y", xlab = "x", xleg = min(xvar),
yleg = max(obj$y), xlim = range(xvar), ylim = NULL,
[Link] = FALSE, color = TRUE, main = NULL, plot = TRUE)
Arguments
obj a fitted gamlss object from fitting a gamlss continuous distribution
... optionally more fitted GAMLSS model objects
xvar the unique explanatory variable
cent a vector with elements the % centile values for which the centile curves have to
be evaluated
legend whether a legend is required in the plot or not, the default is legent=TRUE
ylab the y-variable label
xlab the x-variable label
xleg position of the legend in the x-axis
yleg position of the legend in the y-axis
xlim the limits of the x-axis
ylim the limits of the y-axis
[Link] whether the data should plotted, default [Link]=FALSE or not [Link]=TRUE
color whether the fitted centiles are shown in colour, color=TRUE (the default) or not
color=FALSE
main the main title
plot whether to plot the centiles
Value
Centile plots are produced for the different fitted models and the sample centiles below each centile
curve are printed
14 [Link]
Warning
This function is appropriate only when one continuous explanatory variable is fitted in the model
Author(s)
Mikis Stasinopoulos <[Link]@[Link]> and Bob Rigby <[Link]@[Link]>
References
Rigby, R. A. and Stasinopoulos D. M.(2005). Generalized additive models for location, scale and
shape, (with discussion),Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, centiles , [Link]
Examples
data(abdom)
h1<-gamlss(y~cs(x,df=3), [Link]=~cs(x,1),family=BCT, data=abdom)
h2<-gamlss(y~pb(x), [Link]=~pb(x), family=BCT, data=abdom )
[Link](h1,h2,xvar=abdom$x)
rm(h1,h2)
Description
This function creates predictive centiles curves for new x-values given a GAMLSS fitted model.
The function has three options: i) for given new x-values and given percentage centiles calculates
a matrix containing the centiles values for y, ii) for given new x-values and standard normalized
centile values calculates a matrix containing the centiles values for y, iii) for given new x-values
and new y-values calculates the z-scores. A restriction of the function is that it applies to models
with only one explanatory variable.
Usage
[Link](obj, type = c("centiles", "z-scores", "standard-centiles"),
xname = NULL, xvalues = NULL, power = NULL, yval = NULL,
cent = c(0.4, 2, 10, 25, 50, 75, 90, 98, 99.6),
dev = c(-4, -3, -2, -1, 0, 1, 2, 3, 4),
plot = FALSE, legend = TRUE,
...)
[Link] 15
Arguments
obj a fitted gamlss object from fitting a gamlss continuous distribution
type the default, "centiles", gets the centiles values given in the option cent. type="standard-centiles"
gets the standard centiles given in the dev. type="z-scores" gets the z-scores
for given y and x new values
xname the name of the unique explanatory variable (it has to be the same as in the
original fitted model)
xvalues the new values for the explanatory variable where the prediction will take place
power if power transformation is needed (but read the note below)
yval the response values for a given x required for the calculation of "z-scores"
cent a vector with elements the % centile values for which the centile curves have to
be evaluated
dev a vector with elements the standard normalized values for which the centile
curves have to be evaluated in the option type="standard-centiles"
plot whether to plot the "centiles" or the "standard-centiles", the default is plot=FALSE
legend whether a legend is required in the plot or not, the default is legent=TRUE
... for extra arguments
Value
a vector (for option type="z-scores") or a matrix for options type="centiles" or type="standard-centiles"
containing the appropriate values
Warning
See example below of how to use the function when power transofrmation is used for the x-variables
Note
The power option should be only used if the model
Author(s)
Mikis Stasinopoulos , <[Link]@[Link]>, based on ideas of Elaine Borghie
from the World Health Organization
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
16 [Link]
See Also
gamlss, centiles, [Link]
Examples
## bring the data and fit the model
data(abdom)
a<-gamlss(y~pb(x),[Link]=~pb(x), data=abdom, family=BCT)
## plot the centiles
centiles(a,xvar=abdom$x)
##-----------------------------------------------------------------------------
## first use of [Link]()
## to calculate the centiles at new x values
##-----------------------------------------------------------------------------
newx<-seq(12,40,2)
mat <- [Link](a, xname="x", xvalues=newx )
mat
## now plot the centile curves
mat <- [Link](a, xname="x",xvalues=newx, plot=TRUE )
##-----------------------------------------------------------------------------
## second use of [Link]()
## to calculate (nornalised) standard-centiles for new x
## values using the fitted model
##-----------------------------------------------------------------------------
newx <- seq(12,40,2)
mat <- [Link](a, xname="x",xvalues=newx, type="standard-centiles" )
mat
## now plot the standard centiles
mat <- [Link](a, xname="x",xvalues=newx, type="standard-centiles",
plot = TRUE )
##-----------------------------------------------------------------------------
## third use of [Link]()
## if we have new x and y values what are their z-scores?
##-----------------------------------------------------------------------------
# create new y and x values and plot them in the previous plot
newx <- c(20,21.2,23,20.9,24.2,24.1,25)
newy <- c(130,121,123,125,140,145,150)
for(i in 1:7) points(newx[i],newy[i],col="blue")
## now calculate their z-scores
znewx <- [Link](a, xname="x",xvalues=newx,yval=newy, type="z-scores" )
znewx
## Not run:
##-----------------------------------------------------------------------------
## What we do if the x variables is transformed?
##----------------------------------------------------------------------------
## case 1 : transformed x-variable within the formula
##----------------------------------------------------------------------------
## fit model
aa <- gamlss(y~pb(x^0.5),[Link]=~pb(x^0.5), data=abdom, family=BCT)
## centiles works
centiles(aa,xvar=abdom$x, legend = FALSE)
newx<-seq(12,40,2)
[Link] 17
Description
This function plots centiles curves for separate ranges of the unique explanatory variable x. It is
similar to the centiles function but the range of x is split at a user defined values [Link]
into r separate ranges. The functions also tabulates the sample percentages below each centile curve
for each of the r ranges of x (for comparison with the model percentage given by cent) The model
should have only one explanatory variable.
Usage
[Link](obj, xvar = NULL, [Link] = NULL, [Link] = 4,
cent = c(0.4, 2, 10, 25, 50, 75, 90, 98, 99.6),
legend = FALSE, main = NULL, [Link] = "@",
ylab = "y", xlab = "x", ylim = NULL, overlap = 0,
save = TRUE, plot = TRUE, ...)
Arguments
obj a fitted gamlss object from fitting a gamlss continuous distribution
xvar the unique explanatory variable
[Link] the x-axis cut off points e.g. c(20,30). If [Link]=NULL then the [Link]
argument is activated
[Link] if [Link]=NULL this argument gives the number of intervals in which the
x-variable will be splited, with default 4
cent a vector with elements the % centile values for which the centile curves are to
be evaluated
18 [Link]
legend whether a legend is required in the plots or not, the default is legent=FALSE
main the main title as character. If NULL the default title (shown the intervals) is
shown
[Link] if the [Link] (with default "@") appears in the main title then it is substi-
tuted with the default title.
ylab the y-variable label
xlab the x-variable label
ylim the range of the y-variable axis
overlap how much overlapping in the xvar intervals. Default value is overlap=0 for
non overlapping intervals
save whether to save the sample percentages or not with default equal to TRUE. In
this case the functions produce a matrix giving the sample percentages for each
interval
plot whether to plot the centles. This option is usefull if the sample statistics only
are to be used
... for extra arguments
Value
Centile plots are produced and the sample centiles below each centile curve for each of the r ranges
of x can be saved into a matrix.
Warning
This function is appropriate when only one continuous explanatory variable is fitted in the model
Author(s)
Mikis Stasinopoulos, <[Link]@[Link]>, Bob Rigby <[Link]@[Link]>,
with contributions from Elaine Borghie
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss centiles, [Link]
[Link] 19
Examples
data(abdom)
h<-gamlss(y~pb(x), [Link]=~pb(x), family=BCT, data=abdom)
mout <- [Link](h,xvar=abdom$x)
mout
rm(h,mout)
Description
[Link] is the GAMLSS specific method for the generic function coef which extracts model
coefficients from objects returned by modelling functions. ‘coefficients’ is an alias for coef.
Usage
## S3 method for class 'gamlss'
coef(object, what = c("mu", "sigma", "nu", "tau"),
parameter = NULL, ... )
Arguments
object a GAMLSS fitted model
what which parameter coefficient is required, default what="mu"
parameter equivalent to what (more obvious name)
... for extra arguments
Value
Coefficients extracted from the GAMLSS model object.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
20 cs
See Also
gamlss, [Link], [Link]
Examples
data(aids)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
coef(h)
rm(h)
Description
The functions cs() and scs() are using the cubic smoothing splines function [Link]()
to do smoothing. They take a vector and return it with several attributes. The vector is used in the
construction of the model matrix. The functions do not do the smoothing, but assigns the attributes
to the vector to aid gamlss in the smoothing. The function doing the smoothing is [Link]().
This function use the R function [Link]() which is then used by the backfitting function
[Link]() which is based on the original GAM implementation described in Chambers and
Hastie (1992). The function [Link]() differs from the function cs() in that allows cross
validation of the smoothing parameters unlike the cs() which fixes the effective degrees of freedom,
df. Note that the recommended smoothing function is now the function pb() which allows the
estimation of the smoothing parameters using a local maximum likelihood. The function pb() is
based on the penalised beta splines (P-splines) of Eilers and Marx (1996).
The (experimental) function vc is now defunct. For fitting varying coefficient models, Hastie and
Tibshirani (1993) use the function pvc().
Usage
cs(x, df = 3, spar = NULL, [Link] = NULL, control = [Link](...), ...)
scs(x, df = NULL, spar = NULL, control = [Link](...), ...)
[Link](cv = FALSE, [Link] = TRUE, nknots = NULL, [Link] = TRUE,
[Link] = 0, penalty = 1.4, [Link] = list(), ...)
Arguments
x the univariate predictor, (or expression, that evaluates to a numeric vector). For
the function vc the x argument is the vector which has its (linear) coefficient
change with r
df the desired equivalent number of degrees of freedom (trace of the smoother ma-
trix minus two for the constant and linear fit). The real smoothing parameter
(spar below) is found such that df=tr(S)-2, where S is the implicit smoother
matrix. Values for df should be greater than 0, with 0 implying a linear fit.
cs 21
spar smoothing parameter, typically (but not necessarily) in (0,1]. The coefficient
lambda of the integral of the squared second derivative in the fit (penalised
log likelihood) criterion is a monotone function of ‘spar’, see the details in
[Link].
[Link] This is an option to be used when the degrees of freedom of the fitted gamlss
object are different from the ones given as input in the option df. The de-
fault values used are the ones given the option [Link] in the R func-
tion [Link]() and they are [Link]=c(-1.5, 2). For very large data
sets e.g. 10000 observations, the upper limit may have to increase for example
to [Link]=c(-1.5, 2.5). Use this option if you have received the warning
’The output df are different from the input, change the [Link]’. [Link]
can take both vectors or lists of length 2, for example [Link]=c(-1.5, 2.5) or
[Link]=list(-1.5, 2.5) would have the same effect.
control control for the function [Link](), see below
cv see the R function [Link]()
[Link] see the R function [Link]()
nknots see the R function [Link]()
[Link] see the R function [Link]()
[Link] see the R function [Link]()
penalty see the R function [Link](), here the default value is 1.4
[Link] see above [Link] or the equivalent argument in the function [Link]
... for extra arguments
Details
Note that cs itself does no smoothing; it simply sets things up for the function gamlss() which in
turn uses the function [Link]() for backfitting which in turn uses [Link]()
Note that cs() and scs() functions behave differently at their default values that is if df and lambda
are not specified. cs(x) by default will use 3 extra degrees of freedom for smoothing for x. scs(x)
by default will estimate lambda (and the degrees of freedom) automatically using generalised cross
validation (GCV). Note that if GCV is used the convergence of the gamlss model can be less stable
compared to a model where the degrees of freedom are fixed. This will be true for small data sets.
Value
the vector x is returned, endowed with a number of attributes. The vector itself is used in the
construction of the model matrix, while the attributes are needed for the backfitting algorithms
[Link](). Since smoothing splines includes linear fits, the linear part will be efficiently
computed with the other parametric linear parts of the model.
Warning
For a user who wishes to compare the gamlss() results with the equivalent gam() results in S-plus:
make sure when using S-plus that the convergence criteria epsilon and [Link] in [Link]()
are decreased sufficiently to ensure proper convergence in S-plus. Also note that the degrees of
freedom are defined on top of the linear term in gamlss, but on top of the constant term in S-plus,
22 cs
(so use an extra degrees of freedom in S-plus in order to obtain comparable results to those in
galmss).
Change the upper limit of spar if you received the warning ’The output df are different from the
input, change the [Link]’.
For large data sets do not use expressions, e.g. cs(x^0.5) inside the gamlss function command
but evaluate the expression, e.g. nx=x0 .5, first and then use cs(nx).
Note
The degrees of freedom df are defined differently from that of the gam() function in S-plus. Here df
are the additional degrees of freedom excluding the constant and the linear part of x. For example
df=4 in gamlss() is equivalent to df=5 in gam() in S-plus
Author(s)
Mikis Stasinopoulos and Bob Rigby (see also the documentation of the [Link]()
for the original authors of the cubic spline function.)
References
Chambers, J. M. and Hastie, T. J. (1992) Statistical Models in S, Wadsworth & Brooks/Cole.
Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties (with
comments and rejoinder). Statist. Sci, 11, 89-121.
Hastie, T. J. and Tibshirani, R. J. (1993), Varying coefficient models (with discussion),J. R. Statist.
Soc. B., 55, 757-796.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, [Link], pb, pvc
Examples
# cubic splines example
data(aids)
# fitting a smoothing cubic spline with 7 degrees of freedom
# plus the a quarterly effect
aids1<-gamlss(y~cs(x,df=7)+qrt,data=aids,family=PO) #
aids2<-gamlss(y~scs(x,df=5)+qrt,data=aids,family=PO) #
aids3<-gamlss(y~scs(x)+qrt,data=aids,family=PO) # using GCV
with(aids, plot(x,y))
[Link] 23
lines(aids$x,fitted(aids1), col="red")
lines(aids$x,fitted(aids3), col="green")
rm(aids1, aids2, aids3)
Description
Returns the global, -2*log(likelihood), or the penalized, -2*log(likelihood)+ penalties, deviance of
a fitted GAMLSS model object.
Usage
## S3 method for class 'gamlss'
deviance(object, what = c("G", "P"), ...)
Arguments
object a GAMLSS fitted model
what put "G" for Global or "P" for Penalized deviance
... for extra arguments
Details
deviance is a generic function which can be used to extract deviances for fitted models. [Link]
is the method for a GAMLSS object.
Value
The value of the global or the penalized deviance extracted from a GAMLSS object.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
24 dtop
See Also
[Link], [Link], [Link]
Examples
data(aids)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
deviance(h)
rm(h)
Description
Provides single or multiple detrended transformed Owen’s plot, Owen (1995), for a GAMLSS fitted
objects or any other fitted object which has the method resid(). This is a diagnostic tool for checking
whether the normalised quantile residuals are coming from a normal distribution or not. This could
be true if the horizontal line is within the confidence intervals.
Usage
dtop(object = NULL, xvar = NULL, resid = NULL,
type = c("Owen", "JW"),
[Link] = c("95", "99"), [Link] = 4,
[Link] = NULL, overlap = 0,
[Link] = TRUE, cex = 1, pch = 21,
line = TRUE, ...)
Arguments
object a GAMLSS fitted object or any other fitted object which has the method resid().
xvar the explanatory variable against which the detrended Owen’s plots will be plot-
ted
resid if the object is not specified the residual vector can be given here
type whether to use Owen (1995) or Jager and Wellner (2004) approximate formula
[Link] 95 (default) or 99 percent confidence interval for the plots
[Link] he number of intervals in which the explanatory variable xvar will be cut
[Link] the x-axis cut off points e.g. c(20,30). If [Link]=NULL then the [Link]
argument is activated
overlap how much overlapping in the xvar intervals. Default value is overlap=0 for non
overlapping intervals
[Link] whether to show the x-variable intervals in the top of the graph, default is
[Link]=TRUE
dtop 25
Details
If the xvar argument is not specified then a single detrended Owen’s plot is used, see Owen (1995).
In this case the plot is a detrended nonparametric likelihood confidence band for a distribution func-
tion. That is, if the horizontal lines lies within the confidence band then the normalised residuals
could have come from a Normal distribution and consequently the assumed response variable dis-
tribution is reasonable. If the xvar is specified then we have as many plots as [Link]. In this case the
x-variable is cut into [Link] intervals with an equal number observations and detrended Owen’s plots
for each interval are plotted. This is a way of highlighting failures of the model within different
ranges of the explanatory variable.
Value
A plot is returned.
Author(s)
References
Jager, L. and Wellner, J. A (2004) A new goodness of fit test: the reversed Berk-Jones statistic,
[Link]
Owen A. B. (1995) Nonparametric Confidence Bands for a Distribution Function. Journal of the
American Statistical Association Vol. 90, No 430, pp. 516-521.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, 1-38.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
wp
26 edf
Examples
data(abdom)
a<-gamlss(y~pb(x),[Link]=~pb(x,1),family=LO,data=abdom)
dtop(a)
dtop(a, xvar=abdom$x)
rm(a)
Description
The functions edf() and edfAll() can be used to obtained the effective degrees of freedom for
different additive terms for the distribution parameters in a gamlss model.
Usage
Arguments
Value
The function edfAll() re turns a list of edf for all the fitted parameters. The function edf() a
vector of edf.
Note
The edf given are the ones fitted in the backfitting so the usually contained (depending on the
additive term) the contatnt and the linear part.
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
Examples
library([Link])
data(usair)
m1<- gamlss(y~pb(x1)+pb(x2)+pb(x6), data=usair)
edfAll(m1)
edf(m1)
Description
This function selects the values of hyper parameters and/or non-linear parameters in a GAMLSS
model. It uses the R function optim which then minimises the generalised Akaike information
criterion (GAIC) with a user defined penalty.
Usage
[Link](model = NULL, parameters = NULL, other = NULL, k = 2,
steps = c(0.1), lower = -Inf, upper = Inf, method = "L-BFGS-B",
...)
Arguments
model this is a GAMLSS model in quote(). e.g.
quate(gamlss(y~cs(x,df=p[1]),[Link]=~cs(x,df=p[2]),data=abdom))
where p[1] and p[2] denote the parameters to be estimated
parameters the starting values in the search of the optimum hyper-parameters and/or non-
linear parameters e.g. parameters=c(3,3)
28 [Link]
Details
This historically was an experimental function which worked well for the search of the optimum
degrees of freedom and non-linear parameters (e.g. power parameter λ used to transform x to xλ ).
With the introduction of the P-Spline smoothing function pb() the function [Link]() became
almost redundant. [Link]() takes lot longer than pb() to find automatically the hyper param-
eters while both method produce similar results. See below the examples for a small demonstration.
Value
The function turns the same output as the function optim()
Warning
It may be slow to find the optimum
Author(s)
Mikis Stasinopoulos
[Link] 29
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, [Link], optim
Examples
## Not run:
data(abdom)
# Example estimating the smoothing parameters for mu and
# the transformation parameters for x
# declare the model
mod1<-quote(gamlss(y~cs(nx,df=p[1]),family=BCT,data=abdom,
control=[Link](trace=FALSE)))
# since we want also to find the transformation for x
# we use the "other"" option
op <- [Link](model=mod1, other=quote(nx<-x^p[2]), parameters=c(3,0.5),
lower=c(1,0.001), steps=c(0.1,0.001))
op
# the optimum parameters found are
# p = (p[1],p[2]) = (3.113218 0.001000) = (df for mu, lambda)
# so it needs df = 3 on top of the constant and linear
# in the cubic spline model for mu since p[1] is approximately 3
# and log transformation for x since p[2] is approximately 0
# here is an example with no data declaration in define the model
# we have to attach the data
attach(abdom)
mod2 <- quote(gamlss(y~cs(nx,df=p[1]),family=BCT,
control=[Link](trace=FALSE)))
op2<-[Link](model=mod2, other=quote(nx<-x^p[2]), parameters=c(3,0.5),
lower=c(1,0.001), steps=c(0.1,0.001))
op2
detach(abdom)
#--------------------------------------------------------------
# showing different ways of estimating the smoothing parameter
# get the df using local ML (PQL)
m0 <- gamlss(y~pb(x), data=abdom)
# get the df using local GAIC
m1<-gamlss(y~pb(x, method="GAIC", k=2), data=abdom)
# fiiting cubic splines with fixed df's at 3
m2<-gamlss(y~cs(x, df=3), data=abdom)
30 fitDist
## End(Not run)
Description
This function is using the function gamlssML() to fit all relevant parametric [Link] dis-
tributions to a data vector. The final model is the one which is selected by the generalised Akaike
information criterion with penalty k.
Usage
fitDist(y, k = 2,
type = c("realAll", "realline", "realplus", "real0to1", "counts", "binom"),
[Link] = FALSE, extra = NULL, data = NULL, ...)
Arguments
y the data vector
k the penalty for the GAIC with default values k=2 the standard AIC
type the type of distribution to be tried see details
[Link] if gamlssML() failed whether should try gamlss instead. This will slow up
things for big data.
extra whether extra distribution should be tried which are not in the type list
data the data frame where y ca be found
... for extra arguments to be passed to gamlssML() to gamlss()
fitDist 31
Details
• realAll all the [Link] continuous distributions defined on the real line, i.e. realline
plus realplus
• reallinethe [Link] continuous distributions : "GU", "RG" ,"LO", "NET", "TF",
"PE", "SN1", "SN2", "SHASH", "EGB2", "JSU", "SEP1", "SEP2", "SEP3", "SEP4","ST1",
"ST2", "ST3", "ST4", "ST5", "GT"
• realplus the [Link] continuous distributions in the positive leal line: "EXP","GA","IG","LNO",
"WEI3", "BCCGo", "exGAUS", "GG", "GIG", "BCTo", "BCPEo"
• real0to1the [Link] continuous distributions from 0 to 1: "BE", "BEINF", "BE-
INF0", "BEINF1", "BEOI", "BEZI", "GB1"
• countsthe [Link] distributions for counts: "PO", "LG", "NBI", "NBII", "PIG",
"DEL", "SI", "ZIP", "ZAP", "ZALG", "ZANBI", "ZIP2", "ZIPIG"
• binomthe [Link] distributions for binomial type data :"BI", "BB", "ZIBI", "ZIBB",
"ZABI", "ZABB"
Value
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss,gamlssML
32 [Link]
Examples
y <- rt(100, df=1)
m1<-fitDist(y, type="realline")
m1$fits
m1$failed
# an example of using extra
## Not run:
library([Link])
data(tensile)
[Link](par=1,family="GA", type="right")
[Link](par=1,"LOGNO", type="right")
[Link](par=c(0,1),"TF", type="both")
ma<-fitDist(str, type="real0to1", extra=c("GAtr", "LOGNOtr", "TFtr"), data=tensile)
## End(Not run)
Description
[Link] is the GAMLSS specific method for the generic function fitted which extracts
fitted values for a specified parameter from a GAMLSS objects. [Link] is an alias for it.
The function fv() is similar to [Link]() but allows the argument what not to be character
Usage
## S3 method for class 'gamlss'
fitted(object, what = c("mu", "sigma", "nu", "tau"),
parameter= NULL, ...)
fv(obj, what = c("mu", "sigma", "nu", "tau"), parameter= NULL, ... )
Arguments
object a GAMLSS fitted model
obj a GAMLSS fitted model
what which parameter fitted values are required, default what="mu"
parameter equivalent to what
... for extra arguments
Value
Fitted values extracted from the GAMLSS object for the given parameter.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>
fittedPlot 33
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
[Link], [Link], [Link], [Link], [Link], [Link],
[Link], [Link], [Link]
Examples
data(aids)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
fitted(h)
rm(h)
Description
This function, applicable only to a models with a single explanatory variable, plots the fitted values
for all the parameters of a GAMLSS model against the (one) explanatory variable. It is also useful
for comparing the fits for more than one model.
Usage
fittedPlot(object, ..., x = NULL, color = TRUE, [Link] = FALSE, xlab = NULL)
Arguments
object a fitted GAMLSS model object(with only one explanatory variable)
... optionally more fitted GAMLSS model objects
x The unique explanatory variable
color whether the fitted lines plots are shown in colour, color=TRUE (the default) or
not color=FALSE
[Link] whether the line type should be different or not. The default is color=FALSE
xlab the x-label
34 [Link]
Value
A plot of the fitted values against the explanatory variable
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby and Calliope Akantziliotou
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, centiles, [Link]
Examples
data(abdom)
h1<-gamlss(y~pb(x), [Link]=~x, family=BCT, data=abdom)
h2<-gamlss(y~pb(x), [Link]=~pb(x), family=BCT, data=abdom)
fittedPlot(h1,h2,x=abdom$x)
rm(h1,h2)
Description
[Link] is the GAMLSS specific method for the generic function formula which extracts
the model formula from objects returned by modelling functions.
Usage
## S3 method for class 'gamlss'
formula(x, what = c("mu", "sigma", "nu", "tau"),
parameter= NULL, ... )
gamlss 35
Arguments
Value
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
Examples
data(aids)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
formula(h,"mu")
rm(h)
Description
Returns an object of class "gamlss", which is a generalized additive model for location scale and
shape (GAMLSS). The function gamlss() is very similar to the gam() function in S-plus (now also
in R in package gam), but can fit more distributions (not only the ones belonging to the exponential
family) and can model all the parameters of the distribution as functions of the explanatory variables
(e.g. using linear, non-linear, smoothing, loess and random effects terms).
This implementation of gamlss() allows modelling of up to four parameters in a distribution family,
which are conventionally called mu, sigma, nu and tau.
The function gamlssNews() shows what is new in the current implementation.
Usage
gamlss(formula = formula(data), [Link] = ~1,
[Link] = ~1, [Link] = ~1, family = NO(),
data = [Link](), weights = NULL,
contrasts = NULL, method = RS(), [Link] = NULL,
[Link] = NULL, [Link] = NULL,
[Link] = NULL, [Link] = NULL,
[Link] = FALSE, [Link] = FALSE, [Link] = FALSE,
[Link] = FALSE, control = [Link](...),
[Link] = [Link](...), ...)
[Link](x)
gamlssNews()
Arguments
formula a formula object, with the response on the left of an ~ operator, and the terms,
separated by + operators, on the right. Nonparametric smoothing terms are in-
dicated by pb() for penalised beta splines, cs for smoothing splines, lo for loess
smooth terms and random or ra for random terms, e.g. y~cs(x,df=5)+x1+x2*x3.
Additional smoothers can be added by creating the appropriate interface. Inter-
actions with nonparametric smooth terms are not fully supported, but will not
produce errors; they will simply produce the usual parametric interaction
[Link] a formula object for fitting a model to the sigma parameter, as in the formula
above, e.g. [Link]=~cs(x,df=5). It can be abbreviated to [Link]=~cs(x,df=5).
[Link] a formula object for fitting a model to the nu parameter, e.g. [Link]=~x
[Link] a formula object for fitting a model to the tau parameter, e.g. [Link]=~cs(x,df=2)
family a [Link] object, which is used to define the distribution and the link
functions of the various parameters. The distribution families supported by
gamlss() can be found in [Link]. Functions such as BI() (binomial)
produce a family object. Also can be given without the parentheses i.e. BI.
Family functions can take arguments, as in BI([Link]=probit)
data a data frame containing the variables occurring in the formula. If this is missing,
the variables should be on the search list. e.g. data=aids
weights a vector of weights. Note that this is not the same as in the glm() or gam()
function. Here weights can be used to weight out observations (like in subset)
gamlss 37
Details
The Generalized Additive Model for Location, Scale and Shape is a general class of statistical mod-
els for a univariate response variable. The model assumes independent observations of the response
variable y given the parameters, the explanatory variables and the values of the random effects. The
distribution for the response variable in the GAMLSS can be selected from a very general fam-
ily of distributions including highly skew and/or kurtotic continuous and discrete distributions, see
[Link]. The systematic part of the model is expanded to allow modelling not only of the
mean (or location) parameter, but also of the other parameters of the distribution of y, as linear
38 gamlss
parametric and/or additive nonparametric (smooth) functions of explanatory variables and/or ran-
dom effects terms. Maximum (penalized) likelihood estimation is used to fit the (non)parametric
models. A Newton-Raphson/Fisher scoring algorithm is used to maximize the (penalized) likeli-
hood. The additive terms in the model are fitted using a backfitting algorithm.
[Link] is a short version is is(object,"gamlss")
Value
Returns a gamlss object with components
[Link]
the linear coefficients of the mu model, also [Link], [Link],
[Link] for the other parameters if present
[Link] the formula for the mu model, also [Link], [Link], [Link] for
the other parameters if present
[Link] the mu degrees of freedom also [Link], [Link], [Link] for the other parameters
if present
[Link] the non linear degrees of freedom, also [Link], [Link], [Link] for the
other parameters if present
[Link] the total degrees of freedom use by the model
[Link] the residual degrees of freedom left after the model is fitted
aic the Akaike information criterion
sbc the Bayesian information criterion
Warning
Respect the parameter hierarchy when you are fitting a model. For example a good model for mu
should be fitted before a model for sigma is fitted
Note
The following generic functions can be used with a GAMLSS object: print, summary, fitted,
coef, residuals, update, plot, deviance, formula
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
Examples
data(abdom)
mod<-gamlss(y~pb(x),[Link]=~pb(x),family=BCT, data=abdom, method=mixed(1,20))
plot(mod)
rm(mod)
Description
Auxiliary function as user interface for gamlss fitting. Typically only used when calling gamlss
function with the option control.
Usage
[Link]([Link] = 0.001, [Link] = 20, [Link] = 1, [Link] = 1, [Link] = 1,
[Link] = 1, [Link] = Inf, iter = 0, trace = TRUE, autostep = TRUE,
save = TRUE, ...)
Arguments
[Link] the convergence criterion for the algorithm
[Link] the number of cycles of the algorithm
[Link] the step length for the parameter mu
[Link] the step length for the parameter sigma
[Link] the step length for the parameter nu
[Link] the step length for the parameter tau
[Link] global deviance tolerance level (set more recently to Inf to allow the algorithm to
conversed even if the global deviance change dramatically during the iterations)
iter starting value for the number of iterations, typically set to 0 unless the function
refit is used
trace whether to print at each iteration (TRUE) or not (FALSE)
autostep whether the steps should be halved automatically if the new global deviance is
greater that the old one, the default is autostep=TRUE
save save=TRUE, (the default), saves all the information on exit. save=FALSE saves
only limited information as the global deviance and AIC. For example fitted
values, design matrices and additive terms are not saved. The latest is useful
when gamlss() is called several times within a procedure.
... for extra arguments
[Link] 41
Details
The step length for each of the parameters mu, sigma, nu or tau is very useful to aid convergence
if the parameter has a fully parametric model. However using a step length is not theoretically
justified if the model for the parameter includes one or more smoothing terms, (even thought it may
give a very approximate result).
The [Link] can be increased to speed up the convergence especially for a large set of data which
takes longer to fit. When ‘trace’ is TRUE, calls to the function cat produce the output for each
outer iteration.
Value
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
Examples
data(aids)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
con<-[Link]([Link]=0.1)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids, control=con) #
rm(h,con)
42 [Link]
Description
This is support for the functions cs(), and scs(). It is not intended to be called directly by users. The
function [Link] is using the R function [Link]
Usage
[Link](x, y, w, df = NULL, spar = NULL, xeval = NULL, ...)
Arguments
x the design matrix
y the response variable
w prior weights
df effective degrees of freedom
spar spar the smoothing parameter
xeval used in prediction
... for extra arguments
Value
Returns a class "[Link]" object with
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby
See Also
gamlss, cs
[Link] 43
Description
Those are support for the functions fp() and pp. It is not intended to be called directly by users.
Usage
Arguments
x the x for function [Link] is referred to the design matric of the specific
parameter model (not to be used by the user)
y the y for function [Link] is referred to the working variable of the specific
parameter model (not to be used by the user)
w the w for function [Link] is referred to the iterative weight variable of the
specific parameter model (not to be used by the user)
npoly a positive indicating how many fractional polynomials should be considered in
the fit. Can take the values 1, 2 or 3 with 2 as default
xeval used in prediction
Value
[Link] fitted
residuals residuals
var
[Link] the trace of the smoothing matrix
lambda the value of the smoothing parameter
coefSmo the coefficients from the smoothing fit
varcoeff the variance of the coefficients
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, fp
Description
This is support for the loess function lo(). It is not intended to be called directly by users. The
function [Link] is calling the R function loess.
Usage
[Link](x, y, w, xeval = NULL, ...)
Arguments
x the design matrix
y the response variable
w prior weights
xeval used in prediction
... further arguments passed to or from other methods.
Value
Returns an object
Author(s)
Mikis Stasinopoulos based on Brian Ripley implementation of loess function in R
See Also
gamlss, lo
Description
Those functions are support for the functions pb(), pbo(), ps(), ridge(), ri(), cy(), pvc(), and
pbm(). The functions are not intended to be called directly by users.
Usage
[Link](x, y, w, xeval = NULL, ...)
[Link](x, y, w, xeval = NULL, ...)
[Link](x, y, w, xeval = NULL, ...)
[Link](x, y, w, xeval = NULL, ...)
[Link](x, y, w, xeval = NULL, ...)
[Link](x, y, w, xeval = NULL, ...)
[Link](x, y, w, xeval = NULL, ...)
[Link](x, y, w, xeval = NULL, ...)
[Link](x, y, w, xeval = NULL, ...)
Arguments
x the x for function [Link] is referred to the design matric of the specific
parameter model (not to be used by the user)
y the y for function [Link] is referred to the working variable of the specific
parameter model (not to be used by the user)
w the w for function [Link] is referred to the iterative weight variable of the
specific parameter model (not to be used by the user)
xeval used in prediction
... further arguments passed to or from other methods.
Value
All function return fitted smoothers.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby
46 [Link]
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, pb, ps, ri,ridge,cy,pvc,pbm
Description
This is support for the functions random() and re() respectively. It is not intended to be called
directly by users. .
Usage
[Link](x, y, w)
[Link](x, y, w, xeval = NULL, ...)
Arguments
x the explanatory design matrix
y the response variable
w iterative weights
xeval it used internaly for prediction
... for extra arguments
Value
Returns a list with
Author(s)
Mikis Stasinopoulos, based on Trevor Hastie function [Link]
References
Chambers, J. M. and Hastie, T. J. (1991). Statistical Models in S, Chapman and Hall, London.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, random
Description
Generates a scope argument for a stepwise GAMLSS.
Usage
[Link](frame, response = 1, smoother = "cs", arg = NULL, form = TRUE)
Arguments
frame a data or model frame
response which variable is the response; the default is the first
smoother what smoother to use; default is cs
arg any additional arguments required by the smoother
form should a formula be returned (default), or else a character version of the formula
Details
Each formula describes an ordered regimen of terms, each of which is eligible on their own for
inclusion in the gam model. One of the terms is selected from each formula by [Link]. If a 1 is
selected, that term is omitted.
48 gamlssML
Value
a list of formulas is returned, one for each column in frame (excluding the response). For a numeric
variable, say x1, the formula is
~ 1 + x1 + cs(x1)
If x1 is a factor, the last smooth term is omitted.
Author(s)
Mikis Stasinopoulos: a modified function from Statistical Models in S
References
Chambers, J. M. and Hastie, T. J. (1991). Statistical Models in S, Chapman and Hall, London.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
stepGAIC
Examples
data(usair)
gs1<-[Link]([Link](y~x1+x2+x3+x4+x5+x6, data=usair))
gs2<-[Link]([Link](usair))
gs1
gs2
gs3<-[Link]([Link](usair), smooth="fp", arg="3")
gs3
Description
This is a function for fitting a [Link] distribution to single data set using a non linear
maximisation algorithm in R. This is relevant only when there are not explanatory variables.
gamlssML 49
Usage
gamlssML(formula, family = NO, weights = NULL, [Link] = NULL,
[Link] = NULL, [Link] = NULL, [Link] = NULL,
[Link] = FALSE, [Link] = FALSE, [Link] = FALSE,
[Link] = FALSE, data = [Link](), [Link] = NULL, ...)
Arguments
formula a vector of data requiring the fit of a [Link] distribution or a formula,
for example, y~1 (explanatory variables are ignored).
family [Link] object, which is used to define the distribution and the link
functions of the various parameters. The distribution families supported by
gamlssML() can be found in [Link]
weights a vector of weights. Here weights can be used to weight out observations (like in
subset) or for a weighted likelihood analysis where the contribution of the ob-
servations to the likelihood differs according to weights. The length of weights
must be the same as the number of observations in the data. By default, the
weight is set to one. To set weights to vector w use weights=w
[Link] a scalar of initial values for the location parameter mu e.g. [Link]=4
[Link] a scalar of initial values for the scale parameter sigma e.g. [Link]=1
[Link] scalar of initial values for the parameter nu e.g. [Link]=3
[Link] scalar of initial values for the parameter tau e.g. [Link]=3
[Link] whether the mu parameter should be kept fixed in the fitting processes e.g.
[Link]=FALSE
[Link] whether the sigma parameter should be kept fixed in the fitting processes e.g.
[Link]=FALSE
[Link] whether the nu parameter should be kept fixed in the fitting processes e.g. [Link]=FALSE
[Link] whether the tau parameter should be kept fixed in the fitting processes e.g.
[Link]=FALSE
data a data frame containing the variable y. If this is missing, the variable should be
on the search list. e.g. data=aids
[Link] a gamlss object to start from the fitting or vector of length as many parameters
in the distribution
... for extra arguments
Details
This function which fits a [Link] distribution to a single data set is using a non linear
maximisation. in fact it uses the internal function MLE() which is a copy of the mle() function
of package stat4. The function gamlssML() could be for large data faster than the equivalent
gamlss() function which is designed for regression type of models.
Value
Returns a gamlssML object which behaves like a gamlss fitted objected
50 gamlssVGD
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby, Vlasis Voudouris and
Majid Djennad
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
[Link], gamlss
Examples
#-------- negative binomial 1000 observations
y<- rNBI(1000)
[Link](m1<-gamlss(y~1, family=NBI))
[Link](m1a<-gamlss(y~1, family=NBI, trace=FALSE))
[Link](m11<-gamlssML(y, family=NBI))
AIC(m1,m1a,m11, k=0)
# neg. binomial n=10000
y<- rNBI(10000)
[Link](m1<-gamlss(y~1, family=NBI))
[Link](m1a<-gamlss(y~1, family=NBI, trace=FALSE))
[Link](m11<-gamlssML(y, family=NBI))
AIC(m1,m1a,m11, k=0)
# binomial type data
data(aep)
m1 <- gamlssML(aep$y, family=BB) # ok
m2 <- gamlssML(y, data=aep, family=BB) # ok
m3 <- gamlssML(y~1, data=aep, family=BB) # ok
m4 <- gamlssML(aep$y~1, family=BB) # ok
AIC(m1,m2,m3,m4)
gamlssVGD A Set of Functions for selecting Models using Validation or Test Data
Sets and Cross =Validation
gamlssVGD 51
Description
This is a set of function useful for selecting appropriate models.
The functions gamlssVGD, VGD, getTGD, TGD can be used when a subset of the data is used for
validation or testing.
The function stepVGD() is a stepwise procedure for selecting an appropriate model for any of the
parameters of the model minimising the test global deviance. The function stepVGDAll.A() can
select a model using strategy A for all the parameters.
The functions gamlssCV, CV can be used for a k-fold cross validation.
Usage
gamlssVGD(formula = NULL, [Link] = ~1, [Link] = ~1,
[Link] = ~1, data = NULL, family = NO,
control = [Link](trace = FALSE),
rand = NULL, newdata = NULL, ...)
VGD(object, ...)
TGD(object, ...)
CV(object, ...)
Arguments
formula A gamlss mu formula.
[Link] Formula for sigma.
[Link] Formula for nu.
[Link] Formula for tau.
data The data frame required for the fit.
family The [Link] distribution.
control The control for fitting the gamlss model.
rand For gamlssVGD a variable with values 1 (for fitting) and 2 (for predicting). For
gamlssCV a variable with k values indicating the cross validation sets.
newdata The new data set (validation or test) for prediction.
object A relevant R object.
scope defines the range of models examined in the stepwise selection similar to stepGAIC()
where you can see examples
[Link] defines the range of models examined in the stepwise selection for sigma
[Link] defines the range of models examined in the stepwise selection for nu
[Link] defines the range of models examined in the stepwise selection for tau
[Link] whether should try fitting models for mu
[Link] whether should try fitting models for sigma
[Link] whether should try fitting models for nu
[Link] whether should try fitting models for tau
parameter which distribution parameter is required, default what="mu"
sorted should the results be sorted on the value of TGD
trace f TRUE additional information may be given on the fits as they are tried.
direction The mode of stepwise search, can be one of both, backward, or forward, with
a default of both. If the scope argument is missing the default for direction is
backward
keep see stepGAIC() for explanation
steps the maximum number of steps to be considered. The default is 1000.
[Link] the number of subsets of the data used
[Link] the seed to be used in creating rand
gamlssVGD 53
parallel The type of parallel operation to be used (if any). If missing, the default is "no".
ncpus integer: number of processes to be used in parallel operation: typically one
would chose this to the number of available CPUs.
cl An optional parallel or snow cluster for use if parallel = "snow". If not
supplied, a cluster on the local machine is created for the duration of the call.
... further arguments to be pass in the gamlss fit
Details
The function gamlssVGD() fits a gamlss model to the training data set determined by the arguments
rand or newdata. The results is a gamlssVGD objects which contains the gamlss fit to the training
data plus three extra components: i) VGD the global deviance applied to the validation data sets. ii)
predictError which is VGD divided with the number of observations in the validation data set and
iii) residVal the residuals for the validation data set.
The function VGD() extract the validated global deviance from one or more fitted gamlssVGD objects
and can be used foe model comparison.
The function getTGD() operates different from the function gamlssVGD(). It assumes that the users
already have fitted models using gamlss() and now he/she wants to evaluate the global deviance at
a new (validation or test) data set.
The function TGD() extract the validated/test global deviance from one or more fitted gamlssTGD
objects and can be use to compare models.
The gamlssCV() performs a k-fold cross validation on a gamlss models.
The function CV() extract the cross validated global deviance from one or more fitted gamlssCV
objects and can be use to compare models.
The functions add1TGD(), drop1TGD() and stepTGD behave similar to add1(), drop1() and stepGAIC()
functions respectively but they used validation or test deviance as the selection criterion rather than
the GAIC.
Value
A fitted models of a set of global deviances.
Author(s)
Mikis Stasinopoulos
References
Chambers, J. M. and Hastie, T. J. (1991). Statistical Models in S, Chapman and Hall, London.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
54 gamlssVGD
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC. (see also [Link]
[Link]/).
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
See Also
stepGAIC
Examples
data(abdom)
# generate the random split of the data
rand <- sample(2, 610, replace=TRUE, prob=c(0.6,0.4))
# the proportions in the sample
table(rand)/610
olddata<-abdom[rand==1,] # training data
newdata<-abdom[rand==2,] # validation data
#------------------------------------------------------------------------------
# gamlssVGD
#-------------------------------------------------------------------------------
# Using rand
v1 <- gamlssVGD(y~pb(x,df=2),[Link]=~pb(x,df=1), data=abdom, family=NO,
rand=rand)
v2 <- gamlssVGD(y~pb(x,df=2),[Link]=~pb(x,df=1), data=abdom, family=LO,
rand=rand)
v3 <- gamlssVGD(y~pb(x,df=2),[Link]=~pb(x,df=1), data=abdom, family=TF,
rand=rand)
VGD(v1,v2,v3)
#-------------------------------------------------------------------------------
## Not run:
#-------------------------------------------------------------------------------
# using two data set
v11 <- gamlssVGD(y~pb(x,df=2),[Link]=~pb(x,df=1), data=olddata,
family=NO, newdata=newdata)
v12 <- gamlssVGD(y~pb(x,df=2),[Link]=~pb(x,df=1), data=olddata,
family=LO, newdata=newdata)
v13 <- gamlssVGD(y~pb(x,df=2),[Link]=~pb(x,df=1), data=olddata,
family=TF, newdata=newdata)
VGD(v11,v12,v13)
#-------------------------------------------------------------------------------
# function getTGD
#-------------------------------------------------------------------------------
# fit gamlss models first
g1 <- gamlss(y~pb(x,df=2),[Link]=~pb(x,df=1), data=olddata, family=NO)
g2 <- gamlss(y~pb(x,df=2),[Link]=~pb(x,df=1), data=olddata, family=LO)
g3 <- gamlss(y~pb(x,df=2),[Link]=~pb(x,df=1), data=olddata, family=TF)
# and then use
gg1 <-getTGD(g1, newdata=newdata)
gg2 <-getTGD(g2, newdata=newdata)
gg3 <-getTGD(g3, newdata=newdata)
gamlssVGD 55
TGD(gg1,gg2,gg3)
#-------------------------------------------------------------------------------
#-------------------------------------------------------------------------------
# function gamlssCV
#-------------------------------------------------------------------------------
[Link](123)
rand1 <- sample (10 , 610, replace=TRUE)
g1 <- gamlssCV(y~pb(x,df=2),[Link]=~pb(x,df=1), data=abdom, family=NO,
rand=rand1)
g2 <- gamlssCV(y~pb(x,df=2),[Link]=~pb(x,df=1), data=abdom, family=LO,
rand=rand1)
g3 <- gamlssCV(y~pb(x,df=2),[Link]=~pb(x,df=1), data=abdom, family=TF,
rand=rand1)
CV(g1,g2,g3)
CV(g1)
# using parallel
[Link](123)
rand1 <- sample (10 , 610, replace=TRUE)
nC <- detectCores()
CV(g21,g22,g23)
#-------------------------------------------------------------------------------
# functions add1TGD() drop1TGD() and stepTGD()
#-------------------------------------------------------------------------------
# the data
data(rent)
rand <- sample(2, dim(rent)[1], replace=TRUE, prob=c(0.6,0.4))
# the proportions in the sample
table(rand)/dim(rent)[1]
oldrent<-rent[rand==1,] # training set
newrent<-rent[rand==2,] # validation set
# null model
v0 <- gamlss(R~1, data=oldrent, family=GA)
# complete model
v1 <- gamlss(R~pb(Fl)+pb(A)+H+loc, [Link]=~pb(Fl)+pb(A)+H+loc,
data=oldrent, family=GA)
# drop1TGDP
[Link](v3<- drop1TGD(v1, newdata=newrent, parallel="no"))
[Link](v4<- drop1TGD(v1, newdata=newrent, parallel="multicore",
ncpus=nC) )
[Link](v5<- drop1TGD(v1, newdata=newrent, parallel="snow", ncpus=nC))
56 [Link]
cbind(v3,v4,v5)
# add1TGDP
[Link](d3<- add1TGD(v0,scope=~pb(Fl)+pb(A)+H+loc, newdata=newrent,
parallel="no"))
[Link](d4<- add1TGD(v0,scope=~pb(Fl)+pb(A)+H+loc, newdata=newrent,
parallel="multicore", ncpus=nC) )
[Link](d5<- add1TGD(v0, scope=~pb(Fl)+pb(A)+H+loc,newdata=newrent,
parallel="snow", ncpus=nC))
# stepTGD
[Link](d6<- stepTGD(v0, scope=~pb(Fl)+pb(A)+H+loc,newdata=newrent))
[Link](d7<- stepTGD(v0, scope=~pb(Fl)+pb(A)+H+loc,newdata=newrent,
parallel="multicore", ncpus=nC))
[Link](d8<- stepTGD(v0, scope=~pb(Fl)+pb(A)+H+loc,newdata=newrent,
parallel="snow", ncpus=nC))
## End(Not run)
Description
This function generate a function with argument the parameters of the GAMLSS model which can
evaluate the log-likelihood function.
Usage
[Link](object)
Arguments
object A gamlss fitted model
Details
The purpose of this function is to help the function vcov() to get he right Hessian matrix after a
model has fitted. Note that at the momment smoothing terms are consideted as fixed.
Value
A function of the log-likelihood
Author(s)
Mikis Stasinopoulos <[Link]@[Link]> Bob Rigby and Vlasios Voudouris
getPEF 57
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
vcov
Examples
data(aids)
m1 <- gamlss(y~x+qrt, data=aids, family=NBI)
logL<-[Link](m1)
logL()
logLik(m1)
Description
This function can be used to calculate the partial effect and the elasticity of a continuous explanatory
variable x.
By ‘partial effect’ function we mean how x is influence the parameter of interest given that the rest
of explanatory terms for this paramerer are on (specified) fixed values.
The function takes a GAMLSS object and for the range of the continuous variable x, (by fixing
the rest of the explanatory terms at specified values), calculates the effect that x has on the specific
distribution parameter (or its predictor). The resulting function shows the effect that x has on
the distribution parameter. The partial effect function which is calculated on a finite grit is then
approximated using the splinefun() in R and its is saved.
The saved function can be used to calculate the elasticity of x. The elasticity is the first derivative
of the partial effect function and shows the chance of the parameter of interest for a small change in
in x, by fixing the rest of the explanatory variables at specified values.
Usage
getPEF(obj = NULL, term = NULL, data = NULL, [Link] = 100,
parameter = c("mu", "sigma", "nu", "tau"),
type = c("response", "link"), how = c("median", "last"),
[Link] = list(), plot = FALSE)
58 getPEF
Arguments
obj A gamlss object
term the continuous explanatory variable
data the [Link] (not needed if is declared on obj)
[Link] the number of points in which the influence function for x need to be evaluated
parameter which distribution parameter
type whether against the parameter, "response", or the predictor "link"
how whether for continuous variables should use the median or the last observation
in the data
[Link] a list indicating at which values the rest of the explanatory terms should be fixed
plot whether to the plot the influence function and its first derivatives
Value
A function is created which can be used to evaluate the partial effect function at different values of
x.
Author(s)
Mikis Stasinopoulos, Vlasios Voudouris, Daniil Kiose
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Rigby, R. and Stasinopoulos, D. M (2013) Automatic smoothing parameter selection in GAMLSS
with an application to centile estimation, Statistical methods in medical research.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
Examples
m1 <- gamlss(R~pb(Fl)+pb(A), data=rent, family=GA)
# getting the Partial Efect function
pef <- getPEF(obj=m1,term="A", plot=TRUE)
# the value at 1980
pef(1980)
# the first derivative at 1980
getSmo 59
pef(1980, deriv=1)
# the second derivative at 1980
pef(1980, deriv=2)
# plotting the first derivative
curve(pef(x, deriv=1), 1900,2000)
Description
The function getSmo() extracts information from a fitted smoothing additive term.
Usage
getSmo(object, what = c("mu", "sigma", "nu", "tau"),
parameter= NULL, which = 1)
Arguments
object a GAMLSS fitted model
what which distribution parameter is required, default what="mu"
parameter equivalent to what
which which smoothing term
Details
This function facilitates the extraction of information from a fitted additive terms. For example
getSmo(m1,"sigma",2) is equivalent of m1$[Link][[2]]. To get the actual fitted values
type m1$sigma.s[[2]]
Value
A list containing information about a fitted smoother or a fitted objects
Author(s)
Mikis Stasinopoulos and Bob Rigby
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
60 [Link]
Examples
data(usair)
t1<-gamlss(y~x1+pb(x5)+pb(x6), data=usair, family=GA)
# get the value for lambda for the second fitted term in mu
getSmo(t1, parameter="mu", 2)$lambda
Description
Auxiliary function used for the inner iteration of gamlss algorithm. Typically only used when
calling gamlss function through the option [Link].
Usage
Arguments
Value
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape, (with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
Examples
data(aids)
con<-[Link]([Link]=TRUE)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids, [Link]=con) #
rm(h,con)
histDist This function plots the histogram and a fitted (GAMLSS family) distri-
bution to a variable
Description
This function fits constants to the parameters of a GAMLSS family distribution and them plot the
histogram and the fitted distribution.
Usage
histDist(y, family = NO, freq = NULL,
density = FALSE, nbins = 10, xlim = NULL,
ylim = NULL, main = NULL, xlab = NULL,
ylab = NULL, data = NULL, [Link] = 2,[Link] = 1,
[Link] = "red",...)
Arguments
y a vector for the response variable
family a [Link] distribution
freq the frequencies of the data in y if exist. freq is used as weights in the gamlss
fit
62 histDist
density default value is FALSE. Change to TRUE if you would like a non-parametric
density plot together with the parametric fitted distribution plot (for continuous
variable only)
nbins The suggested number of bins (argument passed to truehist() of package
MASS). Either a positive integer, or a character string naming a rule: "Scott"
or "Freedman-Diaconis" or "FD". (Case is ignored.)
xlim the minimum and the maximum x-axis value (if the default values are out of
range)
ylim the minimum and the maximum y-axis value (if the default values are out of
range)
main the main title for the plot
xlab the label in the x-axis
ylab the label in the y-axis
data the [Link]
[Link] the line width of the fitted distribution
[Link] the line type of the fitted distribution
[Link] the line color of the fitted distribution
... for extra arguments to be passed to the gamlss function
Details
This function first fits constants for each parameters of a GAMLSS distribution family using the
gamlss function and them plots the fitted distribution together with the appropriate plot according
to whether the y variable is of a continuous or discrete type. Histogram is plotted for continuous
and barplot for discrete variables. The function truehist of Venables and Ripley’s MASS package
is used for the histogram plotting.
Value
returns a plot
Author(s)
Mikis Stasinopoulos
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
histSmo 63
See Also
gamlss, [Link]
Examples
data(abdom)
histDist(y,family="NO", data=abdom)
# use the ylim
histDist(y,family="NO", ylim=c(0,0.005), data=abdom)
# bad fit use PE
histDist(y,family="PE",ymax=0.005, data=abdom, [Link]="blue")
# discere data counts
# Hand at al. p150 Leptinotarsa decemlineata
y <- c(0,1,2,3,4,6,7,8,10,11)
freq <- c(33,12,5,6,5,2,2,2,1,2)
histDist(y, "NBI", freq=freq)
# the same as
histDist(rep(y,freq), "NBI")
Description
This set of functions use the old Poisson trick of discretising the data and then fitting a Poisson error
model to the resulting frequencies (Lindsey, 1997). Here the model fitted is a smooth cubic spline
curve. The result is a density estimator for the data.
Usage
Arguments
y the variable of interest
lambda the smoothing parameter
df the degrees of freedom
order the order of the P-spline
lower the lower limit of the y-variable
upper the upper limit of the y-variable
type the type of histogram
plot whether to plot the resulting density estimator
breaks the number of break points to be used in the histogram and consequently the
number of observations in the Poisson fit
discrete whether to treat the fitting density as a discrete distribution or not
... further arguments passed to or from other methods.
Details
Here are the methods used here:
i) The function histSmoO() uses Penalised discrete splines (Eilers, 2003). This function is appro-
priate when the smoothing parameter is fixed.
ii) The function histSmoC() uses smooth cubic splines and fits a Poison error model to the fre-
quencies using the cs() additive function of GAMLSS. This function is appropriate if the effective
degrees of freedom are fixed in the model.
iii) The function histSmoP() uses Penalised cubic splines (Eilers and Marx 1996). It is fitting a
Poisson model to the frequencies using the pb() additive function of GAMLSS. This function is
appropriate if automatic selection of the smoothing parameter is required.
iv) The function histSmo() combines all the above functions in the sense that if lambda is fixed
it uses histSmoO(), if the df’s are fixed it uses codehistSmoC() and if none of these is specified it
uses histSmoP().
Value
Returns a histSmo S3 object. The object has the following components:
Author(s)
Mikis Stasinopoulos, Paul Eilers, Bob Rigby and Vlasios Voudouris
References
Eilers, P. (2003). A perfect smoother. Analytical Chemistry, 75: 3631-3636.
Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties (with
comments and rejoinder). Statist. Sci, 11, 89-121.
Lindsey, J.K. (1997) Applying Generalized Linear Models. New York: Springer-Verlag. ISBN
0-387-98218-3
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
pb, cs
Examples
# creating data from Pareto 2 distribution
[Link](153)
Y <- rPARETO2(1000)
## Not run:
# getting the density
histSmo(Y, lower=0, plot=TRUE)
# more breaks a bit slower
histSmo(Y, breaks=200, lower=0, plot=TRUE)
# quick fit using lambda
histSmoO(Y, lambda=1, breaks=200, lower=0, plot=TRUE)
# or
histSmo(Y, lambda=1, breaks=200, lower=0, plot=TRUE)
# quick fit using df
histSmoC(Y, df=15, breaks=200, lower=0,plot=TRUE)
# or
histSmo(Y, df=15, breaks=200, lower=0)
# saving results
m1<- histSmo(Y, lower=0, plot=T)
plot(m1)
plot(m1, "cdf")
plot(m1, "invcdf")
# using with a histogram
library(MASS)
truehist(Y)
66 IC
lines(m1, col="red")
#---------------------------
# now gererate from SHASH distribution
YY <- rSHASH(1000)
m1<- histSmo(YY)
# calculate Pr(YY>10)
1-m1$cdf(10)
# calculate Pr(-10<YY<10)
1-(1-m1$cdf(10))-m1$cdf(-10)
#---------------------------
# from discrete distribution
YYY <- rNBI(1000, mu=5, sigma=4)
histSmo(YYY, discrete=TRUE, plot=T)
#
YYY <- rPO(1000, mu=5)
histSmo(YYY, discrete=TRUE, plot=T)
#
YYY <- rNBI(1000, mu=5, sigma=.1)
histSmo(YYY, discrete=TRUE, plot=T)
# genarating from beta distribution
YYY <- rBE(1000, mu=.1, sigma=.3)
histSmo(YYY, lower=0, upper=1, plot=T)
# from trucated data
Y <- with(stylo, rep(word,freq))
histSmo(Y, lower=1, discrete=TRUE, plot=T)
histSmo(Y, lower=1, discrete=TRUE, plot=T, type="prob")
## End(Not run)
Description
IC is a function to calculate the Generalised Akaike information criterion (GAIC) for a given penalty
k for a fitted GAMLSS object. The function [Link] is the method associated with a GAMLSS
object of the generic function AIC. The function GAIC is a synonymous of the function [Link].
The function extractAIC is a the method associated a GAMLSS object of the generic function
extractAIC and it is mainly used in the stepAIC function. The function Rsq compute a generali-
sation of the R-squared for not normal models.
Usage
IC(object, k = 2)
## S3 method for class 'gamlss'
AIC(object, ..., k = 2, c = FALSE)
GAIC(object, ..., k = 2, c = FALSE )
## S3 method for class 'gamlss'
extractAIC(fit, scale, k = 2, c = FALSE, ...)
IC 67
Arguments
object an gamlss fitted model
fit an gamlss fitted model
... allows several GAMLSS object to be compared using a GAIC
k the penalty with default k=2.5
c whether the corrected AIC, i.e. AICc, should be used, note that it applies only
when k=2
scale this argument is not used in gamlss
Value
The function IC returns the GAIC for given penalty k of the GAMLSS object. The function AIC
returns a matrix contains the df’s and the GAIC’s for given penalty k. The function GAIC returns
identical results to AIC. The function extractAIC returns vector of length two with the degrees of
freedom and the AIC criterion.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
Examples
data(abdom)
mod1<-gamlss(y~pb(x),[Link]=~pb(x),family=BCT, data=abdom)
IC(mod1)
mod2<-gamlss(y~pb(x),[Link]=~x,family=BCT, data=abdom)
AIC(mod1,mod2,k=3)
GAIC(mod1,mod2,k=3)
extractAIC(mod1,k=3)
rm(mod1,mod2)
68 lms
Description
This function is design to help the user to easily construct growth curve centile estimation. It is
applicable when only "one" explanatory variable is available (usually age).
Usage
lms(y, x, families = LMS, data = NULL, k = 2,
cent = 100 * pnorm((-4:4) * 2/3),
calibration = TRUE, trans.x = FALSE,
[Link] = NULL, [Link] = c(0, 1.5),
prof = FALSE, step = 0.1, legend = FALSE,
[Link] = NULL, [Link] = NULL, [Link] = NULL,
[Link] = NULL, [Link] = 0.01,
[Link] = c("ML", "GAIC"), ...)
Arguments
y The response variable
x The unique explanatory variable
families a list of [Link] with default LMS=c("BCCGo", "BCPEo", "BCTo")
data the data frame
k the penalty to be used in the GAIC
cent a vector with elements the % centile values for which the centile curves have to
be evaluated
calibration whether calibration is required with default TRUE
trans.x whether to check for transformation in x with default FALSE
[Link] if set it fix the power of the transformation for x
[Link] the limits for the search of the power parameter for x
prof whether to use the profile GAIC of the power tranformation
step if codeprof=TRUE is used this determine the step for the profile GAIC
legend whether a legend is required in the plot with default FALSE
[Link] mu effective degrees of freedom if required otherwise are estimated
[Link] sigma effective degrees of freedom if required otherwise are estimated
[Link] nu effective degrees of freedom if required otherwise are estimated
[Link] tau effective degrees of freedom if required otherwise are estimated
[Link] the convergence critetion to be pass to gamlss()
[Link] the method used in the pb() for estimating the smoothing parameters. The de-
fault is local maximum likelihood "ML". "GAIC" is also permitted where k is
taken from the k argument of the function.
... extra argument which can be passed to gamlss()
lms 69
Details
This function should be used if the construction of the centile curves involves only one explanatory
variable.
The model assumes that the response variable has a flexible distribution i.e. y D(µ, σ, ν, τ ) where
the parameters of the distribution are smooth functions of the explanatory variable i.e. g(µ) = s(x),
where g() is a link function and s() is a smooth function. Occasionally a power transformation in
the x-axis helps the construction of the centile curves. That is, in this case the parameters are
modelled by xp rather than just x, i.e.g(µ) = s(xp ). The function lms() uses P-splines (pb()) as a
smoother.
If a transformation is needed for x the function lms() starts by finding an optimum value for p
using the simple model N O(µ = s(xp )). (Note that this value of p is not the optimum for the final
chosen model but it works well in practice.)
After fitting a Normal error model for staring values the function proceeds by fitting several "appro-
priate" distributions for the response variable. The set of [Link] distributions to fit is speci-
fied by the argument families. The default families arguments is LMS=c("BCCGo", "BCPEo", "BCTo")
that is the LMS class of distributions, Cole and Green (1992). Note that this class is only appropri-
ate when y is positive (with no zeros). If the response variable contains negative values and zeros
then use the argument families=theSHASH where theSHASH <- c("NO", "SHASHo") or add
any other list of distributions which you may think is appropriate. Justification of using the specific
centile (0.38 2.27 9.1211220 25.25, 50, 74.75, 90.88, 97.72, 99.62) is given in Cole (1994).
Value
It returns a gamlss fitted object
Note
The function is fitting several models and for large data can be slow
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby and Vlasios Voudouris
<[Link]@[Link]>
References
Cole, T. J. (1994) Do growth chart centiles need a face lift? BMJ, 308–641.
Cole, T. J. and Green, P. J. (1992) Smoothing reference centile curves: the LMS method and penal-
ized likelihood, Statist. Med. 11, 1305–1319
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
70 lo
See Also
gamlss, centiles, calibration
Examples
## Not run:
data(abdom)
m1 <- lms(y,x , data=abdom, [Link]=30)
m2 <- lms(y,x ,data=abdom, [Link]="GAIC", k=log(610))
# this example takes time
data(db)
m1 <- lms(y=head, x=age, data=db, trans.x=TRUE)
## End(Not run)
Description
Allows the user to specify a loess fit within a GAMLSS model. This function is similar to the lo
function in the gam implementation of package gam see Chambers and Hastie (1991).
The function [Link]() allows plotting the results.
Usage
lo(formula, control = [Link](...), ...)
[Link](span = 0.75, [Link] = NULL,
degree = 2, parametric = FALSE, [Link] = FALSE,
normalize = TRUE, family = c("gaussian", "symmetric"),
method = c("loess", "[Link]"),
surface = c("interpolate", "direct"),
statistics = c("approximate", "exact", "none"),
[Link] = c("exact", "approximate"),
cell = 0.2, iterations = 4,iterTrace = FALSE, ...)
[Link](obj, se=-1, rug = FALSE, [Link] = FALSE,
[Link] = "darkred", [Link] = "gray",
[Link] = "lightblue", [Link] = "gray", [Link] = 1.5,
[Link] = 1, [Link] = par("pch"),
type = c("persp", "contour"), [Link] = "gray",
nlevels = 30, [Link] = 30, image = TRUE, ...)
Arguments
formula a formula specifying the explanatory variables
control a control to be passed to the loess function
... extra arguments
lo 71
Details
Note that lo itself does no smoothing; it simply sets things up for the function [Link]() which
is used by the backfitting function [Link]().
Value
a loess object is returned.
Warning
In this version the first argument is a formula NOT a list as in the previous one
Note
Note that lo itself does no smoothing; it simply sets things up for [Link]() to do the backfitting.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby, (The original lo() func-
tion was based on the Trevor Hastie’s S-plus lo() function. See also the documentation of the
loess function for the authorship of the function.
References
Chambers, J. M. and Hastie, T. J. (1991). Statistical Models in S, Chapman and Hall, London.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
cs, random,
Examples
# fitting a loess curve with span=0.4 plus the a quarterly effect
aids1<-gamlss(y~lo(~x,span=0.4)+qrt,data=aids,family=PO) #
[Link](aids1, page=1)
## Not run:
r1 <- gamlss(R~lo(~Fl)+lo(~A), data=rent, family=GA)
[Link](r1, pages=1)
[Link](getSmo(r1, which=1), partial=T)
r2 <- gamlss(R~lo(~Fl+A), data=rent, family=GA)
[Link](r2, pages=1)
loglogSurv 73
[Link](getSmo(r2, which=1))
[Link](getSmo(r2, which=1), se=1.97)
[Link](getSmo(r2, which=1), [Link]=T)
## End(Not run)
loglogSurv Log-Log Survival function plots for checking the tail behaviour of the
data
Description
The log-log Survival functions are design for checking the tails of a single response variable (no
explanatory should be involved). There are three different function:
a) the function loglogSurv1() which plot the (left or right) tails of the empirical log-log Survival
function against loglog(y), where y is the variable of interest. The coefficient of a linear fit to the
plot can be used an estimated for Type I tails.
b) the function loglogSurv2() which plot the (left or right) tails of the empirical log-log Survival
function against log(y). The coefficient of a linear fit to the plot can be used an estimated for Type
II tails.
c) the function loglogSurv3() which plot the (left or right) tails of the empirical log-log Survival
function against y. The coefficient of a linear fit to the plot can be used an an estimated for Type III
tails.
The function loglogSurv() combines all the above functions.
The function logSurv() is also design for exploring the tails of a single response variable. It plots
the empirical log-survival function against log(y) for specified percentage of the tail and fits a
linear, quadratic and exponential curve to the points of the plot. For distributions defined on the
positive real line a good linear fit would indicate a Pareto type tail, a good quadratic fit a log-normal
type tail and good exponential fit a Weibull type tail. Note that this function is only appropriate to
investigate rather heavy tails and it is not very good to discriminate between different type of tails,
as the loglogSurv() .
Usage
loglogSurv(y, percentage = 10, howmany = NULL, type = c("right", "left"),
plot = TRUE, print = TRUE, save = FALSE)
loglogSurv1(y, percentage = 10, howmany = NULL, type = c("right", "left"),
plot = TRUE, print = TRUE, save = FALSE)
loglogSurv2(y, percentage = 10, howmany = NULL, type = c("right", "left"),
plot = TRUE, print = TRUE, save = FALSE)
loglogSurv3(y, percentage = 10, howmany = NULL, type = c("right", "left"),
plot = TRUE, print = TRUE, save = FALSE)
Arguments
y a vector, the variable of interest
percentage what percentage of the tail need to be modelled, default is 10%
howmany how many observations in the tail needed. This is an alternative to percentage.
If it specified it take over from the percentage argument otherwise percentage
is used.
type which tall needs checking the right (default) of the left
plot whether to plot with default equal TRUE
print whether to print the coefficients with default equal TRUE
save whether to save the fitted linear model with default equal FALSE
Details
The functions loglogSurv1(), loglogSurv3() and loglogSurv3() take the upper (or lower) part
of an ordered variable create its empirical survival function and plot the log-log of this functions
against log(log(y)), log(y) and y respectively. Then they fit a line to the plot. The coefficients
of the line can be interpreted as parameters determined the behaviour of the tail. More details can
be found in Chapter 6 of "The Distribution Toolbox of GAMLSS" book which can be found in
[Link]
Value
A plot
Author(s)
Bob Rigby, Mikis Stasinopoulos and Vlassios Voudouris
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
Examples
data(film90)
F90 <- film90$lborev1
op<-par(mfrow=c(3,1))
loglogSurv1(F90)
loglogSurv2(F90)
loglogSurv3(F90)
lpred 75
par(op)
loglogSurv(F90)
logSurv(F90)
lpred Extract Linear Predictor Values and Standard Errors For A GAMLSS
Model
Description
lpred is the GAMLSS specific method which extracts the linear predictor and its (approximate)
standard errors for a specified parameter from a GAMLSS objects. The lpred can be also used
to extract the fitted values (with its approximate standard errors) or specific terms in the model
(with its approximate standard errors) in the same way that the [Link]() and [Link]()
functions can be used for lm or glm objects. The function lp extract only the linear predictor. If
prediction is required for new data values then use the function [Link]().
Usage
lpred(obj, what = c("mu", "sigma", "nu", "tau"), parameter= NULL,
type = c("link", "response", "terms"),
terms = NULL, [Link] = FALSE, ...)
lp(obj, what = c("mu", "sigma", "nu", "tau"), parameter= NULL, ... )
Arguments
obj a GAMLSS fitted model
what which distribution parameter is required, default what="mu"
parameter equivalent to what
type type="link" (the default) gets the linear predictor for the specified distribu-
tion parameter. type="response" gets the fitted values for the parameter while
type="terms" gets the fitted terms contribution
terms if type="terms", which terms to be selected (default is all terms)
[Link] if TRUE the approximate standard errors of the appropriate type are extracted
... for extra arguments
Value
If [Link]=FALSE a vector (or a matrix) of the appropriate type is extracted from the GAMLSS
object for the given parameter in what. If [Link]=TRUE a list containing the appropriate type, fit,
and its (approximate) standard errors, [Link].
Author(s)
Mikis Stasinopoulos
76 [Link]
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
[Link]
Examples
data(aids)
mod<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
mod.t <- lpred(mod, type = "terms", terms= "qrt")
mod.t
[Link] <- lp(mod)
[Link]
rm(mod, mod.t,[Link])
Description
The function performs a likelihood ration test for two nested fitted model.
Usage
[Link](null, alternative, print = TRUE)
Arguments
null The null hypothesis (simpler) fitted model
alternative The alternative hypothesis (more complex) fitted model
print whether to print or save the result
Details
Warning: no checking whether the models are nested is performed.
[Link] 77
Value
If print=FALSE a list with chi, df and [Link] is produced.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, dropterm
Examples
data(usair)
m0<-gamlss(y~x1+x2, data=usair)
m1<-gamlss(y~x1+x2+x3+x4, data=usair)
[Link](m0,m1)
Description
[Link], [Link] and [Link] are the gamlss versions of the
generic functions [Link], [Link] and terms respectively.
Usage
## S3 method for class 'gamlss'
[Link](formula, what = c("mu", "sigma", "nu", "tau"),
parameter= NULL, ...)
## S3 method for class 'gamlss'
terms(x, what = c("mu", "sigma", "nu", "tau"),
parameter= NULL, ...)
## S3 method for class 'gamlss'
78 [Link]
Arguments
formula a gamlss object
x a gamlss object
object a gamlss object
what for which parameter to extract the [Link], terms or [Link]
parameter equivalent to what
... for extra arguments
Value
a [Link], a [Link] or terms
Author(s)
Mikis Stasinopoulos
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
Examples
data(aids)
mod<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
[Link](mod)
[Link](mod)
terms(mod, "mu")
rm(mod)
[Link] 79
Description
This function can be used to plot parallel plots for each individual in a repeated measurement study.
It is based on the coplot() function of R.
Usage
Arguments
formula a formula describing the form of conditioning plot. A formula of the form
y ~ x | a indicates that plots of y versus x should be produced conditional on
the variable a. A formula of the form y ~ x| a * b indicates that plots of y
versus x should be produced conditional on the two variables a and b.
data a data frame containing values for any variables in the formula. This argument
is compulsory.
subjects a factor which distinguish between the individual participants
color whether the parallel plot are shown in colour, color=TRUE (the default) or not
color=FALSE
[Link] logical (possibly of length 2 for 2 conditioning variables): should conditioning
plots be shown for the corresponding conditioning variables (default ’TRUE’)
... for extra arguments
Value
It returns a plot.
Note
Note that similar plot can be fount in the library nlme by Pinheiro and Bates
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), App. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
Examples
library(nlme)
data(Orthodont)
[Link](distance~age,data=Orthodont,subject=Subject)
[Link](distance~age|Sex,data=Orthodont,subject=Subject)
[Link](distance~age|Subject,data=Orthodont,subject=Subject,[Link]=FALSE)
Description
The function is trying to merged similar levels of a given factor. Its based on ideas given by Tutz
(2013).
Usage
pcat(fac, df = NULL, lambda = NULL, method = c("ML", "GAIC"), start = 0.001,
Lp = 0, kappa = 1e-05, iter = 100, [Link] = 1e-04, k = 2)
Arguments
fac, factor a factor to reduce its levels
df the effective degrees of freedom df
lambda the smoothing parameter
method which method is used for the estimation of the smoothing parameter, "ML" or
"GAIC" are allowed.
start starting value for lambda if it estimated using "ML" or "GAIC"
Lp The type of penalty required, Lp=0 is the default. Use Lp=1 for lasso type and
different values for different required penalty.
kappa a regulation parameters used for the weights in the penalties.
iter the number of internal iteration allowed
[Link] the convergent criterion
k the penalty if "GAIC" method is used.
x explanatory factor
y the response or iterative response variable
w iterative weights
xeval indicator whether to predict
formula A formula
data A data frame
along a sequence of values
... for extra variables
Details
The pcat() is used for the fitting of the factor. The function shrinks the levels of the categorical
factor (not towards the overall mean as the function random() is doing) but towards each other.
This results to a reduction of the number if levels of the factors. Different norms can be used for the
shrinkage by specifying the argument Lp.
Value
The function pcat reruns a vector endowed with a number of attributes. The vector itself is used in
the construction of the model matrix, while the attributes are needed for the backfitting algorithms
[Link](). The backfitting is done in [Link].
Note
Note that pcat itself does no smoothing; it simply sets things up for [Link]() to do the
smoothing within the backfitting.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Paul Eilers and Marco Enea
82 [Link]
References
Tutz G. (2013) Regularization and Sparsity in Discrete Structures in the Proceedings of the 29th
International Workshop on Statistical Modelling, Volume 1, p 29-42, Gottingen, Germany
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
random
Examples
# Simulate data 1
n <- 10 # number of levels
m <- 200 # number of observations
[Link](2016)
level <- [Link](floor(runif(m) * n) + 1)
a0 <- rnorm(n)
sigma <- 0.4
mu <- a0[level]
y <- mu + sigma * rnorm(m)
plot(y~level)
points(1:10,a0, col="red")
da1 <- [Link](y, level)
#------------------
mn <- gamlss(y~1,data=da1 ) # null model
ms <- gamlss(y~level-1, data=da1) # saturated model
m1 <- gamlss(y~pcat(level), data=da1) # calculating lambda ML
AIC(mn, ms, m1)
## Not run:
m11 <- gamlss(y~pcat(level, method="GAIC", k=log(200)), data=da1) # GAIC
AIC(mn, ms, m1, m11)
#gettng the fitted object -----------------------------------------------------
getSmo(m1)
coef(getSmo(m1))
fitted(getSmo(m1))[1:10]
plot(getSmo(m1)) #
# After the fit a new factor is created this factor has the reduced levels
levels(getSmo(m1)$factor)
# -----------------------------------------------------------------------------
## End(Not run)
Description
A function to plot probability distribution functions (pdf) belonging to the gamlss family of distri-
butions. This function allows either plotting of the fitted distributions for up to eight observations
or plotting specified distributions belonging in the gamlss family
Usage
[Link](obj = NULL, obs = c(1), family = NO(), mu = NULL,
sigma = NULL, nu = NULL, tau = NULL, min = NULL,
max = NULL, step = NULL, allinone = FALSE,
[Link] = FALSE, ...)
Arguments
obj An gamlss object e.g. obj=model1 where model1 is a fitted gamlss object
obs A number or vector of up to length eight indicating the case numbers of the ob-
servations for which fitted distributions are to be displayed, e.g. obs=c(23,58)
will display the fitted distribution for the 23th and 58th observations
family This must be a gamlss family i.e. family=NO
mu The value(s) of the location parameter mu for which the distribution has to be
evaluated e.g mu=c(3,7)
sigma The value(s) the scale parameter sigma for which the distribution has to be eval-
uated e.g sigma=c(3,7)
nu The value(s) the parameter nu for which the distribution has to be evaluated e.g.
nu=3
tau The value(s) the parameter tau for which the distribution has be evaluated e.g.
tau=5
min Minimum value of the random variable y e.g. min=0
max Maximum value of y e.g. max=10
step Steps for the evaluation of y e.g. step=0.5
allinone This will go
[Link] Whether you need title in the plot, default is [Link]=FALSE
... for extra arguments
Details
This function can be used to plot distributions of the GAMLSS family. If the first argument obj
is specified and it is a GAMLSS fitted object, then the fitted distribution of this model at specified
observation values (given by the second argument obs) is plotted for a specified y-variable range
(arguments min, max, and step).
If the first argument is not given then the family argument has to be specified and the pdf is plotted
at specified values of the parameters mu, sigma, nu, tau. Again the range of the y-variable has to be
given.
84 [Link]
Value
Warning
Note
The range of the y values given by min, max and step are very important in the plot
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
Examples
Description
This function provides four plots for checking the normalized (randomized for a discrete response
distribution) quantile residuals of a fitted GAMLSS object, referred to as residuals below : a plot
of residuals against fitted values, a plot of the residuals against an index or a specific explanatory
variable, a density plot of the residuals and a normal Q-Q plot of the residuals. If argument ts=TRUE
then the first two plots are replaced by the autocorrelation function (ACF) and partial autocorrelation
function (PACF) of the residuals
Usage
## S3 method for class 'gamlss'
plot(x, xvar = NULL, parameters = NULL, ts = FALSE,
summaries = TRUE, ...)
Arguments
x a GAMLSS fitted object
xvar an explanatory variable to plot the residuals against
parameters plotting parameters can be specified here
ts set this to TRUE if ACF and PACF plots of the residuals are required
summaries set this to FALSE if no summary statistics of the residuals are required
... further arguments passed to or from other methods.
Details
This function provides four plots for checking the normalized (randomized) quantile residuals
(called residuals) of a fitted GAMLSS object. Randomization is only performed for discrete
response variables. The four plots are
• residuals against the fitted values (or ACF of the residuals if ts=TRUE)
• residuals against an index or specified x-variable (or PACF of the residuals if ts=TRUE)
• kernel density estimate of the residuals
• QQ-normal plot of the residuals
For time series response variables option ts=TRUE can be used to plot the ACF and PACF functions
of the residuals.
Value
Returns four plots related to the residuals of the fitted GAMLSS model and prints summary statistics
for the residuals if the summary=T
86 [Link]
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby and Kalliope Akantzil-
iotou
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
Examples
data(aids)
a<-gamlss(y~pb(x)+qrt,family=PO,data=aids)
plot(a)
rm(a)
Description
Plots the estimated density or its c.d.f function or its inverse cdf function
Usage
## S3 method for class 'histSmo'
plot(x, type = c("hist", "cdf", "invcdf"), ...)
Arguments
x An histSmo object
type Different plots: a histogram and density estimator, a cdf function or an inverse
cdf function.
... for further arguments
plot2way 87
Value
returns the relevant plot
Author(s)
Mikis Stasinopoulos, Paul Eilers, Bob Rigby, Vlasios Voudouris and Majid Djennad
References
Eilers, P. (2003). A perfect smoother. Analytical Chemistry, 75: 3631-3636.
Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties (with
comments and rejoinder). Statist. Sci, 11, 89-121.
Lindsey, J.K. (1997) Applying Generalized Linear Models. New York: Springer-Verlag. ISBN
0-387-98218-3
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
histSmo
Examples
Y <- rPARETO2(1000)
m1<- histSmo(Y, lower=0, save=TRUE)
plot(m1)
plot(m1, "cdf")
plot(m1, "invcdf")
Description
This function is designed to plot a factor to factor interaction in a GAMLSS model.
Usage
plot2way(obj, terms = list(), what = c("mu", "sigma", "nu", "tau"),
parameter= NULL, [Link] = TRUE, ...)
88 plot2way
Arguments
obj A gamlss model
terms this should be a character vector with the names of the two factors to be plotted
what which parameters? mu, sigma, nu, or tau
parameter equivalent to what
[Link] whether to show the legend in the two way plot
... Further arguments
Details
This is an experimental function which should be use with prudence since no other check is done
on whether this interaction interfere with other terms in the model
Value
The function creates a 2 way interaction plot
Author(s)
Mikis Stasinopoulos
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
[Link],
Examples
data(aids)
ti <- factor(c(rep(1,18),rep(2,27)))
m1 <- gamlss(y~x+qrt*ti, data=aids, family=NBI)
m2 <- gamlss(y~x+qrt*ti, data=aids, family=NO)
plot2way(m1, c("qrt","ti"))
plot2way(m1, c("ti", "qrt"))
polyS 89
Description
These two functions are similar to the poly and polym in R. Are needed for the [Link] function
of GAMLSS and should not be used on their own.
Usage
polyS(x, ...)
[Link](m, degree = 1)
Arguments
x a variable
m a variable
degree the degree of the polynomial
... for extra arguments
Value
Returns a matrix of orthogonal polynomials
Warning
Not be use by the user
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554. Stasinopoulos D. M., Rigby R.A.,
Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using
GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
See Also
gamlss, [Link]
90 [Link]
[Link] Extract Predictor Values and Standard Errors For New Data In a
GAMLSS Model
Description
[Link] is the GAMLSS specific method which produce predictors for a new data set for
a specified parameter from a GAMLSS objects. The [Link] can be used to extract the
linear predictors, fitted values and specific terms in the model at new data values in the same way
that the [Link]() and [Link]() functions can be used for lm or glm objects. Note that
linear predictors, fitted values and specific terms in the model at the current data values can also be
extracted using the function lpred() (which is called from predict if new data is NULL).
Usage
## S3 method for class 'gamlss'
predict(object, what = c("mu", "sigma", "nu", "tau"),
parameter= NULL,
newdata = NULL, type = c("link", "response", "terms"),
terms = NULL, [Link] = FALSE, data = NULL, ...)
predictAll(object, newdata = NULL, type = c("response", "link", "terms"),
terms = NULL, [Link] = FALSE, [Link] = FALSE,
data = NULL, [Link] = "median",
[Link] = .Machine$[Link],
output = c("list", "matrix"), ...)
Arguments
object a GAMLSS fitted model
what which distribution parameter is required, default what="mu"
parameter equivalent to what
newdata a data frame containing new values for the explanatory variables used in the
model
type the default, gets the linear predictor for the specified distribution parameter.
type="response" gets the fitted values for the parameter while type="terms"
gets the fitted terms contribution
terms if type="terms", which terms to be selected (default is all terms)
[Link] if TRUE the approximate standard errors of the appropriate type are extracted if
exist
[Link] if [Link]=TRUE the old data and the newdata are merged and the model is
refitted with weights equal to the prior weights for the old data observationa and
equal to a very small value (see option [Link]) for the .newdata values. This
trick allows to obtain standard errors for all parameters
data the data frame used in the original fit if is not defined in the call
[Link] 91
[Link] how to get the response values for the newdata if they do not exist. The default
is taking the median, [Link]="median". Other function like "max", "min" are
alloed. Also numerical values.
[Link] what values the weights for the newdata should take
output whether the output to be a ‘list’ (default) or a ’matrix’
... for extra arguments
Details
The predict function assumes that the object given in newdata is a data frame containing the right x-
variables used in the model. This could possible cause problems if transformed variables are used in
the fitting of the original model. For example, let us assume that a transformation of age is needed in
the model i.e. nage<-age^.5. This could be fitted as mod<-gamlss(y~cs(age^.5),data=mydata)
or as nage<-age^.5; mod<-gamlss(y~cs(nage), data=mydata). The later could more efficient
if the data are in thousands rather in hundreds. In the first case, the code predict(mod,newdata=[Link](age=c(34,56)
would produce the right results. In the second case a new data frame has to be created containing
the old data plus any new transform data. This data frame has to be declared in the data option.
The option newdata should contain a [Link] with the new names and the transformed values in
which prediction is required, (see the last example).
Value
A vector or a matrix depending on the options.
Note
This function is under development
Author(s)
Mikis Stasinopoulos
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
lp, lpred
92 [Link]
Examples
data(aids)
a<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
newaids<-[Link](x=c(45,46,47), qrt=c(2,3,4))
ap <- predict(a, newdata=newaids, type = "response")
ap
# now getting all the parameters
predictAll(a, newdata=newaids)
rm(a, ap)
data(abdom)
# transform x
aa<-gamlss(y~cs(x^.5),data=abdom)
# predict at old values
predict(aa)[610]
# predict at new values
predict(aa,newdata=[Link](x=42.43))
# now transform x first
nx<-abdom$x^.5
aaa<-gamlss(y~cs(nx),data=abdom)
# create a new data frame
newd<-[Link]( abdom, nx=abdom$x^0.5)
# predict at old values
predict(aaa)[610]
# predict at new values
predict(aaa,newdata=[Link](nx=42.43^.5), data=newd)
Description
[Link] is the GAMLSS specific method for the generic function print which prints objects
returned by modelling functions.
Usage
## S3 method for class 'gamlss'
print(x, digits = max(3, getOption("digits") - 3), ...)
Arguments
x a GAMLSS fitted model
digits the number of significant digits to use when printing
... for extra arguments
Value
Prints a gamlss object
[Link] 93
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
Examples
data(aids)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids)
print(h) # or just h
rm(h)
[Link] Plotting the Profile Deviance for one of the Parameters in a GAMLSS
model
Description
This functions plots the profile deviance of one of the (four) parameters in a GAMLSS model. It can
be used if one of the parameters mu, sigma, nu or tau is a constant (not a function of explanatory
variables) to obtain a profile confidence intervals.
Usage
Arguments
object A fitted GAMLSS model
which which parameter to get the profile deviance e.g. which="tau"
min the minimum value for the parameter e.g. min=1
max the maximum value for the parameter e.g. max=20
step how often to evaluate the global deviance (defines the step length of the grid for
the parameter) e.g. step=1
length the length if step is not set, default equal 7
startlastfit whether to start fitting from the last fit or not, default value is startlastfit=TRUE
plot whether to plot, plot=TRUE or save the results, plot=FALSE
perc what % confidence interval is required
... for extra arguments
Details
This function can be use to provide likelihood based confidence intervals for a parameter for which
a constant model (i.e. no explanatory model) is fitted and consequently for checking the adequacy of
a particular values of the parameter. This can be used to check the adequacy of one distribution (e.g.
Box-Cox Cole and Green) nested within another (e.g. Box-Cox power exponential). For example
one can test whether a Box-Cox Cole and Green (Box-Cox-normal) distribution or a Box-Cox
power exponential is appropriate by plotting the profile of the parameter tau. A profile deviance
showing support for tau=2 indicates adequacy of the Box-Cox Cole and Green (i.e. Box-Cox
normal) distribution.
Value
Return a profile plot (if the argument plot=TRUE) and an [Link] object if saved.
The object contains:
values the values at the grid where the parameter was evaluated
fun the function which approximates the points using splines
min the minimum values in the grid
max te maximum values in the grid
[Link] the value of the parameter maximising the Profile deviance (or GAIC)
CI the profile confidence interval (if global deviance is used)
criterion which criterion was used
Warning
A dense grid (i.e. small step) evaluation of the global deviance can take a long time, so start with a
sparse grid (i.e. large step) and decrease gradually the step length for more accuracy.
Author(s)
Calliope Akantziliotou, Mikis Stasinopoulos <[Link]@[Link]> and Bob Rigby
[Link] 95
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, [Link]
Examples
## Not run:
data(abdom)
h<-gamlss(y~pb(x), [Link]=~pb(x), family=BCT, data=abdom)
[Link](h,"nu",min=-2.000,max=2)
rm(h)
## End(Not run)
[Link] Plotting the Profile: deviance or information criterion for one of the
terms (or hyper-parameters) in a GAMLSS model
Description
This functions plots the profile deviance for a chosen parameter included in the linear predictor of
any of the mu,sigma, nu or tau models so profile confidence intervals can be obtained. In can also
be used to plot the profile of a specified information criterion for any hyper-parameter when smooth
additive terms are used.
Usage
Arguments
model this is a GAMLSS model, e.g.
model=gamlss(y~cs(x,df=this),[Link]=~cs(x,df=3),data=abdom), where
this indicates the (hyper)parameter to be profiled
criterion whether global deviance ("GD") or information criterion ("GAIC") is profiled.
The default is global deviance criterion="GD"
penalty The penalty value if information criterion is used in criterion, default penalty=2.5
other this can be used to evaluate an expression before the actual fitting of the model
(Make sure that those expressions are well define in the global environment)
min the minimum value for the parameter e.g. min=1
max the maximum value for the parameter e.g. max=20
step how often to evaluate the global deviance (defines the step length of the grid for
the parameter) e.g. step=1
length if the step is left NULL then length is considered for evaluating the grid for the
parameter. It has a default value of 11
xlabel if a label for the axis is required
plot whether to plot, plot=TRUE the resulting profile deviance (or GAIC)
perc what % confidence interval is required
[Link] whether to start from the previous fitted model parameters values or not (default
is TRUE)
... for extra arguments
Details
This function can be use to provide likelihood based confidence intervals for a parameter involved in
terms in the linear predictor(s). These confidence intervals are more accurate than the ones obtained
from the parameters’ standard errors. The function can also be used to plot a profile information
criterion (with a given penalty) against a hyper-parameter. This can be used to check the uniqueness
in hyper-parameter determination using for example [Link].
Value
Return a profile plot (if the argument plot=TRUE) and an [Link] object if saved.
The object contains:
values the values at the grid where the parameter was evaluated
fun the function which approximates the points using splines
min the minimum values in the grid
max the maximum values in the grid
[Link] the value of the parameter maximising the Profile deviance (or GAIC)
CI the profile confidence interval (if global deviance is used)
criterion which criterion was used
ps 97
Warning
A dense grid (i.e. small step) evaluation of the global deviance can take a long time, so start with a
sparse grid (i.e. large step) and decrease gradually the step length for more accuracy.
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, [Link]
Examples
data(aids)
# fitting a linear model
gamlss(y~x+qrt,family=NBI,data=aids)
# testing the linear beta parameter
mod<-quote(gamlss(y ~ offset(this * x) + qrt, data = aids, family = NBI))
[Link](mod, min=0.06, max=0.11)
# find the hyper parameter using cubic splines smoothing
mod1<-quote(gamlss(y ~ cs(x,df=this) + qrt, data = aids, family = NBI))
[Link](mod1, min=1, max=15, step=1, criterion="GAIC", penalty=log(45))
# find a break point in x
mod2 <- quote(gamlss(y ~ x+I((x>this)*(x-this))+qrt,family=NBI,data=aids))
[Link](mod2, min=1, max=45, step=1, criterion="GD")
rm(mod,mod1,mod2)
Description
There are several function which use P-spline methodology:
a) pb(), the current version of P-splines which uses SVD in the fitting and therefore is the most
reliable
b) pbo(), an older version of P-splines which uses simple matrix algebra in the fits.
c) pbc() the new vwrsion of cycle P-splines (using SVD)
d) cy() the older version of cycle P-splines.
e) pbm() for fitting monotonic P-splines (using SVD)
f) pbz() for fitting P-splines which allow the fitted curve to shrink to zero degrees of freedom
g) ps() the original P-splines with no facility of estimating the smoothing parameters and
j) pvc() penelised varying coefficient models.
Theoretical explanation of the above P-splines can be found in Eilers et al. (2016)
The functions take a vector and return it with several attributes. The vector is used in the construc-
tion of the design matrix X used in the fitting. The functions do not do the smoothing, but assign
the attributes to the vector to aid gamlss in the smoothing. The functions doing the smoothing
are [Link](), [Link](), [Link]() [Link]() [Link](), [Link](),
[Link] and [Link]() which are used in the backfitting function [Link].
The function pb() is more efficient and faster than the original penalised smoothing function ps().
After December 2014 the pb() has changed radically to improved performance. The older version
of the pb() function is called now pbo(). pb() allows the estimation of the smoothing parameters
using different local (performance iterations) methods. The method are "ML", "ML-1", "EM",
"GAIC" and "GCV".
The function pbm() fits monotonic smooth functions, that is functions which increase or decrease
monotonically depending on the value of the argument mono which takes the values "up" or "down".
The function pbz() is similar to pb() with the extra property that when lambda becomes very large
the resulting smooth function goes to a constant rather than to a linear function. This is very useful
for model selection. The function is based on Maria Durban idea of using a double penalty, one of
order 2 and one of order 1. The second penalty only applies if the effective df are close to 2 (that is
if a linear is already selected).
The function pbc() fits a cycle penalised beta regression spline such as the last fitted value of the
smoother is equal to the first fitted value. cy() is the older version.
The function pvc() fits varying coefficient models see Hastie and Tibshirani(1993) and it is more
general and flexible than the old vc() function which was based on cubic splines.
The function getZmatrix() creates a (random effect) design matrix Z which can be used to fit a
P-splines smoother using the re()) function. (The re() is an interface with the random effect
function lme of the package nlme.
Usage
pb(x, df = NULL, lambda = NULL, control = [Link](...), ...)
pbo(x, df = NULL, lambda = NULL, control = [Link](...), ...)
[Link](inter = 20, degree = 3, order = 2, start = 10, quantiles = FALSE,
method = c("ML", "GAIC", "GCV", "EM", "ML-1"), k = 2, ...)
ps 99
Arguments
x the univariate predictor
df the desired equivalent number of degrees of freedom (trace of the smoother ma-
trix minus two for the constant and linear fit)
lambda the smoothing parameter
control setting the control parameters
by a factor, for fitting different smoothing curves to each level of the factor or a
continuous explanatory variable in which case the coefficients of the by variable
change smoothly according to x i.e. beta(x)*z where z is the by variable.
... for extra arguments
inter the no of break points (knots) in the x-axis
degree the degree of the piecewise polynomial
order the required difference in the vector of coefficients
start the lambda starting value if the local methods are used, see below
quantiles if TRUE the quantile values of x are use to determine the knots
ts if TRUE assumes that it is a seasonal factor
method The method used in the (local) performance iterations. Available methods are
"ML", "ML-1", "EM", "GAIC" and "GCV"
k the penalty used in "GAIC" and "GCV"
mono for monotonic P-splines whether going "up" or "down"
100 ps
Details
The ps() function is based on Brian Marx function which can be found in his website. The pb(),
cy(), pvc() and pbm() functions are based on Paul Eilers’s original R functions. Note that ps()
and pb() functions behave differently at their default values if df and lambda are not specified.
ps(x) by default uses 3 extra degrees of freedom for smoothing x. pb(x) by default estimates
lambda (and therefore the degrees of freedom) automatically using a "local" method. The local
(or performance iterations) methods available are: (i) local Maximum Likelihood, "ML", (ii) local
Generalized Akaike information criterion, "GAIC", (iii) local Generalized Cross validation "GCV"
(iv) local EM-algorithm, "EM" (which is very slow) and (v) a modified version of the ML, "ML-1"
which produce identical results with "EM" but faster.
The function pb() fits a P-spline smoother.
The function pbm() fits a monotonic (going up or down) P-spline smoother.
The function pbc() fits a P-spline smoother where the beginning and end are the same.
The pvc() fits a varying coefficient model.
Note that the local (or performance iterations) methods can occasionally make the convergence of
gamlss less stable compared to models where the degrees of freedom are fixed.
Value
the vector x is returned, endowed with a number of attributes. The vector itself is used in the
construction of the model matrix, while the attributes are needed for the backfitting algorithms
[Link]().
Warning
There are occasions where the automatic local methods do not work. One accusation which came
to our attention is when the range of the response variable values is very large. Scaling the response
variable will solve the problem.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby and Paul Eilers
References
Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties (with
comments and rejoinder). Statist. Sci, 11, 89-121.
ps 101
Eilers, Paul HC, Marx, Brian D and Durban, Maria, (2016) Twenty years of P-splines. SORT-
Statistics and Operations Research Transactions, 39, 149–186.
Hastie, T. J. and Tibshirani, R. J. (1993), Varying coefficient models (with discussion),J. R. Statist.
Soc. B., 55, 757-796.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, [Link], cs
Examples
#==============================
# pb() and ps() functions
data(aids)
# fitting a smoothing cubic spline with 7 degrees of freedom
# plus the a quarterly effect
aids1<-gamlss(y~ps(x,df=7)+qrt,data=aids,family=PO) # fix df's
aids2<-gamlss(y~pb(x,df=7)+qrt,data=aids,family=PO) # fix df's
aids3<-gamlss(y~pb(x)+qrt,data=aids,family=PO) # estimate lambda
with(aids, plot(x,y))
with(aids, lines(x,fitted(aids1),col="red"))
with(aids, lines(x,fitted(aids2),col="green"))
with(aids, lines(x,fitted(aids1),col="yellow"))
rm(aids1, aids2, aids3)
#=============================
## Not run:
# pbc()
# simulate data
[Link](555)
x = seq(0, 1, length = 100)
y = sign(cos(1 * x * 2 * pi + pi / 4)) + rnorm(length(x)) * 0.2
plot(y~x)
m1<-gamlss(y~pbc(x))
lines(fitted(m1)~x)
rm(y,x,m1)
#=============================
# the pvc() function
# function to generate data
genData <- function(n=200)
{
f1 <- function(x)-60+15*x-0.10*x^2
f2 <- function(x)-120+10*x+0.08*x^2
102 ps
[Link](1441)
x1 <- runif(n/2, min=0, max=55)
x2 <- runif(n/2, min=0, max=55)
y1 <- f1(x1)+rNO(n=n/2,mu=0,sigma=20)
y2 <- f2(x2)+rNO(n=n/2,mu=0,sigma=30)
y <- c(y1,y2)
x <- c(x1,x2)
f <- gl(2,n/2)
da<-[Link](y,x,f)
da
}
da<-genData(500)
plot(y~x, data=da, pch=21,bg=c("gray","yellow3")[unclass(f)])
# fitting models
# smoothing x
m1 <- gamlss(y~pb(x), data=da)
# parallel smoothing lines
m2 <- gamlss(y~pb(x)+f, data=da)
# linear interaction
m3 <- gamlss(y~pb(x)+f*x, data=da)
# varying coefficient model
m4 <- gamlss(y~pvc(x, by=f), data=da)
GAIC(m1,m2,m3,m4)
# plotting the fit
lines(fitted(m4)[da$f==1][order(da$x[da$f==1])]~da$x[da$f==1]
[order(da$x[da$f==1])], col="blue", lwd=2)
lines(fitted(m4)[da$f==2][order(da$x[da$f==2])]~da$x[da$f==2]
[order(da$x[da$f==2])], col="red", lwd=2)
rm(da,m1,m2,m3,m4)
#=================================
# the rent data
# first with a factor
data(rent)
plot(R~Fl, data=rent, pch=21,bg=c("gray","blue")[unclass(rent$B)])
r1 <- gamlss(R~pb(Fl), data=rent)
# identical to model
r11 <- gamlss(R~pvc(Fl), data=rent)
# now with the factor
r2 <- gamlss(R~pvc(Fl, by=B), data=rent)
lines(fitted(r2)[rent$B==1][order(rent$Fl[rent$B==1])]~rent$Fl[rent$B==1]
[order(rent$Fl[rent$B==1])], col="blue", lwd=2)
lines(fitted(r2)[rent$B==0][order(rent$Fl[rent$B==0])]~rent$Fl[rent$B==0]
[order(rent$Fl[rent$B==0])], col="red", lwd=2)
# probably not very sensible model
rm(r1,r11,r2)
#-----------
# now with a continuous variable
# additive model
h1 <-gamlss(R~pb(Fl)+pb(A), data=rent)
# varying-coefficient model
h2 <-gamlss(R~pb(Fl)+pb(A)+pvc(A,by=Fl), data=rent)
AIC(h1,h2)
rm(h1,h2)
ps 103
#-----------
# monotone function
[Link](1334)
x = seq(0, 1, length = 100)
p = 0.4
y = sin(2 * pi * p * x) + rnorm(100) * 0.1
plot(y~x)
m1 <- gamlss(y~pbm(x))
points(fitted(m1)~x, col="red")
yy <- -y
plot(yy~x)
m2 <- gamlss(yy~pbm(x, mono="down"))
points(fitted(m2)~x, col="red")
#==========================================
# the pbz() function
# creating uncorrelated data
[Link](123)
y<-rNO(100)
x<-1:100
plot(y~x)
#----------------------
# ML estimation
m1<-gamlss(y~pbz(x))
m2 <-gamlss(y~pb(x))
AIC(m1,m2)
op <- par( mfrow=c(1,2))
[Link](m1, partial=T)
[Link](m2, partial=T)
par(op)
# GAIC estimation
m11<-gamlss(y~pbz(x, method="GAIC", k=2))
m21 <-gamlss(y~pb(x, method="GAIC", k=2))
AIC(m11,m21)
op <- par( mfrow=c(1,2))
[Link](m11, partial=T)
[Link](m21, partial=T)
par(op)
# GCV estimation
m12<-gamlss(y~pbz(x, method="GCV"))
m22 <-gamlss(y~pb(x, method="GCV"))
AIC(m12,m22)
op <- par( mfrow=c(1,2))
[Link](m12, partial=T)
[Link](m22, partial=T)
par(op)
# fixing df is more trycky since df are the extra df
m13<-gamlss(y~pbz(x, df=0))
m23 <-gamlss(y~pb(x, df=0))
AIC(m13,m23)
# here the second penalty is not take effect therefore identcal results
m14<-gamlss(y~pbz(x, df=1))
m24 <-gamlss(y~pb(x, df=1))
AIC(m14,m24)
104 [Link]
# fixing lambda
m15<-gamlss(y~pbz(x, lambda=1000))
m25 <-gamlss(y~pb(x, lambda=1000))
AIC(m15,m25)
#--------------------------------------------------
# prediction
m1<-gamlss(y~pbz(x), data=[Link](y,x))
m2 <-gamlss(y~pb(x), data=[Link](y,x))
AIC(m1,m2)
predict(m1, newdata=[Link](x=c(80, 90, 100, 110)))
predict(m2, newdata=[Link](x=c(80, 90, 100, 110)))
#---------------------------------------------------
## End(Not run)
Description
This function calculates and prints the Q-statistics (or Z-statistics) which are useful to test normality
of the residuals within a range of an independent variable, for example age in centile estimation,
see Royston and Wright (2000).
Usage
[Link](obj = NULL, xvar = NULL, resid = NULL, [Link] = NULL, [Link] = 10,
zvals = TRUE, save = TRUE, plot = TRUE, [Link] = getOption("digits"),
...)
Arguments
obj a GAMLSS object
xvar a unique explanatory variable
resid quantile or standardised residuals can be given here instead of a GAMLSS object
in obj. In this case the function behaves diffently (see details below)
[Link] the x-axis cut off points e.g. c(20,30). If [Link]=NULL then the [Link]
argument is activated
[Link] if [Link]=NULL this argument gives the number of intervals in which the
x-variable will be split, with default 10
zvals if TRUE the output matrix contains the individual Z-statistics rather that the Q
statistics
save whether to save the Q-statistics or not with default equal to TRUE. In this case
the functions produce a matrix giving individual Q (or z) statistics and the final
aggregate Q’s
plot whether to plot a visual version of the Q statistics (default is TRUE)
[Link] to control the number of digits of the xvar in the plot
... for extra arguments
[Link] 105
Details
Note that the function [Link] behaves differently depending whether the obj or the resid argu-
ment is set. The obj argument produces the Q-statistics (or Z-statistics) table appropriate for centile
estimation (therefore it expect a reasonable large number of observations). The argument resid al-
lows any model residuals, (not necessary GAMLSS), suitable standardised and is appropriate for
any size of data. The resulting table contains only the individuals Z-statistics.
Value
A table containing the Q-statistics or Z-statistics. If plot=TRUE it produces also an graphical repre-
senation of the table.
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Royston P. and Wright E. M. (2000) Goodness of fit statistics for the age-specific reference intervals.
Statistics in Medicine, 19, pp 2943-2962.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, [Link], wp
Examples
data(abdom)
h<-gamlss(y~pb(x), [Link]=~pb(x), family=BCT, data=abdom)
[Link](h,xvar=abdom$x,[Link]=8)
[Link](h,xvar=abdom$x,[Link]=8,zvals=FALSE)
[Link](resid=resid(h), xvar=abdom$x, [Link]=5)
rm(h)
106 quantSheets
Description
The quantile sheets function quantSheets() is based on the work of Sabine Schnabe and Paul
Eiler (see references below). The estimation of the quantile curves is done simultaneously by also
smoothing in the direction of y as well as x. This avoids (but do not eliminate completely) the
problem of crossing quantiles.
Usage
quantSheets(y, x, [Link] = 1, [Link] = 1, data = NULL,
cent = 100 * pnorm((-4:4) * 2/3),
control = [Link](...), print = TRUE, ...)
Arguments
y the y variable
x the x variable
[Link] smoothing parameter in the direction of x
[Link] smoothing parameter in the direction of y (probabilities)
data the data frame
cent the centile values where the quantile sheets is evaluated
control for the parameters controlling the algorithm
print whether to print the sample percentages
[Link] number of intervals in the x direction for the B-splines
[Link] number of intervals in the probabilities (y-direction) for the B-splines
degree the degree for the B-splines
logit whether to use logit(p) instead of p (probabilities) for the y-axis
order the order of the penalty
kappa is a ridge parameter set to zero (for no ridge effect)
[Link] number of cycles of the algorithm
quantSheets 107
Details
The advantage of quantile sheets is that they estimates simultaneously all the quantiles. This almost
eliminates the problem of crossing quantiles. The method is very fast and useful for exploratory
tool. The function needs two smoothing parameters. Those two parameters have to specified by the
user. They are not estimated automatically. They can be selected by visual inspection.
The disadvantages of quantile sheets comes from the fact that like all non-parametric techniques
do not have a goodness of fit measure to change how good is the models and the residuals based
diagnostics are not existence since it is difficult to define residuals in this set up.
In this implementation we do provide residuals by using the flexDist() function from package
[Link]. This is based on the idea that by knowing the quantiles of the distribution we can
reconstruct non parametrically the distribution itself and this is what flexDist() is doing. As a
word of caution, such a construct is based on several assumptions and depends on several smoothing
parameters. Treat those residuals with caution. The same caution should apply to the function
[Link]().
Value
Using the function quantSheets() a quantSheets object is returned having the following meth-
ods: print(), fitted(), predict() and resid().
Using findPower() a single values of the power parameter is returned.
Using [Link] a vector of z-scores is returned.
Author(s)
Mikis Stasinopoulos based on function provided by Paul Eiler and Sabine Schnabe
References
Schnabel, S.K. (2011) Expectile smoothing: new perspectives on asymmetric least squares. An
application to life expectancy, Utrecht University.
108 random
Schnabel, S. K and Eilers, P. H. C.(2013) Simultaneous estimation of quantile curves using quantile
sheets, AStA Advances in Statistical Analysis, 97, 1, pp 77-87, Springer.
Schnabel, S. K and Eilers, P. H. (2013) A location-scale model for non-crossing expectile curves,
Stat, 2, 1, pp 171-183.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
lms: for a parametric equivalent results.
Examples
data(abdom)
m1 <- quantSheets(y,x, data=abdom)
head(fitted(m1))
p1 <- predict(m1, newdata=c(20,30,40))
matpoints(c(20,30,40), p1)
[Link](m1,y=c(150, 300),x=c(20, 30) )
# If we needed a power transformation not appropriate for this data
findPower(y,x, data=abdom)
Description
They are two functions for fitting random effects wthin a GAMLSS model, random() and re().
The function random() is based on the original random() function of Trevor Hastie in the package
gam. TIn our version the function has been modified to allow a "local" maximum likelihood estima-
tion of the smoothing parameter lambda. This method is equivalent to the PQL method of Breslow
and Clayton (1993) applied at the local iterations of the algorithm. In fact for a GLM model and a
simple random effect it is equivalent to glmmPQL() function in the package MASS see Venables and
Ripley (2002). Venables and Ripley (2002) claimed that this iterative method was first introduced
by Schall (1991). Note that in order for the "local" maximum likelhood estimation procedure to
operate both argument df and lambda has to be NULL.
The function re() is an interface for calling the lme() function of the package nlme. This gives
the user the abilty to fit comlpicated random effect models while the assumtion of the normal dis-
tribution for the response variable is relaxed. The theoretical justification cames again from the fact
that this is a PQL method, Breslow and Clayton (1993).
Usage
random(x, df = NULL, lambda = NULL, start=10)
Arguments
x a factor
df the target degrees of freedom
lambda the smoothing parameter lambda which can be viewed as a shrinkage parameter.
start starting value for lambda if local Maximul likelihood is used.
fixed a formula specify the fixed effects of the lme() model. This, in most cases can
be also included in the gamlss parameter formula
random a formula or list specufying the random effect part of the model as in lme()
function
correlation the correlation structure of the lme() model
method which method, "ML" (the default), or "REML"
... this can be used to pass arguments for lmeControl()
Details
The function random() can be seen as a smoother for use with factors in gamlss(). It allows the
fitted values for a factor predictor to be shrunk towards the overall mean, where the amount of
shrinking depends either on lambda, or on the equivalent degrees of freedom or on the estimated
sigma parameter (default). Similar in spirit to smoothing splines, this fitting method can be justified
on Bayesian grounds or by a random effects model. Note that the behavier of the function is different
from the original Hastie function. Here the function behaves as follows: i) if both df and lambda
are NULL then the PQL method is used ii) if lambda is not NULL, lambda is used for fitting iii) if
lambda is NULL and df is not NULL then df is used for fitting.
Since factors are coded by [Link]() into a set of contrasts, care has been taken to add an
appropriate "contrast" attribute to the output of random(). This zero contrast results in a column of
zeros in the model matrix, which is aliased with any column and is hence ignored.
The use of the function re() requires knowledge of the use of the function lme() of the pack-
age nlme for the specification of the appropriate random effect model. Some care should betaken
whether the data set is
Value
x is returned with class "smooth", with an attribute named "call" which is to be evaluated in the
backfitting [Link]() called by gamlss()
Author(s)
For re() Mikis Stasinopoulos and Marco Enea and for random() Trevor Hastie (amended by Mikis
Stasinopoulos),
References
Breslow, N. E. and Clayton, D. G. (1993) Approximate inference in generalized linear mixed mod-
els. Journal of the American Statistical Association 88, 9???25.
Chambers, J. M. and Hastie, T. J. (1991). Statistical Models in S, Chapman and Hall, London.
110 random
Pinheiro, Jose C and Bates, Douglas M (2000) Mixed effects models in S and S-PLUS Springer.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Schall, R. (1991) Estimation in generalized linear models with random effects. Biometrika 78,
719???727.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC. (see also [Link]
[Link]/).
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
See Also
gamlss, [Link]
Examples
#------------- Example 1 from Pinheiro and Bates (2000) page 15-----------------
# bring nlme
library(nlme)
data(ergoStool)
# lme model
l1<-lme(effort~Type, data=ergoStool, random=~1|Subject, method="ML")
# use random()
t1<-gamlss(effort~Type+random(Subject), data=ergoStool )
# use re() with fixed effect within re()
t2<-gamlss(effort~re(fixed=~Type, random=~1|Subject), data=ergoStool )
# use re() with fixed effect in gamlss formula
t3<-gamlss(effort~Type+re(random=~1|Subject), data=ergoStool )
# compare lme fitted values with random
plot(fitted(l1), fitted(t1))
# compare lme fitted values with random
plot(fitted(l1), fitted(t2))
lines(fitted(l1), fitted(t3), col=2)
# getting the fitted coefficients
getSmo(t2)
#-------------------------------------------------------------------------------
## Not run:
#-------------Example 2 Hodges data---------------------------------------------
data(hodges)
plot(prind~state, data=hodges)
m1<- gamlss(prind~random(state), [Link]=~random(state), [Link]=~random(state),
[Link]=~random(state), family=BCT, data=hodges)
m2<- gamlss(prind~re(random=~1|state), [Link]=~re(random=~1|state),
[Link]=~re(random=~1|state), [Link]=~re(random=~1|state), family=BCT,
data=hodges)
# comparing the fitted effective degrees of freedom
m1$[Link]
random 111
m2$[Link]
m1$[Link]
m2$[Link]
m1$[Link]
m2$[Link]
m1$[Link]
m2$[Link]
# random effect for tau is not needed
m3<- gamlss(prind~random(state), [Link]=~random(state), [Link]=~random(state),
family=BCT, data=hodges, [Link]=m1)
plot(m3)
# term plots work for random but not at the moment for re()
op <- par(mfrow=c(2,2))
[Link](m3, se=TRUE)
[Link](m3, se=TRUE, what="sigma")
[Link](m3, se=TRUE, what="nu")
par(op)
# getting information from a fitted lme object
coef(getSmo(m2))
ranef(getSmo(m2))
VarCorr(getSmo(m2))
summary(getSmo(m2))
intervals(getSmo(m2))
fitted(getSmo(m2))
fixef(getSmo(m2))
# plotting
plot(getSmo(m2))
qqnorm(getSmo(m2))
#----------------Example 3 from Pinheiro and Bates (2000) page 42---------------
data(Pixel)
l1 <- lme(pixel~ day+I(day^2), data=Pixel, random=list(Dog=~day, Side=~1),
method="ML")
# this will fail
#t1<-gamlss(pixel~re(fixed=~day+I(day^2), random=list(Dog=~day, Side=~1)),
# data=Pixel)
# but this is working
t1<-gamlss(pixel~re(fixed=~day+I(day^2), random=list(Dog=~day, Side=~1),
opt="optim"), data=Pixel)
plot(fitted(l1)~fitted(t1))
#---------------Example 4 from Pinheiro and Bates (2000)page 146----------------
data(Orthodont)
l1 <- lme(distance~ I(age-11), data=Orthodont, random=~I(age-11)|Subject,
method="ML")
t1<-gamlss(distance~I(age-11)+re(random=~I(age-11)|Subject), data=Orthodont)
plot(fitted(l1)~fitted(t1))
# checking the model
plot(t1)
wp(t1, [Link]=2)
# two observation fat try LO
t2<-gamlss(distance~I(age-11)+re(random=~I(age-11)|Subject, opt="optim",
numIter=100), data=Orthodont, family=LO)
plot(t2)
112 refit
wp(t2,[Link]=2)
# a bit better but not satisfactory Note that 3 paramters distibutions fail
#------------example 5 from Venable and Ripley (2002)--------------------------
library(MASS)
data(bacteria)
summary(glmmPQL(y ~ trt + I(week > 2), random = ~ 1 | ID,
family = binomial, data = bacteria))
s1 <- gamlss(y ~ trt + I(week > 2)+random(ID), family = BI, data = bacteria)
s2 <- gamlss(y ~ trt + I(week > 2)+re(random=~1|ID), family = BI,
data = bacteria)
s3 <- gamlss(y ~ trt + I(week > 2)+re(random=~1|ID, method="REML"), family = BI,
data = bacteria)
# the esimate of the random effect sd sigma_b
sqrt(getSmo(s1)$tau2)
getSmo(s2)
getSmo(s3)
#-------------Example 6 from Pinheiro and Bates (2000) page 239-244-------------
# using corAR1()
data(Ovary)
# AR1
l1 <- lme(follicles~sin(2*pi*Time)+cos(2*pi*Time), data=Ovary,
random=pdDiag(~sin(2*pi*Time)), correlation=corAR1())
# ARMA
l2 <- lme(follicles~sin(2*pi*Time)+cos(2*pi*Time), data=Ovary,
random=pdDiag(~sin(2*pi*Time)), correlation=corARMA(q=2))
# now gamlss
# AR1
t1 <- gamlss(follicles~re(fixed=~sin(2*pi*Time)+cos(2*pi*Time),
random=pdDiag(~sin(2*pi*Time)),
correlation=corAR1()), data=Ovary)
plot(fitted(l1)~fitted(t1))
# ARMA
t2 <- gamlss(follicles~re(fixed=~sin(2*pi*Time)+cos(2*pi*Time),
random=pdDiag(~sin(2*pi*Time)),
correlation=corARMA(q=2)), data=Ovary)
plot(fitted(l2)~fitted(t2))
AIC(t1,t2)
wp(t2, [Link]=1)
#-------------------------------------------------------------------------------
## End(Not run)
Description
This function refits a GAMLSS model. It is useful when the algorithm has not converged after 20
outer iteration (the default value)
refit 113
Usage
refit(object, ...)
Arguments
object a GAMLSS fitted model which has not converged
... for extra arguments
Details
This function is useful when the iterations have reach the maximum value set by the code([Link]) of
the [Link] function and the model has not converged yet
Value
Returns a GAMLSS fitted model
Note
The function update does a very similar job
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, [Link]
Examples
data(aids)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
refit(h)
rm(h)
114 [Link]
Description
[Link] is the GAMLSS specific method for the generic function residuals which
extracts the residuals for a fitted model. The abbreviated form resid is an alias for residuals.
Usage
## S3 method for class 'gamlss'
residuals(object, what = c("z-scores", "mu", "sigma", "nu", "tau"),
type = c("simple", "weighted", "partial"),
terms=NULL, ...)
Arguments
object a GAMLSS fitted model
what specify whether the standardized residuals are required, called here the "z-scores",
or residuals for a specific parameter
type the type of residual if residuals for a parameter are required
terms if type is "partial" this specifies which term is required
... for extra arguments
Details
The "z-scores" residuals saved in a GAMLSS object are the normalized (randomized) quantile resid-
uals (see Dunn and Smyth, 1996). Randomization is only needed for the discrete family distribu-
tions, see also [Link]. Residuals for a specific parameter can be "simple" = (working variable
- linear predictor), "weighted"= sqrt(working weights)*(working variable - linear predictor) or "par-
tial"= (working variable - linear predictor)+contribution of specific terms.
Value
a vector or a matrix of the appropriate residuals of a GAMLSS model. Note that when weights are
used in the fitting the length of the residuals can be different from N the length of the fitted values.
Observations with weights equal to zero are not appearing in the residuals. Also observations with
frequencies as weights will appear more than once according to their frequencies.
Note
The "weighted" residuals of a specified parameter can be zero and one if the square of first derivative
have been used in the fitting of this parameter
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby
ri 115
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
[Link], [Link], [Link], [Link], [Link], [Link],
[Link], [Link], [Link]
Examples
data(aids)
h<-gamlss(y~poly(x,3)+qrt, family=NBI, data=aids) #
plot(aids$x,resid(h))
plot(aids$x,resid(h,"sigma") )
rm(h)
Description
The function ri() allow the user to fit a ridge regression within GAMLSS. It allows the coefficients
of a set of explanatory variables to be shrunk towards zero. The amount of shrinking depends either
on lambda, or on the equivalent degrees of freedom (df). The type of shrinking depends on the
argument Lp see example.
Usage
ri(X, df = NULL, lambda = NULL, method = c("ML", "GAIC"),
order = 0, start = 10, Lp = 2, kappa = 1e-05,
iter = 100, [Link] = 1e-06, k = 2)
Arguments
X A matrix of explanatory variables X which is standardised (mean=0, sd=1) auto-
matically
df the effective degrees of freedom df
lambda the smoothing parameter lambda
116 ri
method which method is used for the estimation of the smoothing parameter, ‘ML’ or
‘GAIC’ are allowed.
order the order of the difference applied to the coefficients with default zero. (Do not
change this unless there is some ordering in the explanatory variables).)
start starting value for lambda if it estimated using ‘ML’ or ‘GAIC’
Lp The type of penalty required, Lp=2 a proper ridge regression is the default. Use
codeLp=1 for lasso and different values for different penalties.
kappa a regulation parameters used for the weights in the penalties.
iter the number of internal iteration allowed see details.
[Link] [Link] is the convergent criterion
k k is the penalty if ‘GAIC’ method is used.
Details
This implementation of ridge and related regressions is based on an idea of Paul Eilers which used
weights in the penalty matrix. The type of weights are defined by the argument Lp. Lp=2 is the
standard ridge regression, Lp=1 fits a lasso regression while Lp=0 allows a "best subset"" regression
see Hastie et al (2009) page 71.
Value
x is returned with class "smooth", with an attribute named "call" which is to be evaluated in the
backfitting [Link]() called by gamlss()
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby and Paul Eilers
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Rigby, R. and Stasinopoulos, D. M (2013) Automatic smoothing parameter selection in GAMLSS
with an application to centile estimation, Statistical methods in medical research.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss
[Link] 117
Examples
# USAIR DATA
X<-with(usair, cbind(x1,x2,x3,x4,x5,x6))
# standarise data 1-------------------------------------------------------------
sX<-scale(X)
# ridge
m1<- gamlss(y~ri(sX), data=usair)
# lasso
m2<- gamlss(y~ri(sX, Lp=1), data=usair)
# best subset
m3<- gamlss(y~ri(sX, Lp=0), data=usair)
#-------- plotting the coefficients
op <- par(mfrow=c(3,1))
plot(getSmo(m1)) #
plot(getSmo(m2))
plot(getSmo(m3))
par(op)
Description
This function plots worm plots, van Buuren and Fredriks M. (2001), or QQ-plots of the normalized
randomized quantile residuals (Dunn and Smyth, 1996) for a model using a discrete GAMLSS
family distribution.
Usage
[Link](obj = NULL, howmany = 6, plot = c("all", "average"),
type = c("wp", "QQ"), ...)
Arguments
obj a fitted GAMLSS model object from a "discrete" type of family
howmany The number of QQ-plots required up to ten i.e. howmany=6
plot whether to plot all plots all the residual realisations "all" or just the mean
"average"
type whether to plot worm plots "wp"or QQ plots "QQ" with default worm plots
... for extra arguments tp be passed to wp()
Details
For discrete family distributions, the gamlss() function saves on exit one realization of randomized
quantile residuals which can be plotted using the generic function plot which calls the [Link].
Looking at only one realization can be misleading, so the current function creates QQ-plots for sev-
eral realizations. The function allows up to 10 QQ-plots to be plotted. Occasionally one wishes
118 [Link]
to create a lot of realizations and then take a median of them (separately for each ordered value)
to create a single median realization. The option all in combinations with the option howmany
creates a QQ-plot of the medians of the normalized randomized quantile residuals. These ’median’
randomized quantile residuals can be saved using the option (save=TRUE).
Value
Author(s)
References
Dunn, P. K. and Smyth, G. K. (1996) Randomised quantile residuals, J. Comput. Graph. Statist., 5,
236–244
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC. (see also [Link]
[Link]/).
van Buuren and Fredriks M. (2001) Worm plot: simple diagnostic device for modelling growth
reference curves. Statistics in Medicine, 20, 1259–1277
See Also
[Link], gamlss
Examples
Description
This function gives the generalised R-squared of Nagelkerke (1991) for a GAMLSS model.
Usage
Arguments
Details
where L(0) is the null model (only a constant is fitted to all parameters) and L(θ̂) is the current fitted
model. This definition sometimes is referred to as the Cox & Snell R-squared. The Nagelkerke
/Cragg & Uhler’s definition divides the above with
1 − L(0)( 2/n)
Value
The Rsq() produces a single value if type="Cox Snell" or "Cragg Uhler" and a list if type="both".
Note
The null model is fitted using the function gamlssML() which can create warning messages
Author(s)
References
Nagelkerke, N. J. (1991). A note on a general definition of the coefficient of determination.
Biometrika, 78(3), 691-692.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
GAIC
Examples
data(aids)
m1 <- gamlss(y~x+qrt, data=aids, family=NBI)
Rsq(m1)
Rsq(m1, type="both")
rm(m1)
Description
The function rvcov() is design for providing robust standard errors for the parameters estimates of
a GAMLSS fitted model. The same result can be achieved by using vcov(fitted_model,robust=TRUE).
The function get.() gets the K matrix (see details below).
Usage
rvcov(object, type = c("vcov", "cor", "se", "coef", "all"),
[Link] = c("R", "PB") )
get.K(object, what = c("K", "Deriv"))
Arguments
object a GAMLSS fitted object
type this argument for rvcov() function whether variance-covariance matrix, corre-
lation matrix, standard errors or all of them
what this an argument for the function ket.K() allowing to get either K or the
first derivative of the likelihood with respect to the parameters (the β’s in the
GAMLSS notation).
[Link] How to obtain numerically the Hessian i) using optimHess(), option "R" ii)
using a function by Pinheiro and Bates taken from package nlme, option "PB".
rvcov 121
Details
The robust standard errors are calculated for the robust sandwich estimator of the variance-covariance
given by S = V KV where V is the standard variance-covariance matrix (the inverse of the infor-
mation matrix) and K is an estimate of the variance of he first derivatives of he likelihood. The
function get.K() is use the get the required K matrix.
Value
A variance covariance matrix or other relevant output
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby and Vlasios Voudouris
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
vcov, ~~~
Examples
# gererate from a gamma distribution
Y <- rGA(200, mu=1, sigma=2)
hist(Y)
# fitting the wrong model i.e. sigma=1
m1 <- gamlss(Y~1, family=EXP)
# the conventinal se is too precise
vcov(m1, type="se")
# the sandwich se is wider
rvcov(m1, type="se")
# fitting the correct model
m2 <- gamlss(Y~1, family=GA)
vcov(m2, type="se")
rvcov(m2, type="se")
# similar stadard errors
# also obtained using
vcov(m2, type="se", robust=TRUE)
122 stepGAIC
Description
The function stepGAIC() performs stepwise model selection using a Generalized Akaike Informa-
tion Criterion (GAIC). It is based on the function stepAIC() given in the library MASS of Venables
and Ripley (2002). The function has been changed recently to allow parallel computation. The par-
allel computations are similar to the ones performed in the function boot() of the boot package.
Note that since version 4.3-5 of gamlss the stepGAIC() do not have the option of using the function
[Link] through the argument additive.
Note that stepGAIC() is relying to the dropterm() and addterm() methods applyied to gamlss ob-
jects. drop1() and add1() are equivalent methods to the dropterm() and addterm() respectivelly
but with different default arguments (see the examples).
The function [Link]() is the old version of stepGAIC() with no parallel computations.
The function [Link] is based on the S function [Link]() (see Chambers and Hastie
(1991)) and it is more suited for model with smoothing additive terms when the degrees of freedom
for smoothing are fixed in advance. This is something which rarely used these days, as most of the
smoothing functions allow the calculations of the smoothing parameter, see for example the additive
function pb()).
The functions [Link]() and [Link]() have been adapted to work with gamlss objects
and the main difference is the scope argument, see below.
While the functions stepGAIC() is used to build models for individual parameters of the distribu-
tion of the response variable, the functions stepGAICAll.A() and stepGAICAll.A() are building
models for all the parameters.
The functions stepGAICAll.A() and stepGAICAll.B() are based on the stepGAIC() function but
use different strategies for selecting a appropriate final model.
stepGAICAll.A() has the following strategy:
Strategy A:
i) build a model for mu using a forward approach.
ii) given the model for mu build a model for sigma (forward)
iii) given the models for mu and sigma build a model for nu (forward)
iv) given the models for mu, sigma and nu build a model for tau (forward)
v) given the models for mu, sigma, nu and tau check whether the terms for nu are needed using
backward elimination.
vi) given the models for mu, sigma, nu and tau check whether the terms for sigma are needed
(backward).
vii) given the models for mu, sigma, nu and tau check whether the terms for mu are needed (back-
ward).
Note for this strategy to work the scope argument should be set appropriately.
stepGAIC 123
stepGAICAll.B() uses the same procedure as the function stepGAIC() but each term in the scope
is fitted to all the parameters of the distribution, rather than the one specified by the argument what
of stepGAIC(). The stepGAICAll.B() relies on the add1All() and drop1All() functions for the
selection of variabes.
Usage
Arguments
object an gamlss object. This is used as the initial model in the stepwise search.
124 stepGAIC
scope defines the range of models examined in the stepwise search. For the function
stepAIC() this should be either a single formula, or a list containing compo-
nents upper and lower, both formulae. See the details for how to specify the
formulae and how they are used. For the function stepGAIC the scope defines
the range of models examined in the step-wise search. It is a list of formulas,
with each formula corresponding to a term in the model. A 1 in the formula
allows the additional option of leaving the term out of the model entirely. +
direction the mode of stepwise search, can be one of both, backward, or forward, with a
default of both. If the scope argument is missing the default for direction is
backward.
trace if positive, information is printed during the running of stepAIC. Larger values
may give more information on the fitting process.
keep a filter function whose input is a fitted model object and the associated ’AIC’
statistic, and whose output is arbitrary. Typically ’keep’ will select a subset
of the components of the object and return them. The default is not to keep
anything.
steps the maximum number of steps to be considered. The default is 1000 (essentially
as many as required). It is typically used to stop the process early.
scale scale is nor used in gamlss
what which distribution parameter is required, default what="mu"
parameter equivalent to what
k the multiple of the number of degrees of freedom used for the penalty. Only ’k =
2’ gives the genuine AIC: ’k = log(n)’ is sometimes referred to as BIC or SBC.
parallel The type of parallel operation to be used (if any). If missing, the default is "no".
ncpus integer: number of processes to be used in parallel operation: typically one
would chose this to the number of available CPUs.
cl An optional parallel or snow cluster for use if parallel = "snow". If not
supplied, a cluster on the local machine is created for the duration of the call.
[Link] scope for sigma if different to scope in stepGAICAll.A()
[Link] scope for nu if different to scope in stepGAICAll.A()
[Link] scope for tau if different to scope in stepGAICAll.A()
[Link] The default value is is TRUE, set to FALSE if no model for mu is needed
[Link] The default value is TRUE, set to FALSE if no model for sigma is needed
[Link] The default value is TRUE, set to FALSE if no model for nu is needed
[Link] The default value is TRUE, set to FALSE if no model for tau is needed
test whether to print the chi-square test or not
sorted whether to sort the results
... any additional arguments to ’extractAIC’. (None are currently used.)
stepGAIC 125
Details
The set of models searched is determined by the scope argument.
For the function [Link]() the right-hand-side of its lower component is always included
in the model, and right-hand-side of the model is included in the upper component. If scope is
a single formula, it specifies the upper component, and the lower model is empty. If scope is
missing, the initial model is used as the upper model.
Models specified by scope can be templates to update object as used by [Link].
For the function [Link]() each of the formulas in scope specifies a "regimen" of candidate
forms in which the particular term may enter the model. For example, a term formula might be
~ x1 + log(x1) + cs(x1, df=3)
This means that x1 could either appear linearly, linearly in its logarithm, or as a smooth function
estimated non-parametrically. Every term in the model is described by such a term formula, and the
final model is built up by selecting a component from each formula.
The function [Link] similar to the S [Link]() in Chambers and Hastie (1991) can be
used to create automatically term formulae from specified data or model frames.
The supplied model object is used as the starting model, and hence there is the requirement that one
term from each of the term formulas of the parameters be present in the formula of the distribution
parameter. This also implies that any terms in formula of the distribution parameter not contained
in any of the term formulas will be forced to be present in every model considered.
When the smoother used in gamlss modelling belongs to the new generation of smoothers allowing
the determination of the smoothing parameters automatically (i.e. pb(), cy()) then the function
[Link]() can be used for model selection (see example below).
Value
the stepwise-selected model is returned, with up to two additional components. There is an ’"anova"’
component corresponding to the steps taken in the search, as well as a ’"keep"’ component if the
’keep=’ argument was supplied in the call. The ’"Resid. Dev"’ column of the analysis of deviance
table refers to a constant minus twice the maximized log likelihood
The function stepGAICAll.A() returns with a component "anovaAll" containing all the different
anova tables used in the process.
Author(s)
Mikis Stasinopoulos based on functions in MASS library and in Statistical Models in S
References
Chambers, J. M. and Hastie, T. J. (1991). Statistical Models in S, Chapman and Hall, London.
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
126 stepGAIC
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC. (see also [Link]
[Link]/).
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
See Also
[Link]
Examples
## Not run:
data(usair)
# -----------------------------------------------------------------------------
# null model
mod0<-gamlss(y~1, data=usair, family=GA)
# all the explanatotory variables x1:x6 fitted linearly
mod1<-gamlss(y~., data=usair, family=GA)
#-------------------------------------------------------------------------------
# droping terms
dropterm(mod1)
# with chi-square information
drop1(mod1)
# for parallel computations use something like
nC <- detectCores()
drop1(mod1, parallel="snow", ncpus=nC)
drop1(mod1, parallel="multicore", ncpus=nC)
#------------------------------------------------------------------------------
# adding terms
addterm(mod0, scope=[Link](paste("~", paste(names(usair[-1]),
collapse="+"),sep="")))
# with chi-square information
add1(mod0, scope=[Link](paste("~", paste(names(usair[-1]),
collapse="+"),sep="")))
# for parallel computations
nC <- detectCores()
add1(mod0, scope=[Link](paste("~", paste(names(usair[-1]),
collapse="+"),sep="")), parallel="snow", ncpus=nC)
#------------------------------------------------------------------------------
#------------------------------------------------------------------------------
# stepGAIC
# find the best subset for the mu
mod2 <- stepGAIC(mod1)
mod2$anova
#--------------------------------------------------------------
# for parallel computations
mod21 <- stepGAIC(mod1, , parallel="snow", ncpus=nC)
#--------------------------------------------------------------
# find the best subset for sigma
mod3<-stepGAIC(mod2, what="sigma", scope=~x1+x2+x3+x4+x5+x6)
mod3$anova
#--------------------------------------------------------------
[Link] 127
## End(Not run)
Description
[Link] is the GAMLSS specific method for the generic function summary which summa-
rize objects returned by modelling functions.
Usage
## S3 method for class 'gamlss'
summary(object, type = c("vcov", "qr"),
robust=FALSE, save = FALSE,
[Link] = c("R", "PB"),
digits = max(3, getOption("digits") - 3),...)
Arguments
object a GAMLSS fitted model
128 [Link]
type the default value vcov uses the vcov() method for gamlss to get the variance-
covariance matrix of the estimated beta coefficients, see details below. The alter-
native qr is the original method used in gamlss to estimated the standard errors
but it is not reliable since it do not take into the account the inter-correlation
between the distributional parameters mu, sigma, nu and tau.
robust whether robust (sandwich) standard errors are required
save whether to save the environment of the function so to have access to its values
[Link] whether when calculate the Hessian should use the "R" function optimHess()
or a function based on Pinheiro and Bates nlme package, "PB".
digits the number of digits in the output
... for extra arguments
Details
Using the default value type="vcov", the vcov() method for gamlss is used to get the variance
covariance matrix (and consequently the standard errors) of the beta parameters. The variance co-
variance matrix is calculated using the inverse of the numerical second derivatives of the observed
information matrix. This is a more reliable method since it take into the account the inter-correlation
between the all the parameters. The type="qr" assumes that the parameters are fixed at the es-
timated values. Note that both methods are not appropriate and should be used with caution if
smoothing terms are used in the fitting.
Value
Print summary of a GAMLSS object
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby and Calliope Akantzil-
iotou
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
gamlss, [Link], [Link]
[Link] 129
Examples
data(aids)
h<-gamlss(y~poly(x,3)+qrt, family=PO, data=aids) #
summary(h)
rm(h)
Description
Plots regression terms against their predictors, optionally with standard errors and partial residuals
added. It is based on the R function termplot but is suitably changed to apply to GAMLSS objects.
Usage
[Link](object, what = c("mu", "sigma", "nu", "tau"),
parameter= NULL, data = NULL,
envir = environment(formula(object)), [Link] = FALSE,
rug = FALSE, terms = NULL, se = TRUE, ylim = c("common", "free"),
scheme = c("shaded", "lines"), xlabs = NULL, ylabs = NULL,
main = NULL, pages = 0, [Link] = "darkred",
[Link] = "orange", [Link] = "gray", [Link] = "lightblue",
[Link] = "gray", [Link] = 1.5, [Link] = 2, [Link] = 1,
[Link] = 1, [Link] = par("pch"),
ask = interactive() && [Link] < [Link] && .Device != "postscript",
[Link] = TRUE, [Link] = FALSE,
polys = NULL, [Link] = "topo",...)
Arguments
object a fitted GAMLSS object
what the required parameter of the GAMLSS distribution i.e. "mu"
parameter equivalent to what
data data frame in which variables in object can be found
envir environment in which variables in object can be found
[Link] logical; should partial residuals be plotted or not
rug add rug plots (jitter 1-d histograms) to the axes?
terms which terms to be plotted (default ’NULL’ means all terms)
se plot point-wise standard errors?
130 [Link]
ylim there are two options here a) "common" and b) "free". The "common" option
plots all figures with the same ylim range and therefore allows the viewer to
check the relative contribution of each terms compate to the rest. In the‘free’
option the limits are computed for each plot seperatly.
scheme whether the se’s should appear shaded or as lines
xlabs vector of labels for the x axes
ylabs vector of labels for the y axes
main logical, or vector of main titles; if ’TRUE’, the model’s call is taken as main
title, ’NULL’ or ’FALSE’ mean no titles.
pages in how many pages the plot should appear. The default is 0 which allows differnt
page for each plot
[Link] the colour of the term line
[Link] the colour of the se’s lines
[Link] the colour of the shaded area
[Link] the colour of the partial residuals
[Link] the colour of the rug
[Link] line width of the fitted terms
[Link] line ype for standard errors
[Link] line width for the stadard errors
[Link] plotting character expansion for the parsial residuals
[Link] characters for points in the parsial residuals
ask logical; if ’TRUE’, the user is asked before each plot, see ’par(ask=.)’.
[Link]
Should x-axis ticks use factor levels or numbers for factor terms?
[Link] whether to use surface plot if a ga() term is fitted
polys The polygone nformation filr for MRF models
[Link] Color scheme for polygones for RMF models
... other graphical parameters
Details
The function uses the lpred function of GAMLSS. The ’data’ argument should rarely be needed,
but in some cases ’termplot’ may be unable to reconstruct the original data frame. Using ’[Link]=[Link]’
makes these problems less likely. Nothing sensible happens for interaction terms.
Value
a plot of fitted terms.
Author(s)
Mikis Stasinopoulos based on the existing termplot() function
[Link] 131
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
termplot
Examples
data(aids)
a<-gamlss(y~pb(x)+qrt,data=aids,family=NBI)
[Link](a, pages=1)
rm(a)
Description
[Link] is the GAMLSS specific method for the generic function update which updates
and (by default) refits a GAMLSS model.
Usage
## S3 method for class 'gamlss'
update(object, formula., ...,
what = c("mu", "sigma", "nu", "tau", "All"),
parameter= NULL, evaluate = TRUE)
Arguments
object a GAMLSS fitted model
formula. the formula to update
... for updating argument in gamlss()
what the parameter in which the formula needs updating for example "mu", "sigma",
"nu" "tau" or "All". If "All" all the formulae are updated. Note that the what
argument has an effect only if only if the argument formula. is set
parameter equivalent to what
evaluate whether to evaluate the call or not
132 [Link]
Value
Returns a GAMLSS call or fitted object.
Author(s)
Mikis Stasinopoulos <[Link]@[Link]>, Bob Rigby
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
[Link], [Link], [Link], [Link], [Link], [Link],
[Link], [Link]
Examples
data(aids)
# fit a poisson model
[Link] <-gamlss(y~pb(x)+qrt, family=PO, data=aids)
# update with a negative binomial
[Link] <-update([Link], family=NBI)
# update the smoothing
h.nb1 <-update([Link],~cs(x,8)+qrt)
# remove qrt
h.nb2 <-update(h.nb1,~.-qrt)
# put back qrt take log of y and fit a normal distribution
h.nb3 <-update(h.nb1,log(.)~.+qrt, family=NO)
# verify that it is the same
[Link]<-gamlss(log(y)~cs(x,8)+qrt,data=aids )
Description
The Vuong and Clarke tests for GAMLSS fitted models.
[Link] 133
Usage
[Link](obj1, obj2, [Link] = 0.05)
Arguments
obj1 The first fitted gamlss object
obj2 The second fitted gamlss object
[Link] Significance level used for testing.
Details
The Vuong (1989) and Clarke (2007) tests are likelihood-ratio-based tests for model selection that
use the Kullback-Leibler information criterion. The implemented tests can be used for choosing
between two bivariate models which are non necessary nested.
In the Vuong test, the null hypothesis is that the two models are equally close to the actual model,
whereas the alternative is that one model is closer. The test follows asymptotically a standard
normal distribution under the null. Assume that the critical region is (-c,c), where c is typically
set to 1.96. If the value of the test is greater than c then we reject the null hypothesis that the models
are equivalent in favour of the model in obj1. Vice-versa if the value is smaller than -c we reject
the null hypothesis that the models are equivalent in favour of the model in obj2. If the value falls
within (-c,c0) then we cannot discriminate between the two competing models given the data.
In the Clarke test, if the two models are statistically equivalent then the log-likelihood ratios of
the observations should be evenly distributed around zero and around half of the ratios should be
larger than zero. The test follows asymptotically a binomial distribution with parameters n and
0.5. Critical values can be obtained as shown in Clarke (2007). Intuitively, the model in obj1 is
preferred over that in obj2 if the value of the test is significantly larger than its expected value under
the null hypothesis (’coden/2), and vice versa. If the value is not significantly different from n/2
then obj1 can be thought of as equivalent to obj2.
Value
For the Vuong test it returns its value and the decision and for the Clarke test returns the value the
p-value and the decision. Decisions criteria are as discussed above.
Author(s)
Mikis Stasinopoulos and Giampierro Marra
References
Clarke K. (2007), A Simple Distribution-Free Test for Non-Nested Model Selection. Political Anal-
ysis, 15, 347-363.
Vuong Q.H. (1989), Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses.
Econometrica, 57(2), 307-333.
See Also
[Link]
134 wp
Examples
library(gamlss)
# fitting different models
m0 <- gamlss(y~x+qrt, data=aids, family=PO)
m1 <- gamlss(y~pb(x)+qrt, data=aids, family=PO)
m2 <- gamlss(y~pb(x)+qrt, data=aids, family=NBI)
# comparison of the mdels
[Link](m0,m2)
[Link](m0,m1)
[Link](m1,m2)
wp Worm plot
Description
Provides a single plot or multiple worm plots for a GAMLSS fitted or more general for any fitted
models where the method resid() exist and the residuals are defined sensibly. The worm plot (a de-
trended QQ-plot), van Buuren and Fredriks M. (2001), is a diagnostic tool for checking the residuals
within different ranges (by default not overlapping) of the explanatory variable(s).
Usage
wp(object=NULL, xvar = NULL, resid = NULL, [Link] = 4,
[Link] = NULL,
overlap = 0, [Link] = 4, [Link] = 3.5,
[Link] = TRUE, line = TRUE,
[Link] = 12 * sqrt(1/length(resid)),
[Link] = 12 * sqrt([Link]/length(resid)),
cex = 1, pch = 21, ...)
Arguments
object a GAMLSS fitted object or any other fitted model where the resid() method
works (preferably it should be standarised or quantile residuals)
xvar the explanatory variable(s) against which the worm plots will be plotted. If
only one variable is involved use xvar=x1 if two variables are involved use
xvar=~x1*x2. See also note below for use of formula if the data argument is
not found in the fitted model
resid if object is missing this argument can be used to specify the residual vector
(again it should a quantile residuals or it be assumed to come from a normal
distribution)
[Link] the number of intervals in which the explanatory variable xvar will be cut
[Link] the x-axis cut off points e.g. c(20,30). If [Link]=NULL then the [Link]
argument is activated
wp 135
overlap how much overlapping in the xvar intervals. Default value is overlap=0 for
non overlapping intervals
[Link] for the single plot, this value is the x-variable limit, default is [Link]=4
[Link] for multiple plots, this value is the x-variable limit, default is [Link]=3.5
[Link] whether to show the x-variable intervals in the top of the graph, default is
[Link]=TRUE
line whether to plot the polynomial line in the worm plot, default value is line=TRUE
[Link] for the single plot, this value is the y-variable limit, default value is [Link]=12*sqrt(1/length(fitted
[Link] for multiple plots, this values is the y-variable limit, default value is [Link]=12*sqrt([Link]/lengt
cex the cex plotting parameter with default cex=1
pch the pch plotting parameter with default pch=21
... for extra arguments
Details
If the xvar argument is not specified then a single worm plot is used. In this case a worm plot is a
de-trended normal QQ-plot so departure from normality is highlighted.
If a single xvar is specified (with or without the use of a formula) i.e. xvar=x1 or xvar=~x1) then
we have as many worm plot as [Link]. In this case the x-variable is cut into [Link] intervals with
an equal number observations and de-trended normal QQ (i.e. worm) plots for each interval are
plotted. This is a way of highlighting failures of the model within different ranges of the the single
explanatory variable. The fitted coefficients from fitting cubic polynomials to the residuals (within
each x-variable interval) can be obtain by e.g. coeffs<-wp(model1,xvar=x,[Link]=9). van Bu-
uren and Fredriks M. (2001) used these residuals to identify regions (intervals) of the explanatory
variable within which the model does not fit adequately the data (called "model violation")
Two variables can be displayed with the use of a formula, i.e. xvar=~x1*x2. In this case the
[Link] can be a vector with two values.
Value
For multiple plots the xvar intervals and the coefficients of the fitted cubic polynomials to the
residuals (within each xvar interval) are returned.
Note
Note that the wp() function, if the argument object is used, is looking for the data argument of the
object. If the argument data exists it uses its enviroment to find xvar (whether it is a formula or
not). As a result if data exists withing object xvar=~x*f can be used (assuming thet x and f are
in the data) otherwise the variable should be expicitly defined i.e. xvar=~data$x*data$f.
Author(s)
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, 1-38.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/i07.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible
Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC. (see also [Link]
[Link]/).
van Buuren and Fredriks M. (2001) Worm plot: simple diagnostic device for modelling growth
reference curves. Statistics in Medicine, 20, 1259–1277
See Also
gamlss, [Link]
Examples
data(abdom)
# with data
a<-gamlss(y~pb(x),[Link]=~pb(x,1),family=LO,data=abdom)
wp(a)
coeff1<-wp(a,xvar=x)
coeff1
## Not run:
# no data argument
b <- gamlss(abdom$y~pb(abdom$x),[Link]=~pb(abdom$x),family=LO)
wp(b)
wp(b, xvar=abdom$x)# not wp(b, xvar=x)
# using the argument resid
# this will work
wp(resid=resid(a), xvar=abdom$x)
# not this
# wp(resid=resid(a), xvar=x)
# this example uses the rent data
m1 <- gamlss(R~pb(Fl)+pb(A)+loc, [Link]=~pb(Fl)+pb(A), data=rent, family=GA)
# a single worm plot
wp(m1, [Link]=0.5)
# a single continuous x variable
wp(m1, xvar=Fl, [Link]=.8)
# a single x variable changing the default number of intervals
wp(m1, xvar=Fl, [Link]=1.5, [Link]=9)
# different x variable changing the default number of intervals
B1<-wp(m1, xvar=A, [Link]=1.2, [Link]=9)
B1
# the number five plot has intervals
# [5,] 1957.5 1957.5
# rather disappoining
# try formula for xvar
wp(m1, xvar=~A, [Link]=1.2, [Link]=9)
[Link] 137
## End(Not run)
Description
This creates z-scores for new values of y and x given a fitted lms object.
Usage
[Link](object, y, x)
Arguments
Details
Value
Author(s)
Mikis Stasinopoulos
138 [Link]
References
Cole, T. J. (1994) Do growth chart centiles need a face lift? BMJ, 308–641.
Cole, T. J. and Green, P. J. (1992) Smoothing reference centile curves: the LMS method and penal-
ized likelihood, Statist. Med. 11, 1305–1319
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and
shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape
(GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, [Link]
org/v23/[Link] D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017)
Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also [Link]
See Also
[Link]
Examples
## Not run:
IND<-[Link](7040, 1000, replace=FALSE)
db1 <- db[IND,]
plot(head~age, data=db1)
m0 <- lms(head, age, data=db1,trans.x=TRUE )
[Link](m0, x=c(2,15,30,40),y=c(45,50,56,63))
## End(Not run)
Index
139
140 INDEX
[Link], 104
quantSheets, 106