Title: | Robust Beta Regression |
---|---|
Description: | Robust estimators for the beta regression, useful for modeling bounded continuous data. Currently, four types of robust estimators are supported. They depend on a tuning constant which may be fixed or selected by a data-driven algorithm also implemented in the package. Diagnostic tools associated with the fitted model, such as the residuals and goodness-of-fit statistics, are implemented. Robust Wald-type tests are available. More details about robust beta regression are described in Maluf et al. (2022) <arXiv:2209.11315>. |
Authors: | Felipe Queiroz [aut, cre], Yuri Maluf [aut], Silvia Ferrari [ctb] |
Maintainer: | Felipe Queiroz <[email protected]> |
License: | GPL-3 |
Version: | 0.3.0 |
Built: | 2025-02-06 04:26:06 UTC |
Source: | https://github.com/cran/robustbetareg |
Density, distribution function, quantile function and random generation for exponential generalized beta of the second type distribution.
dEGB(y_star, mu, phi, log = FALSE) pEGB(q, mu, phi) qEGB(p, mu, phi) rEGB(n, mu, phi)
dEGB(y_star, mu, phi, log = FALSE) pEGB(q, mu, phi) qEGB(p, mu, phi) rEGB(n, mu, phi)
y_star , q
|
vector of quantiles. |
mu |
mu parameter. |
phi |
phi parameter. |
log |
logical; if |
p |
vector of probabilities. |
n |
number of observations. If |
The EGB distribution with parameters mu =
and
phi =
has density
with and
. For this
distribution,
and
, where
is the digamma function. See Kerman and McDonald (2015) for additional
details. If
, with
and
representing the mean and precision of
, then
with the density
given above.
dEGB
gives the density, pEGB
gives the distribution function,
qEGB
gives the quantile function, and rEGB
generates random
variables.
Yuri S. Maluf ([email protected]), Francisco F. Queiroz ([email protected]) and Silvia L. P. Ferrari.
Maluf, Y.S., Ferrari, S.L.P., and Queiroz, F.F. (2022). Robust
beta regression through the logit transformation. arXiv:2209.11315.
Kerman, S. and McDonald, J.B. (2015). Skewness-kurtosis bounds for EGB1, EGB2,
and special cases. Communications in Statistics - Theory and Methods,
44:3857-3864.
dEGB(0.2, mu = 0.3, phi = 1) mu = 0.2; phi = 2; set.seed(1) EGBsample = rEGB(1000, mu, phi) hist(EGBsample, prob = TRUE, breaks = 15, main = "", las = 1, ylim = c(0, 0.2), xlim = c(-20, 10)) curve(dEGB(x, mu, phi), from = -20, to = 8, add = TRUE, col = "red") # Showing the P(Y* < -5) = 0.17, where Y* ~ EGB(0.2, 2). x = seq(-20, 10,0.01) y = dEGB(x, mu, phi) plot(x, y, type = "l", lwd = 2, las = 1) x1 = seq(-20, -5, 0.01) y1 = dEGB(x1, mu, phi) polygon(c(x1, -5, -5), c(y1, 0, 0), col = "lightblue") plot(x, pEGB(x, mu, phi), type = "l", las = 1, lwd = 2, ylab = expression(P("Y*"<y)), xlab = "y") p = pEGB(0, mu, phi) q = qEGB(p, mu, phi) points(q, p, pch = 16, col = 2, cex = 1.5) text(2, 0.83, paste("(", 0, ",", round(p, 2), ")"), font = 2, cex = 0.8, col = "red")
dEGB(0.2, mu = 0.3, phi = 1) mu = 0.2; phi = 2; set.seed(1) EGBsample = rEGB(1000, mu, phi) hist(EGBsample, prob = TRUE, breaks = 15, main = "", las = 1, ylim = c(0, 0.2), xlim = c(-20, 10)) curve(dEGB(x, mu, phi), from = -20, to = 8, add = TRUE, col = "red") # Showing the P(Y* < -5) = 0.17, where Y* ~ EGB(0.2, 2). x = seq(-20, 10,0.01) y = dEGB(x, mu, phi) plot(x, y, type = "l", lwd = 2, las = 1) x1 = seq(-20, -5, 0.01) y1 = dEGB(x1, mu, phi) polygon(c(x1, -5, -5), c(y1, 0, 0), col = "lightblue") plot(x, pEGB(x, mu, phi), type = "l", las = 1, lwd = 2, ylab = expression(P("Y*"<y)), xlab = "y") p = pEGB(0, mu, phi) q = qEGB(p, mu, phi) points(q, p, pch = 16, col = 2, cex = 1.5) text(2, 0.83, paste("(", 0, ",", round(p, 2), ")"), font = 2, cex = 0.8, col = "red")
A dataset on risk management practices of 73 firms.
data("Firm", package = "robustbetareg")
data("Firm", package = "robustbetareg")
A data frame with 73 rows and 7 variables:
total property and casualty premiums and uninsured losses as a percentage of total assets.
per occurrence retention amount as a percentage of total assets.
indicates that the firm owns a captive insurance company; 1 if the firm uses a captive, 0 otherwise.
logarithm of total assets.
a measure of the firm's industry risk.
a measure of the importance of the local managers in choosing the amount of risk to be retained.
a measure of the degree of importance in using analytical tools.
The dataset was introduced and analyzed by Schmit and Roth (1990) and
is available in the personal web page of Professor E. Frees (Wisconsin School
of Business Research). The response variable is FIRMCOST
, smaller
values of firm cost are attributed to firms that have a good risk management
performance.
Schmit, J.T. and Roth, K. (1990). Cost effectiveness of risk management practices. Journal of Risk and Insurance. 57:455-470.
A dataset containing the proportion of the population from several cities of the state of São Paulo, Brazil, in 2010.
data("HIC", package = "robustbetareg")
data("HIC", package = "robustbetareg")
A data frame with 80 rows and 4 variables:
the corresponding city.
proportion of the total population living in the city’s urban zone.
per capita gross domestic product.
the health insurance coverage index.
This dataset was collected by the Institute of Applied Economic Research (Instituto de Pesquisa Econômica Aplicada, IPEA, Brazil). It includes information on 80 cities in the state of São Paulo, Brazil, in 2010.
http://www.ipeadata.gov.br/Default.aspx
Some S3 methods for objects of class "robustbetareg
".
## S3 method for class 'robustbetareg' summary(object, type = "sweighted2", ...) ## S3 method for class 'robustbetareg' coef(object, model = c("full", "mean", "precision"), ...) ## S3 method for class 'summary.robustbetareg' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'robustbetareg' summary(object, type = "sweighted2", ...) ## S3 method for class 'robustbetareg' coef(object, model = c("full", "mean", "precision"), ...) ## S3 method for class 'summary.robustbetareg' print(x, digits = max(3, getOption("digits") - 3), ...)
object , x
|
fitted model of class |
type |
character specifying the type of residuals to be included in the
summary output, see |
... |
currently not used. |
model |
character specifying for which component of the model the coefficients should be extracted. |
digits |
the number of significant digits to use when printing. |
A set of methods for fitted model objects of class robustbetareg
,
including methods to the generic functions print
and
summary
, which print the estimated coefficients along with
some further information.
methodsrobustbetareg
returns different outputs for objects of
class robustbetareg
, depending on the used method.
data("HIC", package="robustbetareg") fit=robustbetareg(HIC~URB+GDP|1,data=HIC,alpha=0.06) summary(fit) coef(fit)
data("HIC", package="robustbetareg") fit=robustbetareg(HIC~URB+GDP|1,data=HIC,alpha=0.06) summary(fit) coef(fit)
Several types of standard diagnostic plots can be produced interactively, involving different types of residuals.
## S3 method for class 'robustbetareg' plot(x, ask = TRUE, ...)
## S3 method for class 'robustbetareg' plot(x, ask = TRUE, ...)
x |
fitted model object of class " |
ask |
logical. If " |
... |
plot
method for robustbetareg
objects returns
several diagnostic plots.
robustbetareg
, residuals.robustbetareg
,
plotenvelope
get(data("HIC", package = "robustbetareg")) hic <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, alpha = 0.06)
get(data("HIC", package = "robustbetareg")) hic <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, alpha = 0.06)
plotenvelope
is used to display normal probability plots of residuals
with simulated envelope for the robust beta regression. Currently, eight types of
residuals are supported: sweighted2, pearson, weighted, sweighted,
sweighted.gamma, sweighted2.gamma, combined, and combined.projection residuals.
plotenvelope( object, type = c("sweighted2", "pearson", "weighted", "sweighted", "sweighted.gamma", "sweighted2.gamma", "combined", "combined.projection"), conf = 0.95, n.sim = 100, PrgBar = TRUE, control = robustbetareg.control(...), ... )
plotenvelope( object, type = c("sweighted2", "pearson", "weighted", "sweighted", "sweighted.gamma", "sweighted2.gamma", "combined", "combined.projection"), conf = 0.95, n.sim = 100, PrgBar = TRUE, control = robustbetareg.control(...), ... )
object |
fitted model object of class |
type |
character indicating the type of residuals to be used, see
|
conf |
numeric specifying the confidence level of the simulated
envelopes. Default is |
n.sim |
a positive integer representing the number of iterations
to generate the simulated envelopes. Default is |
PrgBar |
logical. If |
control |
a list of control arguments specified via
|
... |
arguments passed to |
The plotenvelope
creates normal probability plots with simulated
envelope (see Atkinson (1985) for details). Under the correct model,
approximately 100*conf of the residuals are expected to be inside the
envelope.
plotenvelope
returns normal probability plot of residuals with simulated
envelope.
Yuri S. Maluf ([email protected]), Francisco F. Queiroz ([email protected]) and Silvia L. P. Ferrari.
Maluf, Y.S., Ferrari, S.L.P., and Queiroz, F.F. (2022). Robust
beta regression through the logit transformation. arXiv:2209.11315.
Atkinson, A.C. (1985) Plots, transformations and regression: an
introduction to graphical methods of diagnostic regression analysis.
Oxford Science Publications, Oxford.
robustbetareg
, robustbetareg.control
,
residuals.robustbetareg
get(data("HIC", package = "robustbetareg")) hic <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, alpha = 0.06) plotenvelope(hic, n.sim = 50) get(data("Firm", package = "robustbetareg")) rmc <- robustbetareg(FIRMCOST ~ INDCOST + SIZELOG | INDCOST + SIZELOG, data = Firm) plotenvelope(rmc, conf = 0.90)
get(data("HIC", package = "robustbetareg")) hic <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, alpha = 0.06) plotenvelope(hic, n.sim = 50) get(data("Firm", package = "robustbetareg")) rmc <- robustbetareg(FIRMCOST ~ INDCOST + SIZELOG | INDCOST + SIZELOG, data = Firm) plotenvelope(rmc, conf = 0.90)
Extract various types of predictions from beta regression models: either on
the scale of responses in (0, 1) or the scale of the linear predictor,
from robustbetareg
objects.
predict( object, newdata = NULL, type = c("response", "link", "precision", "variance", "quantile"), at = 0.5, ... )
predict( object, newdata = NULL, type = c("response", "link", "precision", "variance", "quantile"), at = 0.5, ... )
object |
fitted model object of class " |
newdata |
optional, a data frame with new predictor values. If omitted, the original predictors are used. |
type |
character indicating type of predictions: fitted means of response
(" |
at |
numeric vector indicating the level(s) at which quantiles should be
predicted (only if |
... |
currently not used. |
Return a vector with the predicted values.
get(data("HIC", package = "robustbetareg")) hic <- robustbetareg(HIC ~ URB + GDP | 1, data = HIC, alpha = 0.04) cbind(predict(hic, type = "response"), predict(hic, type = "quantile", at = c(0.25, 0.5, 0.75)))
get(data("HIC", package = "robustbetareg")) hic <- robustbetareg(HIC ~ URB + GDP | 1, data = HIC, alpha = 0.04) cbind(predict(hic, type = "response"), predict(hic, type = "quantile", at = c(0.25, 0.5, 0.75)))
The function provides several types of residuals for the robust beta regression models: Pearson residuals (raw residuals scaled by square root of variance function) and different kinds of weighted residuals proposed by Espinheira et al. (2008) and Espinheira et al. (2017).
## S3 method for class 'robustbetareg' residuals( object, type = c("sweighted2", "pearson", "weighted", "sweighted", "sweighted.gamma", "sweighted2.gamma", "combined", "combined.projection"), ... )
## S3 method for class 'robustbetareg' residuals( object, type = c("sweighted2", "pearson", "weighted", "sweighted", "sweighted.gamma", "sweighted2.gamma", "combined", "combined.projection"), ... )
object |
fitted model object of class |
type |
character indicating type of residuals to be used. |
... |
currently not used. |
The definitions of the first four residuals are provided in
Espinheira et al. (2008): equation (2) for "pearson
",
equation (6) for "weighted
", equation (7) for "sweighted
",
and equation (8) for "sweighted2
". For the last four residuals
the definitions are described in Espinheira et al. (2017): equations (7)
and (10) for the "sweighted.gamma
" and "sweighted2.gamma
",
respectively, equation (9) for "combined
", and equation (11)
for "combined.projection
".
residuals
returns a vector with the residuals of the type
specified in the type
argument.
Maluf, Y. S., Ferrari, S. L. P., and Queiroz, F. F. (2022). Robust
beta regression through the logit transformation. arXiv:2209.11315.
Espinheira, P.L., Ferrari, S.L.P., and Cribari-Neto, F. (2008). On Beta
Regression Residuals. Journal of Applied Statistics, 35:407–419.
Espinheira, P.L., Santos, E.G.and Cribari-Neto, F. (2017). On nonlinear
beta regression residuals. Biometrical Journal, 59:445-461.
get(data("HIC", package = "robustbetareg")) fit.hic <- robustbetareg(HIC ~ URB + GDP | 1, data = HIC, alpha = 0.04) res <- residuals(fit.hic, type = "sweighted2") #plot(res) #abline(h = 0)
get(data("HIC", package = "robustbetareg")) fit.hic <- robustbetareg(HIC ~ URB + GDP | 1, data = HIC, alpha = 0.04) res <- residuals(fit.hic, type = "sweighted2") #plot(res) #abline(h = 0)
Fit robust beta regression models for rates and proportions via LSMLE, LMDPDE, SMLE and MDPDE. Both mean and precision of the response variable are modeled through parametric functions.
robustbetareg( formula, data, alpha, type = c("LSMLE", "LMDPDE", "SMLE", "MDPDE"), link = c("logit", "probit", "cloglog", "cauchit", "loglog"), link.phi = NULL, control = robustbetareg.control(...), model = TRUE, ... ) LMDPDE.fit(y, x, z, alpha = NULL, link = "logit", link.phi = "log", control = robustbetareg.control(...), ...) LSMLE.fit(y, x, z, alpha = NULL, link = "logit", link.phi = "log", control = robustbetareg.control(...), ...) MDPDE.fit(y, x, z, alpha = NULL, link = "logit", link.phi = "log", control = robustbetareg.control(...), ...) SMLE.fit(y, x, z, alpha = NULL, link = "logit", link.phi = "log", control = robustbetareg.control(...), ...)
robustbetareg( formula, data, alpha, type = c("LSMLE", "LMDPDE", "SMLE", "MDPDE"), link = c("logit", "probit", "cloglog", "cauchit", "loglog"), link.phi = NULL, control = robustbetareg.control(...), model = TRUE, ... ) LMDPDE.fit(y, x, z, alpha = NULL, link = "logit", link.phi = "log", control = robustbetareg.control(...), ...) LSMLE.fit(y, x, z, alpha = NULL, link = "logit", link.phi = "log", control = robustbetareg.control(...), ...) MDPDE.fit(y, x, z, alpha = NULL, link = "logit", link.phi = "log", control = robustbetareg.control(...), ...) SMLE.fit(y, x, z, alpha = NULL, link = "logit", link.phi = "log", control = robustbetareg.control(...), ...)
formula |
symbolic description of the model. See Details for further information. |
data |
dataset to be used. |
alpha |
numeric in |
type |
character specifying the type of robust estimator to be used in the
estimation process. Supported estimators are " |
link |
an optional character that specifies the link function of the
mean submodel (mu). The " |
link.phi |
an optional character that specifies the link function of the
precision submodel (phi). The " |
control |
a list of control arguments specified via
|
model |
logical. If |
... |
argument to be passed to |
y , x , z
|
|
Beta regression models are employed to model continuous response
variables in the unit interval, like rates and proportions. The maximum
likelihood-based inference suffers from
the lack of robustness in the presence of outliers. Based on
the density power divergence, Ghosh (2019) proposed the minimum density
power divergence estimator (MDPDE). Ribeiro and Ferrari (2022) proposed an
estimator based on the maximization of a reparameterized Lq-likelihood;
it is called SMLE. These estimators require suitable restrictions in the
parameter space. Maluf et al. (2022) proposed robust estimators based on
the MDPDE and the SMLE which have the advantage of overcoming this drawback.
These estimators are called LMDPDE and LSMLE. For details, see the
cited works. The four estimators are implemented in the robustbetareg
function. They depend on a tuning constant (called ).
When the tuning constant is fixed and equal to 0, all of the estimators
coincide with the maximum likelihood estimator. Ribeiro and Ferrari (2022)
and Maluf et al. (2022) suggest using a data-driven algorithm to select the
optimum value of
. This algorithm is implemented in
robustbetareg
by default when the argument "alpha
" is
suppressed.
The formulation of the model has the same structure as in the usual functions
glm
and betareg
. The argument
formula
can comprise of three parts (separated by the symbols
"" and "
"), namely: observed response variable in the unit
interval, predictor of the mean submodel, with link function
link
and predictor of the precision submodel, with link.phi
link function. If the model has constant precision, the third part may be
omitted and the link function for phi is "identity
" by default.
The tuning constant alpha
may be treated as fixed or not (chosen
by the data-driven algorithm). If alpha
is fixed, its value
must be specified in the alpha
argument.
Some methods are available for objects of class "robustbetareg
",
see plot.robustbetareg
, summary.robustbetareg
,
coef.robustbetareg
, and residuals.robustbetareg
,
for details and other methods.
robustbetareg
returns an object of class "robustbetareg
" with a list of the following components:
coefficients |
a list with the "mean " and "precision "
coefficients. |
vcov |
covariance matrix. |
converged |
logical indicating successful convergence of the iterative process. |
fitted.values |
a vector with the fitted values of the mean submodel. |
start |
a vector with the starting values used in the iterative process. |
weights |
the weights of each observation in the estimation process. |
Tuning |
value of the tuning constant (automatically chosen or fixed) used in the estimation process. |
residuals |
a vector of standardized weighted residual 2 (see Espinheira et al. (2008)). |
n |
number of observations. |
link |
link function used in the mean submodel. |
link.phi |
link function used in the precision submodel. |
Optimal.Tuning |
logical indicating whether the data-driven algorithm was used. |
pseudo.r.squared |
pseudo R-squared value. |
control |
the control arguments passed to the data-driven algorithm and
optim call. |
std.error |
the standard errors. |
method |
type of estimator used. |
call |
the original function call. |
formula |
the formula used. |
model |
the full model frame. |
terms |
a list with elements "mean ", "precision " and "full "
containing the term objects for the respective models. |
y |
the response variable. |
data |
the dataset used. |
Yuri S. Maluf ([email protected]), Francisco F. Queiroz ([email protected]) and Silvia L. P. Ferrari.
Maluf, Y.S., Ferrari, S.L.P., and Queiroz, F.F. (2022). Robust
beta regression through the logit transformation. arXiv:2209.11315.
Ribeiro, T.K.A. and Ferrari, S.L.P. (2022). Robust estimation in beta regression
via maximum Lq-likelihood. Statistical Papers. DOI: 10.1007/s00362-022-01320-0.
Ghosh, A. (2019). Robust inference under the beta regression model with
application to health care studies. Statistical Methods in Medical
Research, 28:271-888.
Espinheira, P.L., Ferrari, S.L.P., and Cribari-Neto, F. (2008). On beta regression residuals. Journal of Applied Statistics, 35:407–419.
robustbetareg.control
, summary.robustbetareg
, residuals.robustbetareg
#### Risk Manager Cost data data("Firm") # MLE fit (fixed alpha equal to zero) fit_MLE <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "LMDPDE", alpha = 0) summary(fit_MLE) # MDPDE with alpha = 0.04 fit_MDPDE <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "MDPDE", alpha = 0.04) summary(fit_MDPDE) # Choosing alpha via data-driven algorithm fit_MDPDE2 <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "MDPDE") summary(fit_MDPDE2) # Similar result for the LMDPDE fit: fit_LMDPDE2 <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "LMDPDE") summary(fit_LMDPDE2) # Diagnostic plots #### HIC data data("HIC") # MLE (fixed alpha equal to zero) fit_MLE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "LMDPDE", alpha = 0) summary(fit_MLE) # SMLE and MDPDE with alpha selected via data-driven algorithm fit_SMLE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "SMLE") summary(fit_SMLE) fit_MDPDE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "MDPDE") summary(fit_MDPDE) # SMLE and MDPDE return MLE because of the lack of stability # LSMLE and LMDPDE with alpha selected via data-driven algorithm fit_LSMLE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "LSMLE") summary(fit_LSMLE) fit_LMDPDE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "LMDPDE") summary(fit_LMDPDE) # LSMLE and LMDPDE return robust estimates with alpha = 0.06 # Plotting the weights against the residuals - LSMLE fit. plot(fit_LSMLE$residuals, fit_LSMLE$weights, pch = "+", xlab = "Residuals", ylab = "Weights") # Excluding outlier observation. fit_LSMLEwo1 <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC[-1,], type = "LSMLE") summary(fit_LSMLEwo1) # Normal probability plot with simulated envelope plotenvelope(fit_LSMLE)
#### Risk Manager Cost data data("Firm") # MLE fit (fixed alpha equal to zero) fit_MLE <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "LMDPDE", alpha = 0) summary(fit_MLE) # MDPDE with alpha = 0.04 fit_MDPDE <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "MDPDE", alpha = 0.04) summary(fit_MDPDE) # Choosing alpha via data-driven algorithm fit_MDPDE2 <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "MDPDE") summary(fit_MDPDE2) # Similar result for the LMDPDE fit: fit_LMDPDE2 <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "LMDPDE") summary(fit_LMDPDE2) # Diagnostic plots #### HIC data data("HIC") # MLE (fixed alpha equal to zero) fit_MLE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "LMDPDE", alpha = 0) summary(fit_MLE) # SMLE and MDPDE with alpha selected via data-driven algorithm fit_SMLE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "SMLE") summary(fit_SMLE) fit_MDPDE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "MDPDE") summary(fit_MDPDE) # SMLE and MDPDE return MLE because of the lack of stability # LSMLE and LMDPDE with alpha selected via data-driven algorithm fit_LSMLE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "LSMLE") summary(fit_LSMLE) fit_LMDPDE <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC, type = "LMDPDE") summary(fit_LMDPDE) # LSMLE and LMDPDE return robust estimates with alpha = 0.06 # Plotting the weights against the residuals - LSMLE fit. plot(fit_LSMLE$residuals, fit_LSMLE$weights, pch = "+", xlab = "Residuals", ylab = "Weights") # Excluding outlier observation. fit_LSMLEwo1 <- robustbetareg(HIC ~ URB + GDP | GDP, data = HIC[-1,], type = "LSMLE") summary(fit_LSMLEwo1) # Normal probability plot with simulated envelope plotenvelope(fit_LSMLE)
Several parameters that control fitting of robust beta regression models using
robustbetareg.
robustbetareg.control( start = NULL, alpha.optimal = TRUE, tolerance = 0.001, maxit = 5000, L = 0.02, M = 3, ... )
robustbetareg.control( start = NULL, alpha.optimal = TRUE, tolerance = 0.001, maxit = 5000, L = 0.02, M = 3, ... )
start |
an optional vector with starting values for the parameter estimates. |
alpha.optimal |
a logical value. If |
tolerance |
numeric tolerance for convergence. |
maxit |
argument passed to |
L |
numeric specifying the threshold for the data-driven algorithm.
Default is |
M |
integer specifying the number of grid spacing for the data-driven
algorithm. Default is |
... |
currently not used. |
The robustbetareg.control
controls the fitting process of
the robust estimation in beta regression via the LMDPDE, LSMLE, MDPDE, and
SMLE. The arguments L
and M
are passed to the data-driven
algorithm for selecting the optimum alpha value; details can be found in
Ribeiro and Ferrari (2022). Starting values for the parameters associated
to the mean and precision submodels may be supplied via start
.
We do not recommend changing the arguments passed to the data-driven algorithm.
A list with components named as the arguments.
Yuri S. Maluf ([email protected]), Francisco F. Queiroz ([email protected]) and Silvia L. P. Ferrari.
Maluf, Y.S., Ferrari, S.L.P., and Queiroz, F.F. (2022). Robust
beta regression through the logit transformation. arXiv:2209.11315.
Ribeiro, K.A.T. and Ferrari, S.L.P. (2022). Robust estimation in beta regression
via maximum Lq-likelihood. Statistical Papers. DOI: 10.1007/s00362-022-01320-0.
data("Firm") # Using a robust start value for the parameters associated with the # mean submodel # using the robustbase package # robust regression to obtain a starting value for beta fit_lm_Rob <- robustbase::lmrob(FIRMCOST ~ SIZELOG + INDCOST, data = Firm) initials_beta_rob <- as.numeric(coef(fit_lm_Rob)) etarob <- model.matrix(fit_lm_Rob)%*%initials_beta_rob muhat_Rob <- set.link(link.mu = "logit", link.phi = "log")$linkfun.mu$inv.link(etarob) T_1_Rob <- 1/set.link(link.mu = "logit", link.phi = "log")$linkfun.mu$d.linkfun(muhat_Rob) #estimate of variance of y based on robustbase package sigma2hat_Rob <- ((fit_lm_Rob$scale^2)*(T_1_Rob^2)) #phi estimate from robust method phihat_Rob <- mean((muhat_Rob*(1-muhat_Rob))/sigma2hat_Rob) gama1hat_rob <- log(phihat_Rob) #gamma estimates from robustbase package initials_gama_rob <- as.numeric(gama1hat_rob) #robust starting values for beta and gamma thetaStart <- c(initials_beta_rob, initials_gama_rob) fit_LSMLE <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "LSMLE", link.phi = "log", control = robustbetareg.control(start = thetaStart))
data("Firm") # Using a robust start value for the parameters associated with the # mean submodel # using the robustbase package # robust regression to obtain a starting value for beta fit_lm_Rob <- robustbase::lmrob(FIRMCOST ~ SIZELOG + INDCOST, data = Firm) initials_beta_rob <- as.numeric(coef(fit_lm_Rob)) etarob <- model.matrix(fit_lm_Rob)%*%initials_beta_rob muhat_Rob <- set.link(link.mu = "logit", link.phi = "log")$linkfun.mu$inv.link(etarob) T_1_Rob <- 1/set.link(link.mu = "logit", link.phi = "log")$linkfun.mu$d.linkfun(muhat_Rob) #estimate of variance of y based on robustbase package sigma2hat_Rob <- ((fit_lm_Rob$scale^2)*(T_1_Rob^2)) #phi estimate from robust method phihat_Rob <- mean((muhat_Rob*(1-muhat_Rob))/sigma2hat_Rob) gama1hat_rob <- log(phihat_Rob) #gamma estimates from robustbase package initials_gama_rob <- as.numeric(gama1hat_rob) #robust starting values for beta and gamma thetaStart <- c(initials_beta_rob, initials_gama_rob) fit_LSMLE <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST, data = Firm, type = "LSMLE", link.phi = "log", control = robustbetareg.control(start = thetaStart))
This function provides several link functions for robust beta regression.
set.link(link.mu = "logit", link.phi = "log")
set.link(link.mu = "logit", link.phi = "log")
link.mu |
character specifying the mean link function. Currently, the functions
" |
link.phi |
character specifying the precision link function. Currently,
the functions " |
set.link
provides the link function, inverse link function, first and
second derivatives for both mean and precision submodels.
links = set.link(link.mu = "cauchit", link.phi = "sqrt") attributes(links)
links = set.link(link.mu = "cauchit", link.phi = "sqrt") attributes(links)
waldtypetest
provides Wald-type tests for both simple and composite
hypotheses for beta regression based on the robust estimators
(LSMLE, LMDPDE, SMLE, and MDPDE).
waldtypetest(object, FUN, ...)
waldtypetest(object, FUN, ...)
object |
fitted model object of class |
FUN |
function representing the null hypothesis to be tested. |
... |
further arguments to be passed to the |
The function provides a robust Wald-type test for a general hypothesis
, for a fixed
, against
a two sided alternative; see Maluf et al. (2022) for details.
The argument
FUN
specifies the function ,
which defines the null hypothesis. For instance, consider a model with
and let the
null hypothesis be
. The
FUN
argument can be
FUN = function(theta) {theta[2] + theta[3] - 4}
. It is also possible to
define the function as FUN = function(theta, B) {theta[2] + theta[3] - B}
,
and pass the B
argument through the waldtypetest
function.
If no function is specified, that is, FUN=NULL
, the waldtypetest
returns a test in which the null hypothesis is that all the coefficients are
zero.
waldtypetest
returns an output for the Wald-type test containing
the value of the test statistic, degrees-of-freedom and p-value.
Yuri S. Maluf ([email protected]), Francisco F. Queiroz ([email protected]) and Silvia L. P. Ferrari.
Maluf, Y. S., Ferrari, S. L. P., and Queiroz, F. F. (2022). Robust
beta regression through the logit transformation. arXiv:2209.11315.
Basu, A., Ghosh, A., Martin, N., and Pardo, L. (2018). Robust Wald-type
tests for non-homogeneous observations based on the minimum density
power divergence estimator. Metrika, 81:493–522.
Ribeiro, K. A. T. and Ferrari, S. L. P. (2022). Robust estimation in beta
regression via maximum Lq-likelihood. Statistical Papers.
# generating a dataset set.seed(2022) n <- 40 beta.coef <- c(-1, -2) gamma.coef <- c(5) X <- cbind(rep(1, n), x <- runif(n)) mu <- exp(X%*%beta.coef)/(1 + exp(X%*%beta.coef)) phi <- exp(gamma.coef) #Inverse Log Link Function y <- rbeta(n, mu*phi, (1 - mu)*phi) y[26] <- rbeta(1, ((1 + mu[26])/2)*phi, (1 - ((1 + mu[26])/2))*phi) SimData <- as.data.frame(cbind(y, x)) colnames(SimData) <- c("y", "x") # Fitting the MLE and the LSMLE fit.mle <- robustbetareg(y ~ x | 1, data = SimData, alpha = 0) fit.lsmle <- robustbetareg(y ~ x | 1, data = SimData) # Hypothesis to be tested: (beta_1, beta_2) = c(-1, -2) against a two # sided alternative h0 <- function(theta){theta[1:2] - c(-1, -2)} waldtypetest(fit.mle, h0) waldtypetest(fit.lsmle, h0) # Alternative way: h0 <- function(theta, B){theta[1:2] - B} waldtypetest(fit.mle, h0, B = c(-1, -2)) waldtypetest(fit.lsmle, h0, B = c(-1, -2))
# generating a dataset set.seed(2022) n <- 40 beta.coef <- c(-1, -2) gamma.coef <- c(5) X <- cbind(rep(1, n), x <- runif(n)) mu <- exp(X%*%beta.coef)/(1 + exp(X%*%beta.coef)) phi <- exp(gamma.coef) #Inverse Log Link Function y <- rbeta(n, mu*phi, (1 - mu)*phi) y[26] <- rbeta(1, ((1 + mu[26])/2)*phi, (1 - ((1 + mu[26])/2))*phi) SimData <- as.data.frame(cbind(y, x)) colnames(SimData) <- c("y", "x") # Fitting the MLE and the LSMLE fit.mle <- robustbetareg(y ~ x | 1, data = SimData, alpha = 0) fit.lsmle <- robustbetareg(y ~ x | 1, data = SimData) # Hypothesis to be tested: (beta_1, beta_2) = c(-1, -2) against a two # sided alternative h0 <- function(theta){theta[1:2] - c(-1, -2)} waldtypetest(fit.mle, h0) waldtypetest(fit.lsmle, h0) # Alternative way: h0 <- function(theta, B){theta[1:2] - B} waldtypetest(fit.mle, h0, B = c(-1, -2)) waldtypetest(fit.lsmle, h0, B = c(-1, -2))