| Type: | Package | 
| Title: | Two-Part Estimation of Treatment Rules for Semi-Continuous Data | 
| Version: | 0.0.1 | 
| Description: | Implements the methodology of Huling, Smith, and Chen (2020) <doi:10.1080/01621459.2020.1801449>, which allows for subgroup identification for semi-continuous outcomes by estimating individualized treatment rules. It uses a two-part modeling framework to handle semi-continuous data by separately modeling the positive part of the outcome and an indicator of whether each outcome is positive, but still results in a single treatment rule. High dimensional data is handled with a cooperative lasso penalty, which encourages the coefficients in the two models to have the same sign. | 
| URL: | https://github.com/jaredhuling/personalized2part | 
| BugReports: | https://github.com/jaredhuling/personalized2part/issues | 
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| Depends: | personalized, HDtweedie | 
| LinkingTo: | Rcpp, RcppEigen | 
| Imports: | Rcpp, foreach, methods | 
| RoxygenNote: | 7.1.1 | 
| NeedsCompilation: | yes | 
| Packaged: | 2020-09-02 20:55:06 UTC; jared | 
| Author: | Jared Huling  | 
| Maintainer: | Jared Huling <jaredhuling@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2020-09-10 10:00:03 UTC | 
Fit a penalized gamma augmentation model via cross fitting
Description
Fits a penalized gamma augmentation model via cross fitting and returns vector of length n of out of sample predictions on the link scale from cross fitting
Usage
HDtweedie_kfold_aug(
  x,
  y,
  trt,
  wts = NULL,
  K = 10,
  p = 1.5,
  interactions = FALSE
)
Arguments
x | 
 an n x p matrix of covariates for the zero part data, where each row is an observation and each column is a predictor. MUST be ordered such that the first n_s rows align with the observations in x_s and s  | 
y | 
 a length n vector of responses taking positive values  | 
trt | 
 a length n vector of treatment variables with 1 indicating treatment and -1 indicating control  | 
wts | 
 a length n vector of sample weights  | 
K | 
 number of folds for cross fitting  | 
p | 
 tweedie mixing parameter. See   | 
interactions | 
 boolean variable of whether or not to fit model with interactions. For predictions, interactions will be integrated out  | 
Cross validation for hd2part models
Description
Cross validation for hd2part models
Usage
cv.hd2part(
  x,
  z,
  x_s,
  s,
  weights = rep(1, NROW(x)),
  weights_s = rep(1, NROW(x_s)),
  offset = NULL,
  offset_s = NULL,
  lambda = NULL,
  type.measure = c("mae", "mse", "sep-auc-mse", "sep-auc-mae"),
  nfolds = 10,
  foldid = NULL,
  grouped = TRUE,
  keep = FALSE,
  parallel = FALSE,
  ...
)
Arguments
x | 
 an n x p matrix of covariates for the zero part data, where each row is an observation and each column is a predictor. MUST be ordered such that the first n_s rows align with the observations in x_s and s  | 
z | 
 a length n vector of responses taking values 1 and 0, where 1 indicates the response is positive and zero indicates the response has value 0. MUST be ordered such that the first n_s values align with the observations in x_s and s  | 
x_s | 
 an n_s x p matrix of covariates (which is a submatrix of x) for the positive part data, where each row is an observation and each column is a predictor  | 
s | 
 a length n_s vector of responses taking strictly positive values  | 
weights | 
 a length n vector of observation weights for the zero part data  | 
weights_s | 
 a length n_s vector of observation weights for the positive part data  | 
offset | 
 a length n vector of offset terms for the zero part data  | 
offset_s | 
 a length n_s vector of offset terms for the positive part data  | 
lambda | 
 A user supplied lambda sequence. By default, the program computes its own lambda sequence based on nlambda and lambda.min.ratio. Supplying a value of lambda overrides this.  | 
type.measure | 
 measure to evaluate for cross-validation. Will add more description later  | 
nfolds | 
 number of folds for cross-validation. default is 10. 3 is smallest value allowed.  | 
foldid | 
 an optional vector of values between 1 and nfold specifying which fold each observation belongs to.  | 
grouped | 
 Like in glmnet, this is an experimental argument, with default   | 
keep | 
 If   | 
parallel | 
 If TRUE, use parallel foreach to fit each fold. Must register parallel before hand, such as doMC.  | 
... | 
 other parameters to be passed to   | 
Examples
set.seed(1)
Fitting subgroup identification models for semicontinuous positive outcomes
Description
Fits subgroup identification models
Usage
fit_subgroup_2part(
  x,
  y,
  trt,
  propensity.func = NULL,
  propensity.func.positive = NULL,
  match.id = NULL,
  augment.func.zero = NULL,
  augment.func.positive = NULL,
  cutpoint = 1,
  larger.outcome.better = TRUE,
  penalize.ate = TRUE,
  y_eps = 1e-06,
  ...
)
Arguments
x | 
 The design matrix (not including intercept term)  | 
y | 
 The nonnegative response vector  | 
trt | 
 treatment vector with each element equal to a 0 or a 1, with 1 indicating treatment status is active.  | 
propensity.func | 
 function that inputs the design matrix x and the treatment vector trt and outputs
the propensity score, ie Pr(trt = 1 | X = x). Function should take two arguments 1) x and 2) trt. See example below.
For a randomized controlled trial this can simply be a function that returns a constant equal to the proportion
of patients assigned to the treatment group, i.e.:
  | 
propensity.func.positive | 
 function that inputs the design matrix x and the treatment vector trt and outputs
the propensity score for units with positive outcome values, ie Pr(trt = 1 | X = x, Z = 1). Function should take
two arguments 1) x and 2) trt. See example below.
For a randomized controlled trial this can simply be a function that returns a constant equal to the proportion
of patients assigned to the treatment group, i.e.:
  | 
match.id | 
 a (character, factor, or integer) vector with length equal to the number of observations in  Example 1:  Example 2: 
augment.func <- function(x, y, trt) {
    data <- data.frame(x, y, trt)
    lmod <- glm(y ~ x * trt, family = binomial())
    ## get predictions when trt = 1
    data$trt <- 1
    preds_1  <- predict(lmod, data, type = "response")
    ## get predictions when trt = -1
    data$trt <- -1
    preds_n1 <- predict(lmod, data, type = "response")
    ## return predictions averaged over trt
    return(0.5 * (preds_1 + preds_n1))
}
 | 
augment.func.zero | 
 (similar to augment.func.positive) function which inputs the
indicators of whether each response is positive (  | 
augment.func.positive | 
 (similar to augment.func.zero) function which inputs the positive part response
(ie all observations in   | 
cutpoint | 
 numeric value for patients with benefit scores above which
(or below which if   | 
larger.outcome.better | 
 boolean value of whether a larger outcome is better/preferable. Set to   | 
penalize.ate | 
 should the treatment main effect (ATE) be penalized too?  | 
y_eps | 
 positive value above which observations in   | 
... | 
 options to be passed to   | 
Examples
set.seed(42)
dat <- sim_semicontinuous_data(250, n.vars = 15)
x <- dat$x
y <- dat$y
trt <- dat$trt
prop_func <- function(x, trt)
{
    propensmod <- glm(trt ~ x, family = binomial())
    propens <- unname(fitted(propensmod))
    propens
}
fitted_model <- fit_subgroup_2part(x, y, trt, prop_func, prop_func)
fitted_model
## correlation of estimated covariate-conditional risk ratio and truth
cor(fitted_model$benefit.scores, dat$treatment_risk_ratio, method = "spearman")
Main fitting function for group lasso and cooperative lasso penalized two part models
Description
This function fits penalized two part models with a logistic regression model for the zero part and a gamma regression model for the positive part. Each covariate's effect has either a group lasso or cooperative lasso penalty for its effects for the two consituent models
Usage
hd2part(
  x,
  z,
  x_s,
  s,
  weights = rep(1, NROW(x)),
  weights_s = rep(1, NROW(x_s)),
  offset = NULL,
  offset_s = NULL,
  penalty = c("grp.lasso", "coop.lasso"),
  penalty_factor = NULL,
  nlambda = 100L,
  lambda_min_ratio = ifelse(n_s < p, 0.05, 0.005),
  lambda = NULL,
  tau = 0,
  opposite_signs = FALSE,
  flip_beta_zero = FALSE,
  intercept_z = FALSE,
  intercept_s = FALSE,
  strongrule = TRUE,
  maxit_irls = 50,
  tol_irls = 1e-05,
  maxit_mm = 500,
  tol_mm = 1e-05,
  balance_likelihoods = TRUE
)
Arguments
x | 
 an n x p matrix of covariates for the zero part data, where each row is an observation and each column is a predictor  | 
z | 
 a length n vector of responses taking values 1 and 0, where 1 indicates the response is positive and zero indicates the response has value 0.  | 
x_s | 
 an n_s x p matrix of covariates (which is a submatrix of x) for the positive part data, where each row is an observation and each column is a predictor  | 
s | 
 a length n_s vector of responses taking strictly positive values  | 
weights | 
 a length n vector of observation weights for the zero part data  | 
weights_s | 
 a length n_s vector of observation weights for the positive part data  | 
offset | 
 a length n vector of offset terms for the zero part data  | 
offset_s | 
 a length n_s vector of offset terms for the positive part data  | 
penalty | 
 either   | 
penalty_factor | 
 a length p vector of penalty adjustment factors corresponding to each covariate. A value of 0 in the jth location indicates no penalization on the jth variable, and any positive value will indicate a multiplicative factor on top of the common penalization amount. The default value is 1 for all variables  | 
nlambda | 
 the number of lambda values. The default is 100.  | 
lambda_min_ratio | 
 Smallest value for   | 
lambda | 
 a user supplied sequence of penalization tuning parameters. By default, the program automatically
chooses a sequence of lambda values based on   | 
tau | 
 value between 0 and 1 for sparse group mixing penalty. 0 implies either group lasso or coop lasso and 1 implies lasso  | 
opposite_signs | 
 a boolean variable indicating whether the signs of coefficients across models should be encouraged to have
opposite signs instead of the same signs. Default is   | 
flip_beta_zero | 
 should we flip the signs of the parameters for the zero part model? Defaults to   | 
intercept_z | 
 whether or not to include an intercept in the zero part model. Default is   | 
intercept_s | 
 whether or not to include an intercept in the positive part model. Default is   | 
strongrule | 
 should a strong rule be used? Defaults to   | 
maxit_irls | 
 maximum number of IRLS iterations  | 
tol_irls | 
 convergence tolerance for IRLS iterations  | 
maxit_mm | 
 maximum number of MM iterations. Note that for   | 
tol_mm | 
 convergence tolerance for MM iterations. Note that for   | 
balance_likelihoods | 
 should the likelihoods be balanced so variables would enter both models at the same value of lambda
if the penalty were a lasso penalty? Recommended to keep at the default,   | 
Examples
library(personalized2part)
Fitting function for lasso penalized GLMs
Description
This function fits penalized gamma GLMs
Usage
hdgamma(
  x,
  y,
  weights = rep(1, NROW(x)),
  offset = NULL,
  penalty_factor = NULL,
  nlambda = 100L,
  lambda_min_ratio = ifelse(n < p, 0.05, 0.005),
  lambda = NULL,
  tau = 0,
  intercept = TRUE,
  strongrule = TRUE,
  maxit_irls = 50,
  tol_irls = 1e-05,
  maxit_mm = 500,
  tol_mm = 1e-05
)
Arguments
x | 
 an n x p matrix of covariates for the zero part data, where each row is an observation and each column is a predictor  | 
y | 
 a length n vector of responses taking strictly positive values.  | 
weights | 
 a length n vector of observation weights  | 
offset | 
 a length n vector of offset terms  | 
penalty_factor | 
 a length p vector of penalty adjustment factors corresponding to each covariate. A value of 0 in the jth location indicates no penalization on the jth variable, and any positive value will indicate a multiplicative factor on top of the common penalization amount. The default value is 1 for all variables  | 
nlambda | 
 the number of lambda values. The default is 100.  | 
lambda_min_ratio | 
 Smallest value for   | 
lambda | 
 a user supplied sequence of penalization tuning parameters. By default, the program automatically
chooses a sequence of lambda values based on   | 
tau | 
 a scalar numeric value between 0 and 1 (included) which is a mixing parameter for sparse group lasso penalty. 0 indicates group lasso and 1 indicates lasso, values in between reflect different emphasis on group and lasso penalties  | 
intercept | 
 whether or not to include an intercept. Default is   | 
strongrule | 
 should a strong rule be used?  | 
maxit_irls | 
 maximum number of IRLS iterations  | 
tol_irls | 
 convergence tolerance for IRLS iterations  | 
maxit_mm | 
 maximum number of MM iterations. Note that for   | 
tol_mm | 
 convergence tolerance for MM iterations. Note that for   | 
Examples
library(personalized2part)
Plot method for hd2part fitted objects
Description
Plot method for hd2part fitted objects
Usage
## S3 method for class 'hd2part'
plot(
  x,
  model = c("zero", "positive"),
  xvar = c("loglambda", "norm", "lambda"),
  labsize = 0.6,
  xlab = iname,
  ylab = NULL,
  main = paste(model, "model"),
  ...
)
## S3 method for class 'cv.hd2part'
plot(x, sign.lambda = 1, ...)
Arguments
x | 
 fitted "hd2part" model object  | 
model | 
 either   | 
xvar | 
 What is on the X-axis.   | 
labsize | 
 size of labels for variable names. If labsize = 0, then no variable names will be plotted  | 
xlab | 
 label for x-axis  | 
ylab | 
 label for y-axis  | 
main | 
 main title for plot  | 
... | 
 other graphical parameters for the plot  | 
sign.lambda | 
 Either plot against log(lambda) (default) or its negative if   | 
Examples
set.seed(123)
set.seed(123)
Prediction function for fitted cross validation hd2part objects
Description
Prediction function for fitted cross validation hd2part objects
Usage
## S3 method for class 'cv.hd2part'
predict(
  object,
  newx,
  model = c("zero", "positive"),
  s = c("lambda.min", "lambda.1se"),
  type = c("link", "model_response", "response", "coefficients", "nonzero"),
  ...
)
Arguments
object | 
 fitted   | 
newx | 
 Matrix of new values for   | 
model | 
 either   | 
s | 
 Value(s) of the penalty parameter lambda at which predictions are required. Default is the entire sequence used to create
the model. For   | 
type | 
 Type of prediction required.   | 
... | 
 arguments to be passed to   | 
Examples
set.seed(123)
Prediction method for two part fitted objects
Description
Prediction method for two part fitted objects
Usage
## S3 method for class 'hd2part'
predict(
  object,
  newx,
  s = NULL,
  model = c("zero", "positive"),
  type = c("link", "model_response", "response", "coefficients", "nonzero"),
  newoffset = NULL,
  ...
)
Arguments
object | 
 fitted "hd2part" model object  | 
newx | 
 Matrix of new values for   | 
s | 
 Value(s) of the penalty parameter lambda for the zero part at which predictions are required. Default is the entire sequence used to create the model.  | 
model | 
 either   | 
type | 
 Type of prediction required.   | 
newoffset | 
 f an offset is used in the fit, then one must be supplied for making predictions  | 
... | 
 not used  | 
Value
An object depending on the type argument
Examples
set.seed(1)
Generates data from a two part distribution with a point mass at zero and heterogeneous treatment effects
Description
Generates semicontinuous data with heterogeneity of treatment effect
Usage
sim_semicontinuous_data(n.obs = 1000, n.vars = 25)
Arguments
n.obs | 
 number of observations  | 
n.vars | 
 number of variables. Must be at least 10  | 
Value
returns list with values y for outcome, x for design matrix, trt for
treatment assignments, betanonzero for true coefficients for treatment-covariate interactions for model for
whether or not a response is nonzero, betapos for true coefficients for treatment-covariate interactions
for positive model, treatment_risk_ratio for the true covariate-conditional treatment effect risk ratio for
each observation, pi.x for the true underlying propensity score