Type: | Package |
Title: | Time-Dependent Precision-Recall Curve Estimation for Right-Censored Data |
Version: | 1.0.0 |
Description: | This contains functions that can be used to estimate the time-dependent precision-recall curve (PRC) and the corresponding area under the PRC for right-censored survival data. It also compute time-dependent ROC curve and its corresponding area under the ROC curve (AUC). See Beyene, Chen and Kifle (2024) <doi:10.1002/bimj.202300135>. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Depends: | R(≥ 4.0) |
Imports: | survidm, graphics, stats |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-09-10 14:19:59 UTC; m2kas |
Author: | Kassu Mehari Beyene [aut, cre], Ding-Geng Chen [ctb], Yehenew Getachew Kifle [ctb] |
Maintainer: | Kassu Mehari Beyene <m2kassu@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-09-15 09:00:02 UTC |
tdPRC
Description
Tools for estimating and visualizing, precision-recall curve and the receiver operating characteristic (ROC) curves. The area under precision-recall (AUPRC) and area under the ROC curve (AUC) can be compared with statistical tests based on bootstrap standard error. Confidence intervals can be computed for AUPRC and AUC.
Abbreviations
In this package, the following abbreviations are commonly used:
- TPR
True Positive Rate.
- FPR
False Posetive Rate.
- PPV
Positive Predictive Value.
- AUPRC
Area under the precision-recall curve.
- ROC
The Receiver Operating Characteristic curve.
- AUC
Area under the ROC curve at a given time horizon
t
.
Dataset
This package comes with a right-censored data set with 312 observations and 4 variables. For details see mayo
Installing and using
Ensure that your system has an active internet connection, then execute the following command in the R console to install the package:
install.packages("tdPRC")
To load the package after installation, use the following command:
library(tdPRC)
Author(s)
Kassu Mehari Beyene, Ding-Geng Chen and Yehenew Getachew Kifle
Maintainer: Kassu Mehari Beyene <m2kassu@gmail.com>
References
Beran, R. (1981). Nonparametric regression with randomly censored survival data. Technical report, University of California, Berkeley.
Beyene, K.M., Chen, D.G., and Kifle, Y.G. (2024). A novel nonparametric time‐dependent precision–recall curve estimator for right‐censored survival data. Biometrical Journal, 66(3), 2300135.
Beyene, K.M. and El Ghouch A. (2020). Smoothed time-dependent receiver operating characteristic curve for right censored survival data. Statistics in Medicine. 39: 3373-3396.
Heagerty, P.J., and Zheng, Y. (2005). Survival model predictive accuracy and ROC curves. Biometrics, 61(1), 92-105.
Li, L., Greene, T. and Hu, B. (2016). A simple method to estimate the time-dependent receiver operating characteristic curve and the area under the curve with right censored data, Statistical Methods in Medical Research, 27(8): 2264-2278.
Sheather, S.J. and Jones, M.C. (1991). A Reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society. Series B (Methodological) 53(3): 683-690.
Conditional probability given the observed data estimate
Description
This is the base function of the package that uses the Beran nonparametric conditional survival function
estimator for right-censored data to estimate the conditional probability of the
event T<t
(i.e., the event occurring before time t) or T>t
(i.e., the event occurring
after time t), given the observed data.
Usage
Csurv(Y, M, censor, t, h=NULL, ktype="gaussian")
Arguments
Y |
The numeric vector of event-time or observed time. |
M |
The numeric vector of marker value. |
censor |
The censoring indicator, |
t |
A scalar time point used to calculate the the ROC curve |
h |
A scalar bandwidth value for kernel weights estimation. The default is the value obtained using the method of Sheather and Jones (1991). |
ktype |
A character string specifying the desired kernel needed for Beran weight calculation. The possible options are " |
Value
Return a list containing:
positive |
estimate of |
negative |
estimate of |
References
Beyene, K.M., Chen, D.G., and Kifle, Y.G. (2024). A novel nonparametric time‐dependent precision-recall curve estimator for right‐censored survival data. Biometrical Journal, 66(3), 2300135.
Beyene, K.M. and El Ghouch A. (2020). Smoothed time-dependent receiver operating characteristic curve for right censored survival data. Statistics in Medicine, 39: 3373-3396.
Examples
library(tdPRC);
data(mayo);
data <- mayo[ ,c( "time","censor","mayoscore5" )] ;
t <- 365*6;
est <- Csurv(Y=data$time, M=data$mayoscore5, censor=data$censor, t=t, ktype="gaussian")
summary(est$positive)
Mayo Marker Data
Description
Two marker values with event time and censoring status for the subjects in Mayo PBC data
Usage
data(mayo)
Format
A data frame with 312 observations and 4 variables: time (event time/censoring time), censor (censoring indicator), mayoscore4, mayoscore5. The two scores are derived from 4 and 5 covariates respectively.
References
Heagerty, P. J., & Zheng, Y. (2005). Survival model predictive accuracy and ROC curves. Biometrics, 61(1), 92-105.
Time-dependent precision-recall curve (PRC) estimation from right-censored survival data
Description
This function empirically estimate the time-dependent precision-recall curve and RUC curve for right censored survival data using the cumulative sensitivity and dynamic specificity definitions. It also calculates the time-dependent area under precision-recall curve (AUPRC) and the area under the ROC curve (AUC). The function computes standard error and confidence interval of AUPRC and AUC using non-parametric bootstrap approach.
Usage
tdPRC(
Y,
M,
censor,
t,
cut = NULL,
len = 1000,
h = 0.1,
ktype = "gaussian",
B = 0,
alpha = 0.05,
plot = FALSE
)
Arguments
Y |
The numeric vector of event-time or observed time. |
M |
The numeric vector of marker values. |
censor |
The censoring indicator, |
t |
A scalar time point used to calculate the PRC curve. |
cut |
A grid of cutoff values for estimation is computed. Default is sequence of |
len |
The length of the grid points. Default is |
h |
A scalar value for Beran's weight calculations. The default is the value obtained by using the method of Sheather and Jones (1991). |
ktype |
A character string specifying the desired kernel needed for Beran weight calculation. The possible options are " |
B |
The number of bootstrap samples to be used for variance estimation. The default is |
alpha |
The significance level. The default is |
plot |
The logical parameter to see the ROC curve plot. Default is |
Value
Returns the following items:
TPR |
vector of estimated TPR. |
FPR |
vector of estimated FPR. |
PPV |
vector of estimated PPV. |
AUPRC |
estimated area under the PR curve at a given time horizon |
AUC |
estimated area under the ROC curve at a given time horizon |
APbot |
estimated area under the PR curve for each bootstrap sample at a given time horizon |
dat |
a data frame with two columns:po = positive and M = marker. |
References
Beyene, K.M., Chen, D.G., and Kifle, Y.G. (2024). A novel nonparametric time‐dependent precision-recall curve estimator for right‐censored survival data. Biometrical Journal, 66(3), 2300135.
Beyene, K.M. and El Ghouch A. (2020). Smoothed time-dependent receiver operating characteristic curve for right censored survival data. Statistics in Medicine. 39: 3373-3396.
Li, L., Greene, T. and Hu, B. (2016). A simple method to estimate the time-dependent receiver operating characteristic curve and the area under the curve with right censored data, Statistical Methods in Medical Research, 27(8): 2264-2278.
Examples
library(tdPRC);
data(mayo);
data <- mayo[ ,c( "time","censor","mayoscore5" )] ;
t <- 365*6;
resu <- tdPRC(Y=data$time, M=data$mayoscore5, censor=data$censor, t=t, cut=NULL,
len=1000, h=0.1, plot=TRUE);
resu$AUPRC