| Type: | Package | 
| Title: | High-Dimensional Spatial Covariate-Augmented Overdispersed Poisson Factor Model | 
| Version: | 1.3 | 
| Date: | 2025-03-27 | 
| Author: | Wei Liu [aut, cre], Qingzhi Zhong [aut] | 
| Maintainer: | Wei Liu <liuwei8@scu.edu.cn> | 
| Description: | A spatial covariate-augmented overdispersed Poisson factor model is proposed to perform efficient latent representation learning method for high-dimensional large-scale spatial count data with additional covariates. | 
| License: | GPL-3 | 
| URL: | https://github.com/feiyoung/SpaCOAP | 
| BugReports: | https://github.com/feiyoung/SpaCOAP/issues | 
| Imports: | LaplacesDemon, stats, methods, Matrix, MASS,Rcpp (≥ 1.0.10) | 
| Depends: | irlba, R (≥ 3.5.0) | 
| Suggests: | knitr, rmarkdown | 
| LinkingTo: | Rcpp, RcppArmadillo | 
| VignetteBuilder: | knitr | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.3.2 | 
| NeedsCompilation: | yes | 
| Packaged: | 2025-03-27 13:55:40 UTC; 10297 | 
| Repository: | CRAN | 
| Date/Publication: | 2025-03-27 14:30:01 UTC | 
Fit the SpaCOAP model
Description
Fit the spatial covariate-augmented overdispersed Poisson factor model
Usage
SpaCOAP(
  X_count,
  Adj_sp,
  H,
  Z = matrix(1, nrow(X_count), 1),
  offset = rep(0, nrow(X_count)),
  rank_use = 5,
  q = 15,
  epsELBO = 1e-08,
  maxIter = 30,
  verbose = TRUE,
  add_IC_inter = FALSE,
  seed = 1,
  algo = 1
)
Arguments
X_count | 
 a count matrix, the observed count matrix with shape n-by-p.  | 
Adj_sp | 
 a sparse matrix, the weighted adjacency matrix;  | 
H | 
 a n-by-d matrix, the covariate matrix with low-rank regression coefficient matrix;  | 
Z | 
 an optional matrix, the fixed-dimensional covariate matrix with control variables; default as a full-one column vector if there is no additional covariates.  | 
offset | 
 an optional vector, the offset for each unit; default as full-zero vector.  | 
rank_use | 
 an optional integer, specify the rank of the regression coefficient matrix; default as 5.  | 
q | 
 an optional string, specify the number of factors; default as 15.  | 
epsELBO | 
 an optional positive vlaue, tolerance of relative variation rate of the envidence lower bound value, defualt as '1e-8'.  | 
maxIter | 
 the maximum iteration of the VEM algorithm. The default is 30.  | 
verbose | 
 a logical value, whether output the information in iteration.  | 
add_IC_inter | 
 a logical value, add the identifiability condition in iterative algorithm or add it after algorithm converges; default as FALSE.  | 
seed | 
 an integer, set the random seed in initialization, default as 1;  | 
algo | 
 an optional integer taking value 1 0r 2, select the algorithm used, default as 1, representing variational EM algorithm.  | 
Details
None
Value
return a list including the following components:
-  
F- the predicted factor matrix; -  
B- the estimated loading matrix; -  
bbeta- the estimated low-rank large coefficient matrix; -  
alpha0- the estimated regression coefficient matrix corresponing to Z; -  
invLambda- the inverse of the estimated variances of error; -  
eta- the estimated spatial autocorrelation parameter; -  
S- the approximated posterior covariance for each row of F; -  
ELBO- the ELBO value when algorithm stops; -  
ELBO_seq- the sequence of ELBO values. -  
time_use- the running time in model fitting of SpaCOAP; 
References
Liu W, Zhong Q. High-dimensional covariate-augmented overdispersed poisson factor model. Biometrics. 2024 Jun;80(2):ujae031.
See Also
None
Examples
width <- 20; height <- 15; p <- 100
d <- 20; k <- 3; q <- 6; r <- 3
datlist <- gendata_spacoap(width=width, height=height, p=p, d=20, k=k, q=q, rank0=r)
fitlist <- SpaCOAP(X_count=datlist$X, Adj_sp = datlist$Adj_sp, 
H= datlist$H, Z = datlist$Z, q=6, rank_use=3)
str(fitlist)
Select the parameters in COAP models
Description
Select the number of factors and the rank of coefficient matrix in the covariate-augmented overdispersed Poisson factor model
Usage
chooseParams(
  X_count,
  Adj_sp,
  H,
  Z = matrix(1, nrow(X_count), 1),
  offset = rep(0, nrow(X_count)),
  q_max = 15,
  r_max = 24,
  threshold = c(0.1, 0.01),
  verbose = TRUE,
  ...
)
Arguments
X_count | 
 a count matrix, the observed count matrix with shape n-by-p.  | 
Adj_sp | 
 a sparse matrix, the weighted adjacency matrix;  | 
H | 
 a n-by-d matrix, the covariate matrix with low-rank regression coefficient matrix;  | 
Z | 
 an optional matrix, the fixed-dimensional covariate matrix with control variables; default as a full-one column vector if there is no additional covariates.  | 
offset | 
 an optional vector, the offset for each unit; default as full-zero vector.  | 
q_max | 
 an optional string, specify the upper bound for the number of factors; default as 15.  | 
r_max | 
 an optional integer, specify the upper bound for the rank of the regression coefficient matrix; default as 24.  | 
threshold | 
 an optional 2-dimensional positive vector, specify the the thresholds that filters the singular values of beta and B, respectively.  | 
verbose | 
 a logical value, whether output the information in iteration.  | 
... | 
 other arguments passed to the function   | 
Details
The threshold is to filter the singular values with low signal, to assist the identification of underlying model structure.
Value
return a named vector with names 'hr' and 'hq', the estimated rank and number of factors.
References
None
See Also
Examples
width <- 20; height <- 15; p <- 300
d <- 20; k <- 3; q <- 6; r <- 3
datlist <- gendata_spacoap(width=width, height=height, p=p, d=d, k=k, q=q, rank0=r)
set.seed(1)
para_vec <- chooseParams(X_count=datlist$X, Adj_sp=datlist$Adj_sp,
 H= datlist$H, Z = datlist$Z, r_max=6)
print(para_vec)
Generate simulated data
Description
Generate simulated data from spaital covariate-augmented Poisson factor models
Usage
gendata_spacoap(
  seed = 1,
  width = 20,
  height = 30,
  p = 500,
  d = 40,
  k = 3,
  q = 5,
  rank0 = 3,
  eta0 = 0.5,
  bandwidth = 1,
  rho = c(10, 1),
  sigma2_eps = 1,
  seed.beta = 1
)
Arguments
seed | 
 a postive integer, the random seed for reproducibility of data generation process.  | 
width | 
 a postive integer, specify the width of the spatial grid.  | 
height | 
 a postive integer, specify the height of the spatial grid.  | 
p | 
 a postive integer, specify the dimension of count variables.  | 
d | 
 a postive integer, specify the dimension of covariate matrix with low-rank regression coefficient matrix.  | 
k | 
 a postive integer, specify the dimension of covariate matrix as control variables.  | 
q | 
 a postive integer, specify the number of factors.  | 
rank0 | 
 a postive integer, specify the rank of the coefficient matrix.  | 
eta0 | 
 a real between 0 and 1, specify the spatial autocorrelation parameter.  | 
bandwidth | 
 a real positive value, specify the bandwidth in calculating the weighted adjacency matrix.  | 
rho | 
 a numeric vector with length 2 and positive elements, specify the signal strength of loading matrix and regression coefficient, respectively.  | 
sigma2_eps | 
 a positive real, the variance of overdispersion error.  | 
seed.beta | 
 a postive integer, the random seed for reproducibility of data generation process by fixing the regression coefficient matrix beta.  | 
Details
None
Value
return a list including the following components:
-  
X- the high-dimensional count matrix; -  
Z- the low-dimensional covariate matrix with control variables. -  
H- the high-dimensional covariate matrix; -  
Adj_sp- the weighted adjacence matrix; -  
alpha0- the regression coefficient matrix corresponing to Z; -  
bbeta0- the low-rank large regression coefficient matrix corresponing to H; -  
B0- the loading matrix; -  
F0- the laten factor matrix; -  
rank0- the true rank of bbeta0; -  
q- the true number of factors; -  
eta0- spatial autocorrelation parameter; -  
pos- spatial coordinates for each observation. 
References
None
See Also
Examples
width <- 20; height <- 15; p <- 100
d <- 20; k <- 3; q <- 6; r <- 3
datlist <- gendata_spacoap(width=width, height=height, p=p, d=20, k=k, q=q, rank0=r)
str(datlist)