| Title: | Screen and clean variable selection procedures | 
| Version: | 1.0.1 | 
| Date: | 2012-10-30 | 
| Author: | Pengsheng Ji, Jiashun Jin, Qi Zhang | 
| Maintainer: | Qi Zhang <karlmzhang@gmail.com> | 
| Description: | Routines for a collection of screen-and-clean type variable selection procedures, including UPS and GS. | 
| Imports: | MASS, Matrix, quadprog | 
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] | 
| Packaged: | 2012-10-30 14:23:17 UTC; Zhang-Qi | 
| Repository: | CRAN | 
| Date/Publication: | 2012-10-30 17:34:49 | 
| NeedsCompilation: | no | 
Screen and clean variable selection procedures, including UPS and GS.
Description
Routines for a collection of screen-and-clean type variable selection procedures.
Details
| Package: | ScreenClean | 
| Type: | Package | 
| Version: | 1.0.1 | 
| Date: | 2012-10-30 | 
| License: | GPL (>= 2) | 
Note
In order to use ScreenClean, the data need to be normalized, to make the standard deviation of the noise to be 1, and the l_2 norm of each length n predictor vector to be 1.
Author(s)
Pengsheng Ji, Jiashun Jin, Qi Zhang
Maintainer: Qi Zhangqiz19@pitt.edu
References
Ji, P. and Jin, J. (2012). UPS delivers optimal phase diagram in high dimensional variable selection. Ann. Statist., 40(1), 73-103.
Jin, J., Zhang, C.-H. and Zhang, Q. (2012). Optimality of Graphlet Screening in High Dimensional Variable Selection. arXiv:1204.6452
GC-step of the graphlet screening
Description
CleaningStep performs the cleaning step of the graphlet screening
Usage
CleaningStep(survivor, y.tilde, gram, lambda, uu)
Arguments
survivor | 
 the result of the screening step, a logical vector.  | 
y.tilde | 
 
  | 
gram | 
 the thresholded sparse gram matrix  | 
lambda | 
 the tuning parameters of the cleaning step, whose optimal choice is tied to the sparse level.  | 
uu | 
 the tuning parameter of the cleaning step; its optimal choice has the intuition of the detected minimal signal strength.  | 
Value
beta.gs | 
 the estimated regression coefficient of the graphlet screening, a numeric vector  | 
See Also
Examples
##See the demoGs.r
Find all the connected subgraphs whose size <= lc
Description
FindAllCG uses FindCG iteratively, and lists all the connected subgraphs with no more than lc nodes
Usage
FindAllCG(adjacency.matrix, lc)
Arguments
adjacency.matrix | 
 p by p adjacency matrix of an undirected graph; it must be symmetric.  | 
lc | 
 the maximal size of the connected subgraphs to be listed  | 
Value
cg.all | 
 A list, whose kth component is a matrix with k columns that lists all the connected subgraphs with k nodes.  | 
See Also
Examples
require(MASS)
require(Matrix)
p <- 10
Omega <- sparseMatrix(c(1:(p-1),2:p),c(2:p,1:(p-1)),x=1)
cg.all <- FindAllCG(Omega,3)
Find the connected subgraphs with a certain number of nodes
Description
FindCG is used to find all the connected subgraphs with a certain number of nodes.
Usage
FindCG(adjacency.matrix, cg.initial)
Arguments
adjacency.matrix | 
 p by p adjacency matrix of an undirected graph. It must be symmetric.  | 
cg.initial | 
 It could be 1:p or a matrix, whose elements are positive integers from 1 to p. If it is a length p vector, FindCG converts it into a matrix with one column. For a matrix with k columns, FindCG reads its rows as th indices of a collection of connected subgraphs with k nodes.  | 
Value
cg.new | 
 If the input is a matrix with k columns and stores the indices of all the size k connected subgraphs, the output is a matrix with k+1 columns storing the indices of all the connected subgraphs with k+1 nodes.  | 
See Also
Examples
require(MASS)
require(Matrix)
p <- 10
Omega <- sparseMatrix(c(1:(p-1),2:p),c(2:p,1:(p-1)),x=1)
cg.2 <- FindCG(Omega,c(1:p))
cg.3 <- FindCG(Omega,cg.2)
Iterative graphlet screening procedure
Description
The iterative graphlet screening procedure, main function of the package.
Usage
IterGS(y.tilde, gram, gram.bias, cg.all, sp, tau, nm, q0=0.1, scale = 1, max.iter = 3, 
std.thresh = 1.05, beta.initial = NULL)
Arguments
y.tilde | 
 
  | 
gram | 
 the threholded gram matrix  | 
gram.bias | 
 
 the bias of the threholded gram matrix  | 
cg.all | 
 all the connected cg.alls of gram with size no more than nm.  | 
sp | 
 the expected sparse level  | 
tau | 
 the minimal signal strength to be detected  | 
nm | 
 the maximal size of the connected subgaphs considered in the screening step.  | 
q0 | 
 the minimal screening parameter  | 
scale | 
 optional numerical parameter of the screening step. The default is 1  | 
max.iter | 
 the maximal number of iterations. The default is 3.  | 
std.thresh | 
 the threshold of the std change that stop the loop. The default is 1.05.  | 
beta.initial | 
 the initial estimate of beta in reducing the bias. The default is uu*sign(y.tilde)*(abs(y.tilde)>uu).  | 
Value
IterGS returns a list with two elements
estimate | 
 The iterative GS estimate of beta  | 
n.iter | 
 The number of iterations it takes  | 
Examples
##See demoIterGs.r
Penalized MLE procedure used in the cleaning step
Description
Penalized MLE procedure used in the cleaning step, an inner function.
Usage
PMLE(gram, y, lambda, uu)
Arguments
gram | 
 the sub gram matrix of the small scale quadratic problem.  | 
y | 
 the sub-vector of y.tilde  | 
lambda | 
 the tuning parameter of the cleaning step, tied to the sparse level.  | 
uu | 
 the tuning parameters of the cleaning step. It has the intuitive interpretation of the minimal signal strength to be detected.  | 
Value
b | 
 the estimate of the subvector of beta  | 
See Also
GS-step of the graphlet screening
Description
ScreeningStep performs the cleaning step of the graphlet screening
Usage
ScreeningStep(y.tilde, gram, cg.all, nm, v, r, q0 = 0.1, scale = 1)
Arguments
y.tilde | 
 
  | 
gram | 
 the regularized gram matrix  | 
cg.all | 
 a list whose kth element is a matrix of k columns. Its rows contain all the connected subgraph with k nodes.  | 
nm | 
 the maximal subgraph invesgated in the screening step  | 
v | 
 an essential tuning parameter of graphlet screening, tied to the sparse level  | 
r | 
 an essential tuning parameter of graphlet screening, tied to the signal strength  | 
q0 | 
 the minimal screening parameter  | 
scale | 
 
  | 
Value
survivor | 
 A logical vector, where TRUE means retained as a protential signal.  | 
Note
When nm=1, it is just univariate threholding, and thurs the screening step of UPS.
See Also
Examples
##See the demoGS.r
Thresholds the gram matrix
Description
Thresholds the gram matrix
Usage
ThresholdGram(gram.full, delta = 1/log(dim(gram.full)[1]))
Arguments
gram.full | 
 the gram matrix before the elementwise thresholding, a p by p symmetric matrix  | 
delta | 
 the threshold, the default is 1/log(p)  | 
Value
A list with two elements
gram.sd | 
 the threhsolded gram matrix, a sparse matrix  | 
gram.bias | 
 the difference of the orginal matrix and the threholded matrix  | 
Examples
p <-10
off.diag<-matrix(runif(p^2),p,p)
omega <- (off.diag+t(off.diag))*0.3
diag(omega) <- 1
omega.omega<-ThresholdGram(omega,0.3)
omega.omega$gram
omega.omega$gram.bias
expresses the number i on the base as a vector
Description
expresses the number i on the base as a vector, an inner function.
Usage
VectorizeBase(i, base, length)
Arguments
i | 
 the non-negative number to be converted  | 
base | 
 the base to be converted on  | 
length | 
 the length of the converted vector  | 
Value
vector | 
 A vector with the given length, whose elements can be read as the number i with the given base.  |