| ce | cell embeddings for pbmc3k data |
| compute.cell.label | 4.2. binarize the label propagation probability in the cell population; result in a binarized vector of cells with 'nagative' and 'positive' labels; 'positive' means that the cells are relevant to the gene set |
| compute.cell.label.df | similar to compute.cell.label; used when working with multiple gene sets |
| compute.db | this function is called by 'compute.kld' to aggregate the density contribution of each gene to each grid point, and then normalize the densities of grid points to 1. |
| compute.grid.coords | 2. compute density of gene sets of interest 2.1 compute grid point coordinates |
| compute.jsd | 5. compute the specificity of gene set when cell partition information is available; the information could be clustering, sample origins, or other conditions inspired by https://github.com/FloWuenne/scFunctions/blob/0d9ea609fa72210a151f7270e61bdee008e8fc88/R/calculate_rrs.R |
| compute.kld | 2.2 compute KL-divergence (some are adapted from https://github.com/alexisvdb/singleCellHaystack/) |
| compute.mca | 1. compute MCA embeddings |
| compute.nn.edges | 3. compute nearest neighbor graph for genes and cells This graph will be used for fetching the most relevant cells of a gene set |
| compute.spatial.kld | 6. find gene sets with spatial relevance |
| compute.spatial.kld.df | This function is to calculate how likely the cells relevant to multiple gene sets are randomly distributed spatially |
| compute.spec | This is to calculate the similarity between: 1. the label propagation probability of cells for gene sets and 2. the identify of cells in partitions |
| compute.spec.single | This is to calculate the similarity between: 1. the label propagation probability of cells for gene sets and 2. the identify of cells in a certain partition This is called by 'compute.spec'; can also run by itself |
| coords.df | mouse brain coords |
| el_nn_search | this function is called by 'compute.nn.edges' to convert nearest neighbor identity matrix to edge list |
| gene.set.list | A gene set list containing multiple human GO gene sets |
| kde2d.weighted | based on https://stat.ethz.ch/pipermail/r-help/2006-June/107405.html this is called by compute.spatial.kld to calculate the kernel density estimation in 2d space with each data point weighted. |
| pbmc.meta | pbmc3k meta |
| pbmc.mtx | pbmc3k matrix |
| run.rwr | 4.1 To calculate the label propagation probability for a gene set among cells; result in a vector (length = number of cells) reflecting the probability each cell is labeled during the propagation (relevance to the gene set) |
| run.rwr.list | result in a matrix (number of rows = number of cells; number of columns = number of gene sets) reflecting the probability each cell is labeled during the propagation (relevance to the gene set); same idea as run.rwr but with multiple gene sets |
| sample.kld | this function is called by 'compute.kld' to calculate the kl-divergence between sampled (background) gene set and the ref (all) gene set |
| sample.spatial.kld | this function is called by 'compute.spatial.kld' to calculate the kl-divergence between cell-weighted with shuffled weight vector and the ref (all cells, unweighted) |
| seed.mat | 4. compute label propagation from gene set to cells this function is to form a 'seed matrix' used by the dRWR function (dnet R package); the seed matrix is specifying which nodes are the sources for label propagation |
| seed.mat.list | this function is used when more than one 'seed sets' will be used (when there are multiple gene sets of interest) |
| vectorized_pdist | from an excellent post: https://www.r-bloggers.com/2013/05/pairwise-distances-in-r/ enhanced the speed this function is called by 'compute.kld' to quickly compute the distance between genes to grid points |
| weight_df | mouse brain gene set activities |