dpcR is an R package designed to perform analysis of digital PCR (dPCR) experiments. This vignette covers important features of the package and should be used as an addendum to the manual.
Our dpcR package employs the nomenclature of the MIQE guidelines for dPCR (Jim F. Huggett et al. (2013), Jim F. Huggett, Cowen, and Foy (2014)). \(\lambda\) is the mean number of molecules per partition. Total number of partitions is given by \(n\). \(k\) is the number of positive partitions.
\[ \lambda = - \ln \left(1 - \frac{k}{n} \right) \]
Firstly, we describe dpcr
class, a parent class for all
classes designed to contain dPCR data. Further, we closer inspect
adpcr
which is a class responsible for array-based dPCR
experiments (all experiments, where output data has precise localization
in two dimensions). Often these types of dPCR experiments are called
champer digital PCR (cdPCR).
Real-time quantitative qPCR (qPCR) experiments do not follow the
fundamental assumption of dPCR reactions (the mean number of template
molecule per partition smaller than 1). However, it is possible to
‘convert’ the qPCR into a dPCR. We present a tool to analyze results
from high-throughput qPCR systems by using the dPCR analysis
methodology as implemented in the dpcR package. The
qpcr2pp
function converts qPCR amplification curve data to
a qdpcr
object (see section about qdpcr
). The
calculation of the Cq values from the amplification curves is internally
done via functions from the qpcR package by Ritz and Spiess (2008).
dpcr
objectThe key class of the dpcR package is
dpcr
. It has the following slots:
* .Data - matrix containing data from dPCR runs (see
‘.Data slot’ subsection). It is further specified by the
type slot.
* n - number of partitions read in each run.
* exper - name of the experiment.
* replicate - name (or more conveniently ID) of a
replicate.
* assay - name of the assay.
* type - name of the data (see ‘type slot’
subsection)
Although, this class is designed to contain results from all dPCR
experiments, the user will interact mostly with its inheritors as
adpcr
or dpcr
.
dpcr
is a S4 object. Below is shown how to extract
elements from the slots of a S4 object:
# Below we have S4 object
s4 <- sim_adpcr(m = 100, n = 496, times = 100, pos_sums = FALSE, n_panels = 3)
## The assumed volume of partitions in each run is equal to 1.
## The assumed volume uncertainty in each run is equal to 0.
## [1] "adpcr"
## [1] "nm"
## [1] "nm"
dpcr
objects management (bind_dpcr
,
extract_run
)All dpcr
objects should be managed using special
functions provided by this package: bind_dpcr
and
extract_run
. The former binds dpcr
objects,
the latter extracts parts of the dpcr
object. It is
important to use this functions, because they preserve other attributes
important for dpcr
objects as number of partitions, names
of experiments, assays and technical replicates.
# Create single adpcr object. The following code is also true for
# other objects inhering from dpcr, as dpcr or qdpcr
single_run <- sim_adpcr(m = 100, n = 765, times = 100, pos_sums = FALSE, n_panels = 1)
two_runs <- bind_dpcr(single_run, single_run)
three_runs <- bind_dpcr(single_run, single_run, single_run)
# It is also possible to bind a list of dpcr objects...
three_runs_list <- bind_dpcr(list(single_run, single_run, single_run))
# ... which may be useful in do.call statements
dpcr_list <- do.call(bind_dpcr, lapply(5L:10*10, function(n_template)
sim_adpcr(m = n_template, n = 765, times = 100, pos_sums = FALSE, n_panels = 1)))
bind_dpcr
may be seen as the analogue of the R function
cbind.
The main difference is the lack of recycling. If two objects with uneven
number of data points are bound together, the shorter is completed with
missing values (NA).
longer_run <- sim_adpcr(m = 10, n = 15, times = 100, pos_sums = FALSE, n_panels = 1)
shorter_run <- sim_adpcr(m = 10, n = 10, times = 100, pos_sums = FALSE, n_panels = 1)
shortest_run <- sim_adpcr(m = 10, n = 5, times = 100, pos_sums = FALSE, n_panels = 1)
# Expect informative message after binding
res <- bind_dpcr(longer_run, shorter_run, shortest_run)
## Different number of partitions. Shorter objects completed with NA values.
## Experiment11.1 Experiment12.1 Experiment12.1
## [1,] 0 2 5
## [2,] 1 0 0
## [3,] 0 0 5
## [4,] 1 0 1
## [5,] 1 0 0
## [6,] 0 1 NA
## [7,] 0 3 NA
## [8,] 0 2 NA
## [9,] 1 0 NA
## [10,] 0 1 NA
## [11,] 2 NA NA
## [12,] 2 NA NA
## [13,] 1 NA NA
## [14,] 0 NA NA
## [15,] 0 NA NA
extract_run
is an equivalent of Extract.
It extracts one or more runs from the dpcr
objects
preserving other properties (as an appropriate replicate ID and so
on).
## The assumed volume of partitions in each run is equal to 1.
## The assumed volume uncertainty in each run is equal to 0.
## Experiment1.1 Experiment1.2 Experiment1.3 Experiment1.4 Experiment1.5
## [1,] 0 1 0 0 0
## [2,] 0 0 0 0 1
## [3,] 0 1 0 0 0
## [4,] 0 0 1 1 0
## [5,] 0 0 0 0 0
##
## 5 data points ommited.
## Data type: 'nm'
# Extract runs by the index
only_first_run <- extract_run(five_runs, 1)
only_first_and_second_run <- extract_run(five_runs, c(1, 2))
# See if proper replicated were extracted
slot(only_first_and_second_run, "replicate")
## [1] 1 2
## Levels: 1 2
## [1] 2 3 4 5
## Levels: 2 3 4 5
# Extract runs by the name
run_Experiment1.3 <- extract_run(five_runs, "Experiment1.3")
slot(run_Experiment1.3, "replicate")
## [1] 3
## Levels: 3
run_Experiment1.3and5 <- extract_run(five_runs, c("Experiment1.3", "Experiment1.5"))
slot(run_Experiment1.3and5, "replicate")
## [1] 3 5
## Levels: 3 5
Since the .Data slot inherits from the matrix class, it is possible to also use normal Extract operator (‘[’). For more information about this topic, see ‘.Data slot’ subsection.
Digital PCR data is always a matrix (see ?matrix
in R).
Columns and rows represent respectively individual runs and their data
points. Since data points are not always equivalent to partitions, the
true number of partitions for each run is always defined in the slot
n.
# Create two array dPCR experiments. Mind the difference in the n parameter.
sample_adpcr <- bind_dpcr(sim_adpcr(m = 100, n = 765, times = 100, pos_sums = FALSE, n_panels = 1),
rename_dpcr(sim_adpcr(m = 100, n = 763, times = 100, pos_sums = FALSE,
n_panels = 1),
exper = "Experiment2"))
## Different number of partitions. Shorter objects completed with NA values.
In the code chunk above, we created two array dPCR experiments with respectively 765 and 763 partitions. Inspect the last five data points:
# It's possible to manipulate data points from dpcr object using all functions that work for matrices
tail(sample_adpcr)
## Experiment1.1 Experiment2.1
## [760,] 3 0
## [761,] 0 1
## [762,] 0 0
## [763,] 0 0
## [764,] 0 NA
## [765,] 0 NA
Both experiments have 765 data points. We see that in case of a shorter experiment, two data points have the value NA. If we analyze the n slot:
## [1] 765 763
We see the expected number of partitions. It is especially important in case of fluorescence dPCR data, where one droplet may have assignments of few data points.
The important feature of .Data is inheritance from
matrix
class, which opens numerous possibilities for data
manipulation.
## Experiment1.1 Experiment2.1
## 90 NA
## Experiment1.1
## [1,] -0.05000000
## [2,] 0.04983342
## [3,] 0.14866933
## [4,] 0.24552021
## [5,] 0.33941834
##
## 59 data points ommited.
## Data type: 'fluo'
Data from dPCR experiments may have several types:
* ct (cycle
threshold): cycle threshold of each partition.
* fluo: fluorescence intensity of each partition.
* nm (number of
molecules): number of molecules in each partition
(usually such precise data come only from simulations).
* np (number of
positives): status (positive (1) or negative(0)) of
each partition.
* tnp (total number
of positives): total number of positive partitions in
the run (.Data in this case is matrix with single row and
number of columns equal to the number of runs).
In case of fluo and tnp types, the
number of data points in .Data slot is hardly ever equal to the
real number of partitions dpcr
.
# Inspect all types of data
# Cq
# Load package with qPCR data
library(chipPCR)
qpcr2pp(data = C127EGHP[, 1L:6], type = "ct")
## qPCR1.1
## [1,] 34.09434
## [2,] 13.00000
## [3,] 12.26415
## [4,] 12.75472
## [5,] 12.26415
##
## Data type: 'ct'
## Experiment1.1
## [1,] -0.0500000
## [2,] 0.1496668
## [3,] 0.3473387
## [4,] 0.5410404
## [5,] 0.7288367
##
## 59 data points ommited.
## Data type: 'fluo'
## The assumed volume of partitions in each run is equal to 1.
## The assumed volume uncertainty in each run is equal to 0.
## Experiment1.1 Experiment1.2 Experiment1.3
## [1,] 0 0 0
## [2,] 0 0 0
## [3,] 1 0 1
## [4,] 2 0 0
## [5,] 0 1 0
##
## 760 data points ommited.
## Data type: 'nm'
## The assumed volume of partitions in each run is equal to 1.
## The assumed volume uncertainty in each run is equal to 0.
## Experiment1.1 Experiment1.2 Experiment1.3
## [1,] 0 0 0
## [2,] 0 0 0
## [3,] 1 0 1
## [4,] 0 0 0
## [5,] 0 0 0
##
## 760 data points ommited.
## Data type: 'np'
## The assumed volume of partitions in each run is equal to 1.
## The assumed volume uncertainty in each run is equal to 0.
## Experiment1.1 Experiment1.2 Experiment1.3
## [1,] 208 203 193
##
## Data type: 'tnp'
The read_dpcr
function is responsible for importing data
from external file sources into R working space. The additional
parameters allow specification of the format type and other details.
Additionally, we advise to use other packages belonging to the pcRuniversum such as RDML or dedicated packages such as ReadqPCR (available from bioconductor.org). Further information can be found in Pabinger et al. (2014).
# Generate some data from 15x16 array. Let's presume, that we have results from two plates
sample_runs <- matrix(rpois(480, lambda = 1.5), ncol = 2)
# Check its class - it's a typical R structure
class(sample_runs)
## [1] "matrix" "array"
# Save it to adpcr object
adpcr_experiments <- create_dpcr(sample_runs, n = c(240L, 240L), type = "nm", adpcr = TRUE)
## The assumed volume of partitions in each run is equal to 1.
## The assumed volume uncertainty in each run is equal to 0.
## [1] "adpcr"
The Summary
method produces a tabular summary of any
dpcr
object. It calculates λ and its confidence intervals
using both Dube’s and Bhat’s method. The estimated number of template
molecules (m) is computed using the following relationship:
\[ m = n \lambda \]
##
## Number of positive partitions: 11, 9, 39, 42, ...
## Total number of partitions: 765, 765, 765, 765, ...
##
## Number of runs: 6
## Number of experiments: 3
##
## experiment replicate assay method lambda lambda.low lambda.up m
## Experiment1 1 Chr4 dube 0.01448347 0.005960788 0.02307940 11.07985
## Experiment1 1 Chr4 bhat 0.01448347 0.010116498 0.01885043 11.07985
## Experiment1 2 MYC dube 0.01183446 0.004132446 0.01959625 9.05336
## Experiment1 2 MYC bhat 0.01183446 0.007889615 0.01577930 9.05336
## Experiment2 1 Chr4 dube 0.05232582 0.036035144 0.06888628 40.02925
## Experiment2 1 Chr4 bhat 0.05232582 0.043946026 0.06070561 40.02925
## Experiment2 2 MYC dube 0.05646661 0.039531415 0.07369356 43.19696
## Experiment2 2 MYC bhat 0.05646661 0.047752467 0.06518076 43.19696
## Experiment3 1 Chr4 dube 0.12813050 0.102267700 0.15467999 98.01984
## Experiment3 1 Chr4 bhat 0.12813050 0.114762836 0.14149817 98.01984
## Experiment3 2 MYC dube 0.11778304 0.093038006 0.14315595 90.10402
## Experiment3 2 MYC bhat 0.11778304 0.105000286 0.13056579 90.10402
## m.low m.up c c.low c.up k n
## 4.560003 17.65574 0.01448347 0.005960788 0.02307940 11 765
## 7.739121 14.42058 0.01448347 0.010116498 0.01885043 11 765
## 3.161321 14.99113 0.01183446 0.004132446 0.01959625 9 765
## 6.035556 12.07116 0.01183446 0.007889615 0.01577930 9 765
## 27.566885 52.69801 0.05232582 0.036035144 0.06888628 39 765
## 33.618710 46.43979 0.05232582 0.043946026 0.06070561 39 765
## 30.241532 56.37557 0.05646661 0.039531415 0.07369356 42 765
## 36.530637 49.86328 0.05646661 0.047752467 0.06518076 42 765
## 78.234790 118.33019 0.12813050 0.102267700 0.15467999 92 765
## 87.793569 108.24610 0.12813050 0.114762836 0.14149817 92 765
## 71.174074 109.51430 0.11778304 0.093038006 0.14315595 85 765
## 80.325219 99.88283 0.11778304 0.105000286 0.13056579 85 765
# Save summary data without printing it
summ <- summary(six_panels, print = FALSE)
# Print only the summary table
summ[["summary"]]
## experiment replicate assay method lambda lambda.low lambda.up
## 1 Experiment1 1 Chr4 dube 0.01448347 0.005960788 0.02307940
## 7 Experiment1 1 Chr4 bhat 0.01448347 0.010116498 0.01885043
## 2 Experiment1 2 MYC dube 0.01183446 0.004132446 0.01959625
## 8 Experiment1 2 MYC bhat 0.01183446 0.007889615 0.01577930
## 3 Experiment2 1 Chr4 dube 0.05232582 0.036035144 0.06888628
## 9 Experiment2 1 Chr4 bhat 0.05232582 0.043946026 0.06070561
## 4 Experiment2 2 MYC dube 0.05646661 0.039531415 0.07369356
## 10 Experiment2 2 MYC bhat 0.05646661 0.047752467 0.06518076
## 5 Experiment3 1 Chr4 dube 0.12813050 0.102267700 0.15467999
## 11 Experiment3 1 Chr4 bhat 0.12813050 0.114762836 0.14149817
## 6 Experiment3 2 MYC dube 0.11778304 0.093038006 0.14315595
## 12 Experiment3 2 MYC bhat 0.11778304 0.105000286 0.13056579
## m m.low m.up c c.low c.up k n
## 1 11.07985 4.560003 17.65574 0.01448347 0.005960788 0.02307940 11 765
## 7 11.07985 7.739121 14.42058 0.01448347 0.010116498 0.01885043 11 765
## 2 9.05336 3.161321 14.99113 0.01183446 0.004132446 0.01959625 9 765
## 8 9.05336 6.035556 12.07116 0.01183446 0.007889615 0.01577930 9 765
## 3 40.02925 27.566885 52.69801 0.05232582 0.036035144 0.06888628 39 765
## 9 40.02925 33.618710 46.43979 0.05232582 0.043946026 0.06070561 39 765
## 4 43.19696 30.241532 56.37557 0.05646661 0.039531415 0.07369356 42 765
## 10 43.19696 36.530637 49.86328 0.05646661 0.047752467 0.06518076 42 765
## 5 98.01984 78.234790 118.33019 0.12813050 0.102267700 0.15467999 92 765
## 11 98.01984 87.793569 108.24610 0.12813050 0.114762836 0.14149817 92 765
## 6 90.10402 71.174074 109.51430 0.11778304 0.093038006 0.14315595 85 765
## 12 90.10402 80.325219 99.88283 0.11778304 0.105000286 0.13056579 85 765
## experiment replicate assay method lambda lambda.low lambda.up m
## 1 Experiment1 1 Chr4 dube 0.01448347 0.005960788 0.02307940 11.07985
## 2 Experiment1 2 MYC dube 0.01183446 0.004132446 0.01959625 9.05336
## 3 Experiment2 1 Chr4 dube 0.05232582 0.036035144 0.06888628 40.02925
## 4 Experiment2 2 MYC dube 0.05646661 0.039531415 0.07369356 43.19696
## 5 Experiment3 1 Chr4 dube 0.12813050 0.102267700 0.15467999 98.01984
## 6 Experiment3 2 MYC dube 0.11778304 0.093038006 0.14315595 90.10402
## m.low m.up c c.low c.up k n
## 1 4.560003 17.65574 0.01448347 0.005960788 0.02307940 11 765
## 2 3.161321 14.99113 0.01183446 0.004132446 0.01959625 9 765
## 3 27.566885 52.69801 0.05232582 0.036035144 0.06888628 39 765
## 4 30.241532 56.37557 0.05646661 0.039531415 0.07369356 42 765
## 5 78.234790 118.33019 0.12813050 0.102267700 0.15467999 92 765
## 6 71.174074 109.51430 0.11778304 0.093038006 0.14315595 85 765
Since the dpcr
objects tends to have many data points,
which would quickly clutter the console, the show
function
prints only first 5 rows of .Data, number of omitted rows and
the type of the object.
## Experiment1.1
## [1,] 0
## [2,] 0
## [3,] 1
## [4,] 0
## [5,] 0
##
## 5 data points ommited.
## Data type: 'nm'
## Experiment1.1
## [1,] 0
## [2,] 0
## [3,] 1
## [4,] 0
## [5,] 0
##
## 5 data points ommited.
## Data type: 'nm'
## Experiment1.1
## [1,] 0
## [2,] 0
## [3,] 1
## [4,] 0
## [5,] 0
## [6,] 1
## [7,] 0
## [8,] 0
## [9,] 0
## [10,] 1
adpcr
classIf the data output of an dPCR system has exact locations in
two-dimensional space, it belongs to the adpcr
class. It is
the case for all dPCR experiments conducted on panels, arrays and so on.
The adpcr
object inherits from dpcr
objects,
but has special slots specifying the dimensions and their names.
The planar representation of adpcr
objects is created by
the adpcr2panel
function.
## 1 2 3 4 5 6
## 1 0 0 0 1 0 0
## 2 0 0 0 1 0 1
## 3 0 0 0 0 0 0
## 4 0 0 0 0 0 0
## 5 0 0 0 0 0 0
## 6 0 1 1 0 0 0
Data from dPCR arrays can be easily visualized using the
plot_panel
function.
# Remember, you can plot only single panel at once
plot_panel(extract_run(adpcr_experiments, 1), main = "Experiment 1")
The same data can be visualized easily in binarized form (positive/negative partitions).
The plot_panel
function returns invisibly coordinates,
that are compatible with graphics and
ggplot2 packages.
# Extract graphical coordinates
panel_data <- plot_panel(extract_run(adpcr_experiments, 1), plot = FALSE)
ggplot_coords <- cbind(panel_data[["ggplot_coords"]], value = as.vector(extract_run(adpcr_experiments, 1)))
# Plot panel using different graphics package
library(ggplot2)
ggplot(ggplot_coords[, -5], aes(x = x, y = y, fill = value)) +
geom_tile()
The test_panel
function is useful for testing the
randomness of the spatial distribution of template molecules over the
array. This function was implemented to test if technical flaws may
corrupt the result of an dPCR experiment. This may occur during
incorrect filling of the chamber, defects of the chamber or
contaminations.
# The test_panel function performs a test for each experiment in apdr object.
test_panel(six_panels)
## $Experiment1.1
##
## Chi-squared test of CSR using quadrat counts
##
## data: single_panel
## X2 = 16.756, df = 24, p-value = 0.2822
## alternative hypothesis: two.sided
##
## Quadrats: 5 by 5 grid of tiles
##
## $Experiment1.2
##
## Chi-squared test of CSR using quadrat counts
##
## data: single_panel
## X2 = 15.495, df = 24, p-value = 0.1891
## alternative hypothesis: two.sided
##
## Quadrats: 5 by 5 grid of tiles
##
## $Experiment2.1
##
## Chi-squared test of CSR using quadrat counts
##
## data: single_panel
## X2 = 11.603, df = 24, p-value = 0.03196
## alternative hypothesis: two.sided
##
## Quadrats: 5 by 5 grid of tiles
##
## $Experiment2.2
##
## Chi-squared test of CSR using quadrat counts
##
## data: single_panel
## X2 = 18.183, df = 24, p-value = 0.4119
## alternative hypothesis: two.sided
##
## Quadrats: 5 by 5 grid of tiles
##
## $Experiment3.1
##
## Chi-squared test of CSR using quadrat counts
##
## data: single_panel
## X2 = 19.831, df = 24, p-value = 0.5873
## alternative hypothesis: two.sided
##
## Quadrats: 5 by 5 grid of tiles
##
## $Experiment3.2
##
## Chi-squared test of CSR using quadrat counts
##
## data: single_panel
## X2 = 17.868, df = 24, p-value = 0.3812
## alternative hypothesis: two.sided
##
## Quadrats: 5 by 5 grid of tiles
Further tests are available in the spatstat package,
which is also utilizing S4 object system. The data exchange between the
dpcR and the spatstat packages is
streamlined by the adpcr2ppp
function, which converts
adpcr
object to ppp
class taking into account spatial coordinates of positive
partitions.
qdpcr
classThe unique feature of dpcR package is conversion of qPCR data to dPCR-like experiments. The qPCR data should be in a format as used by the qpcR package, where columns represents particular experiments and one column contains cycle number. For pre-processing of raw amplification curve data we recommend the chipPCR package (Rödiger, Burdukiewicz, and Schierack (2015)).
# Load chiPCR package to access C317.amp data
library(chipPCR)
# Convert data to qdpcr object
qdat <- qpcr2pp(data = C317.amp, type = "np", Cq_range = c(10, 30))
qdpcr
inherits from dpcr
objects and may be
analyzed using above mentioned methods. Moreover, the converted data may
visualized using the plot
method.
The import functions accessible under read_dpcr
cover
common dPCR systems. To extend the scope of our software, we introduced
a universal dPCR data exchange format REDF (raw exchange digital PCR
format).
REDF (Raw Exchange Digital PCR Format) has a tabular structure. Each dPCR run (represented by a row) is described using following properties:
Generalized Linear Models (GLM) are linear models for data, where the response variables may have non-normal distributions (as for example binomial distributed positive partitions in dPCR experiments). Using GLM we can describe relationships between results of dPCR:
\[\log{Y} = \beta^T X\]
where \(Y\) are counts, \(X\) are experiments names (categorical data) and \(\beta\) are linear model coefficients for every experiment. Moreover, \(\exp{\beta} = \lambda\).
Estimated means copies per partitions obtained from the model are compared each other using multiple t-test.
# Compare experiments using GLM
# 1. Perform test
comp <- test_counts(six_panels)
# 2. See summary of the test
summary(comp)
## group lambda lambda.low lambda.up experiment replicate k n
## 1 1 0.01315896 0.00585827 0.02946780 1 1.5 10.0 765
## 2 2 0.05439622 0.03601619 0.08178439 2 1.5 40.5 765
## 3 3 0.12295677 0.09284418 0.16207750 3 1.5 88.5 765
## group lambda lambda.low lambda.up experiment replicate k n
## 1 1 0.01315896 0.00585827 0.02946780 1 1.5 10.0 765
## 2 2 0.05439622 0.03601619 0.08178439 2 1.5 40.5 765
## 3 3 0.12295677 0.09284418 0.16207750 3 1.5 88.5 765
## group lambda lambda.low lambda.up experiment replicate k
## Experiment1.1 1 0.01448347 0.006670729 0.03130435 Experiment1 1 11
## Experiment1.2 1 0.01183446 0.005045810 0.02763125 Experiment1 2 9
## Experiment2.1 2 0.05232582 0.034376033 0.07928375 Experiment2 1 39
## Experiment2.2 2 0.05646661 0.037656342 0.08428503 Experiment2 2 42
## Experiment3.1 3 0.12813050 0.097256819 0.16801233 Experiment3 1 92
## Experiment3.2 3 0.11778304 0.088431546 0.15614267 Experiment3 2 85
## n
## Experiment1.1 765
## Experiment1.2 765
## Experiment2.1 765
## Experiment2.2 765
## Experiment3.1 765
## Experiment3.2 765
The Poisson regression on binary data (positive/negative partition) can be used only when the concentration of template molecules in samples is small (positive partitions contain rarely more than 1 template particle). Higher concentrations requires binomial regression.
The dPCR experiments may also be compared pairwise using the uniformly most powerful (UMP) ratio test (Fay 2010). Furthermore, computed p-values are adjusted using Benjamini & Hochberg correction (Benjamini and Hochberg 1995) to control family-wise error rate.
The UMP ratio test has following null-hypothesis:
\[ H_0: \frac{\lambda_1}{\lambda_2} = 1 \]
The generally advised Wilson’s confidence intervals (Brown, Cai, and DasGupta 2001) are computed independently for every dPCR experiment. The confidence intervals are adjusted using Dunn – Šidák correction to ensure that they simultaneously contain true value of \(lambda\):
\[ \alpha_{\text{adj}} = 1 - (1 - \alpha)^\frac{1}{T} \]
For example, the 0.95 significance levels means, that probability of the all real values being in the range of its respective confidence intervals is 0.95.
#1. Perform multiple test comparison using data from the previous example
comp_ratio <- test_counts(six_panels, model = "ratio")
#2. See summary of the test
summary(comp_ratio)
## group lambda lambda.low lambda.up experiment replicate k n
## 1 1 0.01315896 0.00585827 0.02946780 1 1.5 10.0 765
## 2 2 0.05439622 0.03601619 0.08178439 2 1.5 40.5 765
## 3 3 0.12295677 0.09284418 0.16207750 3 1.5 88.5 765
## group lambda lambda.low lambda.up experiment replicate k n
## 1 1 0.01315896 0.00585827 0.02946780 1 1.5 10.0 765
## 2 2 0.05439622 0.03601619 0.08178439 2 1.5 40.5 765
## 3 3 0.12295677 0.09284418 0.16207750 3 1.5 88.5 765
## group lambda lambda.low lambda.up experiment replicate k
## Experiment1.1 1 0.01448347 0.006670729 0.03130435 Experiment1 1 11
## Experiment1.2 1 0.01183446 0.005045810 0.02763125 Experiment1 2 9
## Experiment2.1 2 0.05232582 0.034376033 0.07928375 Experiment2 1 39
## Experiment2.2 2 0.05646661 0.037656342 0.08428503 Experiment2 2 42
## Experiment3.1 3 0.12813050 0.097256819 0.16801233 Experiment3 1 92
## Experiment3.2 3 0.11778304 0.088431546 0.15614267 Experiment3 2 85
## n
## Experiment1.1 765
## Experiment1.2 765
## Experiment2.1 765
## Experiment2.2 765
## Experiment3.1 765
## Experiment3.2 765
# Compare results of two methods
par(mfrow=c(2,1))
plot(comp, aggregate = FALSE)
title("GLM")
plot(comp_ratio, aggregate = FALSE)
title("Ratio")
Two approaches presented above were compared in a simulation approach over 150.000 simulated array dPCR experiments. Each simulation contained six reactions. Three of them had roughly the same amount of molecules per plate and other three had experiments with 10 to 50 molecules more. Experiments were compared using GLM and MT frameworks.
On average, 2.03 and 1.98 reactions were assessed to a wrong group by respectively GLM and MT.
A single GLM comparison took roughly 183 times longer than MT (on average 1.10 seconds versus 0.006 seconds on the Intel i7-2600 processor). The difference grows with the number of experiments and number of partitions (data not shown).
Average coverage probability is the proportion of the time that the interval contains the true value of \(\lambda\).
In the example below, we simulated 1 droplet dPCR experiments (2 droplets each) for each level of \(\lambda\) (1.2 experiments total). We computed the average probability coverage of CI obtained by three methods: Dube’s(Dube, Qin, and Ramakrishnan 2008), Bhat’s(Bhat et al. 2009) and by MT (\(\alpha = 0.95\)).
To assess simultaneous coverage probability, we randomly divided experiments into 2000 groups (500 experiments each) for each possible value of \(\lambda\). We counted frequency of groups in which all confidence intervals contain the true value of \(\lambda\).
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
The dashed black line marks 0.95 border.
Method name | Type of coverage | Value |
---|---|---|
Adjusted | Average probability coverage | 1.00 |
Bhat | Average probability coverage | 0.69 |
Dube | Average probability coverage | 0.95 |
Adjusted | Simultaneous probability coverage | 0.95 |
Bhat | Simultaneous probability coverage | 0.00 |
Dube | Simultaneous probability coverage | 0.01 |