In a cell culture lab various cellular assays are performed. The package “bioassays” will help to analyse the results of these experiments performed in multiwell plates. The usage of various functions in the “bioassays” package is provided in this article.
The functions in this package can be used to summarise data from any multiwell plate, and by incorporating them in a loop several plates can be analyzed automatically. Two examples are also provided in the article.
The output reading from the instrument (eg.spectrophotometer) should be in a matrix format. An example data (csv format) is shown below. If the data is in .xls/.xlsx format read_excel function in ‘readxl’ package can be used.
data(rawdata96)
head(rawdata96)
#> X X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
#> 1 A 0.659 0.649 0.598 0.601 0.541 0.553 0.568 0.519 0.576 0.575 0.583 0.504
#> 2 B 0.442 0.455 0.586 0.563 0.525 0.548 0.511 0.503 0.533 0.559 0.529 0.535
#> 3 C 0.278 0.266 0.491 0.562 0.510 0.473 0.467 0.433 0.382 0.457 0.475 0.510
#> 4 D 0.197 0.199 0.452 0.456 0.421 0.431 0.409 0.401 0.458 0.412 0.408 0.403
#> 5 E 0.177 0.174 0.447 0.437 0.392 0.412 0.368 0.396 0.397 0.358 0.360 0.393
#> 6 F 0.141 0.137 0.277 0.337 0.294 0.279 0.257 0.263 0.262 0.292 0.280 0.300A metadata is needed for the whole experiment. “row” and “col” columns are must in the metafile to indicate the location of well. An example is given below.
extract_filename help to extract information from the file name. syntax is extract_filename(filename,split = " ",end = ".csv", remove = " ", sep="-"). filename is the file name. split is the portions at which the name has to be split (default is space " “). end is the extension of file name that need to be removed (default is”.csv“). remove is the portion from the file name that need to be omitted after splitting (default is space” “). sep add a symbol between separate sections, default is”-".
This function is useful for extracting specific information from file names, like compound name, plate number etc, to provide appropriate analysis.
For e.g.
rmodd_summary help to remove the outliers and summarise the values from a given set of function. Syntax is rmodd_summary(x, rm = "FALSE", strict= "FALSE", cutoff=80,n=3). x is a numeric vector. rm = TRUE if want to remove outliers. If strict = FALSE those values above/below 1.5 IQR is omitted (outliers omitted). If strict = TRUE more aggresive outlier removal is used to bring %cv below cutoff. n is the minimum number of samples you need per group if more aggresive outlier removal is used.
For e.g.
x<- c(1.01,0.98,0.6,0.54,0.6,0.6,0.4,3)
rmodd_summary(x, rm = "FALSE", strict= "FALSE", cutoff=80,n=3)
#> mean median n sd cv
#> 0.9662500 0.6000000 8.0000000 0.8487796 87.8426480data2plateformat convert the data (eg: readings from a 96 well plate) to appropriate matrix format. Syntax is data2plateformat(data, platetype = 96). data is the data to be formatted. platetype is the plate from which the data is coming. It can take 6, 12, 24, 96, 384 values to represent the corresponding multiwell.
For e.g. To rename columns and rows of ‘rawdata96’ to right format.
rawdata<-data2plateformat(rawdata96,platetype = 96)
head(rawdata)
#> 1 2 3 4 5 6 7 8 9 10 11 12
#> A 0.659 0.649 0.598 0.601 0.541 0.553 0.568 0.519 0.576 0.575 0.583 0.504
#> B 0.442 0.455 0.586 0.563 0.525 0.548 0.511 0.503 0.533 0.559 0.529 0.535
#> C 0.278 0.266 0.491 0.562 0.510 0.473 0.467 0.433 0.382 0.457 0.475 0.510
#> D 0.197 0.199 0.452 0.456 0.421 0.431 0.409 0.401 0.458 0.412 0.408 0.403
#> E 0.177 0.174 0.447 0.437 0.392 0.412 0.368 0.396 0.397 0.358 0.360 0.393
#> F 0.141 0.137 0.277 0.337 0.294 0.279 0.257 0.263 0.262 0.292 0.280 0.300plate2df format matrix type 2D data of multi well plates as a dataframe. The function uses column names and row names of ‘datamatrix’ (2D data of a mutli well plate) and generate a dataframe with row, col (column) and position indices. The ‘value’ column represent corresponding value in the ‘datamarix’..
Syntax is plate2df(datamatrix). datamatrix is the data in matrix format.
For eg.
matrix96 help to convert a dataframe in to a matrix format. Syntax is matrix96(dataframe,column,rm="FALSE"). dataframe is the dataframe to be formatted. The dataframe should have a “row” and “col” columns to function smoothly. column is the name of column that need be converted as a matrix.. If rm= “TRUE” then -ve and NA are assigned as 0.
For e.g.
matrix96(OD_df,"value")
#> 1 2 3 4 5 6 7 8 9 10 11 12
#> A 0.659 0.649 0.598 0.601 0.541 0.553 0.568 0.519 0.576 0.575 0.583 0.504
#> B 0.442 0.455 0.586 0.563 0.525 0.548 0.511 0.503 0.533 0.559 0.529 0.535
#> C 0.278 0.266 0.491 0.562 0.510 0.473 0.467 0.433 0.382 0.457 0.475 0.510
#> D 0.197 0.199 0.452 0.456 0.421 0.431 0.409 0.401 0.458 0.412 0.408 0.403
#> E 0.177 0.174 0.447 0.437 0.392 0.412 0.368 0.396 0.397 0.358 0.360 0.393
#> F 0.141 0.137 0.277 0.337 0.294 0.279 0.257 0.263 0.262 0.292 0.280 0.300
#> G 0.122 0.118 0.300 0.288 0.293 0.251 0.245 0.270 0.261 0.259 0.271 0.271
#> H 0.107 0.102 0.320 0.340 0.319 0.270 0.262 0.277 0.294 0.278 0.307 0.316matrix96(OD_df,"position")
#> 1 2 3 4 5 6 7 8 9 10 11 12
#> A "A01" "A02" "A03" "A04" "A05" "A06" "A07" "A08" "A09" "A10" "A11" "A12"
#> B "B01" "B02" "B03" "B04" "B05" "B06" "B07" "B08" "B09" "B10" "B11" "B12"
#> C "C01" "C02" "C03" "C04" "C05" "C06" "C07" "C08" "C09" "C10" "C11" "C12"
#> D "D01" "D02" "D03" "D04" "D05" "D06" "D07" "D08" "D09" "D10" "D11" "D12"
#> E "E01" "E02" "E03" "E04" "E05" "E06" "E07" "E08" "E09" "E10" "E11" "E12"
#> F "F01" "F02" "F03" "F04" "F05" "F06" "F07" "F08" "F09" "F10" "F11" "F12"
#> G "G01" "G02" "G03" "G04" "G05" "G06" "G07" "G08" "G09" "G10" "G11" "G12"
#> H "H01" "H02" "H03" "H04" "H05" "H06" "H07" "H08" "H09" "H10" "H11" "H12"plate_metadata combine the plate specific information (like compound used, standard concentration, dilution of samples, etc) and metadata, to produce unique plate metadata. Syntax is plate_metadata(plate_details, metadata,mergeby="type"). plate details is the plate specific information that need to be added to metadata. metadata is the metadata for whole experiment. mergeby is the column that is common to both metadata and plate_meta (this column will be used for merging the information).
For eg. An incomplete meta data
head(metafile96)
#> row col position type id concentration dilution
#> 1 A 1 A01 STD1 STD 25 NA
#> 2 A 2 A02 STD1 STD 25 NA
#> 3 A 3 A03 S1 Sample NA NA
#> 4 A 4 A04 S1 Sample NA NA
#> 5 A 5 A05 S1 Sample NA NA
#> 6 A 6 A06 S1 Sample NA NAPlate specific details are.
plate_details <- list("compound" = "Taxol",
"concentration" = c(0.00,0.01,0.02,0.05,0.10,1.00,5.00,10.00),
"type" = c("S1","S2","S3","S4","S5","S6","S7","S8"),
"dilution" = 1)Using plate specific info, the metadata can be filled by calling plate_metadata function.
plate_meta<-plate_metadata(plate_details,metafile96,mergeby="type")
head(plate_meta)
#> row col type position id dilution concentration compound
#> 1 A 1 STD1 A01 STD NA 25 <NA>
#> 2 A 2 STD1 A02 STD NA 25 <NA>
#> 3 A 3 S1 A03 Sample 1 0 Taxol
#> 4 A 4 S1 A04 Sample 1 0 Taxol
#> 5 A 5 S1 A05 Sample 1 0 Taxol
#> 6 A 6 S1 A06 Sample 1 0 TaxolTo join both plate_meta and OD_df, innerjoin (is a dplyr function) can be used.
data_DF<- dplyr::inner_join(OD_df,plate_meta,by=c("row","col","position"))
head(data_DF)
#> row col position value type id dilution concentration compound
#> 1 A 1 A01 0.659 STD1 STD NA 25 <NA>
#> 2 A 2 A02 0.649 STD1 STD NA 25 <NA>
#> 3 A 3 A03 0.598 S1 Sample 1 0 Taxol
#> 4 A 4 A04 0.601 S1 Sample 1 0 Taxol
#> 5 A 5 A05 0.541 S1 Sample 1 0 Taxol
#> 6 A 6 A06 0.553 S1 Sample 1 0 Taxolheatplate help to create a heatmap of multiwell plate. The syntax is heatplate(datamatrix,name,size=7.5). datamatrix is the data in matrix format. An easy way to create this is by calling ‘matrix96’ as explained before. name is the name to be given for heatmap, size is the size of each well in the heatmap (default is 7.5).
This function will give a heatmap of normalized values if the ‘variable’ is numeric. If it is a factorial variable, it will simple provide a coloured categorical plot.
eg 1. Categorical plot
datamatrix<-matrix96(metafile96,"id")
datamatrix
#> 1 2 3 4 5 6 7 8
#> A "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> B "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> C "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> D "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> E "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> F "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> G "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> H "Blank" "Blank" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> 9 10 11 12
#> A "Sample" "Sample" "Sample" "Sample"
#> B "Sample" "Sample" "Sample" "Sample"
#> C "Sample" "Sample" "Sample" "Sample"
#> D "Sample" "Sample" "Sample" "Sample"
#> E "Sample" "Sample" "Sample" "Sample"
#> F "Sample" "Sample" "Sample" "Sample"
#> G "Sample" "Sample" "Sample" "Sample"
#> H "Sample" "Sample" "Sample" "Sample"eg 2. Heatmap
rawdata<-data2plateformat(rawdata96,platetype = 96)
OD_df<- plate2df(rawdata)
data<-matrix96(OD_df,"value")
data
#> 1 2 3 4 5 6 7 8 9 10 11 12
#> A 0.659 0.649 0.598 0.601 0.541 0.553 0.568 0.519 0.576 0.575 0.583 0.504
#> B 0.442 0.455 0.586 0.563 0.525 0.548 0.511 0.503 0.533 0.559 0.529 0.535
#> C 0.278 0.266 0.491 0.562 0.510 0.473 0.467 0.433 0.382 0.457 0.475 0.510
#> D 0.197 0.199 0.452 0.456 0.421 0.431 0.409 0.401 0.458 0.412 0.408 0.403
#> E 0.177 0.174 0.447 0.437 0.392 0.412 0.368 0.396 0.397 0.358 0.360 0.393
#> F 0.141 0.137 0.277 0.337 0.294 0.279 0.257 0.263 0.262 0.292 0.280 0.300
#> G 0.122 0.118 0.300 0.288 0.293 0.251 0.245 0.270 0.261 0.259 0.271 0.271
#> H 0.107 0.102 0.320 0.340 0.319 0.270 0.262 0.277 0.294 0.278 0.307 0.316reduceblank help to reduce blank values from the readings.
The syntax is reduceblank (dataframe,x_vector,blank_vector,y). dataframe is the data. x_vector is the entries for which the blank has to be reduced. If all entries has to reduced use “All”. x_vector should be in a vector format eg: c(“drug1”,“drug2”,drug3" etc). blank_vector is the vector of blank names whose value has to be reduced (should be in a vector format eg: c(“blank1”,“blank2”,“blank3”,“blank4”)). This function will reduce the first blank vector element from first x_vector element and so on. y is the column name where the action will take place. y should be numeric in nature. The results will appear as a new column named ‘blankminus’.
For eg.
data_DF<-reduceblank(data_DF, x_vector =c("All"),blank_vector = c("Blank"), "value")
head(data_DF)
#> row col position value type id dilution concentration compound blankminus
#> 1 A 1 A01 0.659 STD1 STD NA 25 <NA> 0.5545
#> 2 A 2 A02 0.649 STD1 STD NA 25 <NA> 0.5445
#> 3 A 3 A03 0.598 S1 Sample 1 0 Taxol 0.4935
#> 4 A 4 A04 0.601 S1 Sample 1 0 Taxol 0.4965
#> 5 A 5 A05 0.541 S1 Sample 1 0 Taxol 0.4365
#> 6 A 6 A06 0.553 S1 Sample 1 0 Taxol 0.4485estimate help to estimate the unknown variable (eg: concentration) based on the standard curve. Syntax is estimate(data=dataframe,colname="blankminus",fitformula=fit, methord="linear/nplr"). data is the dataframe which need to be evaluated. colname is the column name for which the values has to be estimated. fitformula is the filling formula used. methord is to specify if linear or nonparametric logistic curve was used for the fitformula.
For eg: data_DF is a dataframe for which the concentration has to be estimated based on the value of blankminus.
For filtering the ‘standards’
std<- dplyr::filter(data_DF, data_DF$id=="STD")
std<- aggregate(std$blankminus ~ std$concentration, FUN = mean )
colnames (std) <-c("con", "OD")
head(std)
#> con OD
#> 1 0.39 0.0155
#> 2 0.78 0.0345
#> 3 1.56 0.0710
#> 4 3.13 0.0935
#> 5 6.25 0.1675
#> 6 12.50 0.3440To fit a standard curve.
fit1 is the 3 parameter logistic curve model and fit2 is the linear regression model. The appropriate one for your experiment can be used.
fit2<-stats::lm(formula = con ~ OD,data = std)# linear model
fit1<-nplr::nplr(std$con,std$OD,npars=3,useLog = FALSE)# nplr, 3 parameter modelFor estimating the concentration using linear model
estimated<-estimate(data_DF,colname="blankminus",fitformula=fit2,method="linear")
head(estimated)
#> row col position value type id dilution concentration compound blankminus
#> 1 A 1 A01 0.659 STD1 STD NA 25 <NA> 0.5545
#> 2 A 2 A02 0.649 STD1 STD NA 25 <NA> 0.5445
#> 3 A 3 A03 0.598 S1 Sample 1 0 Taxol 0.4935
#> 4 A 4 A04 0.601 S1 Sample 1 0 Taxol 0.4965
#> 5 A 5 A05 0.541 S1 Sample 1 0 Taxol 0.4365
#> 6 A 6 A06 0.553 S1 Sample 1 0 Taxol 0.4485
#> estimated
#> 1 23.96838
#> 2 23.51493
#> 3 21.20234
#> 4 21.33838
#> 5 18.61769
#> 6 19.16183For estimating the concentration using nplr methord
estimated2<-estimate(data_DF,colname="blankminus",fitformula=fit1,method="nplr")
head(estimated2)
#> row col position value type id dilution concentration compound blankminus
#> 1 A 1 A01 0.659 STD1 STD NA 25 <NA> 0.5545
#> 2 A 2 A02 0.649 STD1 STD NA 25 <NA> 0.5445
#> 3 A 3 A03 0.598 S1 Sample 1 0 Taxol 0.4935
#> 4 A 4 A04 0.601 S1 Sample 1 0 Taxol 0.4965
#> 5 A 5 A05 0.541 S1 Sample 1 0 Taxol 0.4365
#> 6 A 6 A06 0.553 S1 Sample 1 0 Taxol 0.4485
#> estimated
#> 1 26.39687
#> 2 24.01751
#> 3 18.68524
#> 4 18.88785
#> 5 15.70869
#> 6 16.23867dfsummary() help to summarize the dataframe (based on a column). It has additional controls to group samples and to omit variables not needed. syntax is dfsummary(dataframe,y,grp_vector,rm_vector,nickname,rm="FALSE",param). dataframe is the data. y is the numeric variable (column name) that has to be summarized. grp_vector is a vector of column names, based on which samples are grouped. The order of elements in grp_vector determines the order of grouping. rm_vector is the vector of items need to be omitted before summarizing. nickname is the name that has to be given to the output dataframe. rm=“FALSE” if outliers has not to be removed. If outliers has to be removed then rm =“TRUE”. For more stringent methord for removing outlier the parameters are provided in a vector param. param has to be entered in the format c(strict=“TRUE”,cutoff=40,n=12). For details please refer rmodd_summary function.
For eg. data has to be summarized based on the “type” column. “estimated” values are summarized. samples are grouped as per “id”. “STD” and “Blank” values need to be omitted. outliers are not omitted (rm=“FALSE”). nickname for the plate is “plate1”.
result<-dfsummary(estimated,"estimated",c("id","type"),
c("STD","Blank"),"plate1", rm="FALSE",
param=c(strict="FALSE",cutoff=40,n=12))
#> F1
#> F2
result
#> id type label N Mean SD CV
#> 1 Sample S1 plate1 10 19.561 1.465 7.49
#> 2 Sample S2 plate1 10 18.536 1.141 6.15
#> 3 Sample S3 plate1 10 15.670 2.194 14.00
#> 4 Sample S4 plate1 10 13.362 1.026 7.68
#> 5 Sample S5 plate1 10 12.043 1.359 11.29
#> 6 Sample S6 plate1 10 6.969 1.066 15.30
#> 7 Sample S7 plate1 10 6.370 0.819 12.85
#> 8 Sample S8 plate1 10 7.612 1.174 15.42pvalue() help to calculate the significance by t-test on the result dataframe. Syntax is pvalue(dataframe,control,sigval). dataframe is the result of dfsummary. control is the group that is considered as control, sigval is the pvalue cutoff (a value below this is considered as significant). For eg.
pval<-pvalue(result, control="S8", sigval=0.05)
head(pval)
#> id type label N Mean SD CV pvalue significance
#> 1 Sample S8 plate1 10 7.612 1.174 15.42 control
#> 2 Sample S1 plate1 10 19.561 1.465 7.49 < 0.001 Yes
#> 3 Sample S2 plate1 10 18.536 1.141 6.15 < 0.001 Yes
#> 4 Sample S3 plate1 10 15.670 2.194 14.00 < 0.001 Yes
#> 5 Sample S4 plate1 10 13.362 1.026 7.68 < 0.001 Yes
#> 6 Sample S5 plate1 10 12.043 1.359 11.29 < 0.001 Yes