Our package’s main purpose is to read, perform quality control, and normalize raw MBA data. Unfortunately, different devices and labs have different data formats. We gathered a few datasets on which our package could be tested. This document describes the datasets and their sources.
The majority of our datasets, available for the public are stored in
the extdata
folder of the package. The remaining ones -
both private and the larger number of publicly available datasets are
stored in the OneDrive
folder, which is accessible to the
package developers.
The simple way of accessing the files is to download them from our GitHub repository.
Another way is to source the files using the system.file
function. The function returns the path to the file, which can be used
to read the data. The function has the following syntax:
dataset_name <- "CovidOISExPONTENT.csv"
dataset_filepath <- system.file("extdata", dataset_name, package = "PvSTATEM", mustWork = TRUE)
The variable dataset_filepath
now contains the path to
the specified dataset on your computer. Since we know the filepath to
the desired dataset, we can execute the read_data
function
to read the data. The function has the following syntax:
library(PvSTATEM)
plate <- read_luminex_data(dataset_filepath)
#> Reading Luminex data from: C:/Users/tymot/AppData/Local/Temp/Rtmpk1YQco/Rinst523856934c41/PvSTATEM/extdata/CovidOISExPONTENT.csv
#> using format xPONENT
#> ([31mWARNING[39m)
#> Layout file not provided. Setting `use_layout_sample_names`,
#> `use_layout_types` and `use_layout_dilutions` to FALSE.
#> ([31mWARNING[39m)
#> All dilutions in the plate are set to NA. Please check the dilutions in the layout file or sample names.[32m
#> New plate object has been created with name: CovidOISExPONTENT!
#> [39m
plate
#> Plate with 96 samples and 30 analytes
Our datasets are divided into three main categories:
In order to perform simple unit tests and validate the most basic
reading functionalities of the package, we created a few artificial
datasets. The datasets are stored in the extdata
folder of
the package. The datasets are:
random.csv
- a simple dataset with random values used
to test the basic functionalities of the packagerandom2.csv
- another simple dataset with random values
used to test the basic functionalities of the package. This file has a
corresponding, artificial layout - random_layout.csv
random_broken_colB.csv
- this dataset has a broken
column, which should be detected by the package and reported as a
warningThe datasets from this category are the most important for package development since the main purpose of the package is to make the preprocessing of the data easier in the scope of the PvSTATEM project.
The majority of them are stored in the package’s
OneDrive
folder. The datasets available in the
extdata
folder are two files coming from Covid oise
examination:
CovidOISExPONTENT.csv
, which is a
IG4DC2~1.csv
plate from examination
IgG_CovidOise4_30plex
. It contains the corresponding layout
file CovidOISExPONTENT_layout.xlsx
CovidOISExPONTENT_CO.csv
, which is a
IGG_CO~1.csv
plate from examination
IgG_CovidOise2_30plex
and corresponding layout fileMost of the examples and vignettes in the package are based on these datasets.
To check the package functionalities on the data from different
sources, we gathered a few datasets from the public domain. The datasets
are also stored in the OneDrive
folder of the package and
in the subfolder external
of the extdata
directory. The datasets are:
Chul_IgG3_1.csv
- GitHub repo RTSS_Kisumu_Schisto source
Chul_TotalIgG_2.csv
- GitHub repo
RTSS_Kisumu_Schisto source
pone.0187901.s001.csv
- data shipped with drLumi
package source
New_Batch_6_20160309_174224.csv
- dataset included
in the paper A single-nucleotide-polymorphism-based genotyping assay
for simultaneous detection of different carbendazim-resistant genotypes
in the Fusarium graminearum species complex, H. Zhang et.
al.
New_Batch_14_20140513_082522.csv
- dataset included
in the paper A single-nucleotide-polymorphism-based genotyping assay
for simultaneous detection of different carbendazim-resistant genotypes
in the Fusarium graminearum species complex, H. Zhang et.
al.