Type: | Package |
Title: | Building and Managing Local Databases from 'Google Earth Engine' |
Version: | 1.0.2 |
Description: | Simplifies the creation, management, and updating of local databases using data extracted from 'Google Earth Engine' ('GEE'). It integrates with 'GEE' to store, aggregate, and process spatio-temporal data, leveraging 'SQLite' for efficient, serverless storage. The 'geeLite' package provides utilities for data transformation and supports real-time monitoring and analysis of geospatial features, making it suitable for researchers and practitioners in geospatial science. For details, see Kurbucz and Andrée (2025) "Building and Managing Local Databases from Google Earth Engine with the geeLite R Package" https://hdl.handle.net/10986/43165. |
License: | MPL-2.0 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
Imports: | rnaturalearthdata, rnaturalearth, googledrive, data.table, reticulate, rstudioapi, geojsonio, lubridate, jsonlite, magrittr, progress, reshape2, tidyrgee, RSQLite, stringr, crayon, dplyr, h3jsr, knitr, utils, purrr, stats, tidyr, rgee, cli, sf |
Suggests: | testthat (≥ 3.0.0), rmarkdown, leaflet, withr |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-07-19 00:22:45 UTC; Marcell |
Author: | Marcell T. Kurbucz [aut, cre], Bo Pieter Johannes Andrée [aut] |
Maintainer: | Marcell T. Kurbucz <m.kurbucz@ucl.ac.uk> |
Repository: | CRAN |
Date/Publication: | 2025-07-21 09:51:25 UTC |
Aggregate Data by Frequency
Description
Aggregates data from a wide-format data frame according to a specified frequency and applies aggregation and post-processing functions.
Usage
aggr_by_freq(
table,
freq,
prep_fun,
aggr_funs,
postp_funs,
variable_name,
preprocess_body
)
Arguments
table |
[mandatory] (data.frame) A wide-format data frame. |
freq |
[mandatory] (character) Specifies the frequency to aggregate the data. |
prep_fun |
[mandatory] (function) Function used for pre-processing. |
aggr_funs |
[mandatory] (function or list) Aggregation function(s). |
postp_funs |
[mandatory] (function or list) Post-processing function(s). |
variable_name |
[mandatory] (character) Name of the current variable. |
preprocess_body |
[mandatory] (character) Body of the |
Value
A data frame in wide format with aggregated values.
Perform a Single Drive Export for Multiple Geometry Chunks
Description
Exports multiple geometry chunks to Google Drive in a single batch task. The function processes spatial data using Google Earth Engine (GEE) and exports results in CSV format.
Usage
batch_drive_export(
sf_list,
imgs,
stat_fun,
band,
stat,
scale,
folder = ".geelite_tmp_drive",
user = NULL,
description = "geelite_export",
verbose = FALSE
)
Arguments
sf_list |
[mandatory] (list) A list of sf data.frames representing geometry chunks to be processed. |
imgs |
[mandatory] (ee$ImageCollection) The Earth Engine image collection to extract statistics from. |
stat_fun |
[mandatory] (ee$Reducer) The reducer function to apply to extract statistics. |
band |
[mandatory] (character) The band name from the image collection (e.g., "NDVI"). |
stat |
[mandatory] (character) The statistical function name (e.g., "mean"). |
scale |
[mandatory] (numeric) The spatial resolution in meters for 'reduceRegions'. |
folder |
[optional] (character) Name of the Google Drive folder
where the export will be stored. Default is |
user |
[optional] (character) If multiple rgee user profiles exist, specify the user profile directory. |
description |
[optional] (character) A custom description for the
export task. Default is |
verbose |
[optional] (logical) If |
Value
(data.frame) A data frame containing extracted statistics with
columns id
, band
, zonal_stat
, and date-based values.
Check Google Earth Engine connection
Description
Returns TRUE
if the user is authenticated with GEE via 'rgee', without
triggering interactive prompts. Useful in non-interactive contexts like
CRAN. Prints a message and returns FALSE
if not.
Usage
check_rgee_ready()
Value
A logical value: TRUE
if authenticated with GEE, FALSE
otherwise (invisibly).
Clean Contents or Entire Google Drive Folders by Name
Description
Searches for all Google Drive folders with the specified name and optionally removes their contents and/or the folders themselves. Useful for cleaning up scratch or export folders used by Earth Engine batch processes.
Usage
clean_drive_folders_by_name(
folder_name,
delete_folders = FALSE,
verbose = TRUE
)
Arguments
folder_name |
[mandatory] (character) Name of the folder(s) to search for in Google Drive. |
delete_folders |
[optional] (logical) If |
verbose |
[optional] (logical) If |
Compare Lists and Highlight Differences
Description
Compares two lists and marks new values with '+' and removed values with '-'.
Usage
compare_lists(list_1, list_2)
Arguments
list_1 |
[mandatory] (list) First list to compare. |
list_2 |
[mandatory] (list) Second list to compare. |
Value
A list showing added and removed values marked with '+' and '-'.
Compare Vectors and Highlight Differences
Description
Compares two vectors and indicates added ('+') and removed ('-') values.
Usage
compare_vectors(vector_1, vector_2)
Arguments
vector_1 |
[mandatory] (character or integer) First vector to compare. |
vector_2 |
[mandatory] (character or integer) Second vector to compare. |
Value
A vector showing added and removed values marked with '+' and '-'.
Collect and Process Grid Statistics
Description
This function retrieves and processes grid statistics from Google Earth
Engine (GEE) based on the specified session task. The collected data is
stored in SQLite format (data/geelite.db
), along with supplementary
files such as CLI files (cli/...
), the state file
(state/state.json
), and the log file (log/log.txt
).
Usage
compile_db(task, grid, mode, verbose)
Arguments
task |
[mandatory] (list) Session task that specifies the parameters for data collection. |
grid |
[mandatory] (sf) Simple features object containing the geometries of the regions of interest. |
mode |
[optional] (character) Mode of data extraction. Currently
supports |
verbose |
[mandatory] (logical) Display messages and progress status. |
Create or Open the Database Connection
Description
Tries to connect to an SQLite database using dbConnect()
. If the
initial connection fails, it retries up to max_retries
times, waiting
wait_time
seconds between each attempt. If the connection cannot be
established after the maximum retries, the function stops and throws an
error.
Usage
db_connect(db_path, max_retries = 3, wait_time = 5)
db_connect(db_path, max_retries = 3, wait_time = 5)
Arguments
db_path |
[mandatory] (character) A string specifying the file path to the SQLite database. |
max_retries |
[mandatory] (integer) The maximum number of retries if
the connection fails (default: |
wait_time |
[mandatory] (numeric) The number of seconds to wait between
retries (default: |
Value
A database connection object if the connection is successful.
Internal Dummy Function for Declared Imports
Description
Ensures CRAN recognizes packages listed in 'Imports:' that are indirectly required but not explicitly used. Never called at runtime and has no side effects.
Usage
dummy_use_for_cran()
Value
NULL (invisible)
Expand Data to Daily Frequency
Description
Expands the input data frame to a daily frequency, filling in any missing dates within the observed range.
Usage
expand_to_daily(df_long, prep_fun)
Arguments
df_long |
[mandatory] (data.frame) A long-format data frame with at
least the columns |
prep_fun |
[mandatory] (function) Function used for pre-processing. |
Value
A data frame with daily dates and preprocessed value
column.
Extract Large-Scale Statistics in Drive Mode with Fewer Tasks
Description
Batches multiple geometry chunks into fewer ee_table_to_drive
tasks,
reducing overhead and leveraging Google Earth Engine's parallel processing.
Usage
extract_drive_stats(
sf_chunks,
imgs,
band,
stat,
stat_fun,
scale,
folder = ".geelite_tmp_drive",
user = NULL,
pb,
pb_step
)
Arguments
sf_chunks |
[mandatory] (list) A list of sf data frames representing geometry chunks. |
imgs |
[mandatory] (ee$ImageCollection) The Earth Engine image collection to extract statistics from. |
band |
[mandatory] (character) The band name (e.g., "NDVI"). |
stat |
[mandatory] (character) The statistical function to apply (e.g., "mean"). |
stat_fun |
[mandatory] (ee$Reducer) The Earth Engine reducer function. |
scale |
[mandatory] (numeric) The spatial resolution in meters for reduceRegions. |
folder |
[optional] (character) Name of the Google Drive folder where
exports will be stored. Defaults to |
user |
[optional] (character) GEE user profile name, if applicable. |
pb |
[mandatory] (Progress bar object) A progress bar instance from
|
pb_step |
[mandatory] (numeric) The step size for updating the progress bar. |
Value
(data.frame) A merged data frame containing extracted statistics from all Drive exports.
Fetch ISO 3166-1 Country Codes
Description
Retrieves country-level regions using ISO 3166-1 alpha-2 codes.
Usage
fetch_country_regions()
Value
A data frame with country names, ISO 3166-1 codes, and admin level 0.
Fetch ISO 3166 Country and Subdivision Codes
Description
Returns a data frame containing ISO 3166-1 country codes and ISO 3166-2 subdivision codes for the specified administrative level.
Usage
fetch_regions(admin_lvl = 0)
Arguments
admin_lvl |
[optional] (integer) Administrative level to retrieve:
|
Value
A data frame containing region names, ISO 3166-2 codes, and the corresponding administrative levels.
Examples
# Example: Fetch ISO 3166-1 country codes
## Not run:
fetch_regions()
## End(Not run)
Fetch ISO 3166-2 Subdivision Codes
Description
Retrieves first-level administrative subdivisions (e.g., states, provinces) using ISO 3166-2 codes.
Usage
fetch_state_regions()
Value
A data frame with subdivision names, ISO 3166-2 codes, and admin level 1.
Fetch Variable Information from an SQLite Database
Description
Displays information on the available variables in the SQLite database
(data/geelite.db
).
Usage
fetch_vars(
path,
format = c("data.frame", "markdown", "latex", "html", "pipe", "simple", "rst")
)
Arguments
path |
[mandatory] (character) Path to the root directory of the generated database. |
format |
[mandatory] (character) A character string. Possible values
are |
Value
Returns the variable information in the selected format. If
format = "data.frame"
, a data.frame
is returned. For other
formats, the output is printed in the specified format and NULL
is
Examples
# Example: Printing the available variables
## Not run:
fetch_vars(path = "path/to/db")
## End(Not run)
Install and Configure a Conda Environment for 'rgee'
Description
Sets up a Conda environment with all required Python and R dependencies
for using the rgee
package, including a specific version of the
earthengine-api
. If Conda is not available, the user will be prompted
to install Miniconda. The created environment is automatically registered
for use with rgee
.
Usage
gee_install(conda = "rgee", python_version = "3.10", force_recreate = FALSE)
Arguments
conda |
[optional] (character) Name of the Conda environment to create
or use. Defaults to |
python_version |
[optional] (character) Python version to use when
creating the Conda environment. Defaults to |
force_recreate |
[optional] (logical) If |
Value
Invisibly returns the name of the Conda environment used or created.
Note
Even after installation, users must manually accept the Conda Terms of Service (ToS) using the 'conda tos accept' command before package installation can proceed. Clear instructions will be provided if ToS acceptance is needed.
Examples
# Example: Creating a Conda environment with 'rgee' dependencies
## Not run:
gee_install()
## End(Not run)
Print Google Earth Engine and Python Environment Information
Description
Prints information about the Google Earth Engine (GEE) and Python environment.
Usage
gee_message(user)
Arguments
user |
[mandatory] (character) Specifies the Google account directory for which information is displayed. |
Define Output Messages
Description
Defines output messages based on whether the database is new or updated.
Usage
gen_messages(database_new)
Arguments
database_new |
[mandatory] (logical) A logical value indicating whether the database is new. |
Value
A list of output messages.
Create Batches from an sf Object
Description
Divides an sf object (grid
) into a list of chunks, either based on a
specified number of batches (batch_num
) or a maximum chunk size
(batch_size
).
Usage
get_batch(grid, batch_size = NULL, batch_num = NULL)
Arguments
grid |
[mandatory] (sf) The sf object to be split into chunks. |
batch_size |
[optional] (integer) Maximum rows per chunk. Must be
set if |
batch_num |
[optional] (integer) Total number of chunks to create.
Must be set if |
Value
(list) A list of sf objects (chunks).
Produce Batches for Build/Update Mixed Cases
Description
Divides the grid into one or two lists of chunked sf objects, depending on the data-collection case (1,2,3).
Usage
get_batches(cases, grid, batch_size)
Arguments
cases |
[mandatory] (integer) 1=All build, 2=All update, 3=Mixed. |
grid |
[mandatory] (sf) The sf object (grid) containing column 'add' to distinguish existing vs. new rows. |
batch_size |
[mandatory] (integer) If |
Value
(list) A list of two elements, b1
and b2
, each a list
of sf subsets (chunks). b2
might be NULL
if not needed.
Get H3 Bins for Shapes
Description
Generates H3 bins for the provided shapes at the specified resolution.
Usage
get_bins(shapes, resol)
Arguments
shapes |
[mandatory] (sf) A simple features object containing geometries used for generating H3 bins. |
resol |
[mandatory] (integer) An integer specifying the resolution of the H3 grid. |
Value
A data frame containing the H3 bins with columns for region ISO 3166 codes, bin IDs, and geometry.
Determine the Cases of Data Collection Requests
Description
Determines the cases of data collection requests based on the markers of 'datasets', 'bands', and 'stats'.
Usage
get_cases(database_new, dataset_new, band_new, stats_new, regions_new)
Arguments
database_new |
[mandatory] (logical) A logical value indicating whether the database is new. |
dataset_new |
[mandatory] (logical) A logical value indicating whether the dataset is new. |
band_new |
[mandatory] (logical) A logical value indicating whether the band is new. |
stats_new |
[mandatory] (logical) A logical vector indicating which statistics are new. |
regions_new |
[mandatory] (logical) A logical vector indicating which regions are new. |
Value
An integer indicating the processing cases as follows:
- 1
All build
- 2
All update
- 3
Mixed
Print the Configuration File
Description
Reads and prints the configuration file from the database's root directory in a human-readable format.
Usage
get_config(path)
Arguments
path |
[mandatory] (character) The path to the root directory of the generated database. |
Value
A character string representing the formatted JSON content of the configuration file.
Examples
# Example: Printing the configuration file
## Not run:
get_config(path = "path/to/db")
## End(Not run)
Obtain H3 Hexagonal Grid
Description
Retrieves or creates the grid for the task based on the specified regions and resolution.
Usage
get_grid(task)
Arguments
task |
[mandatory] (list) Session task that specifies the parameters for data collection. |
Value
A simple features (sf) object containing grid data.
Retrieve Images and Related Information
Description
Retrieves images and related information from Google Earth Engine (GEE) based on the specified session task.
Usage
get_images(task, mode, cases, dataset, band, regions_new, latest_date)
Arguments
task |
[mandatory] (list) Session task specifying parameters for data collection. |
mode |
[mandatory] (character) Mode of data extraction. Currently
supports |
cases |
[mandatory] (integer) Type of data collection request
( |
dataset |
[mandatory] (character) Name of the GEE dataset. |
band |
[mandatory] (character) Name of the band. |
latest_date |
[mandatory] (date) The most recent data available in the
related SQLite table. Set to |
Value
List containing retrieved images and related information as follows:
- $build
Images for the building procedure
- $update
Images for the updating procedure
- $batch_size
Batch size
- $skip_band
TRUE if 'band' is up-to-date and can be skipped
- $skip_update
TRUE if 'band' is up-to-date but cannot be skipped
Print JSON File
Description
Reads and prints a specified JSON file from the provided root directory in a human-readable format.
Usage
get_json(path, file_path)
Arguments
path |
[mandatory] (character) The path to the root directory of the generated database. |
file_path |
[mandatory] (character) The relative path to the JSON file from the root directory. |
Value
A character string representing the formatted JSON content of the specified file.
Get Reducers
Description
Initializes a list of reducers for grid statistics calculation.
Usage
get_reducers()
Value
A list of available reducers.
Get Shapes for Specified Regions
Description
Retrieves the shapes of specified regions, which can be at the country or state level.
Usage
get_shapes(regions)
Arguments
regions |
[mandatory] (character) A vector containing ISO 3166-2 region codes. Country codes are two characters long, while state codes contain additional characters. |
Value
A simple features (sf) object containing the shapes of the specified regions.
Print the State File
Description
Reads and prints the state file from the database's root directory in a human-readable format.
Usage
get_state(path)
Arguments
path |
[mandatory] (character) The path to the root directory of the generated database. |
Value
A character string representing the formatted JSON content of the state file.
Examples
# Example: Printing the state file
## Not run:
get_state(path = "path/to/db")
## End(Not run)
Generate Session Task
Description
Generates a session task based on the configuration and state files.
Usage
get_task()
Value
A list representing the session task.
Initialize Post-Processing Folder and Files
Description
Creates a postp
folder at the specified path and adds two empty
files: structure.json
and functions.R
.
Usage
init_postp(path, verbose = TRUE)
Arguments
path |
[mandatory] |
verbose |
[optional] (logical) Display messages (default: |
Details
The structure.json
file is initialized with a default JSON structure:
"default": null
. This file is intended for mapping variables to
post-processing functions. The functions.R
file is created with a
placeholder comment indicating where to define the R functions for
post-processing. If the postp
folder already exists, an error will
be thrown to prevent overwriting existing files.
Value
No return value, called for side effects.
Examples
# Example: Initialize post-processing files in the database directory
## Not run:
init_postp("path/to/db")
## End(Not run)
Simple Linear Interpolation
Description
Replaces NA
values with linear interpolation.
Usage
linear_interp(x)
Arguments
x |
[mandatory] (numeric) A numeric vector possibly containing
|
Value
A numeric vector of the same length as x
, with NA
values replaced by linear interpolation.
Load External Post-Processing Functions
Description
Loads post-processing functions and their configuration from an external
folder named postp
, located in the root directory of the database.
The folder must contain two files: structure.json
(which defines the
post-processing configuration) and functions.R
(which contains the R
function definitions to be used for post-processing). The function checks
for these files and loads the JSON configuration and sources the R script.
If the required files are missing, it stops execution and notifies the user
with instructions on how to set up the files correctly.
Usage
load_external_postp(path)
Arguments
path |
[mandatory] (character) The path to the root directory where the database is located. |
Value
Returns a list of post-processing functions loaded from the
structure.json
file. The functions defined in functions.R
are
sourced and made available in the returned environment.
Note
The postp
folder must contain two files: structure.json
and functions.R
. The structure.json
file contains mappings of
variables to the post-processing functions, while functions.R
contains the actual function definitions that will be used for
post-processing.
Extract Statistics Locally for a Single Geometry Chunk
Description
Computes statistical summaries for a given spatial feature (sf_chunk
)
from an Earth Engine ee$ImageCollection
over a specified date range.
This function extracts values for a specific band and applies a chosen
reducer.
Usage
local_chunk_extract(sf_chunk, imgs, dates, band, stat, stat_fun, scale)
Arguments
sf_chunk |
[mandatory] (sf) An sf data frame containing geometry. |
imgs |
[mandatory] (ee$ImageCollection) The Earth Engine image collection to extract statistics from. |
dates |
[mandatory] (character) A vector of date strings corresponding to images in the collection. |
band |
[mandatory] (character) The name of the band to extract. |
stat |
[mandatory] (character) The statistical function to apply (e.g., "mean"). |
stat_fun |
[mandatory] (ee$Reducer) The Earth Engine reducer function. |
scale |
[mandatory] (numeric) The spatial resolution in meters for reduce operations. |
Value
(data.frame) A data frame containing extracted statistics with
columns id
, band
, zonal_stat
, and date-based values.
Modify Configuration File
Description
Modifies the configuration file located in the specified root directory of
the generated database (config/config.json
) by updating values
corresponding to the specified keys.
Usage
modify_config(path, keys, new_values, verbose = TRUE)
Arguments
path |
[mandatory] (character) The path to the root directory of the generated database. |
keys |
[mandatory] (list) A list specifying the path to the values in the configuration file that need updating. Each path should correspond to a specific element in the configuration. |
new_values |
[mandatory] (list) A list of new values to replace the
original values at the locations specified by 'keys'. The length of
|
verbose |
[optional] (logical) If |
Value
No return value, called for side effects.
Examples
# Example: Modifying the configuration file
## Not run:
modify_config(
path = "path/to/db",
keys = list("limit", c("source", "MODIS/061/MOD13A2", "NDVI")),
new_values = list(1000, "mean")
)
## End(Not run)
Output Message
Description
Outputs a message if verbose mode is TRUE
.
Usage
output_message(message, verbose)
Arguments
message |
[mandatory] (list) The message to display. |
verbose |
[mandatory] (logical) A flag indicating whether to display the message. |
Display geeLite Package Version
Description
Displays the version of the geeLite
package with formatted headers.
Usage
print_version(verbose)
Arguments
verbose |
[mandatory] (logical) If |
Process a Single Source File
Description
Processes an individual source file by updating the file with the specified
'path' and writing the updated file to the cli/
directory of the
database.
Usage
process_single_file(src_file_path, path)
Arguments
src_file_path |
[mandatory] (character) The path of the source file to process. |
path |
[mandatory] (character) The path to the root directory of the generated database. |
Process Source Files
Description
Processes multiple source files by iterating through them.
Usage
process_source_files(src_files_path, path)
Arguments
src_files_path |
[mandatory] (character) A vector of source file paths. |
path |
[mandatory] (character) The path to the root directory of the generated database. |
Process Marked Vector
Description
Generates a list categorizing items based on their marks: items to be added ('+'), items to be dropped ('-'), items to be used (unmarked or marked with '+'), and indices of '+' items within the used category.
Usage
process_vector(vector)
Arguments
vector |
[mandatory] (character) A character vector containing elements marked with '+' and '-' prefixes. |
Value
A list with the following components:
- $add
Items marked with '+'
- $drop
Items marked with '-'
- $use
Items that are unmarked or marked with '+'
- $use_add
TRUE for items marked with '+' within the $use category
Reading, Aggregating, and Processing the SQLite Database
Description
Reads, aggregates, and processes the SQLite database
(data/geelite.db
).
Usage
read_db(
path,
variables = "all",
freq = c("month", "day", "week", "bimonth", "quarter", "season", "halfyear", "year"),
prep_fun = NULL,
aggr_funs = function(x) mean(x, na.rm = TRUE),
postp_funs = NULL
)
Arguments
path |
[mandatory] (character) Path to the root directory of the generated database. |
variables |
[optional] (character or integer) Names or IDs of the
variables to be read. Use the |
freq |
[optional] (character) The frequency for data aggregation.
Options include |
prep_fun |
[optional] (function or |
aggr_funs |
[optional] (function or list) A function or a list of
functions for aggregating data to the specified frequency ( |
postp_funs |
[optional] (function or list) A function or list of
functions applied to the time series data of a single bin after
aggregation. Users can directly refer to variable names or IDs. The
default is |
Value
A list where the first element (grid
) is a simple feature
(sf) object, and subsequent elements are data frame objects corresponding
to the variables.
Examples
# Example: Reading variables by IDs
## Not run:
db_list <- read_db(path = "path/to/db",
variables = c(1, 3))
## End(Not run)
Read Grid from Database
Description
Reads the H3 grid from the specified SQLite database
(data/geelite.db
).
Usage
read_grid()
Value
A simple features (sf) object containing the grid data.
Read Variables from Database
Description
Reads the specified variables from the SQLite database.
Usage
read_variables(path, variables, freq, prep_fun, aggr_funs, postp_funs)
Arguments
path |
[mandatory] (character) Path to the root directory of the generated database. |
variables |
[mandatory] (character) A vector of variable names to read. |
freq |
[mandatory] (character) Specifies the frequency to aggregate the data. |
prep_fun |
[mandatory] (function) Function used for pre-processing. |
aggr_funs |
[mandatory] (function or list) Aggregation function(s). |
postp_funs |
[mandatory] (function or list) Post-processing function(s). |
Value
A list of variables read from the database.
Remove Tables from the Database
Description
Removes tables from the database if their corresponding dataset is initially marked for deletion ('-').
Usage
remove_tables(tables_drop)
Arguments
tables_drop |
[mandatory] (character) A character vector of tables to be deleted. |
Build and Update the Grid Statistics Database
Description
Collects and stores grid statistics from Google Earth Engine (GEE) data in
SQLite format (data/geelite.db
), initializes CLI files
(cli/...
), and initializes or updates the state
(state/state.json
) and log (log/log.txt
) files.
Usage
run_geelite(
path,
conda = "rgee",
user = NULL,
rebuild = FALSE,
mode = "local",
verbose = TRUE
)
Arguments
path |
[mandatory] (character) The path to the root directory of the generated database. This must be a writable, non-temporary directory. Avoid using the home directory (~), the current working directory, or the package directory. |
conda |
[optional] (character) Name of the virtual Conda environment
used by the |
user |
[optional] (character) Specifies the Google account directory
within |
rebuild |
[optional] (logical) If |
mode |
[optional] (character) Mode of data extraction. Currently
supports |
verbose |
[optional] (logical) Display computation status and messages
(default: |
Value
Invisibly returns NULL, called for side effects.
Examples
# Example: Build a Grid Statistics Database
## Not run:
run_geelite(path = tempdir())
## End(Not run)
Initialize CLI Files
Description
Creates R scripts to enable the main functions to be called through the
Command Line Interface (CLI). These scripts are stored in the cli/
directory of the generated database.
Usage
set_cli(path, verbose = TRUE)
Arguments
path |
[mandatory] (character) The path to the root directory of the generated database. This must be a writable, non-temporary directory. Avoid using the home directory (~), the current working directory, or the package directory. |
verbose |
[optional] (logical) Whether to display messages (default:
|
Value
No return value, called for side effects.
Examples
## Not run:
set_cli(path = tempdir())
## End(Not run)
Initialize the Configuration File
Description
Creates a configuration file in the specified directory of the generated
database (config/config.json
). If the specified directory does not
exist but its parent directory does, it will be created.
Usage
set_config(
path,
regions,
source,
start = "2020-01-01",
resol,
scale = NULL,
limit = 10000,
verbose = TRUE
)
Arguments
path |
[mandatory] (character) The path to the root directory of the generated database. This must be a writable, non-temporary directory. Avoid using the home directory (~), the current working directory, or the package directory. |
regions |
[mandatory] (character) ISO 3166-1 alpha-2 country codes or ISO 3166-2 subdivision codes. |
source |
[mandatory] (list) Description of Google Earth Engine (GEE) datasets of interest (the complete data catalog of GEE is accessible at: https://developers.google.com/earth-engine/datasets/catalog). It is a nested list with three levels:
|
start |
[optional] (date) First date of the data collection
(default: |
resol |
[mandatory] (integer) Resolution of the H3 bin. |
scale |
[optional] (integer) Specifies the nominal resolution
(in meters) for image processing. If left as |
limit |
[optional] (integer) In |
verbose |
[optional] (logical) Display messages (default: |
Value
No return value, called for side effects.
Examples
## Not run:
set_config(path = tempdir(),
regions = c("SO", "YM"),
source = list(
"MODIS/061/MOD13A1" = list(
"NDVI" = c("mean", "sd")
)
),
resol = 3)
## End(Not run)
Set Dependencies
Description
Authenticates the Google Earth Engine (GEE) account and activates the specified Conda environment.
Usage
set_depend(conda = "rgee", user = NULL, drive = TRUE, verbose = TRUE)
Arguments
conda |
[optional] (character) Name of the virtual Conda environment
used by the |
user |
[optional] (character) Specifies the Google account directory
within |
drive |
[optional] (logical) If |
verbose |
[optional] (logical) Display messages (default: |
Generate Necessary Directories
Description
Generates "data"
, "log"
, "cli"
, and "state"
subdirectories at the specified path.
Usage
set_dirs(rebuild)
Arguments
rebuild |
[optional] (logical) If |
Set Progress Bar
Description
Initializes a progress bar if 'verbose' is TRUE
.
Usage
set_progress_bar(verbose)
Arguments
verbose |
[mandatory] (logical) If |
Value
A progress bar (environment) if 'verbose' is TRUE
, or
NULL
if FALSE
.
Source an R Script with Notifications About Functions Loaded
Description
Sources an R script into a dedicated environment and lists the functions that have been loaded.
Usage
source_with_notification(file)
Arguments
file |
[mandatory] (character) A character string specifying the path to the R script to be sourced. |
Value
An environment containing the functions loaded from the sourced file.
Update Grid Statistics
Description
Updates existing grid statistics with newly calculated statistics.
Usage
update_grid_stats(grid_stat, batch_stat)
Arguments
grid_stat |
[optional] (data.frame) Existing data frame of grid statistics to append the newly calculated statistics to. |
batch_stat |
[mandatory] (data.frame) New data frame of grid statistics to append to the existing statistics.ű |
Value
(data.frame) A combined data frame with missing columns filled as NA.
A data frame containing the updated grid statistics.
Validate Parameters
Description
Validates multiple parameters.
Usage
validate_params(params)
Arguments
params |
[mandatory] (list) A list of parameters to be validated. |
Details
The following validations are performed:
- 'admin_lvl': Ensures it is NULL
, 0
, or 1
.
- 'conda': Verifies if it is an available Conda environment.
- 'file_path': Constructs a file path and checks if the file exists.
- 'keys': Ensures it is a non-empty list with valid entries.
- 'limit': Ensures it is a positive numeric value.
- 'mode': Ensures it is 'local' or 'drive'.
- 'new_values': Ensures it is a list with the same length as 'keys'.
- 'user': Verifies it is NULL
or a character value.
- 'path': Verifies if the directory exists.
- 'rebuild': Verifies it is a logical value.
- 'regions': Ensures the first two characters are letters.
- 'start': Ensures it is a valid date.
- 'verbose': Verifies it is a logical value.
Value
Returns NULL
invisibly if all validations pass.
Validate Source Parameter
Description
Checks the validity of the 'source' parameter.
Usage
validate_source_param(source)
Arguments
source |
[mandatory] (list) A list containing datasets, each with its associated bands and statistics. The structure should follow a nested format where each dataset is a named list, each band within a dataset is also a named list, and each statistic within a band is a non-empty character string. |
Value
Returns TRUE
if the 'source' parameter is valid. Throws an
error if the parameter is invalid.
Validate and Process Parameters for Variable Selection and Data Processing
Description
Validates and processes input parameters related to variable selection and
data processing in the read_db
function. It ensures that the
variables, frequency, and functions provided are valid, correctly formatted,
and compatible with the available data.
Usage
validate_variables_param(
variables,
variables_all,
prep_fun,
aggr_funs,
postp_funs
)
Arguments
variables |
[mandatory] (character or integer) Variable IDs or names to
be processed. Use |
variables_all |
[mandatory] (data.frame) A data frame containing all
available variables, typically obtained from |
prep_fun |
[mandatory] (function) A function used for pre-processing. |
aggr_funs |
[mandatory] (function or list) Aggregation function(s). |
postp_funs |
[mandatory] (function or list) Post-processing function(s). |
Value
A character vector of variable names to process.
Write Grid to Database
Description
Writes the H3 grid to the specified SQLite database (data/geelite.db
).
Usage
write_grid(grid)
Arguments
grid |
[mandatory] (sf) Simple features object containing the grid data to be written into the database. |
Write Grid Statistics to Database
Description
Writes grid statistics to the SQLite database.
Usage
write_grid_stats(database_new, dataset_new, dataset, db_table, grid_stats)
Arguments
database_new |
[mandatory] (logical) A logical value indicating whether the database is new. |
dataset_new |
[mandatory] (logical) A logical value indicating whether the dataset is new. |
dataset |
[mandatory] (character) Name of the dataset to initialize or update in the SQLite database. |
db_table |
[mandatory] (data.frame) The table to be updated or
retrieved from the SQLite database. Set to |
grid_stats |
[mandatory] (list) List containing grid statistics separately for (re)building and updating procedures. |
Write Log File
Description
Writes the log file to the specified directory within the generated
database (log/log.txt
).
Usage
write_log_file(database_new)
Arguments
database_new |
[mandatory] (logical) A logical value indicating whether the database is new. |
Write State File
Description
Writes the state file to the specified directory within the generated
database (state/state.json
).
Usage
write_state_file(task, regions, source_for_state)
Arguments
task |
[mandatory] (list) Session task specifying parameters for data collection. |
regions |
[mandatory] (character) A vector containing ISO 3166-2 region codes. Country codes are two characters long, while state codes contain additional characters. |
source_for_state |
[mandatory] (list) A list containing information regarding the collected data. |