geeLite
simplifies the process of building, managing,
and updating local SQLite databases containing geospatial features
extracted from Google Earth Engine (GEE). This vignette covers the
installation, configuration, data collection, and analysis workflows
using the geeLite
package.
For more detailed information and updates, visit the geeLite GitHub repository.
To install geeLite
from GitHub, use the following
commands:
The workflow for setting up and using geeLite
includes
configuration, data collection, modification, and data analysis.
The first step in using geeLite
is setting up a
configuration file. This file specifies the regions of interest, the
datasets to collect from GEE, and other parameters for data
collection.
# Define the path for the SQLite database
path <- path_to_db
# Create a configuration for Somalia (SO) and Yemen (YE) to collect NDVI data
set_config(
path = path,
regions = c("SO", "YE"),
source = list(
"MODIS/061/MOD13A2" = list(
"NDVI" = c("mean", "sd")
)
),
resol = 3,
start = "2020-01-01"
)
This configuration will create a JSON file at the specified path that defines the parameters for collecting data from GEE.
Once the configuration is set, you can collect data from GEE using
the run_geelite()
function. This function retrieves data
based on the configuration file and stores it in a local SQLite
database.
The function will store the collected geospatial data in the SQLite database and log the progress.
If you want to modify the configuration (e.g., to add new statistics
or datasets), you can use the modify_config()
function to
make updates without rebuilding the entire database.
# Add more statistics to the NDVI band and include EVI data
modify_config(
path = path,
keys = list(
c("source", "MODIS/061/MOD13A2", "NDVI"),
c("source", "MODIS/061/MOD13A2", "EVI")
),
new_values = list(
c("mean", "min", "max"),
c("mean", "sd")
)
)
After modifying the configuration, run run_geelite()
again to update the database with the new settings.
Once the data has been collected and stored in the database, you can
read and analyze it using the read_db()
function. This
function allows you to aggregate data at different frequencies (e.g.,
daily, monthly) and apply preprocessing functions.
This example demonstrates how to use geeLite
to gather
NDVI data for Somalia and Yemen, aggregate it monthly, and visualize the
results using the leaflet
package.
# Define the path for the SQLite database
path <- path_to_db
# Set the configuration file for NDVI data collection
set_config(
path = path,
regions = c("SO", "YE"),
source = list(
"MODIS/061/MOD13A2" = list(
"NDVI" = c("mean", "sd")
)
),
resol = 3,
start = "2020-01-01"
)
#> ℹ Config file generated: 'config/config.json'.
# Collect the data
run_geelite(path = path)
#>
#> ────────────────────────────────────────────────────────────────────────────────
#> [1mgeeLite R Package - Version: 1.0.2 [0m
#>
#> ────────────────────────────────────────────────────────────────────────────────
#>
#> ── rgee 1.1.7 ─────────────────────────────────────── earthengine-api 0.1.370 ──
#> ✔ User: not defined
#> ✔ Initializing Google Earth Engine: DONE!
#> ✔ Earth Engine account: users/testgeelite
#> ✔ Python path: C:/Users/Marcell/AppData/Local/r-miniconda/envs/rgee/python.exe
#> ────────────────────────────────────────────────────────────────────────────────
#>
#> > Extracting data from Earth Engine...
#> ℹ Database successfully updated: 'data/geelite.db'.
#> ℹ State file updated: 'state/state.json'.
#> ℹ CLI scripts updated: 'cli/R functions'.
# Read the data from the database
db <- read_db(path = path, freq = "month")
Once the data is collected, you can visualize it using the
leaflet
package. The following code shows how to plot the
mean NDVI values for each region.
# Load necessary packages
library(leaflet)
#> Warning: package 'leaflet' was built under R version 4.3.3
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(sf)
#> Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE
# Read database and merge grid with MODIS data
sf <- merge(db$grid, db$`MODIS/061/MOD13A2/NDVI/mean`, by = "id")
# Select the date to visualize
ndvi <- sf$`2020-01-01`
# Create a color palette function based on the values
color_pal <- colorNumeric(palette = "viridis", domain = ndvi)
# Create the leaflet map
leaflet(data = sf) %>%
addTiles() %>% # Add base tiles
addPolygons(
fillColor = color_pal(ndvi), # Fill color
color = "#BDBDC3", # Border color
weight = 1, # Border weight
opacity = 1, # Border opacity
fillOpacity = 0.9 # Fill opacity
) %>%
addScaleBar(position = "bottomleft") %>% # Add scale bar
addLegend(
pal = color_pal, # Color palette
values = ndvi, # Data values to map
title = "Mean NDVI", # Legend title
position = "bottomright" # Legend position
)
Additional features of the geeLite
package include a
drive
mode for efficiently handling large data requests, as
well as command-line interface (CLI) support for automation and
integration with job scheduling systems like cron
.
To efficiently handle large data requests, drive
mode
exports data in parallel batches to Google Drive before importing it
into your local SQLite database. Ensure that adequate Google Drive
storage is available before using drive
mode.
geeLite
provides a CLI that allows you to run the main
functions of the package directly from the command line. This is useful
for automating workflows or integrating geeLite
into larger
systems.
# Setting the CLI files
Rscript /path/to/geeLite/cli/set_cli.R --path "path/to/db"
# Change directory to where the database will be generated
cd "path/to/db"
# Set up the configuration via CLI
Rscript cli/set_config.R --regions "SO YE" --source "list('MODIS/061/MOD13A2' = list('NDVI' = c('mean', 'min')))" --resol 3 --start "2020-01-01"
# Collecting GEE data via CLI
Rscript cli/run_geelite.R
# Modifying the configuration via CLI
Rscript cli/modify_config.R --keys "list(c('source', 'MODIS/061/MOD13A2', 'NDVI'), c('source', 'MODIS/061/MOD13A2', 'EVI'))" --new_values "list(c('mean', 'min', 'max'), c('mean', 'sd'))"
# Updating the database via CLI
Rscript cli/run_geelite.R
You can automate database updates using the Linux cron
job scheduler. Here’s an example cron job that updates the database
monthly: