| Type: | Package | 
| Title: | R Interface for Apache Sedona | 
| Version: | 1.8.0 | 
| Maintainer: | Apache Sedona <private@sedona.apache.org> | 
| Description: | R interface for 'Apache Sedona' based on 'sparklyr' (https://sedona.apache.org). | 
| License: | Apache License 2.0 | 
| URL: | https://github.com/apache/sedona/, https://sedona.apache.org/ | 
| BugReports: | https://github.com/apache/sedona/issues | 
| Depends: | R (≥ 3.2) | 
| Imports: | rlang, sparklyr (≥ 1.3), dbplyr (≥ 1.1.0), cli, lifecycle | 
| Suggests: | dplyr (≥ 0.7.2), knitr, rmarkdown | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.3.2 | 
| SystemRequirements: | 'Apache Spark' 3.x | 
| NeedsCompilation: | no | 
| Packaged: | 2025-09-13 01:30:04 UTC; jiayu | 
| Author: | Apache Sedona [aut, cre],
  Jia Yu [ctb, cph],
  Yitao Li  | 
| Repository: | CRAN | 
| Date/Publication: | 2025-09-13 03:10:02 UTC | 
apache.sedona: R Interface for Apache Sedona
Description
R interface for 'Apache Sedona' based on 'sparklyr' (https://sedona.apache.org).
Author(s)
Maintainer: Apache Sedona private@sedona.apache.org
Authors:
Yitao Li yitao@rstudio.com (ORCID) [copyright holder]
Other contributors:
Jia Yu jiayu@apache.org [contributor, copyright holder]
The Apache Software Foundation [copyright holder]
RStudio [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/apache/sedona/issues
Find the approximate total number of records within a Spatial RDD.
Description
Given a Sedona spatial RDD, find the (possibly approximated) number of total records within it.
Usage
approx_count(x)
Arguments
x | 
 A Sedona spatial RDD.  | 
Value
Approximate number of records within the SpatialRDD.
See Also
Other Spatial RDD aggregation routine: 
minimum_bounding_box()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_shapefile_to_typed_rdd(
    sc,
    location = input_location, type = "polygon"
  )
  approx_cnt <- approx_count(rdd)
}
Perform a CRS transformation.
Description
Transform data within a spatial RDD from one coordinate reference system to another. This uses the lon/lat order since v1.5.0. Before, it used lat/lon
Usage
crs_transform(x, src_epsg_crs_code, dst_epsg_crs_code, strict = FALSE)
Arguments
x | 
 The spatial RDD to be processed.  | 
src_epsg_crs_code | 
 Coordinate reference system to transform from (e.g., "epsg:4326", "epsg:3857", etc).  | 
dst_epsg_crs_code | 
 Coordinate reference system to transform to. (e.g., "epsg:4326", "epsg:3857", etc).  | 
strict | 
 If FALSE (default), then ignore the "Bursa-Wolf Parameters Required" error.  | 
Value
The transformed SpatialRDD.
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_geojson_to_typed_rdd(
    sc,
    location = input_location, type = "polygon"
  )
  crs_transform(
    rdd,
    src_epsg_crs_code = "epsg:4326", dst_epsg_crs_code = "epsg:3857"
  )
}
Find the minimal bounding box of a geometry.
Description
Given a Sedona spatial RDD, find the axis-aligned minimal bounding box of the geometry represented by the RDD.
Usage
minimum_bounding_box(x)
Arguments
x | 
 A Sedona spatial RDD.  | 
Value
A minimum bounding box object.
See Also
Other Spatial RDD aggregation routine: 
approx_count()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_shapefile_to_typed_rdd(
    sc,
    location = input_location, type = "polygon"
  )
  boundary <- minimum_bounding_box(rdd)
}
Construct a bounding box object.
Description
Construct a axis-aligned rectangular bounding box object.
Usage
new_bounding_box(sc, min_x = -Inf, max_x = Inf, min_y = -Inf, max_y = Inf)
Arguments
sc | 
 The Spark connection.  | 
min_x | 
 Minimum x-value of the bounding box, can be +/- Inf.  | 
max_x | 
 Maximum x-value of the bounding box, can be +/- Inf.  | 
min_y | 
 Minimum y-value of the bounding box, can be +/- Inf.  | 
max_y | 
 Maximum y-value of the bounding box, can be +/- Inf.  | 
Value
A bounding box object.
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
bb <- new_bounding_box(sc, -1, 1, -1, 1)
Import data from a spatial RDD into a Spark Dataframe.
Description
Import data from a spatial RDD (possibly with non-spatial attributes) into a Spark Dataframe.
-  
sdf_register: method for sparklyr's sdf_register to handle Spatial RDD -  
as.spark.dataframe: lower level function with more fine-grained control on non-spatial columns 
Usage
## S3 method for class 'spatial_rdd'
sdf_register(x, name = NULL)
as.spark.dataframe(x, non_spatial_cols = NULL, name = NULL)
Arguments
x | 
 A spatial RDD.  | 
name | 
 Name to assign to the resulting Spark temporary view. If unspecified, then a random name will be assigned.  | 
non_spatial_cols | 
 Column names for non-spatial attributes in the resulting Spark Dataframe. By default (NULL) it will import all field names if that property exists, in particular for shapefiles.  | 
Value
A Spark Dataframe containing the imported spatial data.
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_geojson_to_typed_rdd(
    sc,
    location = input_location,
    type = "polygon"
  )
  sdf <- sdf_register(rdd)
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = input_location,
    delimiter = ",",
    type = "point",
    first_spatial_col_index = 1L,
    repartition = 5
  )
  sdf <- as.spark.dataframe(rdd, non_spatial_cols = c("attr1", "attr2"))
}
Apply a spatial partitioner to a Sedona spatial RDD.
Description
Given a Sedona spatial RDD, partition its content using a spatial partitioner.
Usage
sedona_apply_spatial_partitioner(
  rdd,
  partitioner = c("quadtree", "kdbtree"),
  max_levels = NULL
)
Arguments
rdd | 
 The spatial RDD to be partitioned.  | 
partitioner | 
 The name of a grid type to use (currently "quadtree" and
"kdbtree" are supported) or an
  | 
max_levels | 
 Maximum number of levels in the partitioning tree data
structure. If NULL (default), then use the current number of partitions
within   | 
Value
A spatially partitioned SpatialRDD.
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = input_location,
    delimiter = ",",
    type = "point",
    first_spatial_col_index = 1L
  )
  sedona_apply_spatial_partitioner(rdd, partitioner = "kdbtree")
}
Build an index on a Sedona spatial RDD.
Description
Given a Sedona spatial RDD, build the type of index specified on each of its partition(s).
Usage
sedona_build_index(
  rdd,
  type = c("quadtree", "rtree"),
  index_spatial_partitions = TRUE
)
Arguments
rdd | 
 The spatial RDD to be indexed.  | 
type | 
 The type of index to build. Currently "quadtree" and "rtree" are supported.  | 
index_spatial_partitions | 
 If the RDD is already partitioned using a spatial partitioner, then index each spatial partition within the RDD instead of partitions within the raw RDD associated with the underlying spatial data source. Default: TRUE. Notice this option is irrelevant if the input RDD has not been partitioned using with a spatial partitioner yet.  | 
Value
A spatial index object.
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_shapefile_to_typed_rdd(
    sc,
    location = input_location,
    type = "polygon"
  )
  sedona_build_index(rdd, type = "rtree")
}
Query the k nearest spatial objects.
Description
Given a spatial RDD, a query object x, and an integer k, find the k
nearest spatial objects within the RDD from x (distance between
x and another geometrical object will be measured by the minimum
possible length of any line segment connecting those 2 objects).
Usage
sedona_knn_query(
  rdd,
  x,
  k,
  index_type = c("quadtree", "rtree"),
  result_type = c("rdd", "sdf", "raw")
)
Arguments
rdd | 
 A Sedona spatial RDD.  | 
x | 
 The query object.  | 
k | 
 Number of nearest spatail objects to return.  | 
index_type | 
 Index to use to facilitate the KNN query. If NULL, then
do not build any additional spatial index on top of   | 
result_type | 
 Type of result to return.
If "rdd" (default), then the k nearest objects will be returned in a Sedona
spatial RDD.
If "sdf", then a Spark dataframe containing the k nearest objects will be
returned.
If "raw", then a list of k nearest objects will be returned. Each element
within this list will be a JVM object of type
  | 
Value
The KNN query result.
See Also
Other Sedona spatial query: 
sedona_range_query()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  knn_query_pt_x <- -84.01
  knn_query_pt_y <- 34.01
  knn_query_pt_tbl <- sdf_sql(
    sc,
    sprintf(
      "SELECT ST_GeomFromText(\"POINT(%f %f)\") AS `pt`",
      knn_query_pt_x,
      knn_query_pt_y
    )
  ) %>%
      collect()
  knn_query_pt <- knn_query_pt_tbl$pt[[1]]
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_geojson_to_typed_rdd(
    sc,
    location = input_location,
    type = "polygon"
  )
  knn_result_sdf <- sedona_knn_query(
    rdd,
    x = knn_query_pt, k = 3, index_type = "rtree", result_type = "sdf"
  )
}
Execute a range query.
Description
Given a spatial RDD and a query object x, find all spatial objects
within the RDD that are covered by x or intersect x.
Usage
sedona_range_query(
  rdd,
  x,
  query_type = c("cover", "intersect"),
  index_type = c("quadtree", "rtree"),
  result_type = c("rdd", "sdf", "raw")
)
Arguments
rdd | 
 A Sedona spatial RDD.  | 
x | 
 The query object.  | 
query_type | 
 Type of spatial relationship involved in the query. Currently "cover" and "intersect" are supported.  | 
index_type | 
 Index to use to facilitate the KNN query. If NULL, then
do not build any additional spatial index on top of   | 
result_type | 
 Type of result to return.
If "rdd" (default), then the k nearest objects will be returned in a Sedona
spatial RDD.
If "sdf", then a Spark dataframe containing the k nearest objects will be
returned.
If "raw", then a list of k nearest objects will be returned. Each element
within this list will be a JVM object of type
  | 
Value
The range query result.
See Also
Other Sedona spatial query: 
sedona_knn_query()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  range_query_min_x <- -87
  range_query_max_x <- -50
  range_query_min_y <- 34
  range_query_max_y <- 54
  geom_factory <- invoke_new(
    sc,
    "org.locationtech.jts.geom.GeometryFactory"
  )
  range_query_polygon <- invoke_new(
    sc,
    "org.locationtech.jts.geom.Envelope",
    range_query_min_x,
    range_query_max_x,
    range_query_min_y,
    range_query_max_y
  ) %>%
    invoke(geom_factory, "toGeometry", .)
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_geojson_to_typed_rdd(
    sc,
    location = input_location,
    type = "polygon"
  )
  range_query_result_sdf <- sedona_range_query(
    rdd,
    x = range_query_polygon,
    query_type = "intersect",
    index_type = "rtree",
    result_type = "sdf"
  )
}
Create a typed SpatialRDD from a delimiter-separated values data source.
Description
Create a typed SpatialRDD (namely, a PointRDD, a PolygonRDD, or a LineStringRDD) from a data source containing delimiter-separated values. The data source can contain spatial attributes (e.g., longitude and latidude) and other attributes. Currently only inputs with spatial attributes occupying a contiguous range of columns (i.e., [first_spatial_col_index, last_spatial_col_index]) are supported.
Usage
sedona_read_dsv_to_typed_rdd(
  sc,
  location,
  delimiter = c(",", "\t", "?", "'", "\"", "_", "-", "%", "~", "|", ";"),
  type = c("point", "polygon", "linestring"),
  first_spatial_col_index = 0L,
  last_spatial_col_index = NULL,
  has_non_spatial_attrs = TRUE,
  storage_level = "MEMORY_ONLY",
  repartition = 1L
)
Arguments
sc | 
 A   | 
location | 
 Location of the data source.  | 
delimiter | 
 Delimiter within each record. Must be one of ',', '\t', '?', '\”, '"', '_', '-', '%', '~', '|', ';'  | 
type | 
 Type of the SpatialRDD (must be one of "point", "polygon", or "linestring".  | 
first_spatial_col_index | 
 Zero-based index of the left-most column containing spatial attributes (default: 0).  | 
last_spatial_col_index | 
 Zero-based index of the right-most column containing spatial attributes (default: NULL). Note last_spatial_col_index does not need to be specified when creating a PointRDD because it will automatically have the implied value of (first_spatial_col_index + 1). For all other types of RDDs, if last_spatial_col_index is unspecified, then it will assume the value of -1 (i.e., the last of all input columns).  | 
has_non_spatial_attrs | 
 Whether the input contains non-spatial attributes.  | 
storage_level | 
 Storage level of the RDD (default: MEMORY_ONLY).  | 
repartition | 
 The minimum number of partitions to have in the resulting RDD (default: 1).  | 
Value
A typed SpatialRDD.
See Also
Other Sedona RDD data interface functions: 
sedona_read_geojson(),
sedona_read_shapefile_to_typed_rdd(),
sedona_save_spatial_rdd(),
sedona_write_wkb()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your csv file
  rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = input_location,
    delimiter = ",",
    type = "point",
    first_spatial_col_index = 1L
  )
}
Read geospatial data into a Spatial RDD
Description
Import spatial object from an external data source into a Sedona SpatialRDD.
-  
sedona_read_shapefile: from a shapefile -  
sedona_read_geojson: from a geojson file -  
sedona_read_wkt: from a geojson file -  
sedona_read_wkb: from a geojson file 
Usage
sedona_read_geojson(
  sc,
  location,
  allow_invalid_geometries = TRUE,
  skip_syntactically_invalid_geometries = TRUE,
  storage_level = "MEMORY_ONLY",
  repartition = 1L
)
sedona_read_wkb(
  sc,
  location,
  wkb_col_idx = 0L,
  allow_invalid_geometries = TRUE,
  skip_syntactically_invalid_geometries = TRUE,
  storage_level = "MEMORY_ONLY",
  repartition = 1L
)
sedona_read_wkt(
  sc,
  location,
  wkt_col_idx = 0L,
  allow_invalid_geometries = TRUE,
  skip_syntactically_invalid_geometries = TRUE,
  storage_level = "MEMORY_ONLY",
  repartition = 1L
)
sedona_read_shapefile(sc, location, storage_level = "MEMORY_ONLY")
Arguments
sc | 
 A   | 
location | 
 Location of the data source.  | 
allow_invalid_geometries | 
 Whether to allow topology-invalid geometries to exist in the resulting RDD.  | 
skip_syntactically_invalid_geometries | 
 Whether to allows Sedona to automatically skip syntax-invalid geometries, rather than throwing errorings.  | 
storage_level | 
 Storage level of the RDD (default: MEMORY_ONLY).  | 
repartition | 
 The minimum number of partitions to have in the resulting RDD (default: 1).  | 
wkb_col_idx | 
 Zero-based index of column containing hex-encoded WKB data (default: 0).  | 
wkt_col_idx | 
 Zero-based index of column containing hex-encoded WKB data (default: 0).  | 
Value
A SpatialRDD.
See Also
Other Sedona RDD data interface functions: 
sedona_read_dsv_to_typed_rdd(),
sedona_read_shapefile_to_typed_rdd(),
sedona_save_spatial_rdd(),
sedona_write_wkb()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_geojson(sc, location = input_location)
}
(Deprecated) Create a typed SpatialRDD from a shapefile or geojson data source.
Description
Constructors of typed RDD (PointRDD, PolygonRDD, LineStringRDD) are soft deprecated, use non-types versions
Create a typed SpatialRDD (namely, a PointRDD, a PolygonRDD, or a LineStringRDD)
-  
sedona_read_shapefile_to_typed_rdd: from a shapefile data source -  
sedona_read_geojson_to_typed_rdd: from a GeoJSON data source 
Usage
sedona_read_shapefile_to_typed_rdd(
  sc,
  location,
  type = c("point", "polygon", "linestring"),
  storage_level = "MEMORY_ONLY"
)
sedona_read_geojson_to_typed_rdd(
  sc,
  location,
  type = c("point", "polygon", "linestring"),
  has_non_spatial_attrs = TRUE,
  storage_level = "MEMORY_ONLY",
  repartition = 1L
)
Arguments
sc | 
 A   | 
location | 
 Location of the data source.  | 
type | 
 Type of the SpatialRDD (must be one of "point", "polygon", or "linestring".  | 
storage_level | 
 Storage level of the RDD (default: MEMORY_ONLY).  | 
has_non_spatial_attrs | 
 Whether the input contains non-spatial attributes.  | 
repartition | 
 The minimum number of partitions to have in the resulting RDD (default: 1).  | 
Value
A typed SpatialRDD.
See Also
Other Sedona RDD data interface functions: 
sedona_read_dsv_to_typed_rdd(),
sedona_read_geojson(),
sedona_save_spatial_rdd(),
sedona_write_wkb()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your shapefile
  rdd <- sedona_read_shapefile_to_typed_rdd(
    sc,
    location = input_location, type = "polygon"
  )
}
Visualize a Sedona spatial RDD using a choropleth map.
Description
Generate a choropleth map of a pair RDD assigning integral values to polygons.
Usage
sedona_render_choropleth_map(
  pair_rdd,
  resolution_x,
  resolution_y,
  output_location,
  output_format = c("png", "gif", "svg"),
  boundary = NULL,
  color_of_variation = c("red", "green", "blue"),
  base_color = c(0, 0, 0),
  shade = TRUE,
  reverse_coords = FALSE,
  overlay = NULL,
  browse = interactive()
)
Arguments
pair_rdd | 
 A pair RDD with Sedona Polygon objects being keys and java.lang.Long being values.  | 
resolution_x | 
 Resolution on the x-axis.  | 
resolution_y | 
 Resolution on the y-axis.  | 
output_location | 
 Location of the output image. This should be the desired path of the image file excluding extension in its file name.  | 
output_format | 
 File format of the output image. Currently "png", "gif", and "svg" formats are supported (default: "png").  | 
boundary | 
 Only render data within the given rectangular boundary.
The   | 
color_of_variation | 
 Which color channel will vary depending on values of data points. Must be one of "red", "green", or "blue". Default: red.  | 
base_color | 
 Color of any data point with value 0. Must be a numeric vector of length 3 specifying values for red, green, and blue channels. Default: c(0, 0, 0).  | 
shade | 
 Whether data point with larger magnitude will be displayed with darker color. Default: TRUE.  | 
reverse_coords | 
 Whether to reverse spatial coordinates in the plot (default: FALSE).  | 
overlay | 
 A   | 
browse | 
 Whether to open the rendered image in a browser (default: interactive()).  | 
Value
No return value.
See Also
Other Sedona visualization routines: 
sedona_render_heatmap(),
sedona_render_scatter_plot()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  pt_input_location <- "/dev/null" # replace it with the path to your input file
  pt_rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = pt_input_location,
    type = "point",
    first_spatial_col_index = 1
  )
  polygon_input_location <- "/dev/null" # replace it with the path to your input file
  polygon_rdd <- sedona_read_geojson_to_typed_rdd(
    sc,
    location = polygon_input_location,
    type = "polygon"
  )
  join_result_rdd <- sedona_spatial_join_count_by_key(
    pt_rdd,
    polygon_rdd,
    join_type = "intersect",
    partitioner = "quadtree"
  )
  sedona_render_choropleth_map(
    join_result_rdd,
    400,
    200,
    output_location = tempfile("choropleth-map-"),
    boundary = c(-86.8, -86.6, 33.4, 33.6),
    base_color = c(255, 255, 255)
  )
}
Visualize a Sedona spatial RDD using a heatmap.
Description
Generate a heatmap of geometrical object(s) within a Sedona spatial RDD.
Usage
sedona_render_heatmap(
  rdd,
  resolution_x,
  resolution_y,
  output_location,
  output_format = c("png", "gif", "svg"),
  boundary = NULL,
  blur_radius = 10L,
  overlay = NULL,
  browse = interactive()
)
Arguments
rdd | 
 A Sedona spatial RDD.  | 
resolution_x | 
 Resolution on the x-axis.  | 
resolution_y | 
 Resolution on the y-axis.  | 
output_location | 
 Location of the output image. This should be the desired path of the image file excluding extension in its file name.  | 
output_format | 
 File format of the output image. Currently "png", "gif", and "svg" formats are supported (default: "png").  | 
boundary | 
 Only render data within the given rectangular boundary.
The   | 
blur_radius | 
 Controls the radius of a Gaussian blur in the resulting heatmap.  | 
overlay | 
 A   | 
browse | 
 Whether to open the rendered image in a browser (default: interactive()).  | 
Value
No return value.
See Also
Other Sedona visualization routines: 
sedona_render_choropleth_map(),
sedona_render_scatter_plot()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = input_location,
    type = "point"
  )
  sedona_render_heatmap(
    rdd,
    resolution_x = 800,
    resolution_y = 600,
    output_location = tempfile("points-"),
    output_format = "png",
    boundary = c(-91, -84, 30, 35),
    blur_radius = 10
  )
}
Visualize a Sedona spatial RDD using a scatter plot.
Description
Generate a scatter plot of geometrical object(s) within a Sedona spatial RDD.
Usage
sedona_render_scatter_plot(
  rdd,
  resolution_x,
  resolution_y,
  output_location,
  output_format = c("png", "gif", "svg"),
  boundary = NULL,
  color_of_variation = c("red", "green", "blue"),
  base_color = c(0, 0, 0),
  shade = TRUE,
  reverse_coords = FALSE,
  overlay = NULL,
  browse = interactive()
)
Arguments
rdd | 
 A Sedona spatial RDD.  | 
resolution_x | 
 Resolution on the x-axis.  | 
resolution_y | 
 Resolution on the y-axis.  | 
output_location | 
 Location of the output image. This should be the desired path of the image file excluding extension in its file name.  | 
output_format | 
 File format of the output image. Currently "png", "gif", and "svg" formats are supported (default: "png").  | 
boundary | 
 Only render data within the given rectangular boundary.
The   | 
color_of_variation | 
 Which color channel will vary depending on values of data points. Must be one of "red", "green", or "blue". Default: red.  | 
base_color | 
 Color of any data point with value 0. Must be a numeric vector of length 3 specifying values for red, green, and blue channels. Default: c(0, 0, 0).  | 
shade | 
 Whether data point with larger magnitude will be displayed with darker color. Default: TRUE.  | 
reverse_coords | 
 Whether to reverse spatial coordinates in the plot (default: FALSE).  | 
overlay | 
 A   | 
browse | 
 Whether to open the rendered image in a browser (default: interactive()).  | 
Value
No return value.
See Also
Other Sedona visualization routines: 
sedona_render_choropleth_map(),
sedona_render_heatmap()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = input_location,
    type = "point"
  )
  sedona_render_scatter_plot(
    rdd,
    resolution_x = 800,
    resolution_y = 600,
    output_location = tempfile("points-"),
    output_format = "png",
    boundary = c(-91, -84, 30, 35)
  )
}
Save a Spark dataframe containing exactly 1 spatial column into a file.
Description
Export serialized data from a Spark dataframe containing exactly 1 spatial column into a file.
Usage
sedona_save_spatial_rdd(
  x,
  spatial_col,
  output_location,
  output_format = c("wkb", "wkt", "geojson")
)
Arguments
x | 
 A Spark dataframe object in sparklyr or a dplyr expression representing a Spark SQL query.  | 
spatial_col | 
 The name of the spatial column.  | 
output_location | 
 Location of the output file.  | 
output_format | 
 Format of the output.  | 
Value
No return value.
See Also
Other Sedona RDD data interface functions: 
sedona_read_dsv_to_typed_rdd(),
sedona_read_geojson(),
sedona_read_shapefile_to_typed_rdd(),
sedona_write_wkb()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  tbl <- dplyr::tbl(
    sc,
    dplyr::sql("SELECT ST_GeomFromText('POINT(-71.064544 42.28787)') AS `pt`")
  )
  sedona_save_spatial_rdd(
    tbl %>% dplyr::mutate(id = 1),
    spatial_col = "pt",
    output_location = "/tmp/pts.wkb",
    output_format = "wkb"
  )
}
Perform a spatial join operation on two Sedona spatial RDDs.
Description
Given spatial_rdd and query_window_rdd, return a pair RDD containing all
pairs of geometrical elements (p, q) such that p is an element of
spatial_rdd, q is an element of query_window_rdd, and (p, q) satisfies
the spatial relation specified by join_type.
Usage
sedona_spatial_join(
  spatial_rdd,
  query_window_rdd,
  join_type = c("contain", "intersect"),
  partitioner = c("quadtree", "kdbtree"),
  index_type = c("quadtree", "rtree")
)
Arguments
spatial_rdd | 
 Spatial RDD containing geometries to be queried.  | 
query_window_rdd | 
 Spatial RDD containing the query window(s).  | 
join_type | 
 Type of the join query (must be either "contain" or
"intersect").
If   | 
partitioner | 
 Spatial partitioning to apply to both   | 
index_type | 
 Controls how   | 
Value
A spatial RDD containing the join result.
See Also
Other Sedona spatial join operator: 
sedona_spatial_join_count_by_key()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = input_location,
    delimiter = ",",
    type = "point",
    first_spatial_col_index = 1L
  )
  query_rdd_input_location <- "/dev/null" # replace it with the path to your input file
  query_rdd <- sedona_read_shapefile_to_typed_rdd(
    sc,
    location = query_rdd_input_location,
    type = "polygon"
  )
  join_result_rdd <- sedona_spatial_join(
    rdd,
    query_rdd,
    join_type = "intersect",
    partitioner = "quadtree"
  )
}
Perform a spatial count-by-key operation based on two Sedona spatial RDDs.
Description
For each element p from spatial_rdd, count the number of unique elements q
from query_window_rdd such that (p, q) satisfies the spatial relation
specified by join_type.
Usage
sedona_spatial_join_count_by_key(
  spatial_rdd,
  query_window_rdd,
  join_type = c("contain", "intersect"),
  partitioner = c("quadtree", "kdbtree"),
  index_type = c("quadtree", "rtree")
)
Arguments
spatial_rdd | 
 Spatial RDD containing geometries to be queried.  | 
query_window_rdd | 
 Spatial RDD containing the query window(s).  | 
join_type | 
 Type of the join query (must be either "contain" or
"intersect").
If   | 
partitioner | 
 Spatial partitioning to apply to both   | 
index_type | 
 Controls how   | 
Value
A spatial RDD containing the join-count-by-key results.
See Also
Other Sedona spatial join operator: 
sedona_spatial_join()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = input_location,
    delimiter = ",",
    type = "point",
    first_spatial_col_index = 1L
  )
  query_rdd_input_location <- "/dev/null" # replace it with the path to your input file
  query_rdd <- sedona_read_shapefile_to_typed_rdd(
    sc,
    location = query_rdd_input_location,
    type = "polygon"
  )
  join_result_rdd <- sedona_spatial_join_count_by_key(
    rdd,
    query_rdd,
    join_type = "intersect",
    partitioner = "quadtree"
  )
}
Spatial RDD aggregation routine
Description
Function extracting aggregate statistics from a Sedona spatial RDD.
Arguments
x | 
 A Sedona spatial RDD.  | 
Create a SpatialRDD from an external data source.
Description
Import spatial object from an external data source into a Sedona SpatialRDD.
Arguments
sc | 
 A   | 
location | 
 Location of the data source.  | 
type | 
 Type of the SpatialRDD (must be one of "point", "polygon", or "linestring".  | 
has_non_spatial_attrs | 
 Whether the input contains non-spatial attributes.  | 
storage_level | 
 Storage level of the RDD (default: MEMORY_ONLY).  | 
repartition | 
 The minimum number of partitions to have in the resulting RDD (default: 1).  | 
Visualization routine for Sedona spatial RDD.
Description
Generate a visual representation of geometrical object(s) within a Sedona spatial RDD.
Arguments
rdd | 
 A Sedona spatial RDD.  | 
resolution_x | 
 Resolution on the x-axis.  | 
resolution_y | 
 Resolution on the y-axis.  | 
output_location | 
 Location of the output image. This should be the desired path of the image file excluding extension in its file name.  | 
output_format | 
 File format of the output image. Currently "png", "gif", and "svg" formats are supported (default: "png").  | 
boundary | 
 Only render data within the given rectangular boundary.
The   | 
color_of_variation | 
 Which color channel will vary depending on values of data points. Must be one of "red", "green", or "blue". Default: red.  | 
base_color | 
 Color of any data point with value 0. Must be a numeric vector of length 3 specifying values for red, green, and blue channels. Default: c(0, 0, 0).  | 
shade | 
 Whether data point with larger magnitude will be displayed with darker color. Default: TRUE.  | 
overlay | 
 A   | 
browse | 
 Whether to open the rendered image in a browser (default: interactive()).  | 
Write SpatialRDD into a file.
Description
Export serialized data from a Sedona SpatialRDD into a file.
-  
sedona_write_wkb: -  
sedona_write_wkt: -  
sedona_write_geojson: 
Usage
sedona_write_wkb(x, output_location)
sedona_write_wkt(x, output_location)
sedona_write_geojson(x, output_location)
Arguments
x | 
 The SpatialRDD object.  | 
output_location | 
 Location of the output file.  | 
Value
No return value.
See Also
Other Sedona RDD data interface functions: 
sedona_read_dsv_to_typed_rdd(),
sedona_read_geojson(),
sedona_read_shapefile_to_typed_rdd(),
sedona_save_spatial_rdd()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- sedona_read_wkb(
    sc,
    location = input_location,
    wkb_col_idx = 0L
  )
  sedona_write_wkb(rdd, "/tmp/wkb_output.tsv")
}
Read geospatial data into a Spark DataFrame.
Description
These functions are deprecated and will be removed in a future release. Sedona has
been implementing readers as spark DataFrame sources, so you can use spark_read_source
with the right sources ("shapefile", "geojson", "geoparquet") to read geospatial data.
Functions to read geospatial data from a variety of formats into Spark DataFrames.
-  
spark_read_shapefile: from a shapefile -  
spark_read_geojson: from a geojson file -  
spark_read_geoparquet: from a geoparquet file 
Usage
spark_read_shapefile(sc, name = NULL, path = name, options = list(), ...)
spark_read_geojson(
  sc,
  name = NULL,
  path = name,
  options = list(),
  repartition = 0,
  memory = TRUE,
  overwrite = TRUE
)
spark_read_geoparquet(
  sc,
  name = NULL,
  path = name,
  options = list(),
  repartition = 0,
  memory = TRUE,
  overwrite = TRUE
)
Arguments
sc | 
 A   | 
name | 
 The name to assign to the newly generated table.  | 
path | 
 The path to the file. Needs to be accessible from the cluster. Supports the ‘"hdfs://"’, ‘"s3a://"’ and ‘"file://"’ protocols.  | 
options | 
 A list of strings with additional options. See https://spark.apache.org/docs/latest/sql-programming-guide.html#configuration.  | 
... | 
 Optional arguments; currently unused.  | 
repartition | 
 The number of partitions used to distribute the generated table. Use 0 (the default) to avoid partitioning.  | 
memory | 
 Boolean; should the data be loaded eagerly into memory? (That is, should the table be cached?)  | 
overwrite | 
 Boolean; overwrite the table with the given name if it already exists?  | 
Value
A tbl
See Also
Other Sedona DF data interface functions: 
spark_write_geojson()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- spark_read_shapefile(sc, location = input_location)
}
Write geospatial data from a Spark DataFrame.
Description
These functions are deprecated and will be removed in a future release. Sedona has
been implementing writers as spark DataFrame sources, so you can use spark_write_source
with the right sources ("shapefile", "geojson", "geoparquet") to write geospatial data.
Functions to write geospatial data into a variety of formats from Spark DataFrames.
-  
spark_write_geojson: to GeoJSON -  
spark_write_geoparquet: to GeoParquet -  
spark_write_raster: to raster tiles after using RS output functions (RS_AsXXX) 
Usage
spark_write_geojson(
  x,
  path,
  mode = NULL,
  options = list(),
  partition_by = NULL,
  ...
)
spark_write_geoparquet(
  x,
  path,
  mode = NULL,
  options = list(),
  partition_by = NULL,
  ...
)
spark_write_raster(
  x,
  path,
  mode = NULL,
  options = list(),
  partition_by = NULL,
  ...
)
Arguments
x | 
 A Spark DataFrame or dplyr operation  | 
path | 
 The path to the file. Needs to be accessible from the cluster. Supports the ‘"hdfs://"’, ‘"s3a://"’ and ‘"file://"’ protocols.  | 
mode | 
 A  For more details see also https://spark.apache.org/docs/latest/sql-programming-guide.html#save-modes for your version of Spark.  | 
options | 
 A list of strings with additional options.  | 
partition_by | 
 A   | 
... | 
 Optional arguments; currently unused.  | 
See Also
Other Sedona DF data interface functions: 
spark_read_shapefile()
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  tbl <- dplyr::tbl(
    sc,
    dplyr::sql("SELECT ST_GeomFromText('POINT(-71.064544 42.28787)') AS `pt`")
  )
  spark_write_geojson(
    tbl %>% dplyr::mutate(id = 1),
    output_location = "/tmp/pts.geojson"
  )
}
Spatial join operator
Description
R interface for a Sedona spatial join operator
Arguments
spatial_rdd | 
 Spatial RDD containing geometries to be queried.  | 
query_window_rdd | 
 Spatial RDD containing the query window(s).  | 
join_type | 
 Type of the join query (must be either "contain" or
"intersect").
If   | 
partitioner | 
 Spatial partitioning to apply to both   | 
index_type | 
 Controls how   | 
Execute a spatial query
Description
Given a spatial RDD, run a spatial query parameterized by a spatial object
x.
Arguments
rdd | 
 A Sedona spatial RDD.  | 
x | 
 The query object.  | 
index_type | 
 Index to use to facilitate the KNN query. If NULL, then
do not build any additional spatial index on top of   | 
result_type | 
 Type of result to return.
If "rdd" (default), then the k nearest objects will be returned in a Sedona
spatial RDD.
If "sdf", then a Spark dataframe containing the k nearest objects will be
returned.
If "raw", then a list of k nearest objects will be returned. Each element
within this list will be a JVM object of type
  | 
Export a Spark SQL query with a spatial column into a Sedona spatial RDD.
Description
Given a Spark dataframe object or a dplyr expression encapsulating a Spark SQL query, build a Sedona spatial RDD that will encapsulate the same query or data source. The input should contain exactly one spatial column and all other non-spatial columns will be treated as custom user-defined attributes in the resulting spatial RDD.
Usage
to_spatial_rdd(x, spatial_col)
Arguments
x | 
 A Spark dataframe object in sparklyr or a dplyr expression representing a Spark SQL query.  | 
spatial_col | 
 The name of the spatial column.  | 
Value
A SpatialRDD encapsulating the query.
Examples
library(sparklyr)
library(apache.sedona)
sc <- spark_connect(master = "spark://HOST:PORT")
if (!inherits(sc, "test_connection")) {
  tbl <- dplyr::tbl(
    sc,
    dplyr::sql("SELECT ST_GeomFromText('POINT(-71.064544 42.28787)') AS `pt`")
  )
  rdd <- to_spatial_rdd(tbl, "pt")
}