edr4r

R-CMD-check Codecov test coverage Lifecycle: experimental License: MIT

An R client for any service that speaks OGC API - Environmental Data Retrieval (EDR). The spec is general, but in practice this package gets the most use against in-situ monitoring networks — stream gauges, weather stations, snow telemetry, reservoir telemetry — that expose their stations and time series through EDR.

Two known-good places to point it:

The goal is to take the tedious parts of EDR off your hands — URL construction, comma-separated parameter lists, WKT coordinate encoding, retries, content negotiation — and hand back something you can actually do data analysis with:

Installation

# from GitHub (recommended)
# install.packages("pak")
pak::pak("ksonda/edr4r")

# or
# install.packages("remotes")
remotes::install_github("ksonda/edr4r")

For local development:

git clone https://github.com/ksonda/edr4r.git
cd edr4r
R -e 'devtools::install()'

Requires R >= 4.1. The sf package is optional but recommended (used to turn location lists and GeoJSON into spatial objects).

The package vignettes that demonstrate live USGS and WWDH services are precomputed and packaged as offline snapshots. This keeps installation and R CMD check deterministic when a remote provider is unavailable or returns an intermittent server error.

Quick start

Start by pointing a client at a server. The base URL is the only thing it really needs:

library(edr4r)

client <- edr_client("https://api.wwdh.internetofwater.app")
# or "https://api.waterdata.usgs.gov/ogcapi/beta"
# or "http://localhost:5005" if you're running pygeoapi locally

edr_collections(client)[, c("id", "title", "data_queries")]
#> # A tibble: N × 3
#>   id         title                                             data_queries
#>   <chr>      <chr>                                             <list>
#> 1 rise-edr   USBR Reclamation Information Sharing Environment  <chr [4]>
#> 2 snotel-edr USDA Snowpack Telemetry Network (SNOTEL)           <chr [4]>
#> ...

Every server advertises its own collection IDs. The first thing to do against a new service is run edr_collections() and read the data_queries column to see which EDR endpoints each collection supports. Use edr_parameters() for requestable variables and, on services that advertise cross-collection concept mappings, edr_parameter_groups().

Find stations

edr_locations() returns the service’s locations response as GeoJSON. Some providers paginate large responses, so inspect the provider’s links and use limit = where appropriate. If you have sf installed, GeoJSON gets promoted to an sf object automatically:

locs <- edr_locations(client, "rise-edr")
locs                            # sf POINTs with station attributes
plot(sf::st_geometry(locs))

Pull a time series for one station

Once you know a station ID, ask for its values. The server returns CoverageJSON; covjson_to_tibble() flattens its axis/range arrays into tidy rows:

resp <- edr_location(
  client, "rise-edr",
  location_id    = "3514",
  datetime       = "2023-01-01/2023-01-31",
  parameter_name = "3"
)

df <- covjson_to_tibble(resp)
df
#> # A tibble: 31 × 9
#>   coverage_id parameter   parameter_label  unit  datetime                x     y     z value
#>   <chr>       <chr>       <chr>            <chr> <dttm>              <dbl> <dbl> <dbl> <dbl>
#> 1 1           3           Daily Lake/Res…   af    2023-01-01 07:00:00 -115.  36.0    NA   ...
#> ...

Spatial filters — bbox and polygon

To grab everything inside a rectangle, use edr_cube():

cube <- edr_cube(
  client, "rise-edr",
  bbox           = c(-116, 35.5, -114, 36.5),
  datetime       = "2023-01-01/2023-01-31",
  parameter_name = "3"
)
covjson_to_tibble(cube)

For an arbitrary polygon, edr_area() takes WKT, an sf polygon, or a matrix of (lon, lat) rows (it’ll close the ring for you):

ring <- matrix(
  c(-115.5, 35.5, -114, 35.5, -114, 36.5, -115.5, 36.5),
  ncol = 2, byrow = TRUE
)
area <- edr_area(client, "rise-edr", coords = ring,
                 datetime = "2023-01-01/2023-01-31",
                 parameter_name = "3")
covjson_to_tibble(area)

Plot a time series

edr_plot() is a small ggplot2 wrapper over the tidy tibble:

edr_plot(resp)            # accepts an edr_response directly

Facets by parameter (so different units don’t share a y-axis) and colours by station. Add layers or themes like any other ggplot.

It also auto-detects common non-station shapes:

edr_plot(cube)            # x/y grid -> tile map
edr_plot(profile)         # varying z -> vertical profile

# or force the layout
edr_plot(profile, view = "profile")
edr_plot(cube, view = "grid")

Map stations with per-station popups

edr_map() puts the stations on a leaflet basemap. Pass data = as a named list keyed by station id (the shape [edr_explore()] produces) and each marker gets a popup with an inline plot and a “Download CSV” link for that station’s data — embedded as a data: URI so the saved HTML is selfcontained:

stations <- edr_locations(client, "rise-edr",
                          bbox = c(-116, 35.5, -114, 36.5))
data_list <- list("3514" = covjson_to_tibble(resp))
m <- edr_map(stations, data = data_list, popup = "plot+csv")
edr_save_html(m, "stations.html")

For a quick exploratory pass over a whole collection, edr_explore() does the fetch + plot + map in one call:

edr_explore(
  client, "rise-edr",
  bbox           = c(-116, 35.5, -114, 36.5),
  datetime       = "2023-01-01/2023-01-31",
  parameter_name = "3",
  limit          = 25,
  file           = "snapshot.html"
)

Gridded coverages and vertical profiles can be mapped too. edr_map() detects tidy CoverageJSON grids/profiles and puts slice selectors inside the leaflet widget when there are multiple parameters or datetimes; grids also get a z selector when multiple vertical levels are present:

grid <- covjson_to_tibble(cube)
edr_map(grid)

profile <- covjson_to_tibble(profile_resp)
edr_map(profile)

edr_explore() uses the same behavior for bulk coverage queries. Use output = "plot" when you want a ggplot instead of the interactive map:

edr_explore(client, "gridded-collection",
            bbox = c(-120, 39, -118, 41),
            method = "cube")

edr_explore(client, "profile-collection",
            coords = c(-119, 40),
            method = "position")

edr_explore(client, "profile-collection",
            coords = c(-119, 40),
            method = "position", output = "plot")

Weird IDs, CSV, and an escape hatch

Some monitoring networks use compound station IDs — colon-separated triplets are a common pattern. The client URL-encodes reserved characters for you:

edr_location(client, "snotel-edr", "1185:CO:SNTL",
             datetime = "2024-01-01/..", parameter_name = "WTEQ")

If the server advertises CSV, you can ask for it instead of CovJSON:

edr_location(client, "rise-edr", "3514",
             datetime = "2023-01-01/2023-01-31", format = "csv")

And if you need to hit an endpoint the package doesn’t wrap (instances, custom queryables, anything weird), edr_request() is the raw escape hatch:

edr_request(client, "openapi", format = "json")

API at a glance

Function EDR endpoint
edr_client() construct a client
edr_landing() / edr_conformance() /, /conformance
edr_collections() / edr_collection() /collections
edr_parameter_groups() optional top-level parameterGroups extension
edr_queryables() /collections/{id}/queryables
edr_locations() / edr_location() /collections/{id}/locations[/{loc}]
edr_items() / edr_item() /collections/{id}/items[/{item}]
edr_position() /collections/{id}/position
edr_area() /collections/{id}/area
edr_cube() /collections/{id}/cube
edr_radius() /collections/{id}/radius
edr_trajectory() /collections/{id}/trajectory
edr_corridor() /collections/{id}/corridor
edr_request() low-level escape hatch
covjson_to_tibble() / geojson_to_sf() response parsers

What a server actually supports varies. Every query verb above is in the EDR spec and supported by the client, but most servers implement only a subset. On in-situ monitoring deployments, locations, position, cube, and area are common; radius, trajectory, and corridor less so. Hitting a verb the server doesn’t implement gives you an HTTP error. Check the data_queries column from edr_collections() before you assume a query will work.

Common parameters

Query verbs accept the applicable subset of these common EDR filters:

License

MIT