Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up st_crs for SpatVector objects #2392

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

atsyplenkov
Copy link

Hey, guys.

Consider updating the st_crs() function for the SpatVector objects. The trick is terra::crs() always returns a character vector, therefore we can skip some checks implemented in st_crs() and we can gain a 2x speed increase in CRS extraction. When there is no CRS in SpatVector object, then the transition happens 1.2 times faster. When it comes to st_as_sf(), then it works 10% faster.

See examples below:

library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(terra)
#> terra 1.7.71

f <- system.file("ex/lux.shp", package="terra")
v <- vect(f)


st_crs2 = function(x) {
  if (!requireNamespace("terra", quietly = TRUE))
    stop("package terra required, please install it first") # nocov
  ctr <- terra::crs(x)
  if (ctr == "") {
    NA_crs_
  } else {
    crs <- sf:::CPL_crs_from_input(ctr)
    crs$input <- crs$Name
    crs
  }
}

bench::mark(
  original = st_crs(v),
  PR = st_crs2(v),
  check = T,
  relative = T,
  max_iterations = 1000L
)
#> # A tibble: 2 × 6
#>   expression   min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <dbl>  <dbl>     <dbl>     <dbl>    <dbl>
#> 1 original    1.93   1.83      1         296.     1.00
#> 2 PR          1      1         1.86        1      1

crs(v) <- NA_character_

bench::mark(
  original = st_crs(v),
  PR = st_crs2(v),
  check = T,
  relative = T,
  max_iterations = 1000L
)
#> # A tibble: 2 × 6
#>   expression   min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <dbl>  <dbl>     <dbl>     <dbl>    <dbl>
#> 1 original    1.24   1.20      1          Inf      NaN
#> 2 PR          1      1         1.11       NaN      NaN

Created on 2024-05-09 with reprex v2.1.0

@rsbivand
Copy link
Member

rsbivand commented May 10, 2024

All geographical objects should always declare their CRS. Missing CRS are objects with missing essential data, with no information on their coordinate measurement units or relative shape in relation to other data objects. So helping users abuse their data is not a good idea at all. See

sf/R/read.R

Lines 546 to 549 in 97a179a

if (is.na(st_crs(geom))) {
message('writing: substituting ENGCRS["Undefined Cartesian SRS with unknown unit"] for missing CRS')
st_crs(geom) = st_crs("ENGCRS[\"Undefined Cartesian SRS with unknown unit\",EDATUM[\"Unknown engineering datum\"],CS[Cartesian,2],AXIS[\"X\",unspecified,ORDER[1],LENGTHUNIT[\"unknown\",0]],AXIS[\"Y\",unspecified,ORDER[2],LENGTHUNIT[\"unknown\",0]]]")
}
for a specific case.

@atsyplenkov
Copy link
Author

Sure thing! However, I don't understand how my PR differs from the current behavior and philosophy of st_crs.

sf/R/crs.R

Lines 73 to 76 in 97a179a

st_crs.character = function(x, ...) {
if (is.na(x))
NA_crs_
else {

I am suggesting improving st_crs only for SpatVector objects, which rely on the terra::crs() function, which always returns a character vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants