Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non valid geometries? #78

Open
MatthieuStigler opened this issue May 25, 2023 · 14 comments
Open

Non valid geometries? #78

MatthieuStigler opened this issue May 25, 2023 · 14 comments

Comments

@MatthieuStigler
Copy link

It seems that there are some "invalid" geometries contained in ne_countries (both small and medium scales) detected by sf, both with the old R2 system and new S2?

Also, even on a subset of countries where sf::st_is_valid() does not detect issues, an operation like st_crop can lead to some issues?

library(rnaturalearth)
library(sf)
#> Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.0; sf_use_s2() is TRUE

world_S <- ne_countries(scale = "small", returnclass = "sf")
world_M <- ne_countries(scale = "medium", returnclass = "sf")

## check with S2
sf_use_s2(TRUE)
all(st_is_valid(world_S))
#> [1] FALSE
all(st_is_valid(world_M))
#> [1] FALSE

## check with R2
sf_use_s2(FALSE)
#> Spherical geometry (s2) switched off
all(st_is_valid(world_S))
#> [1] TRUE
all(st_is_valid(world_M))
#> [1] FALSE

## which ones?
sf_use_s2(TRUE)
#> Spherical geometry (s2) switched on
world_S$name[!st_is_valid(world_S)]
#> [1] "Sudan"  "Russia"
world_M$name[!st_is_valid(world_M)]
#> [1] "Antarctica" "Fiji"       "India"      "Russia"     "Sudan"     
#> [6] "S. Sudan"

sf_use_s2(FALSE)
#> Spherical geometry (s2) switched off
world_S$name[!st_is_valid(world_S)]
#> character(0)
world_M$name[!st_is_valid(world_M)]
#> [1] "India"


### issues even when valid
some_countries <- world_M |> 
  subset(continent == "Europe"| name %in% c("United States")) |> 
  subset(name !="Russia") |> 
  st_geometry() 

sf_use_s2(TRUE);all(st_is_valid(some_countries))
#> Spherical geometry (s2) switched on
#> [1] TRUE
sf_use_s2(FALSE);all(st_is_valid(some_countries))
#> Spherical geometry (s2) switched off
#> [1] TRUE

sf_use_s2(TRUE)
#> Spherical geometry (s2) switched on
some_countries |> 
  st_crop( c(xmin=-120, ymin=30, xmax=30, ymax= 55)) |>
  plot(border="blue", lwd=2)
plot(some_countries, add=TRUE)

Created on 2023-05-25 with reprex v2.0.2

@PMassicotte
Copy link
Contributor

We are just providing an interface for the data provided by naturalearth. Maybe you could contact them? Meanwhile, you could try to fix issues with st_make_valid().

@PMassicotte
Copy link
Contributor

Excellent!

@MatthieuStigler
Copy link
Author

The problem is that the issue of polygons is quite complicated and I won't be able to follow-up much there, but anyway, it's also good just to mention the issue if other face it.

Unfortunately, the st_make_valid() approach does not work in the second case, possible because st_is_valid() does not flag those as invalid :-(

@PMassicotte
Copy link
Contributor

Just tried to make it valid with terra but no luck:

terra::vect(world_S) |>
  terra::makeValid() |>
  st_as_sf() |>
  st_is_valid()

@PMassicotte
Copy link
Contributor

I even tried to repair it directly with gdal:

ogr2ogr -f "ESRI Shapefile" output.shp world_s.shp -dialect sqlite -sql "SELECT ST_MakeValid(geometry) AS geometry FROM world_S" 

@MatthieuStigler
Copy link
Author

thanks for trying!

I am wondering whether the problem comes from this (https://r-spatial.github.io/sf/articles/sf7.html)

Simple feature geometries should obey a ring direction too: exterior rings should be counter clockwise, interior (hole) rings should be clockwise, ...... many (legacy) datasets have wrong ring directions. With wrong ring directions, many things still work.

and the discussion on LINESTRING(-179 0,179 0), given that most datasets in rnaturalearth will contain polygons/lines close to 180? When reading the original datasets, did you check the point about check_ring_dir? Thanks!

@mps9506
Copy link
Contributor

mps9506 commented May 25, 2023

Not really a fix but with s2 I think this helps out:

library(rnaturalearth)
#> Warning: package 'rnaturalearth' was built under R version 4.2.3
library(sf)
#> Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE

worldS <- ne_countries(scale = "small", returnclass = "sf")


sf_use_s2(TRUE)

all(st_is_valid(worldS))
#> [1] FALSE
worldS$name[!st_is_valid(worldS)]
#> [1] "Sudan"  "Russia"

worldS_2 <- worldS$geometry |> 
  s2::as_s2_geography(check = F) |> 
  s2::s2_union() |> 
  s2::s2_rebuild() |> 
  st_as_sfc()


all(st_is_valid(worldS_2))
#> [1] TRUE
worldS$name[!st_is_valid(worldS_2)]
#> character(0)

Created on 2023-05-25 with reprex v2.0.2

@MatthieuStigler
Copy link
Author

Thanks @mps9506 !!

Unfortunately, though the transformed polygons pass the st_is_valid test, the visual tests is suggesting there's a big problem with Russia spreading beyond its borders (political considerations aside):

library(rnaturalearth)
library(sf)
#> Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.0; sf_use_s2() is TRUE

worldS <- ne_countries(scale = "small", returnclass = "sf")

worldS_2 <- worldS$geometry |> 
  s2::as_s2_geography(check = F) |> 
  s2::s2_union() |> 
  s2::s2_rebuild() |> 
  st_as_sfc()

all(st_is_valid(worldS_2))
#> [1] TRUE

plot(worldS_2[which(worldS$name=="Russia")])

Created on 2023-05-26 with reprex v2.0.2

@MatthieuStigler
Copy link
Author

@PMassicotte it seems package s2 does also contain a version of naturalerth data, cf. s2::s2_data_countries(). It might be insightful to see how they addressed the invalid polygons part, especially given the future retirement of rgeos (which would make use of s2 the standard I believe?). Thanks!

@PMassicotte
Copy link
Contributor

One option could be to do it on the cron job used in rnaturalearthdata.

@mps9506
Copy link
Contributor

mps9506 commented May 26, 2023

@MatthieuStigler I should know to plot the data by now!

library(rnaturalearth)
#> Warning: package 'rnaturalearth' was built under R version 4.2.3
library(sf)
#> Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE


sf_use_s2()
#> [1] TRUE

worldS <- ne_countries(scale = "small", returnclass = "sf") 

worldS_2 <- worldS$geometry |>
  st_transform(crs = 3857) |> 
  st_make_valid() |>
  st_as_s2(check = F) |> 
  s2::s2_union() |> 
  s2::s2_rebuild(options = s2::s2_options(validate = TRUE)) |>
  st_as_sfc() |> 
  st_wrap_dateline()

all(st_is_valid(worldS_2))
#> [1] TRUE


plot(worldS_2[which(worldS$name=="Russia")])

plot(worldS_2)

Created on 2023-05-26 with reprex v2.0.2

@MatthieuStigler
Copy link
Author

great, thanks @mps9506 !

@PMassicotte
Copy link
Contributor

Just noticed with newer gdal and proj, it works for me:

library(rnaturalearth)
library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.7.0, PROJ 9.2.0; sf_use_s2() is TRUE

worldS <- ne_countries(scale = "small", returnclass = "sf")

worldS_2 <- worldS$geometry
all(st_is_valid(worldS_2))
#> [1] FALSE

plot(worldS_2[which(worldS$name == "Russia")])

Created on 2023-05-28 with reprex v2.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants