RGeo 3.0 Roadmap #247

BuonOmo · 2021-02-04T14:38:45Z

Summary

We want to simplify the overall interface, being more ruby-stylish (a.k.a simple). Refactor code to improve maintainability. Here are the main axis on which we want to focus for the next major:

Loose Invalidity handling for geometries
CAPI Refactoring
Better GeoJSON integration
Consistency and clarification of interface

Loose Invalidity handling for geometries

This is currently the most critical item. We want to always have loose invalidity handling and never raise on creation, which will allow us to get rid of uses_lenient_assertions. This will have the benefit of always raise when a method cannot be use while not making usage cumbersome. For instance, one may want to parse a polygon even if it is not simple, however, we should not give him a bad area computation because it was not verified.

Work on that topic has already been started in the https://github.com/rgeo/rgeo/tree/fix/correct-area-when-invalid-polygon branch. However, there are still performance issues that should be investigated.

Always allow loose geometry creation, raise on operation that need more strict geometries #256.
- Add Segment Intersection Method and Sweepline Intersector Class #262
- Add ImplHelper::ValidOp Module to Provide Uniform Validity Checks #271

CAPI Refactoring

C Extension Update. Change data wrappers at least.
- Change CAPI data wrappers #273
- change notice and error handlers (Add error handler for geos #286)
- see if we really need the multithreaded api (or if ruby GIL is enough protection)
Ship GEOS with the project for MRI users #272

Better GeoJSON integration

GeoJSON is one of the most used way to parse data into RGeo::Geometry object, yet it is cumbersome to use.

Consistency and clarification of interface

Update methods to match the current OGC SFS version (1.2) #267
is_empty? should be replaced with empty? (Or at least alias it for a smoother transition). This goes for every is_*? method. #264
We should have a simple way to iterate through a collection rather than geometry_n (Add a geometry_collection#geometries method #231)
Easier object instantiation by allowing an array of coordinates for creation? (a from_coordinates?) #265

The last item here is to be discussed: we do not want to add something that may make the interface more complex. If you read until here, Please mark your interest for this feature. It may be the case that this is only useful for us maintainers when creating examples and tests.

Proj Defs

Regression in projection definition rgeo-proj4#16

Modus operandi

The main direction may be discussed here. If one want to tackle on of the roadmap items, please open an issue first to indicate that you are in charge and to discuss further how you want to do it if need be.

We will also reference issues with breaking changes with the 3.0 label and merge pull requests in the https://github.com/rgeo/rgeo/tree/v3-dev branch.

The text was updated successfully, but these errors were encountered:

keithdoggett · 2021-02-05T20:12:21Z

Another thing we should work on is an update to the spherical_factory, specifically the SphericalMath module. We had previously discussed this but did not make a list of options/tasks.

The main issue is that small features may be improperly considered invalid (#212) and is due to floating-point issues. I think the first step for this would be to research some techniques or examples of this working well elsewhere.

BuonOmo · 2021-04-19T10:47:33Z

Working on loose geometry validation. I’m really trying to figure out what path we should take. Either:

raise on invalid at geometry creation
raise on invalid at method usage
allow some methods to give unexpected results.

In rgeo actually, we have inconsistent behavior (either validation early, either nothing).

To explain further the differences, here’s a polygon i’ve been working on and testing with both turf (js) and shapely (python) my polygon area is 0.

In our case, it would be 0 as well if computed with those coordinates.

If we had only the case of polygons and areas, we could check if the polygon is simple on creation and store it in the data pointer. Whenever the area is to be computed we would give the correct area if polygon is simple or raise otherwise.

However, there are plenty of reasons a geometry can be invalid, and for each geometry plenty of methods that need different validity.

Hence we need a more canonical approach if we want to maintain correct performances (timewise and spacewise).

If we check for validity, every geos method should work. Therefore we would solely need to add a validity flag to the geometry (int size) and check once (either on method call or geometry loading). This way we keep perfs and make sure a user never gets either nil on creation or a bizarre result for a geometry computation.

The complete way, much more complex, would be to have flags for each different kinds of validity (there are 11 variations in geos). Then we could have a finer, per method check. This allows user to do much more (for instance, the bowtie polygon could give a correct area). But there are a few issues:

this is done under the hood, and a user may expect the 0 value for the area (since most known repositories do so)
maintainability may go wild. We’ll have a lot of combinaison, all to be maintained correctly. And this may change per geos version…
we will need to either recompute the valid polygon every time (timewise complexity would be huge) or store it in the data, which would double space taken by each geometry (pointer to geos is most of the space taken by the geometry)

Considering both options, I think I would go for the first one, with error only on method call. Allowing users to use the lenient assertion flag (maybe renamed to check_validity, or something more scary to make sure user knows what he is doing). And giving this API for users to check if they still want the result with the valid geometry:

polygon = RGeo::Geos.factory.polygon(...)
polygon.area #(would raise an error with `polygon.invalid_reason` as its message if geometry is not valid)
polygon.area! # does not check for validity, gives 0 for a bowtie
polygon.valid?
polygon.invalid_reason # as it is today, maybe separating the location from the message to allow user switching on the result.
polygon.make_valid! # change the current geometry to a valid one (and return it for chaining)
polygon.make_valid # `dup.make_valid!` to avoid mutation.

Most of this api could be done in ruby to keep our c extension simple. We would just need to make sure we do still have correct perfs.

EDIT: There are two points I've not covered with this API, (1.) methods calling other geometries and (2.) methods ending with a ?

If we do not append a !, this checks ALL geometries, self and every others given in parameters
contains?! is not possible in ruby, I'd say maybe unsafe_contains? ?

keithdoggett · 2021-04-21T12:25:14Z

@BuonOmo thanks for the great write-up here.

I think the conclusion you came to about this is correct in that we should:

Add a validity flag to the geometry classes
Compute that validity flag once needed
Raise an error on methods that require validity if any geometry used is invalid
Add a set of unsafe methods that will not check for validity (indicated by ! or unsafe_).

I think we have 2 more things to consider:

If we allow unsafe operations we need to make sure that errors are properly handled in GEOS. Per the example in Strange polygon buffer behavior on RGeo 2.1.1 #252 we see that attempting intersections with an invalid geometry does not work and GEOS reports a TopologyError. We need to make sure that this will still be properly handled and add docs making sure the users know that just because they use unsafe_ operations doesn't mean they'll always get an answer.
I think an important part of this update will be having consistency across all factories. Is there a way that we could move a lot of this logic to the abstract factory/geometry implementations so that this is handled a little more seamlessly across all implementations?

BuonOmo · 2021-05-01T11:07:14Z

@keithdoggett I've made an example of what it could look like:

class Feature
  OGC_METHODS = %w(contains? intersection)

  def check_validity!
    @invalid_reason ||= invalid_reason
    return unless @invalid_reason

    raise RGeo::GeometryError, @invalid_reason
  end

  OGC_METHODS.each do |method_sym|
    copy = alias_method "unsafe_#{method_sym}".to_sym, method_sym
    undef_method method_sym
    define_method(method_sym) do |*args|
      check_validity!
      copy.call(*args)
    end
  end
end

So this would be handled across all factories. There is still one pitfall with this flag, if the geometry changes, it needs to be reset for each method that updates a geometry. I think this is OK, but I'll need to have a look at the codebase to be sure. Any opinion on that ?

Also, next message I'll open a PR dedicated to the subject!

keithdoggett · 2021-05-03T15:11:18Z

@BuonOmo this looks good. As far as I know, none of the methods mutate any geometries, but we should still give it a check. I don't think it makes sense to mutate geometries in general and the benefits of memoizing invalid_reason outweigh the risks of it being incorrect after doing something that wouldn't be considered "best practice".

BuonOmo · 2021-08-30T18:02:11Z

@keithdoggett FYI, I've removed the help wanted tag: if one want to help, they can just search through the created issues!

keithdoggett · 2023-03-21T23:09:07Z

Closing since V3 has been released

BuonOmo added feature help wanted 3.0 labels Feb 4, 2021

BuonOmo mentioned this issue Feb 27, 2021

How can I check element type? rgeo/rgeo-geojson#51

Closed

keithdoggett pinned this issue Mar 2, 2021

BuonOmo mentioned this issue Mar 14, 2021

Add delegation to geometry in Feature rgeo/rgeo-geojson#53

Merged

BuonOmo mentioned this issue May 4, 2021

Always allow loose geometry creation, raise on operation that need more strict geometries #256

Closed

4 tasks

BuonOmo mentioned this issue Jun 23, 2021

Monkey patch core with to_geojson rgeo/rgeo-geojson#54

Closed

3 tasks

BuonOmo mentioned this issue Jul 22, 2021

Changed all is_*? method names to *? #268

Merged

BuonOmo mentioned this issue Aug 30, 2021

Ship GEOS with the project for MRI users #272

Open

BuonOmo removed the help wanted label Aug 30, 2021

thestelz mentioned this issue Nov 4, 2021

RGeo::Geographic.simple_mercator_factory ignores lenient flags #277

Closed

keithdoggett closed this as completed Mar 21, 2023

keithdoggett unpinned this issue Mar 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RGeo 3.0 Roadmap #247

RGeo 3.0 Roadmap #247

BuonOmo commented Feb 4, 2021 •

edited

keithdoggett commented Feb 5, 2021

BuonOmo commented Apr 19, 2021 •

edited

keithdoggett commented Apr 21, 2021

BuonOmo commented May 1, 2021

keithdoggett commented May 3, 2021

BuonOmo commented Aug 30, 2021 •

edited

keithdoggett commented Mar 21, 2023

RGeo 3.0 Roadmap #247

RGeo 3.0 Roadmap #247

Comments

BuonOmo commented Feb 4, 2021 • edited

Summary

Loose Invalidity handling for geometries

CAPI Refactoring

Better GeoJSON integration

Consistency and clarification of interface

Proj Defs

Modus operandi

keithdoggett commented Feb 5, 2021

BuonOmo commented Apr 19, 2021 • edited

keithdoggett commented Apr 21, 2021

BuonOmo commented May 1, 2021

keithdoggett commented May 3, 2021

BuonOmo commented Aug 30, 2021 • edited

keithdoggett commented Mar 21, 2023

BuonOmo commented Feb 4, 2021 •

edited

BuonOmo commented Apr 19, 2021 •

edited

BuonOmo commented Aug 30, 2021 •

edited