Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid interpretation of where.not clause #367

Open
arthurwozniak opened this issue Aug 1, 2022 · 3 comments
Open

Invalid interpretation of where.not clause #367

arthurwozniak opened this issue Aug 1, 2022 · 3 comments

Comments

@arthurwozniak
Copy link

As I cannot find that our bug is intended for this gem, I am opening the issue.

In one place we need to search data by NOT matching specific GPS point. When using SQL condition, it works, but query is corrupted when using ActiveRecord.

some schema information

  create_table "orders", force: :cascade do |t|
    ...
    t.geography "gps", limit: {:srid=>4326, :type=>"st_point", :geographic=>true}
    ...
  end

Let's take a look at the following snippet with SQL statement.

order = Order.find(some_id)

Order.where(id: order.id).where("gps = ST_SetSRID( ST_Point( #{order.gps.x}, #{order.gps.y}), 4326)").explain
 => 
EXPLAIN for: SELECT "orders".* FROM "orders" WHERE "orders"."id" = $1 AND (gps = ST_SetSRID( ST_Point( 14.8569972, 50.8526169), 4326)) [["id", 12009736]]
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Index Scan using orders_id_company_branch_id_idx on orders  (cost=0.43..2.66 rows=1 width=1756)
   Index Cond: (id = '12009736'::bigint)
   Filter: (gps = '0101000020E6100000F3864556C8B62D402AC2F28C226D4940'::geography)
(3 rows)

Order.where(id: order.id).where.not("gps = ST_SetSRID( ST_Point( #{order.gps.x}, #{order.gps.y}), 4326)").explain
 => 
EXPLAIN for: SELECT "orders".* FROM "orders" WHERE "orders"."id" = $1 AND NOT (gps = ST_SetSRID( ST_Point( 14.8569972, 50.8526169), 4326)) [["id", 12009736]]
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Index Scan using orders_id_company_branch_id_idx on orders  (cost=0.43..2.66 rows=1 width=1756)
   Index Cond: (id = '12009736'::bigint)
   Filter: (NOT (gps = '0101000020E6100000F3864556C8B62D402AC2F28C226D4940'::geography))
(3 rows)

But when we use ActiveRecord referencing order.gps, data processed by DB are corrupted using where.not condition.

order = Order.find(some_id)

Order.where(id: order.id).where(gps: order.gps).explain
 => 
EXPLAIN for: SELECT "orders".* FROM "orders" WHERE "orders"."id" = $1 AND "orders"."gps" = $2 [["id", 12009736], ["gps", "0020000001000010e6402db6c8564586f340496d228cf2c22a"]]
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Index Scan using orders_id_company_branch_id_idx on orders  (cost=0.43..2.66 rows=1 width=1756)
   Index Cond: (id = '12009736'::bigint)
   Filter: (gps = '0101000020E6100000F3864556C8B62D402AC2F28C226D4940'::geography)
(3 rows)

Order.where(id: order.id).where.not(gps: order.gps).explain
 => 
EXPLAIN for: SELECT "orders".* FROM "orders" WHERE "orders"."id" = $1 AND "orders"."gps" != $2 [["id", 12009736], ["gps", "0020000001000010e6402db6c8564586f340496d228cf2c22a"]]
                                                                 QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using orders_id_company_branch_id_idx on orders  (cost=0.43..2.66 rows=1 width=1756)
   Index Cond: (id = '12009736'::bigint)
   Filter: ((gps)::bytea <> '\x3030323030303030303130303030313065363430326462366338353634353836663334303439366432323863663263323261'::bytea)
(3 rows)

The result of this behavior is that each statement using AR is returning the order object. It seems to be wrong to me. Rails passes correctly serialized GPS point (0020000001000010e6402db6c8564586f340496d228cf2c22a) as filter condition is same as in case using ST_Point, but where.not looks like it performs some extra encoding of data.

@mjy
Copy link

mjy commented Aug 1, 2022

Side note. IIRC Order as a rails model seems OK, but if you get meta at all with your code (e.g. composing complex queries with Arel) you can end up with some very confusing scenarious while debugging etc. YRMV.

@keithdoggett
Copy link
Member

Thanks for the thorough examples. I'll start looking into this soon and post any updates here.

@keithdoggett
Copy link
Member

I dug into this some more and was able to replicate it. I also tried it with a GEOS backed factory and got a similar but slightly different result.

EXPLAIN for: SELECT "spatial_models".* FROM "spatial_models" WHERE "spatial_models"."latlon" != $1 [["latlon", "002000000100000ec93ff00000000000003ff0000000000000"]]
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Seq Scan on spatial_models  (cost=0.00..77.38 rows=488 width=136)
   Filter: ((latlon)::text <> '002000000100000ec93ff00000000000003ff0000000000000'::text)

For some reason it's trying to cast it to text instead of bytea this time. Maybe it has something to do with how geography is encoded vs geometry columns.

The thing that seems strange is that the query looks ok to me, but the planner decides that it should try to compare text instead of using geography/geometry types when the != comparison is added.

In fact, I can do the same thing by querying my database directly and when using != for comparison, I see that the planner wants to cast it to text again and for = it casts the wkb to a geometry.

When I force the wkb to be cast to a geometry like so:

EXPLAIN SELECT id FROM spatial_models WHERE latlon != '002000000100000ec93ff00000000000003ff0000000000000'::geometry;

I get this error:

ERROR:  operator is not unique: geometry <> geometry
HINT:  Could not choose a best candidate operator. You might need to add explicit type casts.

So it seems this is some sort of implementation issue in PostGIS. I'll look more into their docs to see what they say about it. A potential solution could be to override the default where query when an RGeo node is detected and prefer ST_Equals instead, but that has other implications because that checks for being spatially equal which is different than just using the = operator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants