Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query returning duplicate values #10238

Closed
TristenHarr opened this issue May 14, 2024 · 7 comments
Closed

Query returning duplicate values #10238

TristenHarr opened this issue May 14, 2024 · 7 comments
Labels
k/v3-bug Bug affecting Hasura v3 (DDN)

Comments

@TristenHarr
Copy link
Contributor

A query like this:

query MyQuery {
  user(
    where: {_and: [{_or: [{fullName: {_ilike: "%drew%"}}, {email: {_ilike: "%drew%"}}, {userTags: {tag: {value: {_ilike: "%drew%"}}}}]}]}
    order_by: {fullName: Asc}
  ) {
    fullName
    id
    userTags {
      tag {
        value
      }
    }
  }
}

Is returning duplicated values due to the tags.

{
  "data": {
    "user": [
      {
        "fullName": "Drew 1",
        "id": "79f91d1e",
        "userTags": [
          {
            "tag": {
              "value": "tag1"
            }
          },
          {
            "tag": {
              "value": "tag2"
            }
          }
        ]
      },
      {
        "fullName": "Drew 1",
        "id": "79f91d1e",
        "userTags": [
          {
            "tag": {
              "value": "tag1"
            }
          },
          {
            "tag": {
              "value": "tag2"
            }
          }
        ]
      },
      {
        "fullName": "Drew 2",
        "id": "b4e9504f",
        "userTags": [
          {
            "tag": {
              "value": "tag1"
            }
          }
        ]
      },
      {
        "fullName": "Andrew 1",
        "id": "4967ad1f",
        "userTags": [
          {
            "tag": {
              "value": "tag1"
            }
          },
          {
            "tag": {
              "value": "tag2"
            }
          }
        ]
      },
      {
        "fullName": "Andrew 1",
        "id": "4967ad1f",
        "userTags": [
          {
            "tag": {
              "value": "tag1"
            }
          },
          {
            "tag": {
              "value": "tag2"
            }
          }
        ]
      }
    ]
  }
}

Notice the duplicates for Drew 1 and Andrew 1.

The SQL generated misses a DISTINCT statement.

See the generated SQL:

SELECT
  coalesce(json_agg(row_to_json("%11_universe")), '[]') AS "universe"
FROM
  (
    SELECT
      *
    FROM
      (
        SELECT
          coalesce(json_agg(row_to_json("%12_rows")), '[]') AS "rows"
        FROM
          (
            SELECT
              "%0_user"."full_name" AS "fullName",
              "%0_user"."id" AS "id",
              "%1_RELATIONSHIP_userTags"."userTags" AS "userTags"
            FROM
              "public"."user" AS "%0_user"
              LEFT OUTER JOIN LATERAL (
                SELECT
                  row_to_json("%1_RELATIONSHIP_userTags") AS "userTags"
                FROM
                  (
                    SELECT
                      *
                    FROM
                      (
                        SELECT
                          coalesce(json_agg(row_to_json("%9_rows")), '[]') AS "rows"
                        FROM
                          (
                            SELECT
                              "%5_RELATIONSHIP_tag"."tag" AS "tag"
                            FROM
                              "public"."user_tags" AS "%4_user_tags"
                              LEFT OUTER JOIN LATERAL (
                                SELECT
                                  row_to_json("%5_RELATIONSHIP_tag") AS "tag"
                                FROM
                                  (
                                    SELECT
                                      *
                                    FROM
                                      (
                                        SELECT
                                          coalesce(json_agg(row_to_json("%7_rows")), '[]') AS "rows"
                                        FROM
                                          (
                                            SELECT
                                              "%6_tag"."value" AS "value"
                                            FROM
                                              "public"."tag" AS "%6_tag"
                                            WHERE
                                              ("%4_user_tags"."tag_id" = "%6_tag"."id")
                                          ) AS "%7_rows"
                                      ) AS "%7_rows"
                                  ) AS "%5_RELATIONSHIP_tag"
                              ) AS "%5_RELATIONSHIP_tag" ON ('true')
                            WHERE
                              ("%0_user"."id" = "%4_user_tags"."user_id")
                          ) AS "%9_rows"
                      ) AS "%9_rows"
                  ) AS "%1_RELATIONSHIP_userTags"
              ) AS "%1_RELATIONSHIP_userTags" ON ('true')
              INNER JOIN LATERAL (
                SELECT
                  *
                FROM
                  "public"."user_tags" AS "%2_BOOLEXP_user_tags"
                WHERE
                  (
                    "%0_user"."id" = "%2_BOOLEXP_user_tags"."user_id"
                  )
              ) AS "%2_BOOLEXP_user_tags" ON ('true')
              INNER JOIN LATERAL (
                SELECT
                  *
                FROM
                  "public"."tag" AS "%3_BOOLEXP_tag"
                WHERE
                  (
                    "%2_BOOLEXP_user_tags"."tag_id" = "%3_BOOLEXP_tag"."id"
                  )
              ) AS "%3_BOOLEXP_tag" ON ('true')
            WHERE
              (
                (
                  ("%0_user"."full_name" ~~* cast('%drew%' as "text"))
                  OR ("%0_user"."email" ~~* cast('%drew%' as "text"))
                )
                OR ("%3_BOOLEXP_tag"."value" ~~* cast('%drew%' as "text"))
              )
            ORDER BY
              "%0_user"."full_name" DESC
          ) AS "%12_rows"
      ) AS "%12_rows"
  ) AS "%11_universe"
@TristenHarr TristenHarr added the k/v3-bug Bug affecting Hasura v3 (DDN) label May 14, 2024
@soupi
Copy link
Contributor

soupi commented May 15, 2024

Thanks for reporting. We are working on a fix.

github-merge-queue bot pushed a commit to hasura/ndc-postgres that referenced this issue May 16, 2024
…ows (#463)

### What

We are fixing two bugs:

1. Fix returning duplicate results when filtering
hasura/graphql-engine#10238
2. Fix over-filtering when a related table contains no results (due to
the use of inner join)

<!-- Consider: do we need to add a changelog entry? -->

### How

In order to filter by a nested relationship column, we first query that
nesting to fetch the relevant column, and then build our filter.

Before, we fetched the tables at the top-level using an inner join, then
added a where clause.

In this PR we change this generation scheme to use `WHERE EXISTS (SELECT
1 ...)` instead.
The fetching of columns will appear inside the `WHERE` clause, each path
element will have it's own select (where the nesting are combined using
inner joins) and separate path elements are combined with a FULL OUTER
JOIN instead of inner join.

The `EXISTS` part helps us avoid duplicates - while inner joins at the
top-level can create additional rows, an EXISTS cannot. It will just
inform if a particular row matches the predicate.

The `FULL OUTER JOIN` part helps us avoid over-eager filtering. See the
[note
here](#463 (comment)).
@arjunyel
Copy link
Contributor

@soupi thank you for taking a look! Any way you can cut a new release with the fix?

@soupi
Copy link
Contributor

soupi commented May 17, 2024

@arjunyel Yes, we've just deployed the fix to production and it should now be available for ddn cloud users.

@soupi soupi closed this as completed May 17, 2024
@arjunyel
Copy link
Contributor

@soupi sorry how do I use this in DDN cloud? The only control I have over versions only let's me do 0.6.0

@soupi
Copy link
Contributor

soupi commented May 17, 2024

@arjunyel no further action should be needed, it should just work as expected now. Please let me know if it doesn't.

@arjunyel
Copy link
Contributor

@soupi its working great, thank so much for the quick and comprehensive fix!

@soupi
Copy link
Contributor

soupi commented May 17, 2024

Great to hear. Thanks for the update!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
k/v3-bug Bug affecting Hasura v3 (DDN)
Projects
None yet
Development

No branches or pull requests

3 participants