Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return GRAPHQL_DEBUG message when meta query is used in a GraphQL Request #3028

Open
jasonbahl opened this issue Jan 30, 2024 · 7 comments
Open
Labels
Status: Discussion Requires a discussion to proceed Type: Enhancement New feature or request

Comments

@jasonbahl
Copy link
Collaborator

What problem does this address?

While WPGraphQL doesn't natively expose any fields in the Schema to allow folks to query via meta_query, folks can still add meta query support via filters.

I've responded to countless support requests about problems with performance only to find out that under the hood there's a meta_query being executed.

As a maintainer, I would like users to have more awareness of the potential performance downsides of using meta queries and educate them on alternatives. Ideally this will lead to a more positive user experience and a reduced maintainer burden.

What is your proposed solution?

I propose that if a meta_query is detected within a WPGraphQL request, we add a graphql_debug() message to the response warning them that the query could have a negative performance impact and point them to some info regarding alternatives.

Perhaps we can write a blog post or guide on the website that discusses the topic at more length and provides alternative methods, such as the "Utility Taxonomy" approach: https://tomjn.com/2018/03/16/utility-taxonomies/

What alternatives have you considered?

Leaving things as and repetitively answer support questions about it

Additional Context

No response

@jasonbahl jasonbahl assigned jasonbahl and unassigned jasonbahl Jan 30, 2024
@jasonbahl jasonbahl added the Type: Enhancement New feature or request label Jan 30, 2024
@justlevine
Copy link
Collaborator

justlevine commented Jan 30, 2024

As someone who works a lot with meta queries, this makes me really nervous.

Meta queries (when done correctly) are essential for the sorts of non-traditional sites where headless truly excels as a use case.
All those extra graphql_debug()s are noise and cognitive load, which won't just be surfaced by custom code, but also extensions that are using them correctly.

As far as the weight of addressing these underlying support issues (which I sadly also spend too much time on), though.

Improved performance + debugging docs we can point people to are IMO the way to go (along with the recommendation that developers follow WordPress coding standards in general - which if they want tooling for, there's already a WPCS sniff ).

Maybe there's a case for including it with tracing, or in the QueryAnalyser (defined literally instead of it's current primary caching use case), but even then it seems like it would be ideally be disabled most the time -> so it should have a flag -> doesn't seem to merit the extra UX + code 🤔

@jasonbahl
Copy link
Collaborator Author

Meta queries (when done correctly)

@justlevine can you give examples of this?

The way the database structure is lends to pretty poor performance that degrades as the size of the meta table grows

@justlevine
Copy link
Collaborator

justlevine commented Jan 30, 2024

@jasonbahl an event date is the use case I see more often in custom code. From an "ideal" pov, it's necessary for any filterable value that can't/shouldn't be represented as an Enum (ie a WP term).

The official Coding Standards recommendation is to avoid meta/tax queries when possible and cache the results when not possible. (And that's what WPCS and VIPCS linters both say too).

But back to the context of this ticket and congitive load, when you're writing a compatibility extension, it doesn't really matter that you know the ideal patterns, you're beholden to the DB structure of the plugins extension.

And ☝️ I believe is the bulk of the use cases that would now be getting debug spam, outweighing imo the individuals whose non-performant WP code is equally non-performant in WPGraphQL.

@jasonbahl
Copy link
Collaborator Author

@justlevine ya, it's tricky. There are definitely a lot of personas to consider here. A lot of them are developers extending existing plugins, but there's also a lot of agencies, etc that take over projects or get hired to consult on existing projects, etc.

A lot of these personas are lost when it comes to understanding best practices, etc. A linter helps for sure, but not for all situations.

Also, since things under the hood are filterable, i.e. WP_Query itself, there's a chance meta queries are being executed without realizing it because a plugin is over-stepping and filtering every WP_Query instead of only specific WP_Query's, etc.

Lots of things to consider.

The purpose of graphql_debug() was to allow debug info to be passed to the response without resulting in an error.

I think this use case is a pretty good example of when graphql_debug() message makes a lot of sense.

It doesn't prevent execution or cause an error, but it highlights the fact that something under the hood could be causing performance problems.

Of course, this could be silenced (via filter or setting or whatever) or we could scope it to only certain situations (i.e. only if a query takes longer than xx amount of time), but it meta queries rank pretty high in the list of things I've seen cause problems for folks over the years (both before and since working on WPGraphQL).

I think we can do all of the above though. . .add more / better docs, Help provide better linters/tooling/education. . .but also provide a debug message. 😉

@justlevine
Copy link
Collaborator

justlevine commented Jan 30, 2024

I think this use case is a pretty good example of when graphql_debug() message makes a lot of sense.

It doesn't prevent execution or cause an error, but it highlights the fact that something under the hood could be causing performance problems.

Of course, this could be silenced (via filter or setting or whatever) or we could scope it to only certain situations

This is the part that makes me pause. The rest we're in agreement on.

Unless we have a way to silence per initiator, then by having it on we're immediately going to get flooded with dozens of perfectly safe or regardless inactionable messages, obfuscating the good ones and requiring more triage than currently (via the SQL trace).
Any "real" debug message is lost in a sea of performance messages, and even when debugging performance is the goal, the support job isn't actually any easier because of the irrelevant false positives. And then there's the increase in false reports from people thinking a certain message is a 'bug' (like PHP 8.3 deprecation warnings when running 8.2 ;) ).

A slightly less intrusive alternative is to toggle this as part of tracing logs - they have these weird and long SQL queries, and now a message says part of that is caused by a meta/tax query. Then all those messages only get turned on when someone is already debugging performance, even if a big portion of them is noise.

I think long term, the ideal feature for this is "query cost" limits, where the QueryAnalyzer (or similar) checks the query args + number of non-self-referential child nodes or even the generated SQL, and computes a value use to prevent/warn about the performance implications and recommend an alternative approach (pagination, not using a meta query, using a GraphQL model, and anything else we want to guardrail).

@jasonbahl
Copy link
Collaborator Author

jasonbahl commented Jan 30, 2024

I think long term, the ideal feature for this is "query cost" limits

Agreed here for sure, but even that's difficult to know until the query has been executed.

Query Cost, typically, works by analyzing the Query compared the the Schema and each determining the cost of each field + arg (multiplier) and determining the cost before executing.

Since the Schema is filterable, and under-the-hood mechanisms are also filterable, it's not always possible to identify just from the Query / Schema comparison that any given query will or will not include a meta_query under the hood.

So while query cost would be a great feature and likely reduce some of the support burden, it's:

  • a lot further out from shipping
  • won't catch all current use cases I see where meta queries are causing issues

A slightly less intrusive alternative is to toggle this as part of tracing logs.

I think this might be the way to go short-term.

if "Query Logs" and/or "Tracing" is enabled, then we could also show this output. 🤔

@jasonbahl jasonbahl added Status: Discussion Requires a discussion to proceed Close Candidate Needs confirmation before closing and removed Close Candidate Needs confirmation before closing labels Feb 12, 2024
@justlevine
Copy link
Collaborator

justlevine commented Feb 15, 2024

Related #2438

With a dedicated Logger, we could be as verbose and granular as we want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Discussion Requires a discussion to proceed Type: Enhancement New feature or request
Projects
Status: 🆕 New
Development

No branches or pull requests

2 participants