Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Question #469

Open
owengalenjones opened this issue Aug 31, 2022 · 10 comments
Open

Performance Question #469

owengalenjones opened this issue Aug 31, 2022 · 10 comments

Comments

@owengalenjones
Copy link

Hello, I have a large number of schemas linked by $ref 's with a combined disk size 1380KB and an object is 675B. In benchmarks it consistently takes around 1.4 seconds to validate the object. Is this an expected time? We were hoping to validate objects at runtime during client requests but this would make that impossible.

Are there any steps I can take to attempt to increase performance here?

Thanks!

@erosb
Copy link
Contributor

erosb commented Aug 31, 2022

Hello @owengalenjones , I can't advise on that based on this information. You can

  • check if your schema has any large-scale "oneOf" applicator and if yes, if it is substitutable by an "anyOf" without loosing any guarantees. (this can work because "anyOf" is short-circuit evaluated, unlike "oneOf")
  • check with a profiler where is the most time spent
  • compare the performance with other validator implementations

@owengalenjones
Copy link
Author

@erosb thanks for the reply!

  1. The schema I'm testing with is not failing so I would imagine short-circuiting wouldn't gain me anything?
  2. I will check
  3. So far this is the quickest of the validators I have examined, networknt/json-schema-validator took 9 seconds for the same validation 😨 but I guess I can try some of the others.

@owengalenjones
Copy link
Author

As far as profiling, I'm not sure if it shows me anything:

Screen Shot 2022-08-31 at 1 59 20 PM

I assume this is just standard recursive call stacks?

I ran the test 3 times to get data for profiling.

Hotspots:

Screen Shot 2022-08-31 at 2 01 11 PM

@owengalenjones
Copy link
Author

Is that correct in showing that most of the actual time is spent constructing exceptions?

@erosb
Copy link
Contributor

erosb commented Aug 31, 2022

Yes, there can be many ValidationExceptions thrown, if there are many subschema mismatches during validation. But I can't see InternalValidationException on the screenshots. Is it because its self-time is low, or do you use an old version of the library?

@owengalenjones
Copy link
Author

Surprisingly with the case of the exceptions, I see no validation error messages when validation is complete. I am running tests of the schema against a sample JSON object that is valid / conforming to the schema.

We are using:

    <dependency>
      <groupId>com.github.erosb</groupId>
      <artifactId>everit-json-schema</artifactId>
      <version>1.9.2</version>
    </dependency>

@owengalenjones
Copy link
Author

Ah I see that's slightly out of date. With 1.14.1 the same run is down to ~700ms.

@erosb
Copy link
Contributor

erosb commented Aug 31, 2022 via email

@owengalenjones
Copy link
Author

Thanks for your assistance @erosb, despite the increase I'm still wondering if there is anything that my profile shows as a potential site of further improvment?

Screen Shot 2022-09-01 at 11 19 24 AM

I'm not sure what would be expected or if there is anything that leaps out to you.

In testing the same collection of schemas and the JSON object with the node library Ajv, it only takes 40 ms.

@erosb
Copy link
Contributor

erosb commented Sep 2, 2022

Hello, my last guess: was the JVM warmed up? How many iterations of validation did you run?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants