Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A proposal for supporting the JSON-P types of the javax.json package. #220

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

ceharris
Copy link

@ceharris ceharris commented Sep 13, 2018

A proposal for supporting the JSON-P types of the javax.json package.

Motivation

The JSON-P types provide the underlying JSON abstraction for applications
that utilize Java EE 7 or Java EE 8. In both EE 7 and 8 these types are
commonly employed in building REST APIs using JAX-RS, where high quality
JSON schema validation support provided from the Everit library would
provide great value. Java SE applications using the Jersey REST server
or client components would similarly benefit from JSON schema validation
support.

As JSON schema continues to make progress on the IETF standards track, its
use with the Java platform will need to be standardized through the Java
Community Process. Supporting the JSON-P types in this library will
provide an example implementation of JSON schema support for Java that fits
naturally within the existing support for JSON-P and JSON-B, and will
position the author(s) of the Everit library to significantly influence
the direction of the standardization process for JSON Schema support in
Java.

Goals

  1. Provide a means for the Everit JSON Schema library to support
    validation of JSON values (structures and scalars) using the JSON-P
    types from JSR-374 (and its predecessor JSR-353) in the javax.json
    package.
  2. Ensure full backwards compatibility for use of the library with
    org.json types.
  3. Minimize performance impacts of supporting more than one system of
    JSON types.
  4. Provide an SPI that would facilitate later adaptation to future
    specifications as well as other Java JSON implementations (e.g. Jackson)
  5. Allow the use of alternative JSON type systems to be configurable
    at runtime.

Non-Goals

  1. Replacement of the underlying JSON type system used to represent JSON
    Schema structures in the library.
  2. Implementation of support for type systems other than the existing
    org.json and the JSON-P types.

Analysis

Using the current releases of the Everit JSON Schema library, one
supported approach for using a JSON type system other than that provided
by org.json is translation. With this approach, in an application
that uses the JSON-P types in javax.json, it is necessary to transform
JSON structures from the JSON-P types to instances of org.json.JSONObject
and/or org.json.JSONArray. While the translation itself is reasonably
straightforward, it is a very expensive approach and may not be feasible
in applications that need to support JSON schema validation operations at
a high volume per unit time.

Another potential approach would be to adapt the JSON-P data types to the
API provided by the org.json types. This approach is significantly
complicated by the fact that these types are concrete classes rather than
interfaces. Any such approach would involve creating new concrete
subtypes that override virtually all of the functionality of the base types
without violating the contract of the base types. This is further
exacerbated by the fact that these types expose a very large number of
public methods which would need overrides. It is not clear that these
types were designed to be subclassed in this manner, and there may well be
implicit assumptions that are inherent to the design of these types that
make such an approach infeasible.

After extensive study of the architecture and implementation of the Everit
library, it becomes clear that there exists an (implicit) contract between
the validation implementation and the JSON types provided by org.json;
i.e. that while the org.json types provide a large API surface, only a
small subset of this is actually used in the validation implementation.
Moreover the scope of implementation classes that utilize this implicit
contract is relatively small due to the visitor-based relationship
between the JSON schema abstraction and the classes that implement
validation.

In this contract, the JSON type system is assumed to provide a JSON object
type with a very minimal Map-like interface. Similarly, the JSON array
type is assumed to provide a minimal List-like interface. It is
assumed that the JSON type system uses the wrapper form of the intrinsic
Java types for strings, booleans, numbers. With respect to nullity, it
assumes that the JSON type system supports not only the intrinsic Java
null but also a null object; i.e. an instance of some Java type that
represents the null JSON value.

With this understanding, it becomes clear that by creating a service
provider interface (SPI) that describes the contract between the JSON type
system and the validation components, it would then be possible to substitute
a different JSON type system. The SPI will act as a thin adaptation layer
between the validation components and the JSON type system. At the
expense of some additional method call dispatches through the adapter
layer at runtime, the validation components can easily support a different
JSON type system.

It is important to note that in this approach, neither the interface nor
the implementation of the schema components of the library are changed.
The Schema type and its related components are treated as a black box
and its underlying JSON type is irrelevant. Only the validation components
need to support adaptation to a different JSON type system.

Proposal

The proposal represented in the code of this pull request is as follows.

  1. Introduce a JsonAdaptation service provider interface (SPI), that
    represents the relatively small contract between validation components
    and the JSON type system.
  2. Implement the SPI using the org.json types and use this adaptation
    as the default provider in the validation components. This ensures that
    existing uses of the library are unaffected.
  3. Introduce an implementation of the SPI using the JSON-P types of
    the javax.json package and provide the means to configure the
    ValidatorBuilder to use this as an alternative to the default
    adaptation to the org.json types.

As can be observed in reviewing the pull request, the amount of change
required in the validation components is small and highly targeted. By
defining the SPI according to the implicit contract between validation
components and the org.json types, very little change is needed in most
of the validation components.

Moreover, using this approach the only required change to existing unit
tests for the library is to acknowledge the default org.json adaptation
in the ValidatingVisitorTest; i.e. passing an additional argument to
to ValidatingVisitor constructor invocations in the test. With
no other changes to the test suite, all existing tests pass.
Assuming that test coverage is reasonably complete, this should give some
measure of confidence that the introduction of the adaptation layer does
not change the externally visible behavior of the validation components.

Performance Considerations

Adapting the library to support another type system will inevitably involve
some trade-off of performance for flexibility. But the approach taken here
aims to minimize this impact. In the default org.json adaptation additional
method call dispatches through the adaptation layer should introduce
reasonably small additional overhead. These adapters are quite simple with
minimal branching logic, and conditional expressions that involve few operands
and corresponding fetches. Performance profiling could be used to validate
these assertions.

An additional performance improvement could be made and has been marked with
a TODO in the implementation. In the getFailureOfSchema method of
ValidatingVisitor, the adapted JSON value received as input from the caller
is passed to the adaptation layer to be inverted (un-adapted, so to speak).
This is done because there are a handful of test cases that make assumptions
that are violated by the presence of adapter types. This approach was used to
avoid modifying the tests at the same time the adaptation layer is introduced,
to minimize risks. However, after some successful experience with the
adaptation layer, it would be desirable to revisit the design of these tests
to avoid the additional call back into the adaptation layer in this method.

Addtional Testing Considerations

Naturally, all of the existing unit and integration tests are focused on
testing the implementation based on org.json types. Indeed, the
implementation of this proposal deliberately avoids changing the existing
tests to the extent possible to minimize the risks of introducing externally
visible changes in the behavior of the validation components.

One additional test class with tests for the ValidationVisitor using the
JSON-P adaptation was added, as were tests for the adaptation classes; while
there is not much logic in these classes, the correct operation of these adapters
is obviously essential.

It may be also be worthwhile to consider modifying the existing tests to
account for the use of adapter types for JSON arrays and objects, as noted
previously.

In the integration test phase of the build some thought and effort should be
expended to determine where and how to modify the integration test suite to
test validation using each of the supported JSON type systems, especially in
the integration test phase of the build. Again, guidance from the Everit
authors would be helpful here.

@coveralls
Copy link

coveralls commented Sep 13, 2018

Coverage Status

Coverage increased (+0.3%) to 92.86% when pulling 5dfcb2b on soulwing:master into a61c6a6 on everit-org:master.

@erosb
Copy link
Contributor

erosb commented Sep 18, 2018

Hello @ceharris ,

thank you for this PR and all the effort you put into this proposal. Much appreciated, and also . There are a couple of reasons why I'm not going to merge it.

Most importantly, the library already has an abstraction layer over org.json, but it is only used in the loader package - I mean the JsonValue class and its subclasses. Though these were created with the loading process in mind, they can probably be improved to support the validation flow too.

Do you have time and/or commitment to give it an other go with this in mind?
You can contact me at ebence88@gmail.com if you need any assistance (sidenote: this project is not related to Everit, I left the company a few months after the initial release and I maintain the library voluntarily since then).

@ceharris
Copy link
Author

When I first started studying the source code for this library, I noticed the abstraction layer used in the schema package. I was surprised that you hadn't used the same abstraction in the validation itself. I think it should be a relatively proposition to refactor to share the same JSON abstraction for both the schema and the validation and I'm willing to give it a try.

@erosb
Copy link
Contributor

erosb commented Sep 19, 2018

Sounds awesome. That abstraction layer was introduced much later than the original release / concept. Its core purpose is to let us reliably track the location of the current schema json during loading, relative to the root document. This would be useful to also do during validation, because currently on validation failure the schema pointer (ValidationException#schemaLocation) and instance pointer (ValidationException#pointerToViolation) are computed in a completely different way, and it would be useful to clean it up at some time in the future (again, historic reasons..)

The JsonValue class is tightly coupled to the LoadingState class. I suggest extracting a superclass from LoadingState which only maintains the pointerToCurrentObj value (we can call this subclass TraversalState) and use a TraversalState during validation.

One more thing: we use a maven plugin for automatically checking API backward-compatibility. Unfortunately it doesn't work on CI, but you can run it locally by

  • checking out the latest released git tag (1.9.1)
  • building the project with mvn clean install so that the latest release gets copied into the local maven repo
  • then building your current HEAD version with mvn clean install (the built jar will be compared to the latest release)

These are my thoughts on this for now, let me know if you need any further assistance.

@erosb
Copy link
Contributor

erosb commented Nov 12, 2018

Hello @ceharris could you make any progress on this effort?

@ceharris
Copy link
Author

It is still on my list. I got diverted to another project at work to resolve a crisis, which I hope is coming to closure late this week or early next.

I am motivated to work on it... the project in which I'm using your json-schema as a dependency is currently using my fork with the initial proposal for supporting javax.json. ;-)

@ceharris
Copy link
Author

After a very long diversion on a troubled project, I'm finally back to the project for which I developed this fork, and I'm still interested in doing the work to make it possible to get some variation on this PR merged.

Are you still interested in this work?

In your previous messages you mentioned the existing internal abstraction layer over org.json types that is used in the schema loader, and you suggested that I reconsider this proposal in light of the existing abstraction.

After re-reading your comments a few times, your goals aren't completely clear to me. It seems you wish to use the existing internal JSON abstraction in the validation subsystem, in addition to its use in schema loading. Your rationale seems to be that this will allow a common approach to reporting location across both the schema loader and the validator. Am I understanding this correctly?

One approach would be to replace the existing uses of org.json types in the validator with uses of the internal JSON types used in the schema loader. The internal JSON types would then be reworked to allow them to delegate to either the org.json types or to a different JSON type system such as JSON-P (in much the same manner as proposed in this PR as it currently stands). Effectively, this would mean that both the validator and the schema loader would depend only on the internal JSON abstraction, but that abstraction would be reworked to allow the JSON provider to be substituted at runtime (in much the same spirit as this PR as it presently stands).

Is this what you're wanting, or am I missing it?

@erosb
Copy link
Contributor

erosb commented Jan 16, 2020

Hello Carl,

thank you for getting back to this PR. I hope you succeeded on your troubled project.

you wish to use the existing internal JSON abstraction in the validation subsystem, in addition to its use in schema loading. Your rationale seems to be that this will allow a common approach to reporting location across both the schema loader and the validator. Am I understanding this correctly?

Yes, that's correct. Also, the other reason why I'd like to use the existing json abstraction is that (with some improvements) it could be used for resolving the tight coupling between the org.json library and this library. The goal of this improvement is to let us support other json libraries as well, also other formats which are similar to json. This feature was requested a few times, though it isn't a top-priority thing currently.

One approach would be to replace the existing uses of org.json types in the validator with uses of the internal JSON types used in the schema loader. The internal JSON types would then be reworked to allow them to delegate to either the org.json types or to a different JSON type system such as JSON-P

Yes, that would be the ideal approach.

Effectively, this would mean that both the validator and the schema loader would depend only on the internal JSON abstraction, but that abstraction would be reworked to allow the JSON provider to be substituted at runtime

Exactly.

Are you still interested in this work?

Well, I'm interested in a sense that I'd be very happy to see this happen. On the other it is a very ambitious rework and I simply won't have time to implement it on my own this year. I'm also not 100% sure if we can implement it without breaking backward-compatibility (BC verification is at least the part of the build process so we can simply check it).

Furthermore, this is currently not the top-priority feature request on the line. The most important major task is adding support to the latest draft specification (draft-2019-09, formerly draft-8).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants