Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELM Typescript Types #266

Open
csenn opened this issue Mar 3, 2022 · 21 comments
Open

ELM Typescript Types #266

csenn opened this issue Mar 3, 2022 · 21 comments

Comments

@csenn
Copy link

csenn commented Mar 3, 2022

Thanks a lot for this library and the port to Typescript, awesome work in this space!

I have been working with ELM more and more while executing CQL, and was wondering if you have Typescript types for the ELM json? It would probably make dealing with the ELM tree much easier instead of everything being 'any'.

I know here you have the XSD schemas which could probably be in theory converted to Typescript definitions, just wanted to double check if it's been done yet... I could also give it a shot if you think that would be useful?

@cmoesel
Copy link
Member

cmoesel commented Mar 3, 2022

Hey @csenn! We've definitely talked about this, but haven't made any immediate plans for it just yet. The first step was just getting the basic types in so we could compile w/ tsc and then make incremental improvements. But... I think we're all in agreement that types on ELM would be awesome!

I'm tagging @mgramigna and @hossenlopp because I think they've thought about this even more than I have and I think @hossenlopp has even done some initial implementation of a subset of ELM types for a different project.

@hossenlopp
Copy link
Contributor

Hey @csenn! One of our projects does some shuffling through the ELM JSON outside of execution and I did some work to define types for that. It's a bit pared down from the entire set of Expressions since I mainly added them as I needed them. If thats at all helpful, they can be found here: https://github.com/projecttacoma/fqm-execution/blob/master/src/types/ELMTypes.ts

@csenn
Copy link
Author

csenn commented Mar 3, 2022

Thanks @cmoesel and @hossenlopp! Nice work creating these types in fqm-execution, I can't help but wonder if we could auto-generate the elm types using the XSD definitions here. I'll look into that a bit, I'm not super familiar with XSD but seems possible at first glance. That would just be a bit more maintainable than having to manually keep them in sync. We could also probably publish the ELM types to DefinitleyTyped and anyone could use them with yarn add @types/cql-elm --dev

@cmoesel
Copy link
Member

cmoesel commented Mar 3, 2022

@csenn -- I think that's a great idea. There may be other TS projects outside of cql-execution that could benefit from the ELM types, so I think it makes sense for them to be published via DefinitelyTyped. Before we go too far though, I think there is one other person we should ask:

@brynrhodes - have you or anyone in your shop given thought to TypeScript type definitions for ELM? Do you see any potential issues w/ attempting to convert the XSD to types (which would need to describe the JSON format of the types)? Or is there a better reference for the JSON format that the CQL-to-ELM translator produces?

@csenn
Copy link
Author

csenn commented Mar 4, 2022

Good points @cmoesel, there could def be an easier format to deal with JSON ELM than XSD...

It might also be possible to output a JSON Schema for ELM JSON as well, if that's useful. I believe Typescript types can be generated from json schema.

It looks like the ELM JSON is being generated in this function which directly marshals the JAXB class instances that were generated from the incoming XSD?

@brynrhodes
Copy link
Member

@cmoesel , @csenn, better support for JSON is one our priorities in the translator. We're currently working on JSON representation of ModelInfo, as well as improved JSON output for the ELM. We have both JSON (using JAXB) and JXSON (tongue firmly in cheek here, that uses the Jackson serializer). The JXSON format is required for deserialization in Android and we are proposing to migrate to that format for all JSON and deprecate the use of JAXB altogether.

As a side note, we did just use "reinert" to build a JSONSchema for the CQLTranslatorOptions class, that might be a round-about way to get there.

I feel like there must be a decent path from XSD to JSONSchema, but I tried a few online converters without success. I can log a tracker to take a deeper look at that: cqframework/clinical_quality_language#730

@brynrhodes
Copy link
Member

And sorry, just reading back on this thread, @csenn, if you want to take a crack at getting JSONSchema files for the XSD files, that would be fantastic. If we can get good representations there I'd be happy to propose them back to the CQL specification.

@csenn
Copy link
Author

csenn commented Mar 4, 2022

Hey @brynrhodes! Sounds good I'll look more into this. I did have a few a clarifying questions though. Is the current workflow:

  1. Manually Edit XSD file with desired ELM structures
  2. Use the ELM library to generate the javax classes
  3. Parse CQL using the Antlr4 parsers/lexers/visitors etc in cql-to-elm and build ELM tree instances using the javax classes
  4. Marshal finished class instances out to XML, JSON, JXSON etc...

If that's right, do your future future designs change this flow? And what do you mean by "improved JSON output for the ELM"? Whats the difference between the JXSON and JSON output that make Android able to consume it?

@brynrhodes
Copy link
Member

Hi @csenn , yes that's the correct flow currently, and no, the future designs don't change that, they are just focused on making the resulting ELM more consumable on more platforms. There are 3 areas we're currently working on to improve support:

  1. JSON ModelInfo (so we can separate XML dependencies entirely in the translator and you could use it without them)
  2. Improved annotation format. Right now we use mixed content in the XML to decorate the annotations, which is a big ugly mess when it comes through in JSON, so we're planning on serializing that to an HTML string directly, rather than using mixed content, because it needs to be HTML anyway, that's the whole point of it
  3. Deprecate JAXB JSON serialization entirely (switch to the "JXSON")

The difference between the JAXB JSON and the Jackson JSON is in the way that it renders the mixed content, it actually uses some JAXB-specific features, so you can't deserialize the JAXB JSON with a Jackson deserializer, it has to know how to handle the JAXB-specific usage. In the JavaScript engine it works because the JavaScript engine expects the behavior and handles it, but it doesn't work for Android, we have to serialize with the JXSON format (that's why we have both right now). And even if there was, we shouldn't be relying on a JAXB-specific feature, which is what #2 above is about).

Sorry, that was a lot of context, but hopefully that helps?

@cmoesel
Copy link
Member

cmoesel commented Mar 7, 2022

Just to loop this back a little bit (to the original conversation), based on this thread, I'm assuming that your team has not done anything to create TypeScript types for ELM then, right @brynrhodes? And you'd be fine w/ someone (@csenn or someone at MITRE) publishing ELM types to DefinitelyTyped?

@brynrhodes
Copy link
Member

We haven't done anything, no, and would be happy for them to be published, the only concern is that the ELM types and the serialization be compatible (so using the Jackson serializer approach).

@csenn
Copy link
Author

csenn commented Mar 7, 2022

Hey @cmoesel and @brynrhodes, I appreciate the clarifications! I'm still not 100% sure what "mixed content" means from the message above though?

I was able to make pretty good progress here and got something that appears to work pretty well at transforming the XSD files into Typescript Types. It could probably also be extended to do a json schema as well as the "processor" holds a representation of the XML schema in memory and a new generator could be made that does json schema.

But you're right, there are tricky parts of XSD to Typescript/ json-schema compatibility. The biggest issue is inheritance, and the way polymorphism using discriminators is used to differentiate nodes in the ELM tree.

So for instance in the following code any expression can be used in ExpressionDef. But the discriminator type property constants of "IdentifierRef" and "Literal" don't exist as properties in the XSD files, and appear to be auto-generated during JSON marshaling by either Jackson or somehow else.

interface ExpressionDef extends AbstractElement {
  name?: string;
  context?: string;
  accessLevel?: AccessModifier;
  expression?: Expression;
}

/*
The IdentifierRef type defines an expression that references an identifier that is either unresolved, or has been resolved to an attribute in an unambiguous iteration scope such as a sort. Implementations should attempt to resolve the identifier, only throwing an error at compile-time (or run-time for an interpretive system) if the identifier reference cannot be resolved.
*/
interface IdentifierRef extends AbstractExpression {
  type: "IdentifierRef" ;
  name?: string;
  libraryName?: string;
}

/*
The Literal type defines a single scalar value. For example, the literal 5, the boolean value true or the string "antithrombotic".
*/
interface Literal extends AbstractExpression {
  type: "Literal" ;
  valueType: string;
  value?: any;
}

So I add these types in the code automatically, however there seems to be occasional times where there are inconsistencies in how these discriminators are generated. They are not generated on ALL nodes that extends from abstract base classes, only some. And they are generated when inheriting from a non-abstract base class, but that non-abstract parent does not get the discriminator.

That being said, I think these types work in almost all cases. In the test-d folder these ELM structures are fully valid and type checked except for one single error, which might actually be a small error in the ELM generation.

There also may be a few changes that could be made to XSD files to make these types a little better, such as marking some properties required when they will always should be there.

In terms of your current flow, having a single source of truth to define the ELM structures that everything else down stream can be generated from definitely seems a good way to go. One could possibly argue there may be a more cross compatible format than XSD such as Protocol Buffers or even JSON Type Def that might be a little easier to deal with, but as long as you don't use too many XML specific rules in the XSD it should be ok...

@csenn
Copy link
Author

csenn commented Mar 8, 2022

And one more quick note...typescript-json-schema converts typescript types into json schema. So I ran the Typescript Types from above through that and it generated this schema which at first glance looks to work reasonably well. Wrote a quick script to test and then manually changed some properties to break the schema, and it does seem to work!

There could be weaknesses with this approach though, definitely interested in what you guys think. @cmoesel I have been going through some of the source in cql-execution too, and since you already have Typescript setup here it would be pretty easy to try out these types in a few places if you think it would be beneficial.

@cmoesel
Copy link
Member

cmoesel commented Mar 8, 2022

This sounds like great progress, @csenn! I've been heads down working on some other projects, so haven't had a chance to look at this yet. I just tried and found that I don't have access to your cql-elm-types repository. Can you give me access to it so I can take a look?

I'd be very open to trying out some of these types in cql-execution -- but we might need to coordinate timing w/ another team that is doing some significant work on this project (just to ensure we don't cause too many git conflicts). Let me check w/ them to see what they think.

@csenn
Copy link
Author

csenn commented Mar 8, 2022

Sorry @cmoesel, repo should be public now!

@mgramigna
Copy link
Contributor

@csenn Sorry to get around to this thread late! I'm one of the authors of fqm-execution, and was the contributor of the initial TypeScript code here. I also think my team is the one @cmoesel mentioned about coordination with other significant cql-execution work.

Chris and I talked about it briefly, and I agree that I think it'd be great to try out these types in cql-execution and see how things behave. I don't think it should conflict too much with the work we're doing, as the ELM types changes should be largely scoped to the code that ingests the ELM JSON and build it into expressions to be executed. The git conflicts should be minimal if any.

@cmoesel -- any next steps?

@cmoesel
Copy link
Member

cmoesel commented Mar 9, 2022

I agree w/ everything @mgramigna said. @csenn, I think you were offering to explore integrating these types into cql-execution on your own, but if you were hoping our team would do it, let me know.

If you're able to give it a try, I encourage you to fork cql-execution, add the ELM type definitions, then try updating some of the code to use them. Most of the relevant code would be in src/elm/.

One thing that might present a challenge is that we named many of our classes after the ELM types. So you may need to reference types via a namespace (e.g., elm.Add), rename imports (e.g., import { Add as AddELM }), or regenerate the types w/ different names (e.g., AddELM).

If it seems to work, then feel free to submit a PR (even if it is only incremental improvement).

Then at some point, I imagine we'll have proven out the types well enough that you can start the process to get them into DefinitelyTyped (again, assuming you want to do that yourself since you made them).

@csenn
Copy link
Author

csenn commented Mar 9, 2022

Sounds great @cmoesel and @mgramigna, I'll keep it simple at first, maybe just copy paste the index.d.ts and just try a few out to see how it looks. We can always publish to DefinitleyTyped later.

I'm using it some in my own code that traverses ELM already and the imports currently are namespaced using the ELM prefix which seems pretty standard.

import ELM from './elm-types';

function doThing (library: ELM.Library) : void {
    library.includes?.def?.forEach((includeDef) => {
      const range = getRangeFromLocator(includeDef?.locator || '');
      console.log(range)
    });
}

One small issue though is the need need for all those ? marks, because most properties are being generated as optional. Here is the library definition that was generated, and here is @mgramigna library that was done manually

It turns out though the XSD schema definitions could be interpreted a few ways.

	<xs:complexType name="IdentifierRef">
		<xs:complexContent>
			<xs:extension base="Expression">
				<xs:attribute name="name" type="xs:string"/>
				<xs:attribute name="libraryName" type="xs:string" use="optional"/>
			</xs:extension>
		</xs:complexContent>
	</xs:complexType>

	<xs:complexType name="Literal">
		<xs:complexContent>
			<xs:extension base="Expression">
				<xs:attribute name="valueType" type="xs:QName" use="required"/>
				<xs:attribute name="value" type="xs:anySimpleType" use="optional"/>
			</xs:extension>
		</xs:complexContent>
	</xs:complexType>

Notice there are 3 cases in these 2 complex types:

  1. use = required
  2. use = optional
  3. no use

In my code I think I assumed only use=required was the only time the property was required, but maybe "no use" should also be assumed required and a property is only optional when explicitly stated? Do you happen to know that as well @brynrhodes ?

@cmoesel
Copy link
Member

cmoesel commented Mar 9, 2022

Cool, @csenn. I think XSD states that attributes are optional by default -- so no use is the same as use="optional". (As opposed to elements, which are minOccurrs="1" by default).

That said, I can't think of a case where IdentifierRef wouldn't have a name -- so I don't know why it's sticking w/ leaving name optional. (But I may be missing an edge case).

I suppose it's possible that there are attributes marked (or unmarked) optional in the ELM XSD that, in practice, are always present anyway. In that case it may be worth thinking about if they really should be optional. A good question for @brynrhodes.

@csenn
Copy link
Author

csenn commented Mar 14, 2022

Hey @cmoesel! Here is the beginning of a small MR with a few types added in. I'm trying to change the code as little as possible, but there a few times where it's better to throw errors if certain properties dont exist. In these cases i believe the code would fail during execution anyways and would be caused by Malformed ELM in the first place.

There are some tricky parts to adding types into this repo, for instance code such as

const args = this.execArgs(ctx);

is always returning any, and then the rest of the code doesn't know how to use typed "args".

Let me know what you think would be best going forward, it may be a good idea to just keep adding types but I didn't want this MR to get too overwhelmingly large before you (and team) got eyes on it.

@cmoesel
Copy link
Member

cmoesel commented Mar 15, 2022

Thanks, @csenn! Things are a bit busy over here. Sorry for the delay! This is pretty exciting -- and we will take a look at it for sure -- but it might take a few days. Thanks for the heads up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants