-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Faster unions (vs z.switch()
)
#3407
Comments
LGTM! 💪
Wondering if Am I missing something? P.S. abort early! 🎉 |
I think this makes a lot of sense! ArkType goes pretty far down the "let's create a runtime type system and handle discrimination robustly" rabbit hole, but I think using heuristics like checking only literals at the root of the object can get you most of the benefit and save a ton of complexity. The other category of disjoints besides literals that might be worth considering is If it's of any use, this is the algorithm AT uses to determine the "best" discriminator (or sequence of discriminators, if it requires multiple steps) for a union: Happy to discuss in more detail any time if it would be useful! Definitely one of my favorite subjects if you don't mind hearing me nerd out a bit over it 😄 |
Am I understanding this right? Before: z.discriminatedUnion("kind", [
z.object({ kind: z.literal("number"), amount: z.number() }),
z.object({ kind: z.literal("boolean"), value: z.boolean() })
]) After: z.union([
z.object({ kind: z.literal("number"), amount: z.number() }),
z.object({ kind: z.literal("boolean"), value: z.boolean() })
]) Optional: z.union([
z.object({ kind: z.literal("number"), amount: z.number() }),
z.object({ kind: z.literal("boolean"), value: z.boolean() })
], {
discriminator: "kind"
}) The "after" schema will initialize a bit slower(?) than it would today, but parse much faster than it would today and up to around as fast as the "before" schema does today? If so, I like it a lot, |
Yep this would be defined on
Thanks David I'll look into this! I haven't thought through how discriminator selection would work, so this is super useful.
I was thinking about this. Maybe a special symbol (
Precisely 💯 |
LGTM and thanks for the breakdown of this neat pre-computation strategy 🚀 ! I have a few clarifications for better understanding:
How would the fast-checks generalize for unions where the "discriminator" is a non-primitive e.g an object? Here's an example building on top of @jrysana example but nesting z.union([
z.object({ info: z.object({ kind: z.literal("number") }), amount: z.number() }),
z.object({ info: z.object({ kind: z.literal("boolean") }), value: z.boolean() })
]) This question is based on my understanding of Side note: this schema may not be the best, but it seems like something that could exist out in the wild. Lastly if this has already been thought through or it'd be figured out at a later time, I'm happy to wait and see what it would look like! Maybe this is what you are referring to as discriminator selection by:
This seems like a nice idea to use special symbols (like Also, PS. I am glad that |
Edit: Ok now I see it will solve my problem automatically. Thanks again for your great work, looking forward to the implementation. Nice to see this is going to be addressed. Thank you for your hard work and ideas. One thing that I'd like to see being introduced to const u = z.union([
z.object({ type: z.literal("a"), value: z.number() }),
z.object({ type: z.literal("b"), value: z.string() }),
]);
console.log(JSON.stringify(u.safeParse({ type: "a", value: "b" }))); long error message given by
|
I second that. I use So far I really like the proposal, but I hope it addresses also the error messages, otherwise it will be unusable for me. |
Forgive my ignorance, but would this allow for nested discriminated unions unlike |
I am using this example: const myUnion = z.discriminatedUnion("status", [
z.object({ status: z.literal("success"), data: z.string(), timestamp: z.date() }),
z.object({ status: z.literal("failed"), error: z.instanceof(Error), timestamp: z.date() }),
]);
myUnion.parse({ status: "success", data: "yippie ki yay" }); This is maybe a dumb example with the EDITThis can actually be done with |
New problem: The |
Why not instead of fastCheck(val) {
let res: true | false | "unavailable" = "unavailable";
for (const attr of this.attributes) {
const fastCheck = attr.fastCheck(val[attr]);
if (fastCheck === "unavailable") continue;
else if (!fastCheck) return false;
res = true;
}
return res;
} Then to collect all possibilities in an const possibleSchemas = allSchemas.filter(s => s.fastCheck(val) !== false); Maybe there is not much value to |
I made my schema like this now: export const store_order_item_schema = z.object({
session_token: z.string().length(128),
product_type: z.string(),
product_name: z.string().max(200),
value: z.string().max(200),
period: z.number().positive().default(1),
quantity: z.number().positive().default(1),
}).and(z.discriminatedUnion("product_type", [
z.object({
product_type: z.literal('domain_extension'),
extension: z.string().refine((v) => !v.startsWith('.'), 'Domain extension must not start with a dot'),
status: z.enum(['free', 'active']),
transfer_code: z.string().optional(),
//! you can not order multiple domains with the same name
quantity: z.literal(1),
period: z.literal(12).or(z.literal(24)).or(z.literal(36)).or(z.literal(48)).or(z.literal(60))
}),
z.object({
product_type: z.literal('pax8_license'),
term: z.enum(['Annual', 'Monthly']),
license_id: z.string().uuid(),
commitment_term_id: z.string().uuid(),
}),
z.object({
product_type: z.literal('hosting'),
hostname: z.string()
}),
z.object({product_type: z.literal('other')})
])) The only thing what is bothering me is that I cannot use other values in this schema |
How about multiple discriminators that cause nesting issue like #1884? Would it be possible to accept an array of discriminators? for example: z.union([ ... ], {
discriminator: ["someKey1", "someKey2"]
}) |
Okay, I'm glad I didn't rush a
z.switch()
implementation, because I now think there's a better way forward. This is definitely my own brilliant idea and not something @gcanti suggested in a Twitter DM.It's pretty simple, we just...make
z.union()
better. Here's the case for sticking with plainz.union()
:z.switch()
. Code generation tools need a way to enumerate all elements of a union, and that isn't possible withz.switch()
.z.discriminatedUnion
How? The idea is for Zod to do some "pre-computation" at the schema creation time to make parsing faster.
.getLiterals(): Set<Primitive>
method that returns the set of known literal values that will pass validation. For something likez.string()
, this will be undefined (there are infinitely many valid inputs). For something like a ZodEnum or ZodLiteral, this will be a finite set of values.z.union()
, Zod usesgetLiterals()
to extract a set of "fast checks" for each union element in the form{[k: string]: Set<Primitive>
.abortEarly
mode to bail out ASAP in the event of validation errors. This mode will speed up ZodUnion performance even if there's no discrimination to be done. This was already on the Zod 4 roadmap but is made even more relevant now.The API could also accept a "discriminator hint" to point the parser in the right direction.
In retrospect, this is what the API always should have been. Discriminated unions are not a type unto themselves, just an optimization over unions, and the API should have reflected that.
The text was updated successfully, but these errors were encountered: