Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite Valibot with the pipe function #502

Merged
merged 299 commits into from May 21, 2024
Merged

Rewrite Valibot with the pipe function #502

merged 299 commits into from May 21, 2024

Conversation

fabian-hiller
Copy link
Owner

@fabian-hiller fabian-hiller commented Apr 3, 2024

One month after I published #463 to discuss Valibot's API, it is finally time to share my first draft with you. I have not only implemented the pipe functions. In the end, I basically rewrote the whole library. I improved many things while keeping performance, bundle size and developer experience in mind.

Except for the pipe function, the basic API remains the same. However, to make the migration as smooth as possible for you, we plan to partner with Codemod to automatically migrate most of the changes with a single CLI command.

pipe function

First of all, thank you for your feedback! It was amazing to see how the Valibot community came together to discuss this change. In total, my post on X got more than 15,000 impressions and more than 70 developers participated on GitHub.

Better mental model

The pipe function reduces the mental model of Valibot to schemas, methods and actions. Schemas validate data types like strings, numbers and objects. Methods are small utilities that help you use or modify a schema. For example, the pipe function is a "method" that adds a pipeline to a schema. Actions are used exclusively within a pipeline to further validate or transform a particular data type.

Simplified source code

By moving the pipeline into its own function, we made Valibot even more modular. This has several advantages. The most obvious is that it reduces the bundle size if your schemas do not require this feature. But even if you use the pipe function, you can expect an average 15-20% reduction in bundle size with the new implementation.

Another great benefit is that we were able to simplify the schema functions by removing the pipe argument. Schema functions are now more independent and only cover the validation of their data type. This allows us to simplify the unit tests, making the library even safer to use.

Furthermore, for any primitive data type schema such as string, we could remove its async stringAsync implementation, since this was only necessary to support async validation within the pipe argument. This will further reduce the overall bundle size of the library.

More flexibility and control

The pipe function also adds flexibility to the library and gives you more control over your schemas. Previously, it was very difficult to extend the pipeline of a schema with additional validations or transformations. This changes with the pipe functions because they can be nested, making your code more composable.

import * as v from 'valibot';

const EmailSchema = v.pipe(v.string(), v.email());
const GmailSchema = v.pipe(EmailSchema, v.endsWith('@gmail.com'));

Another great benefit is that pipelines now support data type transformations. There is no longer any confusion between a transformation method and pipeline transformations. You can even add another schema function after a transformation to validate its output.

import * as v from 'valibot';

const PixelSchema = v.pipe(
  v.string(),
  v.regex(/^\d+px$/),
  v.transform(parseInt),
  v.number(),
  v.maxValue(100)
);

When building forms, this also gives you more control over the input and output type of a schema. It is now possible for a field to have an input type of string | null but an output type of string without specifying a default value.

import * as v from 'valibot';

// This schema
const ProfileSchema = v.object({
  name: v.string(),
  age: v.pipe(v.nullable(v.number()), v.number()),
  bio: v.pipe(v.nullable(v.string()), v.string()),
});

// Has this input type
type ProfileInput = {
  name: string;
  age: number | null;
  bio: string | null;
};

// But this output type
type ProfileOutput = {
  name: string;
  age: number;
  bio: string;
};

object schema

Another major change is that I have removed the rest argument from the object schema. This reduces the bundle size by up to 150 bytes if you don't need this functionality. If you do need it, you can use the newly added objectWithRest schema, and to allow or not allow unknown entries, there is now a special looseObject and strictObject schema for a better developer experience.

import * as v from 'valibot';

const NormalObjectSchema = v.object({ ... });
const ObjectWithRestSchema = v.objectWithRest({ ... }, v.string());
const LooseObjectSchema = v.looseObject({ ... });
const StrictObjectSchema = v.strictObject({ ... });

I also plan to remove the rest argument from the tuple schema and provide a tupleWithRest, looseTuple and strictTuple schema.

Type safety

I am happy to announce that I have been able to drastically improve the type safety of the library. Previously, like many other libraries out there, less important parts were only generally typed.

A concrete example is the issues that Valibot returns when the data does not match your schema. Even though each issue contains very specific data depending on the schema and validation function, they were all previously typed with a generic SchemaIssue type.

With this rewrite, each schema and validation function brings its own issues type. This way each schema knows exactly what kind of issues can occur. We also added the isOfKind and isOfType util functions to filter and handle specific objects in a type-safe way.

valibot

Unit tests

Valibot was kind of my first project that came with unit testing. When I implemented the first tests, I had very little experience with testing. Even though the current tests cover 100% of the library, I don't feel comfortable releasing v1 before rewriting them and making them perfect. Also, we have had some type issues in the past. That's why I'm also planning to add type testing to ensure the developer experience when using TypeScript.

It is important to me to ensure the functionality of the library and to make sure that no unexpected bugs occur when changes are made to the code in the future. Developers should be able to fully rely on Valibot. If you are a testing expert, please have a look and review the string and object schema tests.

Next steps

I created this draft PR for two reasons. One is to get feedback from the community before rewriting the entire library. I have already discussed a lot of details with @Demivan and think we are on a good path, but maybe we missed something. Feel free to investigate the source code and reach out via the comments.

With the following commands you can clone, bundle and play with the new source code in advance:

# Clone repository
git clone git@github.com:fabian-hiller/valibot.git

# Switch branch
git switch rewrite-with-pipe

# Install dependencies
cd ./valibot && pnpm install

# Bundle library
cd ./library && pnpm build

# Modify `playground.ts`

# Run playground
pnpm play

On the other hand, I created this PR also to ask you to help me implement the missing parts. If you are passionate about open source and want to join an exponentially growing open source project, now might be the time.

Starting April 8, everyone is welcome to take over the implementation of a schema or action function. I recommend starting with a simple function like minBytes, where you only have to copy and modify the minLength action, and maybe look at the previous implementation of the same function.

To participate, just add a comment to this PR with the part you want to take over. Please only add a comment if you are able to do the implementation within 3 days, to not block the rewrite. After you get my thumbs up, you can create a PR that merges your changes into the rewrite-with-pipe branch.

The goal is to have everything implemented this month. ⚡️

@fabian-hiller fabian-hiller added enhancement New feature or request priority This has priority labels Apr 3, 2024
@fabian-hiller fabian-hiller self-assigned this Apr 3, 2024
@fabian-hiller fabian-hiller marked this pull request as draft April 3, 2024 01:01
@Afsoon
Copy link
Contributor

Afsoon commented Apr 4, 2024

I want to contribute, but I have a question. You say the following:

I recommend starting with a simple function like minBytes, where you only have to copy and modify the minLength action, and maybe look at the previous implementation of the same function.

But I have noticed the action has missing tests. Is it intentional?

@fabian-hiller
Copy link
Owner Author

But I have noticed the action has missing tests. Is it intentional?

I plan to add any missing tests over the weekend. We should be ready to go on April 8.

@IlyaSemenov
Copy link
Contributor

I noticed that you're using snake_case for validation type, for example:

export function minLength(...) { // <------- camelCase function name
  return {
    kind: 'validation',
    type: 'min_length', // <----- snake_case rephrase of the same name
    ...,
  };
}

What do we lose if we have a rule to keep validation.type to be always exactly the same as the function name?

@fabian-hiller
Copy link
Owner Author

Thank you for your review! I personally prefer snake_case for string values because it is easier to read and looks more professional (just personal opinion) then camelCase. In terms of bundle size, the difference is only about 2 bytes. Why would you prefer to use the same name as the function name? To make it "uniform"?

@IlyaSemenov
Copy link
Contributor

Personally, I am also a big fan of snake_case (and tabs for indentation, for that matter). I have to suffer this love for consistency within Node.js ecosphere.

However, in this case my point was different: to reduce the number of 'different' things/names a developer has to invent / to come up with in order to implement a new method. Currently, it's 4 elements:

  1. maxLength for the actual method name
  2. MaxLengthAction for the action interface (essentially a 1, capitalized and suffixed).
  3. MaxLengthIssue for the issue interface (essentially a 1, capitalized and suffixed).
  4. max_length for the type key

In my thinking, the last one is easily avoidable as (1) could be used for (4), presumably without downsides. That's not a big deal, just a heads up to give you some thinking. If you still think it's good to have it different, so be it!

@fabian-hiller
Copy link
Owner Author

Thank you for your detailed reply. I will think about it. It would be nice to get feedback from more developers, but it probably does not matter to most developers.

@Saeris
Copy link
Contributor

Saeris commented Apr 6, 2024

I noticed that you're using snake_case for validation type, for example:

export function minLength(...) { // <------- camelCase function name
  return {
    kind: 'validation',
    type: 'min_length', // <----- snake_case rephrase of the same name
    ...,
  };
}

What do we lose if we have a rule to keep validation.type to be always exactly the same as the function name?

Another way to further reduce the overall bundle size, ironically, is to change the return values of these functions to class instances rather than plain objects for the simple fact that we can then remove both kind and type properties in favor of using instanceof instead to check for heritage.

Okay, that may sound dumb, but hear me out! Here's an example from some exploratory work I did prior to #259 getting merged:

src/types/schema.ts

/**
 * Base schema type.
 */
export abstract class Schema<TInput = any, TOutput = TInput> {
  /**
   * Input and output type.
   *
   * @internal
   */
  _types?: { input: TInput; output: TOutput };
  /**
   * Parses unknown input based on its schema.
   *
   * @param input The input to be parsed.
   * @param info The parse info.
   *
   * @returns The parse result.
   *
   * @internal
   */
  abstract _parse(input: unknown, info?: ParseInfo): SchemaResult<TOutput>;
}

src/schemas/boolean/boolean.ts

export class BooleanSchema<TOutput = boolean> extends Schema<boolean, TOutput> {
  /**
   * The error message.
   */
  message: ErrorMessage;

  constructor(message?: ErrorMessage) {
    super();
    this.message = message;
  }

  _parse(input: unknown, info?: ParseInfo) {
    // ... schema parsing logic
  }
}

/**
 * Creates a boolean schema.
 *
 * @param message The error message.
 *
 * @returns A boolean schema.
 */
export const boolean = (message?: ErrorMessage) => new BooleanSchema(message);

With the above, we can use instanceof to both check whether the return value of boolean() is a Schema or more specifically, a BooleanSchema. Additionally, it enables us to consolidate the interface types we had before into the class declaration itself, reducing the overall amount of code that needs to be maintained in the library.

You can then use the returned class instance the exact same way you are now with the object these functions return. We only keep the original function for the convenience of not needing to use new all of the time.

Anyways, food for thought! Might not be the best way to go about things, but I think it's worth considering to further reduce the codebase and bundle size.

@fabian-hiller
Copy link
Owner Author

I have looked at your code and currently think that it is more complicated when using classes. Also, the bundle size has increased by about 20% in my investigations.

boolean with class instance (147 bytes GZIP):

export class BaseSchema {}

export class BooleanSchema extends BaseSchema {
  message;
  expects = 'boolean';
  constructor(message) {
    this.message = message;
  }
  _run(dataset, config) {
    // ...
  }
}

export const boolean = (message) => new BooleanSchema(message);

boolean with plain object (120 bytes GZIP):

export function boolean(message) {
  return {
    kind: 'schema',
    type: 'boolean',
    expects: 'boolean',
    async: false,
    message,
    _run(dataset, config) {
      // ...
    },
  };
}

@fabian-hiller
Copy link
Owner Author

I need 1 more day to add each missing unit and type test.

@fabian-hiller
Copy link
Owner Author

fabian-hiller commented Apr 9, 2024

I will add the missing tests for the methods and schemas tomorrow, but anyone who is interested can already start helping me port the previous validations and transformations to the new /actions directory in this PR. I recommend to copy a similar existing action (e.g. decimal for regex based validations like email), rename everything and if necessary port over the validation or transformation logic from the previous code. I am happy to help and guide you if you have any questions.

@EltonLobo07
Copy link
Contributor

@fabian-hiller Hi! I would like to work on includes validation. I have read the new implementation of minLength and the old implementation of includes. I think I have a decent idea of what needs to be done. Let me know if I can start working.

@Afsoon
Copy link
Contributor

Afsoon commented Apr 9, 2024

@fabian-hiller I would like to work on minBytes.

@xcfox
Copy link
Contributor

xcfox commented Apr 9, 2024

@fabian-hiller I would like to help with this refactoring. I've completed trimEnd and trimStart at #511 .

@EltonLobo07
Copy link
Contributor

@fabian-hiller I would like to work on excludes.

@IlyaSemenov
Copy link
Contributor

I can take integer.

@xcfox
Copy link
Contributor

xcfox commented Apr 10, 2024

I have completed upperCase and lowerCase at #513

@mxdvl
Copy link
Contributor

mxdvl commented Apr 10, 2024

Could I take the tuple and related methods?

@fabian-hiller
Copy link
Owner Author

At the moment I think it is better if I take care of the schemas and more complex functions and the community helps to create the actions to not block the rewrite. But if I am too slow I might change my mind next week.

@EltonLobo07
Copy link
Contributor

@fabian-hiller I would like to work on bytes and notBytes (validations).

@EltonLobo07
Copy link
Contributor

@fabian-hiller I have completed bytes and notBytes at #517 . I would like to work on minSize and maxSize validations next.

@fabian-hiller
Copy link
Owner Author

Done 🥹

@fabian-hiller fabian-hiller marked this pull request as ready for review May 20, 2024 22:09
@fabian-hiller fabian-hiller changed the title Help rewrite Valibot with the pipe function Rewrite Valibot with the pipe function May 20, 2024
@fabian-hiller
Copy link
Owner Author

v0.31.0-rc.0 is now available 🙌

@saturnonearth
Copy link

What is the equivalent way of writing with "merge" with the new pipes?

Having trouble with converting some thing like this:

const requestSchema = v.merge([
	v.pick(testSchema, ['name', 'id', 'description']),
	v.object({
		id: v.optional(testSchema.entries.id)
	})
]);

@fabian-hiller
Copy link
Owner Author

merge spreaded the .entries into a new object schema. This is no longer possible because there are now 4 different object schemas. So the new way to achieve the same is to do the spreading yourself. I welcome feedback on this.

const MergedSchema = v.object({ ...Schema1.entries, ...Schema2.entries });

@fabian-hiller fabian-hiller merged commit 0450934 into main May 21, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority This has priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet