Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reverting iter to previous position #1952

Open
avekarpov opened this issue Feb 13, 2023 · 4 comments
Open

Reverting iter to previous position #1952

avekarpov opened this issue Feb 13, 2023 · 4 comments

Comments

@avekarpov
Copy link

avekarpov commented Feb 13, 2023

In the case of using a one-pass version of the parser (using "find_fileld"), if the field is not found, then the object has to be reset, which is logical, but not for the case when we expect that there may not be a value with the passed keys. This results in performance degradation as we reprocess the keys. This can be avoided by adding the ability to remember the position of the iterator.

Result result;

auto error = object.find_field(key).get(result);

switch (error) {
    case simdjson::error_code::SUCCESS:
        break;

    case simdjson::error_code::NO_SUCH_FIELD:
        object.reset();
        break;

    default:
        ...
}

Make something like this

auto position = object.get_current_position();

Result result;

auto error = object.find_field(key).get(result);

switch (error) {
    case simdjson::error_code::SUCCESS:
        break;

    case simdjson::error_code::NO_SUCH_FIELD:
        object.revert_position(position);
        break;

    default:
        ...
}
@lemire
Copy link
Member

lemire commented Feb 13, 2023

With all issues having to do with performance, we really want to have measurements (benchmarks). This being said, what you describe is not too difficult to implement but we really would like to have to benchmarks supporting our beliefs especially if we make the API more complex.

@lemire
Copy link
Member

lemire commented Mar 9, 2023

Pull request invited.

@jkeiser
Copy link
Member

jkeiser commented Mar 10, 2023

It's been a minute, but I think it does actually return to the existing field position. To find out whether the field exists, it has to cycle through all the fields, including the initial ones.

@lemire
Copy link
Member

lemire commented Mar 10, 2023

@jkeiser The find_field assumes that you provide the keys in order, so if a key is missing, you end up at the end, and if you want to continue, you have to reset.

The default (operator[]) is find_field_unordered and it definitively remembers the location.

It is a matter of semantics at this point because the current behaviour is that `find_field("x")...find_field("y")...find_field("z")..." only succeeds if you have the fields x, y, and z in sequence.

If the field y is missing, then you have to reset the object.

You don't have to do that with the default (find_field_unordered).

(I think all of that was written and designed by @jkeiser, to be clear.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants