Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add lazy isolation mode #111

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

414owen
Copy link

@414owen 414owen commented Mar 14, 2024

This adds a lazy (incremental) isolation combinator.

The use case I have for it is this:

  • I'm using pinch, which decodes thrift messages using cereal
  • Sometimes pinch receives badly formed data
  • Pinch decodes the header of the packet, which might have an enormous size
  • Pinch isolates parsing the apparently astronomically huge message here
  • This crashes the system it's running on, by strictly trying to read it all into memory.

By introducing an incremental isolation primitive, users can make sure the data they're parsing is valid before it all hits main memory.

Implementation notes

I tried to make sure the errors were as close as possible between isolate and isolateLazy. This is useful for fuzzing the implementation with quickcheck, as well as just good practice. This meant I had to expose more of the failure API. I could have added an internal partial runner, which exposes this information, but that seemed like it would require a lot of changes, so I opted for adding the info to the current partial runner, and added a pattern synonym so that users (shouldn't) notice a difference.

These changes are stacked on #110

`ensure` is a publically exported function, which has some odd behaviour
when given a byte length of zero or less.

It raises an error when the value is zero, but happily returns a value
when the byte length is negative.

I've switched these two behaviours around.

```
λ> runGet (ensure (-1)) "0123"
Right "0123"
λ> runGet (isolate 0 getWord8) "0123"
Left "too few bytes\nFrom:\tdemandInput\n\n"
```

```
λ> runGet (ensure (-1)) "0123"
Left "Failed reading: Attempted to ensure negative amount of bytes\nEmpty call stack\n"
λ> runGet (ensure 0) "0123"
Right ""
```
This adds an internal variant, `ensure'`, which is used to avoid
checking for `0` and `<0` more than once in a given code path.
@414owen 414owen changed the title feat: Add lazy isolation mode draft: feat: Add lazy isolation mode Mar 15, 2024
@414owen 414owen changed the title draft: feat: Add lazy isolation mode feat: Add lazy isolation mode Mar 17, 2024
@414owen
Copy link
Author

414owen commented Mar 17, 2024

Taking this out of draft mode, because I've got it integrated with pinch, and passing its test suite, which caught some bugs that the cereal test suite didn't.

@414owen
Copy link
Author

414owen commented Mar 20, 2024

@glguy quick ping, because you're the last committer/merger to the repo.

It doesn't look like there have been any major updates to cereal in a few years.
I'm hoping that's just because it's generally stable :)
Do you think it would be worth pinging Trevor for a review, or do you think there are enough informed eyes available?

@414owen
Copy link
Author

414owen commented Apr 13, 2024

@glguy would you be able to take a look at this please? Or do you know of someone who could?

@414owen
Copy link
Author

414owen commented Apr 17, 2024

@elliottt any chance of a review, or an opinion on this change? I'd really appreciate it.

@elliottt
Copy link
Contributor

I'll take a week this weekend 👍

Copy link
Contributor

@elliottt elliottt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this, I like this new version of isolate! I had a few questions before approving.

src/Data/Serialize/Get.hs Outdated Show resolved Hide resolved
src/Data/Serialize/Get.hs Outdated Show resolved Hide resolved
src/Data/Serialize/Get.hs Show resolved Hide resolved
src/Data/Serialize/Get.hs Outdated Show resolved Hide resolved
src/Data/Serialize/Get.hs Outdated Show resolved Hide resolved
src/Data/Serialize/Get.hs Outdated Show resolved Hide resolved
tests/GetTests.hs Show resolved Hide resolved
@414owen 414owen force-pushed the os/lazy-isolation branch 2 times, most recently from 9e3137e to e585e8d Compare May 16, 2024 09:13
cabal.project.local Outdated Show resolved Hide resolved
src/Data/Serialize/Get.hs Outdated Show resolved Hide resolved
src/Data/Serialize/Get.hs Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants