Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export an eof getter that spits out the remaining bytestring #102

Open
raehik opened this issue Apr 29, 2022 · 5 comments
Open

Export an eof getter that spits out the remaining bytestring #102

raehik opened this issue Apr 29, 2022 · 5 comments

Comments

@raehik
Copy link

raehik commented Apr 29, 2022

Thanks for cereal, I'm using it in a set of binary representation libraries with alternate typeclasses, and it's super handy.

I've found myself needing a dual to Put.putByteString for getters, where there's no length prefix, so all you can do is consume the whole remaining bytestring and spit it back out. Data.Serialize.Get doesn't export some of the underlying types or helpers needed for writing this, so users aren't able to define it themselves (efficiently, anyway).

I've forked and written something that apparently works, but I don't know how the buffer and other parts function.

getByteStringEOF :: Get B.ByteString
getByteStringEOF =
    Get (\s0 b0 m0 w _ k -> k B.empty Nothing m0 w (mergeBuf s0 b0))
  where mergeBuf s0 = \case Nothing -> s0
                            Just b  -> b <> s0

-- adapted from an existing function
-- but it doesn't consume the input, just return it
get :: Get B.ByteString
get  = Get (\s0 b0 m0 w _ k -> k s0 b0 m0 w s0)

Would it be useful to for Data.Serialize.Get to export something like this? It wouldn't be used in the typeclass-- or anywhere else, but it's convenient to piggyback off cereal's efficient internal parser rather than write my own.

@raehik
Copy link
Author

raehik commented Apr 29, 2022

I'm looking at this sort of relation (pseudocode using my own definitions):

bs :: B.ByteString -- for any bytestring, including empty
-- runPut bs == bs
-- runGet (runPut bs) == Right (bs, "")

cereal doesn't provide the top law because it (correctly) serializes bytestrings with length prefixes. My library handles that in a different way, so I'd like provide a "bad" instance for ByteString that you can only use at the top level. That's what I need the "to EOF" getter for.

@elliottt
Copy link
Contributor

What about combining lookahead, remaining, and getBytes? Does lookahead (getBytes =<< remaining) do what you're after?

@raehik
Copy link
Author

raehik commented Apr 29, 2022

lookahead doesn't consume the input, but getBytes =<< remaining by itself does and has good clarity -- thanks!

A shortcut that doesn't do the counting or copying would still be nice (though maybe the copying is important?). getBytes calls ensure, which does a lot of work. All I want is to extract the remaining bytes, so I don't think I need it. This is what I have:

getByteStringEOF :: Get B.ByteString
getByteStringEOF =
    Get (\s b m w _ k -> k B.empty emptyBuffer m w (s `B.append` bufferBytes b))

@elliottt
Copy link
Contributor

elliottt commented May 3, 2022

The copying is important mainly so that you don't hold on to the entire input just to keep around the result of getBytes. The call to ensure in the case that you're passing remaining as the argument is pretty redundant though, so perhaps an uncheckedGetBytes would make sense here?

@raehik
Copy link
Author

raehik commented May 3, 2022

Yes, that would be useful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants