Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spanEnd and breakEnd to Data.Text #312

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

TheMatten
Copy link

Provides part of #244 for Data.Text.

@Lysxia
Copy link
Contributor

Lysxia commented Mar 15, 2021

This looks good to me. Would any other @haskell/text maintainers like to chime in? Let's merge this with one more approval. This PR seems to actually close #244, or did I miss something else in that issue?

@TheMatten
Copy link
Author

It doesn't actually add corresponding functions to Data.Text.Lazy - while I guess that doesn't really matter for this specific PR, it means that #244 isn't technically fully resolved. I can try to add those if you're interested, but I imagine they are going to be slightly less trivial because of list-like nature of lazy Text.

@TheMatten
Copy link
Author

Looking at breakOnEnd, would similar implementation using reverse make sense? I'm not familiar enough with performance characteristics of text to tell whether it would break existing optimizations.

@TheMatten
Copy link
Author

I've locally prepared commit with

-- | /O(n)/ Similar to 'break', but searches from the end of the string.
--
-- >>> T.breakEnd (=='0') "180cm"
-- ("180","cm")
breakEnd :: (Char -> Bool) -> Text -> (Text, Text)
breakEnd p src = let (a,b) = break p (reverse src)
                 in  (reverse b, reverse a)
{-# INLINE breakEnd #-}

, spanEnd inmplemented in terms of it, and some fixes of whitespace in spanEnd_.

@Bodigrim
Copy link
Contributor

Would any other @haskell/text maintainers like to chime in?

@Lysxia you can request reviews from other maintainers, adding them under the gear in the top right corner of "Reviewers" panel.

@Bodigrim
Copy link
Contributor

@TheMatten would it be possible to add tests for new functions?

@Lysxia
Copy link
Contributor

Lysxia commented Mar 20, 2021

@Bodigrim Where would be a good place to add such tests? There doesn't seem to be a unit-testing suite, is there? It might be good to document this somewhere too.

@Bodigrim
Copy link
Contributor

@Lysxia I suggest adding tests somewhere near

t_span p = L.span p `eqP` (unpack2 . T.span p)
tl_span p = L.span p `eqP` (unpack2 . TL.span p)

@Lysxia
Copy link
Contributor

Lysxia commented Mar 20, 2021

Thanks. I didn't realize we could (and were already) using Data.List as a reference implementation for property testing.

@TheMatten
Copy link
Author

@Bodigrim no problem. Should I include mentioned Lazy versions too?

@Bodigrim
Copy link
Contributor

Yeah, it would be nice.

@Lysxia
Copy link
Contributor

Lysxia commented May 12, 2021

Ping.

  1. Add tests.
  2. (optional) Add lazy versions.

@Lysxia Lysxia added this to the 1.3.0.0 milestone May 12, 2021
@TheMatten
Copy link
Author

Oops, sorry, I forgot to upload changes - will do so today.

@TheMatten
Copy link
Author

Should I rebase it on current master?

@Lysxia
Copy link
Contributor

Lysxia commented May 13, 2021

Yes please :)

-- >>> T.breakEnd (=='0') "180cm"
-- ("180","cm")
breakEnd :: (Char -> Bool) -> Text -> (Text, Text)
breakEnd p src = let (a,b) = break p (reverse src)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that reversing is very expensive and annihilates all benefits of lazy Text. Would it be possible to have something lazier?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean something that wouldn't touch chunk contents on left side of text being broken?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See c76b9aa

@TheMatten
Copy link
Author

I've rebased on top of current master.

go r Empty = (empty, r)
go r (Chunk t ts) = case T.breakEnd p t of
("", _) -> go (Chunk t r) ts
(l, r') -> (reverseSpine (Chunk l ts), Chunk r' r)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but please do not enable OverloadedStrings in this module.

Would it be possible to implement via foldrChunks? Something like

breakEnd :: (Char -> Bool) -> Text -> (Text, Text)
breakEnd p = foldrChunks go (empty, empty) 
    where 
        go x (ys, zs) 
            | null ys   = let (y, z) = T.breakEnd p x in (chunk y empty, chunk z zs)
            | otherwise = (chunk x ys, zs)

@Bodigrim Bodigrim marked this pull request as draft February 28, 2023 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants