Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Data.Text.Lazy.Builder.toText #295

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chris-martin
Copy link

This is a trivial composition, but it has been a recurring inconvenience for me for years. Particularly when I'm writing introductory tutorials, it is difficult to explain why among the three major types (strict text, lazy text, and builder) only five of the six possible conversion functions are given in the library.

@phadej
Copy link
Contributor

phadej commented Aug 25, 2020

But there is a reason. If you make a Builder, then convert to lazy Text and then again to strict Text you are most likely doing some completely unnecessary copying, instead of directly doing thing you want to do with lazy Text.

I'm not convinced it is a good idea to have this function. If you really need that conversion, write it explicitly in your code, and wonder whether you can afford the conversion. Hiding this fact, especially in introductionary tutorials, is not good idea IMO.

The same reasoning there isn't way to convert bytestring's Builder to strict ByteString.

@chris-martin
Copy link
Author

Perhaps then should Data.Text.Lazy.toStrict have a warning on it, or perhaps even be deprecated, for the same reason? Or does the principle that lazy text ought to be used directly, rather than be made strict, only apply to lazy text that was produced using Builder?

@friedbrice
Copy link

I frequently need this function, as well. When I need it, I don't normally find myself thinking, "I wonder if I can afford the conversion." Instead, I normally find myself thinking "Dammit, why does this library function (or record) take strict Text? I guess I'd better stop what I'm doing, add an import to the top of my file, and do the conversion." Why am I using such terrible libraries, you might ask? Well, the function had to take something and strict Text is just what they decided.

To me, the above reasoning behind not including the one missing conversion is flimsy at best and at worst creates a sliding goalpost. As already pointed out above, less-than-optimal functions already abound. Why is the policy on admitting only the most optimal functions not uniformly followed? The answer is that it'd make for a library that nobody would want to use.

Finally, while I understand the passion for conceptual understanding that motivates opinions such as, "Hiding this fact, especially in introductionary tutorials, is not good idea IMO," I am obliged to point out that the literature disagrees with you. Identification of learners' zone of proximal development combined with careful scaffolding is an important component in lesson design, even for adult learners (https://journals.sagepub.com/doi/full/10.3102/0034654316670999). All this amounts to not overwhelming the learner with too much information at once. I'm sure you're familiar with the passage from the Haskell Report which I quote, "[Haskell] should be suitable for teaching, research, and applications, including building large systems." Notice that "teaching" is the first stated goal, followed by research, with applications coming in third.

I think we'd be losing sight of larger goals if we were to allow an argument from premature optimization to affect our decisions here when the suggested addition serves both an industrially-pragmatic and pedagogically-sound purpose.

@phadej
Copy link
Contributor

phadej commented Aug 25, 2020

Ok.

Let's have toStrictText and toStrictTextWith.


I'd also welcome a similar PR to bytestring, though I'm not a maintainer of that library. ping @sjakobi

@sjakobi
Copy link
Member

sjakobi commented Aug 25, 2020

I'd also welcome a similar PR to bytestring, though I'm not a maintainer of that library. ping @sjakobi

Another argument against this addition is the good old Fairbairn threshold.

I currently feel about -1/2 on these additions.

The other bytestring maintainers might disagree though. CC @Bodigrim, @hsyl20, @bgamari, @chessai, @dcoutts, @hvr.

@Bodigrim
Copy link
Contributor

As a maintainer of bytestring, I don't see much point in adding trivial, type-driven plumbing of this kind. GHC should be perfectly capable to suggest toStrict instead of a hole.

From my point of view the real annoyance here is that Data.Text does not export toStrictText, so one has to jump to the import list and add Data.Text.Lazy (with some qualified namespace). Could we possibly export toStrictText / fromStrictText from Data.Text?

@chris-martin
Copy link
Author

Two opportunities I think this function affords:

  1. It is a good place to attach documentation. If you search Hoogle for Builder -> Text, this function can come up, and it can include @phadej's warning about the circumstances in which one would be better off preferring lazy text to avoid copying.
  2. Although I unconfidently implemented this function as the trivial composition toText = L.toStrict . toLazyText, @Bodigrim, I was hoping that this simple starting point might eventually give way to a better definition. I'm not sure it's actually necessary to construct the intermediate lazy Text value; instead of passing the chunk list to Data.Text.Lazy.fromChunks (in definition of toLazyTextWith), perhaps we could apply Data.Text.concat to it directly? I am sympathetic to Fairbairn, @sjakobi, but adding more combinations to the library can often give the library authors more ability to give us a more performant composition.

@Lysxia Lysxia added API Addition (PVP minor) question Requires more investigation labels Mar 7, 2021
@Lysxia
Copy link
Contributor

Lysxia commented May 12, 2021

I am in favor of adding this function, agreeing with @chris-martin's motivations above, and particularly:

it is difficult to explain why among the three major types (strict text, lazy text, and builder) only five of the six possible conversion functions are given in the library.

  1. It's worth adding some comment that it does extra copying so users who care about that can think twice about it.
  2. I also think the more explicit name toStrictText is better.

@haskell/text any objections?

@Bodigrim
Copy link
Contributor

As long as there is no expectation for bytestring to follow suit, I don't object.

@parsonsmatt
Copy link

I'd be happy to include the function. Even if we immediately throw a WARNING or DEPRECATION pragma and suggest something better, it's always good to have easily discoverable names that we can attach documentation to.

@Lysxia Lysxia added this to the 1.3.0.0 milestone May 25, 2021
@Bodigrim
Copy link
Contributor

@Lysxia if you are still in favor of the idea, feel free to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Addition (PVP minor) question Requires more investigation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants