Generic structs with field layout specialized depending on generic type #101350

fabianoliver · 2024-04-21T13:48:16Z

fabianoliver
Apr 21, 2024

First of all, apologies in case this is not the right place to discuss - it's not entirely clear to me if this fits better into the Runtime, or csharplang.

It could be quite useful if we were able to tailor the field layout of a generic struct depending on the generic parameter itself.
Would this be possible at all, or would this break any CLR specs? If not, would it need runtime support, or is the runtime in principle already able to handle this & we're "just" missing language features for this?

Motivating example

We currently don't have an out-of-the-box way for unconstrained generic methods to accept or return optional values:

T? ReturnSomething<T>() { /* ... */ }  // Potentially misleading/not very useful return type if T is a value type

In many practical uses cases, it would be useful for ReturnSomething to return T? for ref types, and Nullable<T> for value types - which is not possible.

In practice, that leaves two options:

Have separate overloads that explicitly differentiate between T: class and T: struct

T? ReturnSomething<T>() where T : class { /* ... */ }  // Sensible
T? ReturnSomething<T>(T witness = default) where T : struct { /* ... */ }

Or, introduce some notion of a custom Option<T> type

Option<T> ReturnSomething<T>() { /* ... */ }

The first approach quickly leads to quite an explosion of code and public APIs, and may increase the surface area for bugs as well as require more testing.

The second is much simpler, but carries a performance penalty for reference types. A common Option<T> type likely look quite similar to Nullable<T>, minus the type constraints:

readonly struct Option<T> where T : notnull
{
  private bool _hasValue; // What a shame, we only really need this field if T is a value type
  private T? _value;
}

In the fairly common use case above, this is fairly inefficient for notnull ref types; the _hasValue field isn't needed. Since we can't trim it away though, we're always left with this overhead; in practice, the Option<T> field for ref types is hence twice as large as needed.

The overall effects will surely depend a lot on circumstance. Quite anecdotal, but for one of the projects I'm currently working on, which relies heavily on transformations of optional values, benchmarks suggest a ~15% performance penalty and a ~25% memory penalty on the overall system compared to a solution that indeed specifies every API function for where T : struct and where T : class explicitly.

Now if we could conceptually do something like this instead:

readonly struct Option<T> where T : notnull
{
  [StructDropFieldIfIsRefType<T>] private bool _hasValue;
  private T? _value;
}

We could possibly avoid the complexity of differentiating between structs and ref types explicitly, while incurring significantly less (or possibly even no?) performance & memory penalties for ref types.

Now the specific attribute is very much tailored towards this example case here of course, maybe there's a nicer way to generalize something like this, maybe not even involving attributes at all.

I'd be curious though to hear thoughts though if something like this in principle would ever be feasible - and if so, if there's any wider interest in this, or whether it's a niche use case that wouldn't warrant broader support.

jkotas · 2024-04-22T00:32:11Z

jkotas
Apr 22, 2024
Collaborator

We currently don't have an out-of-the-box way for unconstrained generic methods to accept or return optional values:

Consider this example that works today. It shows that it is possible, and it looks quite natural in C#:

foo<string?>("Hello");
foo<string?>(null);
foo<int?>(1);
foo<int?>(null);

T foo<T>(T t) { Console.WriteLine(t == null); return t; }

It is as efficient as it can be (string? is passed as reference without extra bool, int? is passed with extra bool that indicates nullability).

I'd be curious though to hear thoughts

There is a lot of thoughts and discussion about this in #21014 .

I do not think that introducing a new Nullable<T>-like type would help with these scenarios. Having two very similar nullable types would create a ton of confusion. Many gaps in these scenarios were filled by introducing nullability annotations. If there are more gaps worth filling, I would expect that we fill them in similar way by building on top what exists as much as possible.

0 replies

fabianoliver · 2024-04-23T13:17:19Z

fabianoliver
Apr 23, 2024
Author

Thanks for the response.

You're certainly right with your snippet. Technically, the method in your example would likely accept nullable and non-nullable values. If you wanted enforce a definitely nullable type is passed (the opposite of where T : notnull), I don't believe we'd have a way of doing so, at least not type checked at compile time.

However, the much bigger issue in my view is any method or type that needs to expose definitely non-nullable and definitely nullable versions of a generic unconstrained type.

Say we want to implement a simple dictionary left-join:

static Dictionary<TKey, TResult> LeftJoin<TKey, TLeft, TRight, TResult>(
    this Dictionary<TKey, TLeft> left,
    Dictionary<TKey, TRight> right,
    Func<TKey, TLeft, TRight?, TResult> resultSelector)
  where TLeft : notnull
  where TRight : notnull
{ /* ... */ }

We want to ensure both the left and right source only contain non-null values; perfect, we can do that already.
But then, we need to pass an optional right value to the result selector here. That's where the wheels come off a little now.

If TRight is a value type, the only thing we can do here really is to pass default(TRight) to resultSelector; but that's clearly wrong of course, as that could very well be a legitimate non-missing value, so we can't do that.

We could introduce another generic parameter, something like TNullableRight and TNonNullableRight .. but aside from polluting APIs with the amount of generics (and wreaking havoc on generic type inference), we have no way of adding any constraint that would enforce TNullableRight is the nullable version of TNonNullableRight, so that wouldn't quite work either.

So I think we'd need to either

Create specific separate overloads for TRight : struct and TRight : class
Change TRight to Option<TRight> (and/or some other avenue that externally conveys this info, like adding an extra bool parameter in the resultSelector etc)

Which would then come back to the issues above - either we have to create heaps of explicit overloads for every nullable/non-nullable permutation of generic function or type accepts; or we go down a path that comes with a sometimes not insignificant performance penalty.

I do not think that introducing a new Nullable<T>-like type would help with these scenarios. Having two very similar nullable types would create a ton of confusion. Many gaps in these scenarios were filled by introducing nullability annotations. If there are more gaps worth filling, I would expect that we fill them in similar way by building on top what exists as much as possible.

That's very fair. Another type related to optionality would definitely cause some confusion!

Personally though, I consider the current behaviour of nullability annotations on conjunction with unconstrained generics to probably be the biggest (out of the very few) footguns C# currently has.

I've seen many extremely seasoned devs introduce or miss various correctness bugs (surely myself included!) based on not noticing that a generic type annotated as nullable might not be nullable at all. I'd bet good money that if you were to scrape random github repositories specifically for usages of unconstrained generics annotated as nullable, and possibly narrow it down to explicit null checks being performed on those, we'd see an absolutely huge amount of misuse / unintended behaviour.

So while the complexity of adding another type to signify optionality definitely isn't great - giving people an avenue to express optionality without these rather surprising & easy to miss pitfalls could well be a net benefit in my view, but I appreciate that's of course debatable.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generic structs with field layout specialized depending on generic type #101350

{{title}}

Replies: 2 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Generic structs with field layout specialized depending on generic type #101350

fabianoliver Apr 21, 2024

Motivating example

Replies: 2 comments

jkotas Apr 22, 2024 Collaborator

fabianoliver Apr 23, 2024 Author

fabianoliver
Apr 21, 2024

jkotas
Apr 22, 2024
Collaborator

fabianoliver
Apr 23, 2024
Author