Avoid allocations in RaiseEvent when there's no listeners #15580

MrJul · 2024-05-01T14:38:36Z

What does the pull request do?

This PR adds new Interactive.RaiseEvent() overloads which avoid creating an event route and the corresponding event arguments if there are no listeners for a given RoutedEvent.

Note that while there's already an existing BuildEventRoute() method to avoid allocating the arguments when there isn't any listener, it doesn't prevent creating the route itself.

In addition, EventRoute.HasHandlers now returns true if there are no handlers inside the route but RoutedFinished handlers exist.

What is the current behavior?

Each time a RoutedEvent is raised, an EventRoute and a RoutedEventArgs (or derived) objects are always created.

What is the updated/expected behavior with this PR?

RoutedEventArgs are created only if necessary (when there's at least one listener).
EventRoute are never created anymore.

How was the solution implemented (if it's not obvious)?

A new LightweightEventRoute is introduced, which has the same behavior as an EventRoute, but is a struct, to avoid heap allocations.
This new type has RaiseEvent() methods taking an arguments factory alongside a context/state. The factory is only called if there are listeners for the event.
The expected usage is to have a static factory to avoid allocating an extra closure on each call (which would defeat the purpose of the method).

Since mutable structures are prone to errors (they must always be passed by reference to work correctly, which is hard to enforce in public API), LightweightEventRoute is internal and isn't exposed directly.
Instead, new public Interactive.RaiseEvent() overloads use the LightweightEventRoute.

The public EventRoute implementation now delegates its logic to LightweightEventRoute.

Notes

The first commit is the most interesting one as it implements the new RaiseEvent() overloads and the LightweightEventRoute type.
The other commits are using the new RaiseEvent() overloads where it's possible.

We have several places where it isn't possible to remove the arguments allocation currently without breaking changes.

For example, there's Control.OnLoaded/OnUnloaded/OnSizeChanged which takes the arguments directly, added in #11828. While there are good arguments exposed in this PR for this change, I think we should also optimize for the 99% case. There are usually hundreds of controls in a given visual tree, and only a few (if any) with a Loaded handler.

That said, we have tons of inconsistency here in the codebase. Most routed events don't have a matching virtual OnXxx() method. Some have. Some of these methods have a RoutedEventArgs parameter. Some don't. We should really unify the behavior here in v12, document it, and stick to it (while still avoiding unnecessary allocations when possible). I plan to open a proposal to discuss that.

avaloniaui-bot · 2024-05-01T15:02:37Z

You can test this PR using the following package version. 11.2.999-cibuild0047982-alpha. (feed url: https://nuget-feed-all.avaloniaui.net/v3/index.json) [PRBUILDID]

cla-avalonia · 2024-05-01T15:02:51Z

All contributors have signed the CLA.

MrJul · 2024-05-01T15:04:51Z

@cla-avalonia agree

kekekeks · 2024-05-01T15:09:42Z

src/Avalonia.Base/Input/InputExtensions.cs

+                return typedInputElement.RaiseEvent(routedEvent);
+            }
+
+            var eventArgs = new RoutedEventArgs(routedEvent);


Won't it make sense to pass a factory here too?

This would require adding all the new RaiseEvent overloads to the IInputElement interface, which I wasn't comfortable with, as I don't think it should be that specific.

Since IInputElement isn't client implementable, we should never get to that fallback anyways.

kekekeks · 2024-05-01T15:11:20Z

src/Avalonia.Base/Interactivity/Interactive.cs

+        /// The created <see cref="RoutedEventArgs"/> used to raise the event,
+        /// or null if the event wasn't raised because it didn't have any listeners.
+        /// </returns>
+        public RoutedEventArgs? RaiseEvent<TContext>(


I wonder if passing the context by value combined with a delegate call is less expensive than dealing with a single allocation

It's optimizing for the case where there are no listeners, where it's a clear win.

For the case where there are listeners, passing the state and calling the delegate is a little slower than a direct call, as you'd expect.

Note that the delegate+state pattern appears in .NET itself, for example in ILogger.Log<TState>() or String.Create<TState>().

Benchmarks

SmallEvent

Method Mean Error StdDev Gen0 Allocated

Standard_No_Listeners 22.091 ns 0.1672 ns 0.1564 ns 0.0024 40 B

Factory_No_Listeners 9.617 ns 0.0545 ns 0.0510 ns - -

Standard_With_Listeners 100.925 ns 0.8585 ns 0.8031 ns 0.0057 96 B

Factory_With_Listeners 111.128 ns 0.6346 ns 0.5936 ns 0.0057 96 B

LargeEvent:

Method Mean Error StdDev Gen0 Allocated

Standard_No_Listeners 39.63 ns 0.768 ns 0.854 ns 0.0076 128 B

Factory_No_Listeners 21.99 ns 0.115 ns 0.108 ns - -

Standard_With_Listeners 118.76 ns 1.438 ns 1.345 ns 0.0110 184 B

Factory_With_Listeners 126.16 ns 0.411 ns 0.343 ns 0.0110 184 B

Benchmark code

#nullable enable using Avalonia.Controls; using Avalonia.Input; using Avalonia.Interactivity; using BenchmarkDotNet.Attributes; namespace Avalonia.Benchmarks; [MemoryDiagnoser] public class RaiseEvent_SmallEvent_Benchmarks { private static readonly RoutedEvent<RoutedEventArgs> _smallEvent = RoutedEvent.Register<Button, RoutedEventArgs>("SomeEvent", RoutingStrategies.Bubble); private readonly Button _button = CreateButton(10); internal static Button CreateButton(int depth) { var button = new Button(); for (int i = 0; i < depth; i++) { var child = new Button(); button.Content = child; button = child; } return button; } [GlobalSetup(Targets = [nameof(Standard_With_Listeners), nameof(Factory_With_Listeners)])] public void SetupListeners() => _button.AddHandler(_smallEvent, (_, _) => { }); [Benchmark] public void Standard_No_Listeners() => Standard(); [Benchmark] public void Factory_No_Listeners() => Factory(); [Benchmark] public void Standard_With_Listeners() => Standard(); [Benchmark] public void Factory_With_Listeners() => Factory(); private void Standard() => _button.RaiseEvent(new RoutedEventArgs(_smallEvent)); private void Factory() => _button.RaiseEvent(_smallEvent); } [MemoryDiagnoser] public class RaiseEvent_LargeEvent_Benchmarks { private static readonly RoutedEvent<PointerEventArgs> _largeEvent = RoutedEvent.Register<Button, PointerEventArgs>("LargeEvent", RoutingStrategies.Bubble); private readonly Button _button = RaiseEvent_SmallEvent_Benchmarks.CreateButton(10); private readonly IPointer _pointer = new Pointer(1, PointerType.Mouse, true); [GlobalSetup(Targets = [nameof(Standard_With_Listeners), nameof(Factory_With_Listeners)])] public void SetupListeners() => _button.AddHandler(_largeEvent, (_, _) => { }); [Benchmark] public void Standard_No_Listeners() => Standard(); [Benchmark] public void Factory_No_Listeners() => Factory(); [Benchmark] public void Standard_With_Listeners() => Standard(); [Benchmark] public void Factory_With_Listeners() => Factory(); private void Standard() => _button.RaiseEvent(new PointerEventArgs( _largeEvent, _button, _pointer, _button, new Point(123.0, 456.0), 9876543210UL, new PointerPointProperties(RawInputModifiers.Control, PointerUpdateKind.LeftButtonPressed), KeyModifiers.Control)); private void Factory() => _button.RaiseEvent( _largeEvent, static (e, ctx) => new PointerEventArgs(e, ctx.source, ctx.pointer, ctx.rootVisual, ctx.rootVisualPosition, ctx.timestamp, ctx.properties, ctx.modifiers), (source: _button, pointer: _pointer, rootVisual: _button, rootVisualPosition: new Point(123.0, 456.0), timestamp: 9876543210UL, properties: new PointerPointProperties(RawInputModifiers.Control, PointerUpdateKind.LeftButtonPressed), modifiers: KeyModifiers.Control)); }

So it's about 6-10% slower when there's listeners, and twice as fast when there's none.
Note that the handler does nothing in these benchmarks - real code will probably have heavier handlers.

If the overhead for listeners isn't acceptable, we can avoid the delegate call and context and use the LightweightEventRoute directly in the solution - it's simply less convenient to write than then RaiseEvent() handlers.

TomEdwardsEnscape

Aside from the small performance drop identified above, the static delegate syntax is hard to use correctly.

Allocations could also be avoided with a null check, using syntax similar to this:

myControl.RaiseEvent(InputElement.TextInputEvent)?.With(new TextInputEventArgs() { Text = "foo" });

The return value of this RaiseEvent overload should be a nullable struct. If it's null, the event args object won't be constructed.

Perhaps a new name is needed for this design, since "RaiseEvent" won't actually raise the event.

maxkatz6 · 2024-05-02T07:14:42Z

myControl.RaiseEvent(InputElement.TextInputEvent)?.With(new TextInputEventArgs() { Text = "foo" });

It reminds me event?.Invoke(sender, args) C# syntax. So, similar naming could be used here.

But I don't have specific preferences here, since old (current) syntax is still available and likely to be the most common to use.

maxkatz6 · 2024-05-02T07:15:12Z

Aside from that, simple dev analyzer can be added to help with finding usages of slower RaiseEvent version.

MrJul · 2024-05-02T09:23:42Z

Allocations could also be avoided with a null check, using syntax similar to this:
myControl.RaiseEvent(InputElement.TextInputEvent)?.With(new TextInputEventArgs() { Text = "foo" });
The return value of this RaiseEvent overload should be a nullable struct. If it's null, the event args object won't be constructed.

While I very much like the simpler syntax you're proposing, it comes with new caveats.
The event route has a PooledList which should be disposed to return the underlying array to the pool.

The struct could implement IDisposable. The syntax isn't as nice, but still acceptable imo:

using var route = myControl.TryBuildEventRoute(Event); // final method name to be defined
route?.RaiseEvent(new RoutedEventArgs(...));

Or we can dispose the list on raise:

myControl.TryBuildEventRoute(Event)?.Raise(new RoutedEventArgs(...));

Forgetting to call Raise wouldn't dispose the list, which is minor as it doesn't impact correctness, only performance.
However, calling Raise twice on the same struct won't do anything the second time, which isn't expected.

Personally, I prefer the first one, with the using, as it's still straightforward even though it isn't a one-liner.

TomEdwardsEnscape · 2024-05-02T09:47:03Z

Both are OK, but I would prefer option B because I think it will be unusual for anyone to try to store the returned object and reuse it, especially if we choose a good name for the methods. "EventRouteBuilder", then "Complete"?

If someone does try to re-use the same struct, an exception can be thrown.

MrJul · 2024-05-02T10:02:14Z

For the sake of discussion, I want to add that option A is a similar to what we have today with the existing BuildEventRoute:

using var route = BuildEventRoute();
route.RaiseEvent(args);

MrJul · 2024-05-02T10:06:13Z

If someone does try to re-use the same struct, an exception can be thrown.

That works, but is a bit surprising.

That being said, I agree it should be unusual to store the returned object, and proper documentation comments should hopefully prevent incorrect usage. Outside of the Avalonia solution, only people caring about these extra allocations will use the new method anyway.

robloo · 2024-05-02T13:18:43Z

So unless this is a major bottleneck we might be going too far on the optimizations now. It's always a balance between optimization <-> maintainability and this seems error prone and not intuitive compared to what we've been doing.

MrJul · 2024-05-02T14:11:03Z

So unless this is a major bottleneck we might be going too far on the optimizations now. It's always a balance between optimization <-> maintainability and this seems error prone and not intuitive compared to what we've been doing.

Yes, it's always a balance, and a difficult one. That's why .NET, especially since Core, has multiple levels of API. For example you can go as high as File.ReadAllBytes() or as low as RandomAccess.Read, while passing by a good middleground with FileStream. Each lower level API is more difficult to use, but more performant. It's the same thing here: I don't propose removing the simple RaiseEvent(args) that's been everywhere! We just need an alternative. (This is my personal opinion and other team members may think differently. Let's discuss!)

While it's not a major bottleneck, allocating is a death by a thousand paper cuts, as I like to remind each time. I don't mind allocations when they are useful, and kept to a minimum if possible (while still having maintainable code). In the case of this PR, launch any Avalonia application, and observe that 95% of raised events on startup have no listeners. Yes, that's a real number.

For example, should a StackPanel raise a SnapPointsChanged event each time its measured, when that event is only for stackpanel-as-items-container? Personally, I think it's a waste that is easily removed, even if it isn't a bottleneck. That event was what prompted me to start this work actually, as I really can't stand code that caters for the 1% case while still running for everyone. No, alone it's not a big deal. Do that for every component, and in a few years you've got a terrible unoptimized mess and people complaining that the framework is slow.

In my opinion, everything should be "pay to use" as much as possible. You're not using that feature? It's free, no extra allocations and minimum CPU time wasted. If possible, it should even be completely trimmed away when using AOT compilation (not applicable to this PR).

As someone looking at memory snapshots very frequently, it's quite hard to analyze and improve things when the majority of the allocations are noise coming from libraries out of your control. I ideally want a "clean state" here as much as possible. Avalonia should be efficient, and never be a bottleneck. I'm very committed to improve things regarding memory allocations and CPU usage in the coming months.

My rant aside, if the proposed API isn't ideal - which I agree it isn't - we can go back to simply exposing LightweightEventRoute, which was my very first iteration of this problem. The API will be similar to the existing EventRoute, minus the mutation methods to avoid problems with mutable structs. If that's still too error prone, we can make it internal (but I believe we should expose performant methods as much as convenient ones and let the user choose depending on its needs).

jp2masa · 2024-05-02T22:51:01Z

Option B could allow reusing the route with Raise vs RaiseAndComplete or Raise(..., bool complete = true), for example. As this is a lower-level API, the increased complexity is acceptable, I think.

robloo · 2024-05-03T02:37:56Z

@MrJul I agree with basically everything you said. My only concern is the API is a bit ugly and non-standard (unwieldy?) for this type of thing 😄 I would prioritize a cleaner API in this area to avoid errors rather than micro optimizations.

But like I said, I agree with you that it makes sense to have different levels of API. If we don't expose this API to app developers (keep it internal) I am even more OK with this. It allows us to improve it in the future if a better idea comes along without committing to it as a public API.

I can live with the PR as is but if I was a gatekeeper I would still want an improved API. I won't hold up the discussion any more on this unless I think of a better API myself.

MrJul added 3 commits May 1, 2024 13:53

Added RaiseEvent overloads to avoid allocs without listeners

0e3f9e3

Use new RaiseEvent in Base project

468c992

Use new RaiseEvent in Controls project

f334469

MrJul added area-perf api API addition labels May 1, 2024

kekekeks reviewed May 1, 2024

View reviewed changes

TomEdwardsEnscape reviewed May 2, 2024

View reviewed changes

MrJul marked this pull request as draft May 3, 2024 08:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid allocations in RaiseEvent when there's no listeners #15580

Avoid allocations in RaiseEvent when there's no listeners #15580

MrJul commented May 1, 2024

avaloniaui-bot commented May 1, 2024

cla-avalonia commented May 1, 2024 •

edited

MrJul commented May 1, 2024

kekekeks May 1, 2024

MrJul May 1, 2024

kekekeks May 1, 2024

MrJul May 1, 2024 •

edited

TomEdwardsEnscape left a comment

maxkatz6 commented May 2, 2024

maxkatz6 commented May 2, 2024

MrJul commented May 2, 2024

TomEdwardsEnscape commented May 2, 2024

MrJul commented May 2, 2024

MrJul commented May 2, 2024

robloo commented May 2, 2024

MrJul commented May 2, 2024

jp2masa commented May 2, 2024

robloo commented May 3, 2024

Method	Mean	Error	StdDev	Gen0	Allocated
Standard_No_Listeners	22.091 ns	0.1672 ns	0.1564 ns	0.0024	40 B
Factory_No_Listeners	9.617 ns	0.0545 ns	0.0510 ns	-	-
Standard_With_Listeners	100.925 ns	0.8585 ns	0.8031 ns	0.0057	96 B
Factory_With_Listeners	111.128 ns	0.6346 ns	0.5936 ns	0.0057	96 B

Method	Mean	Error	StdDev	Gen0	Allocated
Standard_No_Listeners	39.63 ns	0.768 ns	0.854 ns	0.0076	128 B
Factory_No_Listeners	21.99 ns	0.115 ns	0.108 ns	-	-
Standard_With_Listeners	118.76 ns	1.438 ns	1.345 ns	0.0110	184 B
Factory_With_Listeners	126.16 ns	0.411 ns	0.343 ns	0.0110	184 B

Avoid allocations in RaiseEvent when there's no listeners #15580

Are you sure you want to change the base?

Avoid allocations in RaiseEvent when there's no listeners #15580

Conversation

MrJul commented May 1, 2024

What does the pull request do?

What is the current behavior?

What is the updated/expected behavior with this PR?

How was the solution implemented (if it's not obvious)?

Notes

avaloniaui-bot commented May 1, 2024

cla-avalonia commented May 1, 2024 • edited

MrJul commented May 1, 2024

kekekeks May 1, 2024

Choose a reason for hiding this comment

MrJul May 1, 2024

Choose a reason for hiding this comment

kekekeks May 1, 2024

Choose a reason for hiding this comment

MrJul May 1, 2024 • edited

Choose a reason for hiding this comment

Benchmarks

SmallEvent

LargeEvent:

TomEdwardsEnscape left a comment

Choose a reason for hiding this comment

maxkatz6 commented May 2, 2024

maxkatz6 commented May 2, 2024

MrJul commented May 2, 2024

TomEdwardsEnscape commented May 2, 2024

MrJul commented May 2, 2024

MrJul commented May 2, 2024

robloo commented May 2, 2024

MrJul commented May 2, 2024

jp2masa commented May 2, 2024

robloo commented May 3, 2024

cla-avalonia commented May 1, 2024 •

edited

MrJul May 1, 2024 •

edited