Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of UriHelper.GetDisplayUrl #55611

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

paulomorgado
Copy link
Contributor

@paulomorgado paulomorgado commented May 8, 2024

Replaced StringBuilder concatenation with the more efficient string.Create method for creating new strings by concatenating scheme, host, pathBase, path, and queryString. This method reduces overhead by avoiding multiple string concatenations and allocating the correct buffer size for the new string. The CopyTo method is used in a callback function to copy each string to the new string, slicing the buffer to remove the copied part. The SchemeDelimiter is always copied to the new string, regardless of its length. This change enhances the performance of the code.

Improve performance of UriHelper.GetDisplayUrl

  • You've read the Contributor Guide and Code of Conduct.
  • You've included unit or integration tests for your change, where applicable.
  • You've included inline docs for your change, where applicable.
  • There's an open issue for the PR that you are making. If you'd like to propose a new feature or change, please open an issue to discuss the change or find an existing issue.

Summary

UriHelper.GetDisplayUrl uses a non-pooled StringBuilder that is instantiated on every invocation. Although optimized in size, it is a heap allocation with an intermediary buffer.

public static string GetDisplayUrl(this HttpRequest request)
{
    var scheme = request.Scheme ?? string.Empty;
    var host = request.Host.Value ?? string.Empty;
    var pathBase = request.PathBase.Value ?? string.Empty;
    var path = request.Path.Value ?? string.Empty;
    var queryString = request.QueryString.Value ?? string.Empty;

    // PERF: Calculate string length to allocate correct buffer size for StringBuilder.
    var length = scheme.Length + SchemeDelimiter.Length + host.Length
        + pathBase.Length + path.Length + queryString.Length;

    return new StringBuilder(length)
        .Append(scheme)
        .Append(SchemeDelimiter)
        .Append(host)
        .Append(pathBase)
        .Append(path)
        .Append(queryString)
        .ToString();
}

Motivation and goals

This method is frequently used in hot paths like redirect and rewrite rules.

From the benchmarks below, we can see that, compared to the current implementation using a StringBuilder with enough capacity, string interpolation is around 3 times better in terms of duration and around 4 times in memory used.

String.Create is even more performant.

Benchmarks

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3593/23H2/2023Update/SunValley3)
13th Gen Intel Core i9-13900K, 1 CPU, 32 logical and 24 physical cores
.NET SDK 9.0.100-preview.4.24267.66
  [Host]     : .NET 9.0.0 (9.0.24.26619), X64 RyuJIT AVX2
  DefaultJob : .NET 9.0.0 (9.0.24.26619), X64 RyuJIT AVX2

Method scheme host basePath path query Mean Ratio Gen0 Allocated Alloc Ratio
StringBuilder http cname.domain.tld **** / **** 67.161 ns 1.00 0.0288 544 B 1.00
String_Interpolation http cname.domain.tld / 24.606 ns 0.37 0.0038 72 B 0.13
String_Concat http cname.domain.tld / 23.160 ns 0.34 0.0038 72 B 0.13
String_Create http cname.domain.tld / 9.903 ns 0.15 0.0038 72 B 0.13
StringBuilder http cname.domain.tld **** / ?para(...)alue3 [42] 92.873 ns 1.00 0.0446 840 B 1.00
String_Interpolation http cname.domain.tld / ?para(...)alue3 [42] 26.817 ns 0.29 0.0085 160 B 0.19
String_Concat http cname.domain.tld / ?para(...)alue3 [42] 25.303 ns 0.27 0.0085 160 B 0.19
String_Create http cname.domain.tld / ?para(...)alue3 [42] 11.978 ns 0.13 0.0085 160 B 0.19
StringBuilder http cname.domain.tld **** /path/one/two/three **** 74.314 ns 1.00 0.0314 592 B 1.00
String_Interpolation http cname.domain.tld /path/one/two/three 23.582 ns 0.32 0.0059 112 B 0.19
String_Concat http cname.domain.tld /path/one/two/three 35.836 ns 0.48 0.0059 112 B 0.19
String_Create http cname.domain.tld /path/one/two/three 9.352 ns 0.13 0.0059 112 B 0.19
StringBuilder http cname.domain.tld **** /path/one/two/three ?para(...)alue3 [42] 93.593 ns 1.00 0.0467 880 B 1.00
String_Interpolation http cname.domain.tld /path/one/two/three ?para(...)alue3 [42] 28.930 ns 0.31 0.0102 192 B 0.22
String_Concat http cname.domain.tld /path/one/two/three ?para(...)alue3 [42] 41.730 ns 0.45 0.0102 192 B 0.22
String_Create http cname.domain.tld /path/one/two/three ?para(...)alue3 [42] 13.065 ns 0.14 0.0102 192 B 0.22
StringBuilder http cname.domain.tld /base-path / **** 71.984 ns 1.00 0.0305 576 B 1.00
String_Interpolation http cname.domain.tld /base-path / 23.342 ns 0.32 0.0051 96 B 0.17
String_Concat http cname.domain.tld /base-path / 20.272 ns 0.28 0.0051 96 B 0.17
String_Create http cname.domain.tld /base-path / 10.282 ns 0.14 0.0051 96 B 0.17
StringBuilder http cname.domain.tld /base-path / ?para(...)alue3 [42] 93.702 ns 1.00 0.0459 864 B 1.00
String_Interpolation http cname.domain.tld /base-path / ?para(...)alue3 [42] 28.924 ns 0.31 0.0093 176 B 0.20
String_Concat http cname.domain.tld /base-path / ?para(...)alue3 [42] 25.931 ns 0.28 0.0093 176 B 0.20
String_Create http cname.domain.tld /base-path / ?para(...)alue3 [42] 13.951 ns 0.15 0.0093 176 B 0.20
StringBuilder http cname.domain.tld /base-path /path/one/two/three **** 75.755 ns 1.00 0.0327 616 B 1.00
String_Interpolation http cname.domain.tld /base-path /path/one/two/three 24.781 ns 0.33 0.0068 128 B 0.21
String_Concat http cname.domain.tld /base-path /path/one/two/three 36.724 ns 0.48 0.0068 128 B 0.21
String_Create http cname.domain.tld /base-path /path/one/two/three 11.092 ns 0.15 0.0068 128 B 0.21
StringBuilder http cname.domain.tld /base-path /path/one/two/three ?para(...)alue3 [42] 89.961 ns 1.00 0.0479 904 B 1.00
String_Interpolation http cname.domain.tld /base-path /path/one/two/three ?para(...)alue3 [42] 31.002 ns 0.34 0.0114 216 B 0.24
String_Concat http cname.domain.tld /base-path /path/one/two/three ?para(...)alue3 [42] 41.576 ns 0.46 0.0114 216 B 0.24
String_Create http cname.domain.tld /base-path /path/one/two/three ?para(...)alue3 [42] 14.374 ns 0.16 0.0115 216 B 0.24

StringBuilder

This benchmark uses the same implementation as UriHelper.GetDisplayUrl.

String_Interpolation

This benchmark uses string interpolation to build the URL.

String_Concat

This benchmark uses the new in .NET 9.0 String.Concat(ReadOnlySpan<string?> values) to build the URL.

String_Create

This benchmark uses String.Create and spans to build the URL.

Code

[MemoryDiagnoser]
[HideColumns("Error", "StdDev", "Median", "RatioSD")]
public class DisplayUrlBenchmark
{
    private static readonly string SchemeDelimiter = Uri.SchemeDelimiter;

    private static readonly string[] schemes = ["http"];
    private static readonly string[] hosts = ["cname.domain.tld"];
    private static readonly string[] basePaths = [null, "/base-path",];
    private static readonly string[] paths = ["/", "/path/one/two/three",];
    private static readonly string[] queries = [null, "?param1=value1&param2=value2&param3=value3",];

    public IEnumerable<object[]> Data()
    {
        foreach (var scheme in schemes)
        {
            foreach (var host in hosts)
            {
                foreach (var basePath in basePaths)
                {
                    foreach (var path in paths)
                    {
                        foreach (var query in queries)
                        {
                            yield return new object[] { scheme, new HostString(host), new PathString(basePath), new PathString(path), new QueryString(query), };
                        }
                    }
                }
            }
        }
    }

    [Benchmark(Baseline = true)]
    [ArgumentsSource(nameof(Data))]
    public string StringBuilder(string scheme, HostString host, PathString basePath, PathString path, QueryString query)
    {
        var schemeValue = scheme ?? string.Empty;
        var hostValue = host.Value ?? string.Empty;
        var basePathValue = basePath.Value ?? string.Empty;
        var pathValue = path.Value ?? string.Empty;
        var queryValue = query.Value ?? string.Empty;

        var length =
            +schemeValue.Length
            + SchemeDelimiter
            + hostValue.Length
            + basePathValue.Length
            + pathValue.Length
            + queryValue.Length;

        return new StringBuilder(length)
                .Append(schemeValue)
                .Append(SchemeDelimiter)
                .Append(hostValue)
                .Append(basePathValue)
                .Append(pathValue)
                .Append(queryValue)
                .ToString();
    }

    [Benchmark]
    [ArgumentsSource(nameof(Data))]
    public string String_Interpolation(string scheme, HostString host, PathString basePath, PathString path, QueryString query)
    {
        return $"{scheme}://{host.Value}{basePath.Value}{path.Value}{query.Value}";
    }

    [Benchmark]
    [ArgumentsSource(nameof(Data))]
    public string String_Concat(string scheme, HostString host, PathString basePath, PathString path, QueryString query)
    {
        return string.Concat((ReadOnlySpan<string>)[scheme, "://", host.Value, basePath.Value, path, query.Value]);
    }

    [Benchmark]
    [ArgumentsSource(nameof(Data))]
    public string String_Create(string scheme, HostString host, PathString basePath, PathString path, QueryString query)
    {
        var schemeValue = scheme ?? string.Empty;
        var hostValue = host.Value ?? string.Empty;
        var basePathValue = basePath.Value ?? string.Empty;
        var pathValue = path.Value ?? string.Empty;
        var queryValue = query.Value ?? string.Empty;

        var length =
            +schemeValue.Length
            + SchemeDelimiter.Length
            + hostValue.Length
            + basePathValue.Length
            + pathValue.Length
            + queryValue.Length;

        return string.Create(
            length,
            (schemeValue, hostValue, basePathValue, pathValue, queryValue),
            static (buffer, uriParts) =>
            {
                var (scheme, host, basePath, path, query) = uriParts;

                if (scheme.Length > 0)
                {
                    scheme.CopyTo(buffer);
                    buffer = buffer.Slice(scheme.Length);
                }

                SchemeDelimiter.CopyTo(buffer);
                buffer = buffer.Slice(SchemeDelimiter.Length);

                if (host.Length > 0)
                {
                    host.CopyTo(buffer);
                    buffer = buffer.Slice(host.Length);
                }

                if (basePath.Length > 0)
                {
                    basePath.CopyTo(buffer);
                    buffer = buffer.Slice(basePath.Length);
                }

                if (path.Length > 0)
                {
                    path.CopyTo(buffer);
                    buffer = buffer.Slice(path.Length);
                }

                if (query.Length > 0)
                {
                    query.CopyTo(buffer);
                }
            });
    }
}

{Detail}

Fixes #28906

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label May 8, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 8, 2024
Replaced `StringBuilder` concatenation with the more efficient `string.Create` method for creating new strings by concatenating `scheme`, `host`, `pathBase`, `path`, and `queryString`. This method reduces overhead by avoiding multiple string concatenations and allocating the correct buffer size for the new string. The `CopyTo` method is used in a callback function to copy each string to the new string, slicing the buffer to remove the copied part. The `SchemeDelimiter` is always copied to the new string, regardless of its length. This change enhances the performance of the code.
.Append(path)
.Append(queryString)
.ToString();
return string.Create(
Copy link
Member

@MihaZupan MihaZupan May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meaningfully better than just calling the span Concat overload?

return string.Concat((ReadOnlySpan<string?>)[
    request.Scheme,
    SchemeDelimiter,
    request.Host.Value,
    request.PathBase.Value,
    request.Path.Value,
    request.QueryString.Value
]);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because string.Concat with ReadOnlySpan<char> only goes up to 5 parameters and 6 are needed, as you figured out, an array allocation is needed (the API you "used" doesn't exist).

I haven't benchmarked this one, but I don't expect it to be better than string.Create.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does exist in .Net 9, might be fairly recent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still can't see how that could be better than string.Create. have you benchmarked it?

Copy link
Member

@MihaZupan MihaZupan May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better by way of being 10x shorter and thus more readable/maintainable.

Unless rolling out the custom Concat is meaningfully faster, I don't think it's worth the extra logic.
Judging by the implementation, I would expect the two to perform very similarly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's around 3 times faster and uses around 4 times less memory and is a pattern already used in the code base.

I'm not saying we shouldn't make any changes here. Thank you for looking into this and improving things.
Going through string.Create definitely is faster than StringBuilder, that isn't being disputed.

I'm saying that using an existing helper string.Concat(ReadOnlySpan<string>) should be very similar perf-wise to the significantly more logic needed to go through string.Create.

Copy link
Member

@MihaZupan MihaZupan May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. for a sample input from your benchmark

Method Mean Allocated
StringBuilder 262.35 ns 904 B
String_Interpolation 111.77 ns 216 B
Concat 88.82 ns 216 B
String_Create 75.50 ns 216 B

Then the question becomes whether the extra LOC are worth it for 10 ns on GetDisplayUrl.
I'll leave that up to the maintainers of this repo.

Side note: I wouldn't be surprised if Interpolation ends up compiling to the same thing as Concat in future compiler versions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems odd that Concat is slower, it should basically be doing the same thing as string.Create here.

Is it possible that the CopyStringContent calls are adding overhead that could be reduced?

Copy link
Member

@MihaZupan MihaZupan May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That helper is most likely getting inlined. My guess would be it's a combination of

  • extra branches to guard against null values (both when calculating length and before copying)
  • extra branch to check for length overflow (which theoretically the manual string.Create should also be doing)
  • extra branches for length checks before copying (defensive in case the backing values changed)
  • extra branch at the end to check if the length is correct
  • branches from having the loops at all instead of being effectively manually unrolled
  • One extra Memmove call for "://" which would otherwise be turned into a couple movs when used directly in a CopyTo like in the string.Create impl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BrennanConroy,

Seems odd that Concat is slower, it should basically be doing the same thing as string.Create here.

Is it possible that the CopyStringContent calls are adding overhead that could be reduced?

public static string Concat(/params/ ReadOnlySpan<string?> values) is not yet in .NET 9.0.0-preview.3.24172.9. As soon as it is in a preview,

The current implementation of string.Concat that does not require the allocation of a string[] only accepts up to 4 strings, and this requires 5.

I'll update my benchmarks to use it.

@MihaZupan,

That helper is most likely getting inlined. My guess would be it's a combination of

  • extra branches to guard against null values (both when calculating length and before copying)
  • extra branch to check for length overflow (which theoretically the manual string.Create should also be doing)
  • extra branches for length checks before copying (defensive in case the backing values changed)
  • extra branch at the end to check if the length is correct
  • branches from having the loops at all instead of being effectively manually unrolled
  • One extra Memmove call for "://" which would otherwise be turned into a couple movs when used directly in a CopyTo like in the string.Create impl

When hot path performance is involved, I tend to not rely on guessing. I rely on personal and community experience and concrete measurements.

@amcasey
Copy link
Member

amcasey commented May 15, 2024

I'm going to be sad if this is the most efficient way to concatenate six strings.

@paulomorgado
Copy link
Contributor Author

I'm going to be sad if this is the most efficient way to concatenate six strings.

Up to 9.0.0-preview.3.24175.3, it is.

@paulomorgado
Copy link
Contributor Author

Upated to use the new in .NET 9.0 String.Concat(ReadOnlySpan<string?> values) method.

Always slower than String.Create and slower than string interpolation in most of the cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions community-contribution Indicates that the PR has been added by a community member Perf
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UriHelper.GetDisplayUrl: opportunity for performance improvement
5 participants