Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConcurrentDictionary.TryGetValue crashes with System.IndexOutOfRangeException: Arg_IndexOutOfRangeException #8866

Closed
igor-sidorovich opened this issue Apr 11, 2024 · 20 comments
Assignees
Labels
Area: Mono Runtime Mono-related issues: BCL bugs, AOT issues, etc. need-attention A xamarin-android contributor needs to review

Comments

@igor-sidorovich
Copy link

igor-sidorovich commented Apr 11, 2024

Android application type

.NET Android (net7.0-android, net8.0-android, etc.)

Affected platform version

.NET 8.0.200, workload 34.0.52/8.0.100

Description

Since updating from Xamarin to .NET 8 we are getting next crashes in ConcurrentDictionary.

00:49:25:118|FATAL| 6|AndroidRestCrashHandler|-Core-|AndroidEnvironment.UnhandledExceptionRaiser. UtcNow is 2024-04-09T19:19:25. --> System.IndexOutOfRangeException: Arg_IndexOutOfRangeException
at System.Collections.Concurrent.ConcurrentDictionary`2[[System.String, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[BB.Features.Resources.Remote.Models.RemoteResourcePackage, BB.Features, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]].TryGetValue(String key, RemoteResourcePackage& value)

For me it looks like strange to have such crashes for ConcurrentDictionary.

Steps to Reproduce

Unfortuantle, i do not have repro(I tried but didn't succeed), but on Production we see such crashes.

Did you find any workaround?

No

Relevant log output

No response
@igor-sidorovich igor-sidorovich added Area: App Runtime Issues in `libmonodroid.so`. needs-triage Issues that need to be assigned. labels Apr 11, 2024
@grendello
Copy link
Member

@igor-sidorovich even though you cannot reproduce it, can you attach a snippet of code which shows more-or-less what you're doing when the crash happens?

Also, please gather and post info about the crash environment(s) - Android version, locale, device(s).

Lastly, could you attach full crash stack trace, with as much context as possible, that you get in the Google console?

Thanks!

@grendello grendello removed their assignment Apr 11, 2024
@grendello grendello added Area: Mono Runtime Mono-related issues: BCL bugs, AOT issues, etc. and removed Area: App Runtime Issues in `libmonodroid.so`. needs-triage Issues that need to be assigned. labels Apr 11, 2024
@grendello
Copy link
Member

@vitek-karas would you mind looking into this issue?

@grendello grendello added the need-info Issues that need more information from the author. label Apr 11, 2024
@igor-sidorovich
Copy link
Author

igor-sidorovich commented Apr 11, 2024

  1. We do not have logs in Google console for current crash(looks like because it's managed crash), only SIGSEGV, SIGABRT without stack traces.
  2. I think, i can't attach code snippet because it's really huge application and it happens in different places where we use ConcurrentDictionary.
  3. List of devices:
    image

P.S. As i see, we have only problems with ConcurrentDictionary where key is string. The length of string(key) approximately - 30-80 symbols, keys count in one ConcurrentDictionary approximately - 20000 -34000. I tired different samples, also override GetHashCode to increase collisions, but cannot reproduce anyway.

@microsoft-github-policy-service microsoft-github-policy-service bot added need-attention A xamarin-android contributor needs to review and removed need-info Issues that need more information from the author. labels Apr 11, 2024
@simonrozsival
Copy link
Contributor

@igor-sidorovich do you use custom IEqualityComparer<string>? If yes, would it be possible to share the code with us?

If the problem is not with the equality comparer, the problem might be in HashHelper.FastMod which is used to get the index in the hash table. I don't see anything obviously wrong with it, but maybe it doesn't behave correctly on some ARM64 processors?

@igor-sidorovich
Copy link
Author

  1. We are use ConcurrentDictionary with default constructor -> _packages = new ConcurrentDictionary<string, RemoteResourcePackage>(). As i can see from .net source code default constructor with next parameters:
public ConcurrentDictionary()
            : this(DefaultConcurrencyLevel, DefaultCapacity, growLockArray: true, null) { }

, and looks like we are using default string comparer

private static readonly NonRandomizedStringEqualityComparer WrappedAroundDefaultComparer = new OrdinalComparer(EqualityComparer<string?>.Default);
  1. But, as i can see, we always use FastMod on 64bit CPU:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
        private static Node? GetBucket(Tables tables, int hashcode)
        {
            VolatileNode[] buckets = tables._buckets;
            if (IntPtr.Size == 8)
            {
                return buckets[HashHelpers.FastMod((uint)hashcode, (uint)buckets.Length, tables._fastModBucketsMultiplier)]._node;
            }
            else
            {
                return buckets[(uint)hashcode % (uint)buckets.Length]._node;
            }
        }

Do we have option do not use FastMod?

  1. And pls clarify, what does it mean "it doesn't behave correctly on some ARM64 processors?"? Thx.

@simonrozsival
Copy link
Contributor

@igor-sidorovich thanks for the additional information.

I tried reproducing the exception today and I wasn't successful. I tried the running the following code on my Samsung S23, and there wasn't any exception thrown by TryGetValue or any other method:

private readonly ConcurrentDictionary<string, string> _dictionary = new();
private Thread[] _threads = new Thread[11];

private void OnCounterClicked(object sender, EventArgs e)
{
	bool gotException = false;

	for (int i = 1; i < _threads.Length; i++)
	{
		_threads[i] = new Thread(() =>
		{
			while (!gotException)
			{
				try
				{
					var key = GenerateRandomString(30, 80);
					var value = "abc";

					_dictionary.TryAdd(key, value);
					_dictionary.TryGetValue(key, out _);

					Interlocked.Increment(ref count);
				}
				catch (Exception ex)
				{
					Console.WriteLine($"Exception: {ex.Message}");
					gotException = true;
				}
			}
		});
	}

	_threads[0] = new Thread(() =>
	{
		while (!gotException)
		{
			Device.BeginInvokeOnMainThread(() => CounterBtn.Text = $"i: {count}, dc: {_dictionary.Count}");
			Thread.Sleep(100);
		}
	});

	foreach (var thread in _threads)
	{
		thread.Start();
	}
}

static string GenerateRandomString(int min, int max)
{
	int length = Random.Shared.Next(min, max);
	var sb = new System.Text.StringBuilder(length	);
	for (int i = 0; i < length; i++)
	{
		sb.Append((char)Random.Shared.Next('A', 'Z' + 1));
	}
	return sb.ToString();
}

The OS eventually kills the app, but nothing suggests it is because of ConcurrentDictionary.

And pls clarify, what does it mean "it doesn't behave correctly on some ARM64 processors?"?

It means that maybe this could some glitch in arm chips? But since it crashes on all those devices in the screenshot running different chips, it's probably not the problem.

I think we need a repro if we want to fix this.

@igor-sidorovich
Copy link
Author

I think we need a repro if we want to fix this.

I understand, but i also tried repro on different ARM phones and no any success.
That's why i created ticket, because on Xamarin we do not have such crashes.

Do we have option do not use FastMod?

@simonrozsival
Copy link
Contributor

Do we have option do not use FastMod?

I don't think so. But honestly, I think the chance that this is the problem is quite slim, since it happens on so many different devices and not just on some specific model.

@simonrozsival
Copy link
Contributor

@igor-sidorovich have you had any success reproducing this issue? or do you have some data about the frequency of the crashes and the affected OS versions or phone models you could share with us?

@igor-sidorovich
Copy link
Author

i do not have repo, i tried but didn't success. I will send affected OS versions and phone models later, thx.

@igor-sidorovich
Copy link
Author

igor-sidorovich commented May 14, 2024

Device OS
realme 3 Android 10
Redmi 13C Android 14
A58 Android 13
Redmi 8 Android 10
V2027 Android 12
Galaxy S23 Ultra Android 14
HUAWEI P30 lite Android 10
moto g(30) Android 12
Alcatel 1SE Android 10
ZERO X Pro Android 11
Galaxy A22s 5G Android 13

@simonrozsival
Copy link
Contributor

@igor-sidorovich thanks for the additional info. I was also interested in the frequency - are these rare cases or is this affecting a large number of customers?

@igor-sidorovich
Copy link
Author

igor-sidorovich commented May 14, 2024

We opened it only on 0.5% percent of customers. It's - India, Colombia, Thailand, Pakistan. And off course it's rare cases.

@igor-sidorovich
Copy link
Author

For 50000 users - we have only 10 crashes.

@simonrozsival
Copy link
Contributor

@jpobst So far, this issue doesn't seem to be Android-specific. I think we should transfer it into dotnet/runtime. Is there some easy way to do it or is it easiest to close this one and open a new one?

@jpobst
Copy link
Contributor

jpobst commented May 17, 2024

Unfortunately there isn't an easy way to transfer issues to a different organization. We will probably need to open a new one in dotnet/runtime.

@igor-sidorovich
Copy link
Author

Should i open new one?

@jpobst
Copy link
Contributor

jpobst commented May 17, 2024

If you don't mind, please do.

Having the issue in dotnet/runtime will help ensure it stays on the Runtime team's radar and is included in their prioritization.

Please include a link to this issue in the description so they can reference the context here.

Thanks!

@igor-sidorovich
Copy link
Author

igor-sidorovich commented May 22, 2024

@simonrozsival, @jpobst Done - dotnet/runtime#102545.

@jpobst
Copy link
Contributor

jpobst commented May 22, 2024

Thanks!

Tracking this here: dotnet/runtime#102545.

@jpobst jpobst closed this as not planned Won't fix, can't repro, duplicate, stale May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Mono Runtime Mono-related issues: BCL bugs, AOT issues, etc. need-attention A xamarin-android contributor needs to review
Projects
None yet
Development

No branches or pull requests

4 participants