-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
twisted.internet.test.test_tcp.AbortConnectionTests_*Tests.test_fullWriteBuffer
test timeout on macOS
#12151
Comments
I am trying to reproduce it. I am using something like this on Ubuntu...
It failed with
If I don't run in parallel... it doesn't look like it's failing. |
Another failure anter running in a loop for about 1 minute
|
The issue here has to do with vagaries regarding the maximum outbound buffer size. It writes one big outgoing chunk, then assumes that this chunk is big enough to flood its own buffer. IF the OS were to actually consume that data somehow, and just put it into a big buffer, hanging is what we would logically see. |
I can actually get a failure relatively quickly with just |
On Ubuntu Desktop 23.10 (default kernel setup) I am not able to get that test to fail. |
After the merge of #12160 i still get an error.... but it might be due to OS limitations
Sometimes the error is something like this ...no sure if it's related
|
I am pretty sure this one is an OS limitation. |
I suspect this one is yet another, different test flake :) |
I have instrumented it and I can get it to fail locally, reliably, in the same way: the server connection doesn't get notified of its connection dropping. … And, oh, of course, this is a predictable failure mode, because the server does |
This might just be a bug in asyncio. There are definitely some potential race conditions in the Twisted code here too, but when I fixed them and instrumented |
Scratch that, not asyncio, I can reproduce it on the select reactor. |
It happens much more frequently if I lose focus on the terminal window where the test is running which makes me wonder if the OS is implicated too |
twisted.internet.test.test_tcp.AbortConnectionTests_AsyncioSelectorReactorTests.test_fullWriteBuffer
test timeouttwisted.internet.test.test_tcp.AbortConnectionTests_AsyncioSelectorReactorTests.test_fullWriteBuffer
test timeout on macOS
twisted.internet.test.test_tcp.AbortConnectionTests_AsyncioSelectorReactorTests.test_fullWriteBuffer
test timeout on macOStwisted.internet.test.test_tcp.AbortConnectionTests_*Tests.test_fullWriteBuffer
test timeout on macOS
I think I'm going to stop debugging this for a while because this looks like a kernel bug where select (and kqueue, and CFRunLoop, and select and friends) will notify that a socket whose peer has closed has some work to do, recv() will give errno 35, and if you busy-loop polling the file descriptor, it will start giving errno 54 as expected after 5 seconds or so, but if you just select() on it it will not start appearing as readable at that transition. Building a test case to illustrate that the sockets layer is broken in this way would be an annoying amount of work, because we'd really want an absolutely minimal test case that doesn't involve any Twisted code, just sockets. |
while debugging #12150 I saw this error locally:
which suggests that maybe CFReactor isn't the only flaky test in there…
The text was updated successfully, but these errors were encountered: