Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nursery does not re-raise error from .run_in_actor() remote task #287

Open
didimelli opened this issue Jan 3, 2022 · 22 comments · May be fixed by #288
Open

Nursery does not re-raise error from .run_in_actor() remote task #287

didimelli opened this issue Jan 3, 2022 · 22 comments · May be fixed by #288
Labels
bug Something isn't working cancellation SC teardown semantics and anti-zombie semantics supervision

Comments

@didimelli
Copy link

didimelli commented Jan 3, 2022

Hello,
I am on WSL2, with python 3.9.9 and tractor 0.1.0a4.

I am testing the cancellation of the tasks when one raises. I am using this simple snippet:

import tractor
import trio
from trio import sleep
from trio import sleep_forever

async def not_raising() -> None:
    print("inside not raising")
    await sleep_forever()

async def raising() -> None:
    print("inside raising!")
    await sleep(5)
    raise ValueError("random raise from bad task")

async def main() -> None:
    async with tractor.open_nursery() as n:
        t1 = await n.run_in_actor(not_raising, name="raising")
        t2 = await n.run_in_actor(raising, name="not raising")
    print("outside the tractor nursery")

async def main2() -> None:
    async with trio.open_nursery() as n:
        n.start_soon(not_raising)
        n.start_soon(raising)
    print("outside trio nursery")

if __name__ == "__main__":
    trio.run(main)

I would expect that when raising task raises ValueError, the other task gets cancelled (as Trio does, as in main2 function). However, when I run this code, i get:

No actor could be found @ 127.0.0.1:1616
inside not raising
inside raising!
Jan 03 16:03:58 (not raising, 11045, raising)) [ERROR] tractor _actor.py:260 Actor crashed:
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task
Remote context error for ('not raising', '8e3309a2-2e39-4a8e-a3ec-15ce0977c1a8'):244eb763-1b98-4f46-9d09-93aa6791e400:
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task

Cancelling ('not raising', '8e3309a2-2e39-4a8e-a3ec-15ce0977c1a8') after error ('not raising', '8e3309a2-2e39-4a8e-a3ec-15ce0977c1a8')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task

Cancelling existing result waiter task for ('not raising', '8e3309a2-2e39-4a8e-a3ec-15ce0977c1a8')

and the programs hangs indefinitely. I need to press ctrl+c to kill the process.

The traceback shows that tractor is trying to cancel the not_raising task (Cancelling ('not raising', '8e3309a2-2e39-4a8e-a3ec-15ce0977c1a8') after error ('not raising', '8e3309a2-2e39-4a8e-a3ec-15ce0977c1a8')), but the program hangs there.

If instead of sleep_forever(), i use sleep(20), the program actually sleeps for 20 seconds before exiting.

Since this is a basic feature of the library, i feel like i am missing something.
Could you please help me understand what it is happening?

@didimelli
Copy link
Author

didimelli commented Jan 3, 2022

I can workaround this by doing:

async def main() -> None:
    async with tractor.open_nursery() as n:
        t1 = await n.run_in_actor(not_raising, name="raising")
        t2 = await n.run_in_actor(raising, name="not raising")
        async with trio.open_nursery() as tn:
            tn.start_soon(t1.result)
            tn.start_soon(t2.result)
    print("outside the tractor nursery")

but the traceback is weird (reported below)

No actor could be found @ 127.0.0.1:1616
inside not raising
inside raising!
Jan 03 16:27:41 (not raising, 14349, raising)) [ERROR] tractor _actor.py:260 Actor crashed:
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task
Remote context error for ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14'):5ca62b9f-f356-49b0-b38a-3465dd97eaa4:
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task

Nursery for ('root', '1768cd15-f853-44ba-bff9-bd556f2229a4') errored with ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task
,
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 83, in _unwrap_msg
    return msg['return']
KeyError: 'return'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 325, in _open_and_supervise_one_cancels_all_nursery
    yield anursery
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 482, in open_nursery
    yield anursery
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 37, in main
    tn.start_soon(t2.result)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/trio/_core/_run.py", line 815, in __aexit__
    raise combined_error_from_nursery
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 182, in result
    return _unwrap_msg(self._result_msg, self.channel)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 87, in _unwrap_msg
    raise unpack_error(msg, channel)
tractor._exceptions.RemoteActorError: ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task

Cancelling ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14') after error ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task

Cancelled result waiter for ('root', '1768cd15-f853-44ba-bff9-bd556f2229a4')
Root actor crashed:
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 83, in _unwrap_msg
    return msg['return']
KeyError: 'return'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 325, in _open_and_supervise_one_cancels_all_nursery
    yield anursery
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 482, in open_nursery
    yield anursery
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 37, in main
    tn.start_soon(t2.result)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/trio/_core/_run.py", line 815, in __aexit__
    raise combined_error_from_nursery
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 182, in result
    return _unwrap_msg(self._result_msg, self.channel)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 87, in _unwrap_msg
    raise unpack_error(msg, channel)
tractor._exceptions.RemoteActorError: ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_root.py", line 204, in open_root_actor
    yield actor
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 482, in open_nursery
    yield anursery
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/async_generator/_util.py", line 53, in __aexit__
    await self._agen.athrow(type, value, traceback)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 434, in _open_and_supervise_one_cancels_all_nursery
    raise list(errors.values())[0]
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_spawn.py", line 137, in exhaust_portal
    final = await portal.result()
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 182, in result
    return _unwrap_msg(self._result_msg, self.channel)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 87, in _unwrap_msg
    raise unpack_error(msg, channel)
tractor._exceptions.RemoteActorError: ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task

Channel for ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14') was already closed
Channel for ('raising', 'b24609b7-c3f6-4772-9e77-e639e5fbac9e') was already closed
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 83, in _unwrap_msg
    return msg['return']
KeyError: 'return'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 325, in _open_and_supervise_one_cancels_all_nursery
    yield anursery
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 482, in open_nursery
    yield anursery
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 37, in main
    tn.start_soon(t2.result)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/trio/_core/_run.py", line 815, in __aexit__
    raise combined_error_from_nursery
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 182, in result
    return _unwrap_msg(self._result_msg, self.channel)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 87, in _unwrap_msg
    raise unpack_error(msg, channel)
tractor._exceptions.RemoteActorError: ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 49, in <module>
    trio.run(main)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/trio/_core/_run.py", line 1932, in run
    raise runner.main_task_outcome.error
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 37, in main
    tn.start_soon(t2.result)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/async_generator/_util.py", line 53, in __aexit__
    await self._agen.athrow(type, value, traceback)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 482, in open_nursery
    yield anursery
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/async_generator/_util.py", line 53, in __aexit__
    await self._agen.athrow(type, value, traceback)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_supervise.py", line 434, in _open_and_supervise_one_cancels_all_nursery
    raise list(errors.values())[0]
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_spawn.py", line 137, in exhaust_portal
    final = await portal.result()
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 182, in result
    return _unwrap_msg(self._result_msg, self.channel)
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_portal.py", line 87, in _unwrap_msg
    raise unpack_error(msg, channel)
tractor._exceptions.RemoteActorError: ('not raising', '021f0dd9-a52c-4b8b-b116-6fd013d30a14')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 232, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task

@goodboy goodboy added bug Something isn't working cancellation SC teardown semantics and anti-zombie semantics supervision labels Jan 3, 2022
@goodboy
Copy link
Owner

goodboy commented Jan 3, 2022

@didimelli ahah! Nice catch.

This does appear to be a bug todo with the nursery reaping.
You are correct, when using run_in_actor() the nursery should reap portals (your t1 and t2 here) on __aexit__().
I'm not sure why this is now broken but clearly we need a better test for it.

My suspicion is the issue is a race between when the nursery exit happens versus when the error is collected from the portal. This is likely a regression after the latest release since we changed some cancel request semantics to be more deterministic and this may have touched some error retrieval logic which we aren't testing in this way.

Would you mind trying this on .alpha3 as well and see if you get the same?

Cheers and thanks for finding this bug!

@goodboy
Copy link
Owner

goodboy commented Jan 3, 2022

As some tips for your example (and how I figured out this was indeed a bug):

  • you probably want to reverse the name= for each .run_in_actor() call 😉 (or just don't bother passing name since the func name will be used)
  • you can pass loglevel='runtime' to the .open_nursery() call to see more detailed logging

@goodboy
Copy link
Owner

goodboy commented Jan 3, 2022

Huh i just tried on the last 2 alphas and the same issue still seems to exist!?
So it's quite possible this has been a long lived bug.

I will dig in this afternoon.

@didimelli
Copy link
Author

Here the logs with loglevel='runtime' with a3 (still hanging)

❯ poetry run python gse/main.py
Jan 03 18:30:44 (no actor context, 20253, __main__.main)) [WARNING] tractor _root.py:121 No actor could be found @ 127.0.0.1:1616
Jan 03 18:30:44 (root, 20253, __main__.main)) [INFO] tractor _root.py:158 Starting local <tractor._actor.Arbiter object at 0x7fbc79053be0> @ 127.0.0.1:1616
Jan 03 18:30:44 (root, 20253, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:391 Attempting to import tractor._debug@/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_debug.py
Jan 03 18:30:44 (root, 20253, tractor._actor.Actor._serve_forever)) [RUNTIME] tractor _actor.py:1079 Started tcp server(s) on [<trio.socket.socket fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616)>]
Jan 03 18:30:44 (root, 20253, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:932 Registering <tractor._actor.Arbiter object at 0x7fbc79053be0> for role `root`
Jan 03 18:30:44 (root, 20253, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:960 Waiting on service nursery to complete
Jan 03 18:30:44 (root, 20253, tractor._spawn.new_proc)) [RUNTIME] tractor _spawn.py:236 Started <trio.Process ['/home/didi/leaf/leaf-radio/.venv/bin/python', '-m', 'tractor._child', '--uid', "('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')", '--parent_addr', "('127.0.0.1', 1616)", '--loglevel', 'runtime']: running with PID 20256>
Jan 03 18:30:44 (root, 20253, tractor._spawn.new_proc)) [RUNTIME] tractor _actor.py:364 Waiting for peer ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173') to connect
Jan 03 18:30:45 (no actor context, 20256, no task context)) [INFO] tractor._entry _entry.py:74 Started ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:431 New connection to us <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47296)>
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')@('127.0.0.1', 1616) complete
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')@('127.0.0.1', 47296) complete
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:460 Waking channel waiters _ParkingLotStatistics(tasks_waiting=1)
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:474 Registered <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47296)> for ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47296)> from ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')
Jan 03 18:30:45 (root, 20253, tractor._spawn.new_proc)) [RUNTIME] tractor _actor.py:367 ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173') successfully connected back to us
Jan 03 18:30:45 (root, 20253, __main__.main)) [RUNTIME] tractor _actor.py:575 Getting result queue for ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173') cid 2cbfe0e1-728f-482e-94f9-a981fede9067
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:819 Received state from parent:
{'_parent_main_data': {'init_main_from_path': '/home/didi/leaf/leaf-radio/gse/main.py'}, 'enable_modules': {'__main__': '/home/didi/leaf/leaf-radio/gse/main.py', 'tractor._debug': '/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_debug.py'}, '_arb_addr': ('127.0.0.1', 1616), 'bind_host': '127.0.0.1', 'bind_port': 0, '_runtime_vars': {'_debug_mode': False, '_is_root': False, '_root_mailbox': ('127.0.0.1', 1616)}}
Jan 03 18:30:45 (root, 20253, __main__.main)) [RUNTIME] tractor _actor.py:600 Sending cmd to ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'): __main__.not_raising({})
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:828 Runtime vars are: {'_debug_mode': False, '_is_root': False, '_root_mailbox': ('127.0.0.1', 1616)}
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:391 Attempting to import __main__@/home/didi/leaf/leaf-radio/gse/main.py
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:391 Attempting to import tractor._debug@/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_debug.py
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._serve_forever)) [RUNTIME] tractor _actor.py:1079 Started tcp server(s) on [<trio.socket.socket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 36111)>]
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:932 Registering <tractor._actor.Actor object at 0x7ff44bb1a670> for role `not raising`
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:431 New connection to us <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47298)>
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')@('127.0.0.1', 1616) complete
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')@('127.0.0.1', 47298) complete
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:470 already have channel(s) for ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'):[<Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47296)>]?
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:474 Registered <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47298)> for ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47298), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47298)> from ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:575 Getting result queue for ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') cid ad6c55a5-fa8b-4047-b89a-e905a04d8d63
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:600 Sending cmd to ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db'): self.register_actor({'uid': ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'), 'sockaddr': ('127.0.0.1', 36111)})
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:671 Processing request from ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')
self.register_actor({'uid': ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'), 'sockaddr': ('127.0.0.1', 36111)})
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:720 Spawning task for <bound method Arbiter.register_actor of <tractor._actor.Arbiter object at 0x7fbc79053be0>>
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:743 RPC func is <bound method Arbiter.register_actor of <tractor._actor.Arbiter object at 0x7fbc79053be0>>
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:553 Delivering {'functype': 'asyncfunc', 'cid': 'ad6c55a5-fa8b-4047-b89a-e905a04d8d63'} from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') to caller ad6c55a5-fa8b-4047-b89a-e905a04d8d63
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:749 Waiting on next msg for <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47298)> from ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47298), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:553 Delivering {'return': None, 'cid': 'ad6c55a5-fa8b-4047-b89a-e905a04d8d63'} from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') to caller ad6c55a5-fa8b-4047-b89a-e905a04d8d63
Jan 03 18:30:45 (root, 20253, register_actor)) [RUNTIME] tractor _actor.py:254 All RPC tasks have completed
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47298), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:793 Exiting msg loop for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47298), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') with last msg:
{'return': None, 'cid': 'ad6c55a5-fa8b-4047-b89a-e905a04d8d63'}
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47296), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:766 channel from ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173') closed abruptly:
<Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47298)>
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:960 Waiting on service nursery to complete
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:793 Exiting msg loop for <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47298)> from ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173') with last msg:
{'cmd': ('self', 'register_actor', {'uid': ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'), 'sockaddr': ('127.0.0.1', 36111)}, ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'), 'ad6c55a5-fa8b-4047-b89a-e905a04d8d63')}
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:671 Processing request from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
__main__.not_raising({})
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:497 Releasing channel <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47298)> from ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:720 Spawning task for <function not_raising at 0x7ff44b875160>
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:505 Peers is defaultdict(<class 'list'>, {('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'): [<Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47296)>]})
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:518 Disconnecting channel <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47298)>
inside not raising
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:743 RPC func is <function not_raising at 0x7ff44b875160>
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:553 Delivering {'functype': 'asyncfunc', 'cid': '2cbfe0e1-728f-482e-94f9-a981fede9067'} from ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173') to caller 2cbfe0e1-728f-482e-94f9-a981fede9067
Jan 03 18:30:45 (not raising, 20256, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:749 Waiting on next msg for <Channel fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47296), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47296)> from ('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')
Jan 03 18:30:45 (root, 20253, tractor._spawn.new_proc)) [RUNTIME] tractor _spawn.py:236 Started <trio.Process ['/home/didi/leaf/leaf-radio/.venv/bin/python', '-m', 'tractor._child', '--uid', "('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')", '--parent_addr', "('127.0.0.1', 1616)", '--loglevel', 'runtime']: running with PID 20257>
Jan 03 18:30:45 (root, 20253, tractor._spawn.new_proc)) [RUNTIME] tractor _actor.py:364 Waiting for peer ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') to connect
Jan 03 18:30:45 (no actor context, 20257, no task context)) [INFO] tractor._entry _entry.py:74 Started ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:431 New connection to us <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)>
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')@('127.0.0.1', 47300) complete
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')@('127.0.0.1', 1616) complete
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:460 Waking channel waiters _ParkingLotStatistics(tasks_waiting=1)
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:474 Registered <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)> for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:45 (root, 20253, tractor._spawn.new_proc)) [RUNTIME] tractor _actor.py:367 ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') successfully connected back to us
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:819 Received state from parent:
{'_parent_main_data': {'init_main_from_path': '/home/didi/leaf/leaf-radio/gse/main.py'}, 'enable_modules': {'__main__': '/home/didi/leaf/leaf-radio/gse/main.py', 'tractor._debug': '/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_debug.py'}, '_arb_addr': ('127.0.0.1', 1616), 'bind_host': '127.0.0.1', 'bind_port': 0, '_runtime_vars': {'_debug_mode': False, '_is_root': False, '_root_mailbox': ('127.0.0.1', 1616)}}
Jan 03 18:30:45 (root, 20253, __main__.main)) [RUNTIME] tractor _actor.py:575 Getting result queue for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') cid 2f7ff8aa-1192-45d0-a2b7-7e6d7e3cf831
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:828 Runtime vars are: {'_debug_mode': False, '_is_root': False, '_root_mailbox': ('127.0.0.1', 1616)}
Jan 03 18:30:45 (root, 20253, __main__.main)) [RUNTIME] tractor _actor.py:600 Sending cmd to ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'): __main__.raising({})
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:391 Attempting to import __main__@/home/didi/leaf/leaf-radio/gse/main.py
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:391 Attempting to import tractor._debug@/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_debug.py
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._serve_forever)) [RUNTIME] tractor _actor.py:1079 Started tcp server(s) on [<trio.socket.socket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 34913)>]
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:932 Registering <tractor._actor.Actor object at 0x7f7715b756d0> for role `raising`
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:431 New connection to us <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47302)>
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')@('127.0.0.1', 47302) complete
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')@('127.0.0.1', 1616) complete
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:470 already have channel(s) for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'):[<Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)>]?
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47302), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:474 Registered <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47302)> for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47302)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:575 Getting result queue for ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') cid b6c8574d-4f5b-48f7-b563-81fa65e742d8
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:600 Sending cmd to ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db'): self.register_actor({'uid': ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'), 'sockaddr': ('127.0.0.1', 34913)})
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:671 Processing request from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
self.register_actor({'uid': ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'), 'sockaddr': ('127.0.0.1', 34913)})
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:720 Spawning task for <bound method Arbiter.register_actor of <tractor._actor.Arbiter object at 0x7fbc79053be0>>
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:553 Delivering {'functype': 'asyncfunc', 'cid': 'b6c8574d-4f5b-48f7-b563-81fa65e742d8'} from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') to caller b6c8574d-4f5b-48f7-b563-81fa65e742d8
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:743 RPC func is <bound method Arbiter.register_actor of <tractor._actor.Arbiter object at 0x7fbc79053be0>>
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:749 Waiting on next msg for <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47302)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47302), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (root, 20253, register_actor)) [RUNTIME] tractor _actor.py:254 All RPC tasks have completed
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:553 Delivering {'return': None, 'cid': 'b6c8574d-4f5b-48f7-b563-81fa65e742d8'} from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') to caller b6c8574d-4f5b-48f7-b563-81fa65e742d8
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47302), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:793 Exiting msg loop for <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47302), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') with last msg:
{'return': None, 'cid': 'b6c8574d-4f5b-48f7-b563-81fa65e742d8'}
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47300), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:766 channel from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') closed abruptly:
<Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47302)>
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:793 Exiting msg loop for <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47302)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') with last msg:
{'cmd': ('self', 'register_actor', {'uid': ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'), 'sockaddr': ('127.0.0.1', 34913)}, ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'), 'b6c8574d-4f5b-48f7-b563-81fa65e742d8')}
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:671 Processing request from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
__main__.raising({})
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:497 Releasing channel <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47302)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:720 Spawning task for <function raising at 0x7f77158d18b0>
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:505 Peers is defaultdict(<class 'list'>, {('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'): [<Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47296)>], ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'): [<Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)>]})
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:960 Waiting on service nursery to complete
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:518 Disconnecting channel <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47302)>
inside raising!
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:743 RPC func is <function raising at 0x7f77158d18b0>
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:553 Delivering {'functype': 'asyncfunc', 'cid': '2f7ff8aa-1192-45d0-a2b7-7e6d7e3cf831'} from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') to caller 2f7ff8aa-1192-45d0-a2b7-7e6d7e3cf831
Jan 03 18:30:45 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:749 Waiting on next msg for <Channel fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47300), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:45 (root, 20253, __main__.main)) [RUNTIME] tractor._supervise _supervise.py:264 Waiting on subactors {('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'): (<tractor._actor.Actor object at 0x7fbc790b9d30>, <trio.Process ['/home/didi/leaf/leaf-radio/.venv/bin/python', '-m', 'tractor._child', '--uid', "('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173')", '--parent_addr', "('127.0.0.1', 1616)", '--loglevel', 'runtime']: running with PID 20256>, <tractor._portal.Portal object at 0x7fbc790b87f0>), ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'): (<tractor._actor.Actor object at 0x7fbc790b99d0>, <trio.Process ['/home/didi/leaf/leaf-radio/.venv/bin/python', '-m', 'tractor._child', '--uid', "('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')", '--parent_addr', "('127.0.0.1', 1616)", '--loglevel', 'runtime']: running with PID 20257>, <tractor._portal.Portal object at 0x7fbc78f27460>)} to complete
Jan 03 18:30:45 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (raising, 20257, raising)) [ERROR] tractor _actor.py:223 Actor crashed:
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 195, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task
Jan 03 18:30:50 (raising, 20257, raising)) [RUNTIME] tractor _actor.py:254 All RPC tasks have completed
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:553 Delivering {'error': {'tb_str': 'Traceback (most recent call last):\n  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 195, in _invoke\n    await chan.send({\'return\': await coro, \'cid\': cid})\n  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising\n    raise ValueError("random raise from bad task")\nValueError: random raise from bad task\n', 'type_str': 'ValueError'}, 'cid': '2f7ff8aa-1192-45d0-a2b7-7e6d7e3cf831'} from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') to caller 2f7ff8aa-1192-45d0-a2b7-7e6d7e3cf831
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (root, 20253, tractor._spawn.cancel_on_completion)) [WARNING] tractor _spawn.py:147 Cancelling ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') after error ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 195, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task

Jan 03 18:30:50 (root, 20253, tractor._spawn.cancel_on_completion)) [CANCEL] tractor._portal _portal.py:205 Sending actor cancel request to ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') on <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)>
Jan 03 18:30:50 (root, 20253, tractor._spawn.cancel_on_completion)) [RUNTIME] tractor _actor.py:575 Getting result queue for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') cid 6e4c447e-c42c-444d-b282-3267e113cf5a
Jan 03 18:30:50 (root, 20253, tractor._spawn.cancel_on_completion)) [RUNTIME] tractor _actor.py:600 Sending cmd to ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'): self.cancel({})
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:671 Processing request from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
self.cancel({})
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [CANCEL] tractor _actor.py:688 Actor ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') was remotely cancelled; waiting on cancellation completion..
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [CANCEL] tractor _actor.py:1108 ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') is trying to cancel
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:553 Delivering {'functype': 'asyncfunc', 'cid': '6e4c447e-c42c-444d-b282-3267e113cf5a'} from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') to caller 6e4c447e-c42c-444d-b282-3267e113cf5a
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:1211 Shutting down channel server
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [CANCEL] tractor _actor.py:1132 ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') called `Actor.cancel()`
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:961 Service nursery complete
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:962 Waiting on root nursery to complete
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:254 All RPC tasks have completed
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:553 Delivering {'return': True, 'cid': '6e4c447e-c42c-444d-b282-3267e113cf5a'} from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') to caller 6e4c447e-c42c-444d-b282-3267e113cf5a
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:753 <Channel fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47300), raddr=('127.0.0.1', 1616)> for ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') disconnected, cancelling tasks
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:793 Exiting msg loop for <Channel fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47300), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') with last msg:
{'cmd': ('self', 'cancel', {}, ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db'), '6e4c447e-c42c-444d-b282-3267e113cf5a')}
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._async_main)) [INFO] tractor _actor.py:999 Runtime nursery complete
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._async_main)) [INFO] tractor _actor.py:1003 Closing all actor lifetime contexts
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:431 New connection to us <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47304)>
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')@('127.0.0.1', 1616) complete
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:1248 Handshake with actor ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')@('127.0.0.1', 47304) complete
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:470 already have channel(s) for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'):[<Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)>]?
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47304), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:474 Registered <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47304)> for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:575 Getting result queue for ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') cid 8c30088b-3c50-4090-8d48-5735267fb570
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:619 Entering msg loop for <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47304)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:600 Sending cmd to ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db'): self.unregister_actor({'uid': ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')})
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:671 Processing request from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
self.unregister_actor({'uid': ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')})
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:720 Spawning task for <bound method Arbiter.unregister_actor of <tractor._actor.Arbiter object at 0x7fbc79053be0>>
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:743 RPC func is <bound method Arbiter.unregister_actor of <tractor._actor.Arbiter object at 0x7fbc79053be0>>
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:553 Delivering {'functype': 'asyncfunc', 'cid': '8c30088b-3c50-4090-8d48-5735267fb570'} from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') to caller 8c30088b-3c50-4090-8d48-5735267fb570
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:749 Waiting on next msg for <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47304)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47304), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:553 Delivering {'return': None, 'cid': '8c30088b-3c50-4090-8d48-5735267fb570'} from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') to caller 8c30088b-3c50-4090-8d48-5735267fb570
Jan 03 18:30:50 (root, 20253, unregister_actor)) [RUNTIME] tractor _actor.py:254 All RPC tasks have completed
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:653 Waiting on next msg for <Channel fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47304), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db')
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._process_messages)) [RUNTIME] tractor _actor.py:793 Exiting msg loop for <Channel fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 47304), raddr=('127.0.0.1', 1616)> from ('root', '4f1793e7-ab4d-4ec8-9ce9-96a725c952db') with last msg:
{'return': None, 'cid': '8c30088b-3c50-4090-8d48-5735267fb570'}
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:1047 All peer channels are complete
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:766 channel from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') closed abruptly:
<Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47304)>
Jan 03 18:30:50 (raising, 20257, tractor._actor.Actor._async_main)) [RUNTIME] tractor _actor.py:1049 Runtime completed
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:793 Exiting msg loop for <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47304)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') with last msg:
{'cmd': ('self', 'unregister_actor', {'uid': ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')}, ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'), '8c30088b-3c50-4090-8d48-5735267fb570')}
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:497 Releasing channel <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47304)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:505 Peers is defaultdict(<class 'list'>, {('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'): [<Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47296)>], ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b'): [<Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)>]})
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:518 Disconnecting channel <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47304)>
Jan 03 18:30:50 (raising, 20257, no task context)) [INFO] tractor._entry _entry.py:91 Actor ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') terminated
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:766 channel from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') closed abruptly:
<Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)>
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:793 Exiting msg loop for <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b') with last msg:
{'return': True, 'cid': '6e4c447e-c42c-444d-b282-3267e113cf5a'}
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:497 Releasing channel <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)> from ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:502 No more channels for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:505 Peers is defaultdict(<class 'list'>, {('not raising', 'a8a39a36-f205-4cac-9c33-40f209ca3173'): [<Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47296)>]})
Jan 03 18:30:50 (root, 20253, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor _actor.py:518 Disconnecting channel <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47300)>
Jan 03 18:30:50 (root, 20253, tractor._spawn.new_proc)) [WARNING] tractor _spawn.py:302 Cancelling existing result waiter task for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')
Jan 03 18:30:50 (root, 20253, tractor._spawn.new_proc)) [CANCEL] tractor _spawn.py:311 Hard reap sequence starting for ('raising', 'be83a540-0b92-4b4d-9d71-77aad921285b')

you probably want to reverse the name= for each .run_in_actor() call 😉

lol you are right! 🤦🏼‍♂️

@didimelli
Copy link
Author

With a2, it hangs with these logs:

❯ poetry run python gse/main.py                               Jan 03 18:37:02 (no actor context, 20930, __main__.main)) [WARNING] tractor _root.py:121 No actor could be found @ 127.0.0.1:1616
Jan 03 18:37:02 (root, 20930, __main__.main)) [INFO] tractor _root.py:158 Starting local <tractor._actor.Arbiter object at 0x7f1d4c8bad90> @ 127.0.0.1:1616
Jan 03 18:37:02 (root, 20930, tractor._actor.Actor._async_main)) [INFO] tractor _actor.py:907 Waiting on service nursery to complete
Jan 03 18:37:02 (root, 20930, tractor._spawn.new_proc)) [RUNTIME] tractor log.py:68 Started <trio.Process ['/home/didi/leaf/leaf-radio/.venv/bin/python', '-m', 'tractor._child', '--uid', "('not raising', '684c2479-63bf-4e33-a350-4b0f3ae9c6e5')", '--parent_addr', "('127.0.0.1', 1616)", '--loglevel', 'runtime']: running with PID 20933>
Jan 03 18:37:02 (no actor context, 20933, no task context)) [INFO] tractor._entry _entry.py:74 Started ('not raising', '684c2479-63bf-4e33-a350-4b0f3ae9c6e5')
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 New connection to us <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47354)>
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Handshake with actor ('not raising', '684c2479-63bf-4e33-a350-4b0f3ae9c6e5')@('127.0.0.1', 47354) complete
Jan 03 18:37:02 (not raising, 20933, tractor._actor.Actor._async_main)) [RUNTIME] tractor log.py:68 Handshake with actor ('root', '61c1be9e-92a4-43ba-ac13-01473349c0b0')@('127.0.0.1', 1616) complete
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Registered <Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47354)> for ('not raising', '684c2479-63bf-4e33-a350-4b0f3ae9c6e5')
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 New connection to us <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47356)>
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Handshake with actor ('not raising', '684c2479-63bf-4e33-a350-4b0f3ae9c6e5')@('127.0.0.1', 47356) complete
Jan 03 18:37:02 (not raising, 20933, tractor._actor.Actor._async_main)) [RUNTIME] tractor log.py:68 Handshake with actor ('root', '61c1be9e-92a4-43ba-ac13-01473349c0b0')@('127.0.0.1', 1616) complete
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 already have channel(s) for ('not raising', '684c2479-63bf-4e33-a350-4b0f3ae9c6e5'):[<Channel fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47354)>]?
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Registered <Channel fd=11, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47356)> for ('not raising', '684c2479-63bf-4e33-a350-4b0f3ae9c6e5')
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 RPC func is <bound method Arbiter.register_actor of <tractor._actor.Arbiter object at 0x7f1d4c8bad90>>
Jan 03 18:37:02 (root, 20930, register_actor)) [RUNTIME] tractor log.py:68 All RPC tasks have completed
Jan 03 18:37:02 (not raising, 20933, tractor._actor.Actor._async_main)) [INFO] tractor _actor.py:907 Waiting on service nursery to complete
inside not raising
Jan 03 18:37:02 (not raising, 20933, tractor._actor.Actor._process_messages)) [RUNTIME] tractor log.py:68 RPC func is <function not_raising at 0x7f08e2a003a0>
Jan 03 18:37:02 (root, 20930, tractor._spawn.new_proc)) [RUNTIME] tractor log.py:68 Started <trio.Process ['/home/didi/leaf/leaf-radio/.venv/bin/python', '-m', 'tractor._child', '--uid', "('raising', '300d385f-bc39-47da-9014-412cdae4bf31')", '--parent_addr', "('127.0.0.1', 1616)", '--loglevel', 'runtime']: running with PID 20934>
Jan 03 18:37:02 (no actor context, 20934, no task context)) [INFO] tractor._entry _entry.py:74 Started ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 New connection to us <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47358)>
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Handshake with actor ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')@('127.0.0.1', 47358) complete
Jan 03 18:37:02 (raising, 20934, tractor._actor.Actor._async_main)) [RUNTIME] tractor log.py:68 Handshake with actor ('root', '61c1be9e-92a4-43ba-ac13-01473349c0b0')@('127.0.0.1', 1616) complete
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Registered <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47358)> for ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 New connection to us <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47360)>
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Handshake with actor ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')@('127.0.0.1', 47360) complete
Jan 03 18:37:02 (raising, 20934, tractor._actor.Actor._async_main)) [RUNTIME] tractor log.py:68 Handshake with actor ('root', '61c1be9e-92a4-43ba-ac13-01473349c0b0')@('127.0.0.1', 1616) complete
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 already have channel(s) for ('raising', '300d385f-bc39-47da-9014-412cdae4bf31'):[<Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47358)>]?
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Registered <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47360)> for ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')
Jan 03 18:37:02 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 RPC func is <bound method Arbiter.register_actor of <tractor._actor.Arbiter object at 0x7f1d4c8bad90>>
Jan 03 18:37:02 (root, 20930, register_actor)) [RUNTIME] tractor log.py:68 All RPC tasks have completed
Jan 03 18:37:02 (raising, 20934, tractor._actor.Actor._async_main)) [INFO] tractor _actor.py:907 Waiting on service nursery to complete
inside raising!
Jan 03 18:37:02 (raising, 20934, tractor._actor.Actor._process_messages)) [RUNTIME] tractor log.py:68 RPC func is <function raising at 0x7f9ac38cf790>
Jan 03 18:37:02 (root, 20930, __main__.main)) [RUNTIME] tractor._trionics log.py:68 Waiting on subactors {('not raising', '684c2479-63bf-4e33-a350-4b0f3ae9c6e5'): (<tractor._actor.Actor object at 0x7f1d4c8e8700>, <trio.Process ['/home/didi/leaf/leaf-radio/.venv/bin/python', '-m', 'tractor._child', '--uid', "('not raising', '684c2479-63bf-4e33-a350-4b0f3ae9c6e5')", '--parent_addr', "('127.0.0.1', 1616)", '--loglevel', 'runtime']: running with PID 20933>, <tractor._portal.Portal object at 0x7f1d4c8dfb80>), ('raising', '300d385f-bc39-47da-9014-412cdae4bf31'): (<tractor._actor.Actor object at 0x7f1d4c8e83a0>, <trio.Process ['/home/didi/leaf/leaf-radio/.venv/bin/python', '-m', 'tractor._child', '--uid', "('raising', '300d385f-bc39-47da-9014-412cdae4bf31')", '--parent_addr', "('127.0.0.1', 1616)", '--loglevel', 'runtime']: running with PID 20934>, <tractor._portal.Portal object at 0x7f1d4bddc280>)} to complete
Jan 03 18:37:07 (raising, 20934, raising)) [ERROR] tractor _actor.py:198 Actor crashed:
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 192, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task
Jan 03 18:37:07 (raising, 20934, raising)) [ERROR] tractor _actor.py:222 Actor crashed:
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 192, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task
Jan 03 18:37:07 (raising, 20934, raising)) [RUNTIME] tractor log.py:68 All RPC tasks have completed
Jan 03 18:37:07 (root, 20930, tractor._spawn.cancel_on_completion)) [WARNING] tractor _spawn.py:144 Cancelling ('raising', '300d385f-bc39-47da-9014-412cdae4bf31') after error ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')
Traceback (most recent call last):
  File "/home/didi/leaf/leaf-radio/.venv/lib/python3.9/site-packages/tractor/_actor.py", line 192, in _invoke
    await chan.send({'return': await coro, 'cid': cid})
  File "/home/didi/leaf/leaf-radio/gse/main.py", line 27, in raising
    raise ValueError("random raise from bad task")
ValueError: random raise from bad task

Jan 03 18:37:07 (root, 20930, tractor._spawn.cancel_on_completion)) [WARNING] tractor._portal _portal.py:205 Sending actor cancel request to ('raising', '300d385f-bc39-47da-9014-412cdae4bf31') on <Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47358)>
Jan 03 18:37:07 (raising, 20934, cancel)) [WARNING] tractor _actor.py:1042 ('raising', '300d385f-bc39-47da-9014-412cdae4bf31') is trying to cancel
Jan 03 18:37:07 (raising, 20934, cancel)) [INFO] tractor _actor.py:1125 Cancelling all 0 rpc tasks:
{}
Jan 03 18:37:07 (raising, 20934, cancel)) [INFO] tractor _actor.py:1134 Waiting for remaining rpc tasks to complete {}
Jan 03 18:37:07 (raising, 20934, tractor._actor.Actor._process_messages)) [WARNING] tractor _actor.py:709 Actor ('raising', '300d385f-bc39-47da-9014-412cdae4bf31') was remotely cancelled; waiting on cancellation completion..
Jan 03 18:37:07 (raising, 20934, cancel)) [WARNING] tractor _actor.py:1066 ('raising', '300d385f-bc39-47da-9014-412cdae4bf31') was sucessfullly cancelled
Jan 03 18:37:07 (raising, 20934, cancel)) [WARNING] tractor _actor.py:248 Task <bound method Actor.cancel of <tractor._actor.Actor object at 0x7f9ac3d37bb0>> likely errored or cancelled before it started
Jan 03 18:37:07 (raising, 20934, cancel)) [RUNTIME] tractor log.py:68 All RPC tasks have completed
Jan 03 18:37:07 (raising, 20934, tractor._actor.Actor._async_main)) [INFO] tractor _actor.py:908 Service nursery complete
Jan 03 18:37:07 (raising, 20934, tractor._actor.Actor._async_main)) [INFO] tractor _actor.py:909 Waiting on root nursery to complete
Jan 03 18:37:07 (raising, 20934, tractor._actor.Actor._async_main)) [INFO] tractor _actor.py:943 Root nursery complete
Jan 03 18:37:07 (raising, 20934, tractor._actor.Actor._async_main)) [WARNING] tractor _actor.py:947 Closing all actor lifetime contexts
Jan 03 18:37:07 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 New connection to us <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47362)>
Jan 03 18:37:07 (raising, 20934, tractor._actor.Actor._async_main)) [RUNTIME] tractor log.py:68 Handshake with actor ('root', '61c1be9e-92a4-43ba-ac13-01473349c0b0')@('127.0.0.1', 1616) complete
Jan 03 18:37:07 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Handshake with actor ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')@('127.0.0.1', 47362) complete
Jan 03 18:37:07 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 already have channel(s) for ('raising', '300d385f-bc39-47da-9014-412cdae4bf31'):[<Channel fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47358)>]?
Jan 03 18:37:07 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 Registered <Channel fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 1616), raddr=('127.0.0.1', 47362)> for ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')
Jan 03 18:37:07 (root, 20930, trio._highlevel_serve_listeners._run_handler)) [RUNTIME] tractor log.py:68 RPC func is <bound method Arbiter.unregister_actor of <tractor._actor.Arbiter object at 0x7f1d4c8bad90>>
Jan 03 18:37:07 (root, 20930, unregister_actor)) [RUNTIME] tractor log.py:68 All RPC tasks have completed
Jan 03 18:37:07 (raising, 20934, no task context)) [INFO] tractor._entry _entry.py:91 Actor ('raising', '300d385f-bc39-47da-9014-412cdae4bf31') terminated
Jan 03 18:37:07 (root, 20930, tractor._spawn.new_proc)) [RUNTIME] tractor log.py:68 Attempting to kill <trio.Process ['/home/didi/leaf/leaf-radio/.venv/bin/python', '-m', 'tractor._child', '--uid', "('raising', '300d385f-bc39-47da-9014-412cdae4bf31')", '--parent_addr', "('127.0.0.1', 1616)", '--loglevel', 'runtime']: exited with status 0>
Jan 03 18:37:07 (root, 20930, tractor._spawn.new_proc)) [WARNING] tractor _spawn.py:311 Cancelling existing result waiter task for ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')
Jan 03 18:37:07 (root, 20930, tractor._spawn.cancel_on_completion)) [WARNING] tractor._portal _portal.py:220 May have failed to cancel ('raising', '300d385f-bc39-47da-9014-412cdae4bf31')

@didimelli
Copy link
Author

Just tried on all the alphas, and it is hanging with all of them!

@goodboy goodboy linked a pull request Jan 3, 2022 that will close this issue
@goodboy
Copy link
Owner

goodboy commented Jan 3, 2022

@didimelli yah i found the root cause and put up the patch in #288.

We need a test and hopefully this change won't blow up CI 😂

@goodboy
Copy link
Owner

goodboy commented Jan 3, 2022

Heh it's definitely blowing up CI looks like3 tests; so we might need a slightly different approach or to fix those tests.

Having a test in the suite would be great if you wanted to write up your test script as one 😉
You could put it under the cancellation suite.

@goodboy goodboy changed the title Tasks are not cancelled Nursery does not re-raise error from .run_in_actor() remote task Jan 3, 2022
@didimelli
Copy link
Author

Ok, I will look into this tomorrow!

@goodboy
Copy link
Owner

goodboy commented Jan 3, 2022

@didimelli sweet!

Yeah just to clarify the issue, basically it's summarized as follows:

  • ActorNursery.run_in_actor() is syntactic sugar around .start_actor() where the portal is requested to implicitly run a single task (it's annotated for the nursery to collect its lone result on nursery exit) and then when the nursery is exited any .run_in_actor() portals are waited on for their results. If the actor task fails the nursery should do the work of capturing the result or error from the target task and then based on the supervisor strat (which for now we only have one-cancels-all like trio) raise that error and deal with other actors appropriately.
  • this scenario enters the nursery exit before the remote error is raised by the child actor
  • we are capturing the error but it is not being raised in the child's spawn task which is responsible for managing the lifetime of the child actor. This means we never see this error until the root nursery task checks for accumulated errors from all children but since the top level nursery task will be blocking waiting for all children to finish there is no way for it to know about the remote error until not_raising terminates.

Some solutions:

  1. raising the error immediately as in Always raise a received result-as-error in spawn tasks #288 though i'm not sure this is the most graceful approach
  2. make the spawner task cancel the ria_nursery on error such that error collection is triggered.

@htnjack
Copy link

htnjack commented Jan 4, 2022

Hello!

I was having a look at this, and I had an idea:
How about instead of raising the error directly as in #288, you cancel the actor nursery itself?

I'm by no means good at trio (or tractor), but I think this would trigger cancellation for all the other tasks, ensuring that they are cancelled (and errors accounted for) cleanly.

Are there any side effects I am missing?

I can push some code to the PR that does what I said, if you want to have a look.

@goodboy
Copy link
Owner

goodboy commented Jan 4, 2022

@houtenjack thanks for the input!

Yes, cancellation will also work but (and this is important) we also want to make sure we aren't limiting the ability of users to eventually be able to define custom supervisors as is discussed in #22 😎

So the question is if we decide to use a cancel-actor-nursery-on-any-.run_in_actor()-error approach, where do we enable changing this one-cancels-all strategy in the api?


To clarify the reason CI is failing in #288, error aggregation:

  • cancelling nursery spawn tasks greedily once a first error is received from any one child will cause other spawn management tasks to be cancelled on their result retrieval call
  • the checks for the contents of the trio.MultiError sub-exceptions are breaking since we do not have a final collection of all errors from all child actors on nursery completion (after all subprocs have terminated)
    • One approach we can try is to re-iterate all child portals and call each's .result() and push those into the MultiError set?

@goodboy
Copy link
Owner

goodboy commented Jan 4, 2022

Another question is whether a child actor which is cancelled due to an error from a sibling in the nursery group (in this case not_raising) should also raise a trio.Cancelled which is re-raised locally by the actor nursery. Currently Cancelled is filtered out just like trio's nursery (i think).

@htnjack
Copy link

htnjack commented Jan 4, 2022

Let's see if I understand the issue:

  • Task A raises an error
  • it goes through exhaust_portal via cancel_on_completion
  • the error is collected for propagation
  • the actor is cancelled via portal.cancel_actor()
  • ria_nursery won't pickup the error until the end, because Cancelled is never raised.

In the second solution you proposed, you want to cancel ria_nursery as it would make sense for #22 to manage "cancellation strategies" in the supervisor.

How would the supervisor "tap" into a child nursery to check for errors before all the tasks are collected, if tasks are not raising errors?

It seems to me that supervisor code needs to branch for implementing different strategies, but this would require "lower level" branching for handling these cases, which means having supervisor logic in places like _spawn.cancel_on_completion.

I hope everything is clear, please let me know if I'm making wrong assumptions!

@goodboy
Copy link
Owner

goodboy commented Jan 4, 2022

Let's see if I understand the issue:

💯. Only things I might emphasize is that task A is a remote task from the perspective of the nursery and,

the actor is cancelled via portal.cancel_actor()

Only because we have a default OCA strat right now.

The rest of it i'd say you have correct and I'd encourage solidifying your understanding by running the OP's example and enabling loglevel='cancel' in the top level nursery.


How would the supervisor "tap" into a child nursery to check for errors before all the tasks are collected, if tasks are not raising errors?

A very good question!

Something I've been thinking about is that we may need to make our default strat one-cancels-one which would mean every actor started with .run_in_actor() would explicitly report crashes upward to the nursery and the OCA nursery (actually a special case of OCO) will just decide to cancel siblings on any such error.

OCO seems to actually be the most general case (since you'll need error handling around each and every task it spawns and a way to ask the super for control of sibling lifetimes) and likely can be implemented solely as a generalization of a trio nursery - particularly so because really, .run_in_actor() is the special case where you tie a single task to some actor.

How would the supervisor "tap" into a child nursery to check for errors before all the tasks are collected, if tasks are not raising errors?

Presuming tasks do raise errors (as per the comment on OCO above), then I think we can maybe just offer a strat API implemented as async context managers? Worth checking out is the ctx mngr style API i was dreaming up in the comment in #22 though I'm not sure the implications of limitations of that chaining style.

Ideally we can get ourselves some kind of nice, composable strategy combining API in the long run.

@goodboy
Copy link
Owner

goodboy commented Jan 4, 2022

Heh, so despite trying all the things the test suite is still be grumpy and it's making me wonder if we should just redo the entire .run_in_actor() API in the sense that it's first and foremost a legacy interface and technically a syntactic sugar on top of something like:

done = trio.Event()
result = None

async def target():
    await trio.sleep(1)
    return 'blah'

async with (
    trio.open_nursery() as tn,
    tractor.open_nursery() as an,
):
    portal = an.start_actor('name', enable_modules=[__name__])
    
    async def run_task_and_get_result():
        result = await portal.run(target)
        done.set()
        
   t.start_soon(run_task_and_get_result)
   
   # simulate waiting on result
   await done.wait()
   
   assert result

In this case you can never really have errors swallowed since the trio nursery will always capture whatever error is propagated up from an.run() and it totally sidesteps the ActorNursery from having to wait on results. You could also just rewrap the entire thing as some kind of API on top of the actor nursery like a tractor.open_single_task_per_actor() type thing?

I dunno, open to suggestions here. I personally have not used .run_in_actor() whatsoever in parent project / client code since the intro of the .open_context() API so hearing feedback from someone who is using this interface would probably be a lot better 😂.

If we were to let's say (😲 ) drop this api it would definitely simplify a lot of things and makes me wonder if we'd even need anything non-trio / task level specific in order to make alternate supervisor strats justt work.

@goodboy
Copy link
Owner

goodboy commented Jan 5, 2022

Yeah, after spelunking the original design a bit yesterday I think I'm of the opinion that we should at the least decouple .run_in_actor() from the nursery internals and at most possibly deprecate it from the nursery API.

Argument:

  • .run_in_actor() calls can be implemented using .start_actor() + a Portal.run() call spawned from a local trio nursery task
  • if you want to create a single actor and run a single (one shot) task in it, why aren't you just using trio-parallel?

-> we can offer a "wrapper" high level API that provides .run_in_actor() as part of a small worker pool API?


The benefits include:

  • massively simplifying our actor nursery cancellation/spawn machinery
  • moving error collection and propagation up to some local trio task nursery which is not the same nursery responsible for creating subproc management / spawn tasks
  • a simpler API where we can demonstrate how to build the higher level constructs like a worker pool from primitives and there is less ways to do the same thing.

Interested to see what y'all think.

Also linking #172 as maybe another tidbit to consider as sugar level API.

@goodboy
Copy link
Owner

goodboy commented Jan 5, 2022

Added a couple more patches to #288 to attempt to make the test suite happy (in terms of errors we previously expected in a MultiError) but as mentioned I wonder if this whole approach is a bit of wasted effort and we should re-design this API.

@didimelli
Copy link
Author

I am not particularly interested in .run_in_actor() calls specifically. However, approaching tractor from trio, it seemed the most trio-like API and then I went with that.

I think that since this project is heavily inspired by trio, many people will approach tractor the same way I did, and so they will reach for a trio-like api.

But apart from this, I understand the benefits and agree with your arguments on the API.

@goodboy
Copy link
Owner

goodboy commented Jan 5, 2022

I think that since this project is heavily inspired by trio, many people will approach tractor the same way I did, and so they will reach for a trio-like api.

@didimelli i agree i'm just becoming convinced the layering here is wrong.

One point I do want to emphasize: an actor nursery is actually quite a bit different. tractor's nursery can spawn actors which are are wayyy more sophisticated units-of-concurrency then simple tasks. For example,

  • an actor is an executing instance of many trio scheduled tasks
  • actor tasks are running in parallel relative to the parent and maybe even in an entirely different memory space
  • there are no guarantees about being able to access an eventual expected error if cancellation is happening in parallel in the parent who owns the nursery (which is exactly why the test suite is failing with the changes in Always raise a received result-as-error in spawn tasks #288)

Right now the error collection that can be done above our actor nursery probably should be instead of intermingled in our nursery's cancellation/teardown logic with special cases all throughout the spawning code 😂 to deal with the .run_in_actor() usage..

The issue is when I first started working on the idea of an actor nursery I wasn't entirely sure how RPC semantics would work but now that they're more refined with the Portal.open_context() generalization (also see #24, and maybe #66), this older API doesn't make a ton of sense.

IMO it's super confusing that Portal is a half baked future / deferred result type (with .run_in_actor() and .result() coupled in its API) as well as a memory portal for conducting SC RPC between inter-actor tasks.

Why would you ever want to use .run_in_actor() and not have it be one shot? If you do want to do that you probably shouldn't be using .run_in_actor() but in that case the method should probably have signature .run_in_actor() -> RemoteResult[MyBoxedResult] to ensure the caller knows this is a one shot task that gives one output.

Further, if we drop .run_in_actor as a nursery API then we can drastically simplify how custom supervisors are going to work and we can add back this API probably as part of some kind of worker pool api which works over a cluster of actors that you can request one-shot-task work from.

It'd be interesting if someone wanted to try writing a .run_in_actor() API on top of the nursery API (my guess is it'll look a lot like the worker pool example we link to from the readme) and then we can expose it from our higher level cluster API stuff?

To me .run_in_actor() is a special use case and forcing it into the nursery implementation feels pretty wrong especially when trying to make cases like this work and keep the nursery primitive simple and easy to extend/compose.

@goodboy
Copy link
Owner

goodboy commented Jan 5, 2022

I am not particularly interested in .run_in_actor() calls specifically.

Yeah, if there's one thing I've discovered since adding .open_context() it's that everything can be built from that set of semantics (aka 2 way streaming with synchronized setup/teardown) and once you start building real-world SC systems you pretty much always want that granular control over remote task management.

In fact rolling your own worker pool that caches streams kind of seems so trivial once you're familiar with these APIs that you wonder why one shot work was ever a thing 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cancellation SC teardown semantics and anti-zombie semantics supervision
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants