Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prim_inet don't clean up monitor if port_command crashed #8484

Open
zzydxm opened this issue May 15, 2024 · 3 comments · May be fixed by #8545 or #8546
Open

prim_inet don't clean up monitor if port_command crashed #8484

zzydxm opened this issue May 15, 2024 · 3 comments · May be fixed by #8545 or #8546
Assignees
Labels
bug Issue is reported as a bug team:PS Assigned to OTP team PS

Comments

@zzydxm
Copy link
Contributor

zzydxm commented May 15, 2024

Describe the bug
When the port owner dies because for example a linked process died, the erlang:port_command can crash and fall into this clause
https://github.com/erlang/otp/blob/master/erts/preloaded/src/prim_inet.erl#L616-L619

Problem (1): This will cause a uncleared monitor message so we should demonitor here
Problem (2): The late monitor message appears to us in format
{'DOWN',#Ref<0.754902196.4233101336.24126>,process,#Port<0.55426>,{timeout,{...}}}
which type is process but object is a port, which is not expected anyway and we think it is a bug in BEAM

To Reproduce
No local reproduce yet but from the code problem (1) is straightforward
Problem (2) reproduce see comment below

Expected behavior
There should not be any DOWN message when prim_inet:send failed

Affected versions
OTP 26

@zzydxm zzydxm added the bug Issue is reported as a bug label May 15, 2024
@potatosalad
Copy link
Contributor

potatosalad commented May 15, 2024

Here is a simple way to reproduce the issue manually:

(fun() ->
_ = catch erlang:process_flag(trap_exit, true),
Parent = self(),
{Pid, PidMon} = spawn_monitor(fun() ->
Port = erlang:open_port({spawn, "sleep 10"}, []),
Parent ! {self(), Port},
timer:sleep(timer:seconds(10)),
exit({timeout, expired})
end),
Port = receive {Pid, P} -> P end,
_ = catch erlang:link(Port),
PortMon = erlang:monitor(port, Port),
_ = spawn(fun() -> timer:sleep(timer:seconds(5)), erlang:exit(Pid, {timeout, killed}) end),
#{
    port => Port,
    port_mon => PortMon,
    pid => Pid,
    pid_mon => PidMon
}
end)().
% wait 5 seconds...
flush().
Shell got {'DOWN',#Ref<0.3044587589.3799252993.136686>,process,<0.91.0>,
                  {timeout,killed}}
Shell got {'EXIT',#Port<0.3>,{timeout,killed}}
Shell got {'DOWN',#Ref<0.3044587589.3799252993.136688>,process,#Port<0.3>,
                  {timeout,killed}}

Edit: Confirmed that the above behavior is present in both OTP 26.x and OTP 27.0-rc3

@IngelaAndin IngelaAndin added the team:PS Assigned to OTP team PS label May 20, 2024
@RaimoNiskanen
Copy link
Contributor

It is obviously a bug, and the solution would be simply to add a demonitor(Mref, [flush]) in the catch error: _ -> clause.

But I will have to get the VM guys to take a look at the weird process tag first...

@RaimoNiskanen RaimoNiskanen linked a pull request Jun 4, 2024 that will close this issue
rickard-green added a commit to rickard-green/otp that referenced this issue Jun 4, 2024
…rickard/port-down-fix/25/erlangGH-8484/OTP-19123

* rickard/port-down-fix/24/erlangGH-8484/OTP-19123:
  [erts] Fix DOWN signal from port
@rickard-green rickard-green linked a pull request Jun 4, 2024 that will close this issue
@RaimoNiskanen
Copy link
Contributor

The process tag problem was apparently easy to find. It is on the ToDo list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug team:PS Assigned to OTP team PS
Projects
None yet
4 participants