Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on exit with Derecho using TCP #135

Open
songweijia opened this issue Jun 17, 2019 · 0 comments
Open

Segmentation fault on exit with Derecho using TCP #135

songweijia opened this issue Jun 17, 2019 · 0 comments
Assignees

Comments

@songweijia
Copy link
Contributor

This will happen in view_manager.cpp:line 120, when the ViewManager object is destructed. GDB shows the segmentation fault is from gethostbyname(). This only happen with Derecho configured to use TCP (provider=sockets).

@songweijia songweijia self-assigned this Jun 17, 2019
etremel added a commit that referenced this issue Oct 1, 2021
I discovered one reason why Derecho processes often end with a
segmentation fault when attempting to shut down "cleanly" (as noted in
issues #135, #192, etc.): When a node marks itself as failed, the
SST will be told to freeze the node's own row (in process_suspicions()),
but SST::freeze() will dereference a null pointer if it is called on the
local row (res_vec has no entry for the node's own row).

The solution is to add a check for row_index == my_index in freeze(),
and also to ensure a node shuts itself down more promptly when it
detects that it has been marked as "failed" by the rest of the group.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant