Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

echo example - recovery from failure #50

Open
csdrane opened this issue Dec 4, 2015 · 2 comments
Open

echo example - recovery from failure #50

csdrane opened this issue Dec 4, 2015 · 2 comments

Comments

@csdrane
Copy link

csdrane commented Dec 4, 2015

I have a question about recovery from tcp socket errors. Using the echo example, run client.js and server.js.

If you abruptly restart client.js, server.js is able to recover from and a new stream is created. But, if you restart server.js, no new stream is ever created, despite client reconnecting.

I'm unclear on why they behave differently. I suspect that in client.js, that the link.stream() breaks once server restarts, and for some reason a new link status event isn't emitted. So no new link.stream() call occurs.

What changes would need to be made to the code to have both client and server able to recover from a restart?

@rynomad
Copy link
Contributor

rynomad commented Dec 4, 2015

This is a good issue to tackle, and it comes from the fact that we support a number of underlying transports by default; http, udp, and tcp in Node. In the case of tcp, a broken pipe is a pretty surefire way of establishing that a pipe/path is down, but the problem is that our notion of a Link encompasses multiple paths between hashnames. Udp, for example, keeps track via keepalive messages that I think come out about once a minute or so.

It seems like what's happening here is that since udp hasn't had a chance to confirm it's down-ness, the client is never registering that the link is down, so when it re-discovers the server and establishes the connection, it just hums along without ever noticing, and keeps writing to a stream that it believes to be valid. on the server side of course, there's no persisted state for individual channels, so it has no handler for the incoming channel packets and they get dropped.

I can think of a couple workarounds for the echo example, but really this is a broader issue of how to decide when a link is up/down when you've got multiple potential transports. I'm trying to think of an approach that doesn't involve blasting UDP with keepalive messages too often.

As a short term fix, I would add a process.on('exit') listener to server.js that manually ends all connected client streams. On the client side, I would use the stream.on('end') to cleanup the link and try to establish a new one.

I'm sorry all of this sounds terribly hackish, but this is a tricky issue and I want to make sure that any fix in the library is well suited to handle all the possible channel/transport/topology configurations, of which there are many, not to mention being 100% certain that any feature is compatible with the telehash spec.

@csdrane
Copy link
Author

csdrane commented Dec 4, 2015

Thanks I was able to get this working with your suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants