Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Particle connection watchdog needed #112

Open
patfreeman opened this issue Mar 16, 2017 · 9 comments
Open

Particle connection watchdog needed #112

patfreeman opened this issue Mar 16, 2017 · 9 comments

Comments

@patfreeman
Copy link
Collaborator

patfreeman commented Mar 16, 2017

If my particle loses power or its network connection, the TCP socket in the kegbot app is not closed and remains in an ESTABLISHED state. This leads to all future sensor data being lost to the app.

It looks like the USB Controller / Manager classes have some sort of watchdog implemented to ensure the connection to the arduino kegboard remain functional. This functionality does not seem to exist in the NetworkController class. To help maintain a consistent connection to the particle it may work to implement some sort of status ping and response to ensure the board is properly communicating with the app.

@AdamCavaliere
Copy link

I have found the same thing with my utilization as well. Along with that, I have found that after awhile the particle may have an issue and put in some code to restart it once a day. This throws off the Android tablet as well.

I am not familiar enough with Android development to offer a fix - just requesting that this get some attention!

@patfreeman
Copy link
Collaborator Author

@AdamCavaliere I've found that adding a temperature sensor and running the latest master of kegbot-android and kegboard-particle version 0.3.0 keeps a fairly stable connection. It's not a true watchdog in the Network Controller, but I think it keeps the connection active due to the constant updates of temp from the particle.

@rplankenhorn
Copy link
Contributor

I'm starting to work with the Kegbot on Particle and I figure I could chime in here.

Something to note: in this situation, the Particle would be the server and the Android app would be the client. When the Particle launches, it should start listening on a specific port for a connection and block until it gets one. When it gets a connection it then starts sending that data on that connection but should also start asynchronously listening for additional connections. From my brief look at the Particle code, it seems to be behaving this way.

I think what is missing is an ACK response from the Android tablet to the Particle when something is received. Regardless of the message we receive from the Particle, we should send back a message (probably here:

private void handleMessage(NetworkMessage message) {
). If the particle doesn't receive an ACK within a reasonable period of time, it closes the connection and listens for a new one.

@mik3y
Copy link
Member

mik3y commented Dec 24, 2020

I think what is missing is an ACK response from the Android tablet to the Particle when something is received. Regardless of the message we receive from the Particle, we should send back a message

One potential pitfall of that approach is, it gets very "chatty" to ACK every message. It also means at least a little extra bookkeeping on the controller side ("which message(s) are unack'd", or, "how many acks am i still waiting for?").

A variation which I'd slightly prefer is to add an explicit PING/PONG message to the client->controller protocol, and have a controller-side watchdog timer based on it: If a PING message hasn't been received from the client in more than timeout seconds, the controller should terminate the connection. It works in the other direction too: If the client issues a ping and no pong is received, it may assume the controller is dead.

@rplankenhorn
Copy link
Contributor

One potential pitfall of that approach is, it gets very "chatty" to ACK every message. It also means at least a little extra bookkeeping on the controller side ("which message(s) are unack'd", or, "how many acks am i still waiting for?").

So we don't care if the client has ACKed every response. Only that they have sent an ACK more recently than our timeout requirements. We also don't need to associate an ACK with a particular request. This piggybacks the watchdog on existing requests and just asks the question: "when did I last hear from the client?"

I've done the PingPong style approach before but actually found it more chatty than the above approach because you need to add an additional command on both the client and the server that gets sent back and forth.

We only care whether the client is still connected and if it can't prove it's connected, then we close the connection and ready the server again. If the client actually detects this, then we can have it automatically reestablish the connection.

@mik3y
Copy link
Member

mik3y commented Dec 24, 2020

So we don't care if the client has ACKed every response. Only that they have sent an ACK more recently than our timeout requirements

Ah, OK, I got slightly confused by the terminology; I'd call that more like a heartbeat than an ack.

I've done the PingPong style approach before but actually found it more chatty than the above approach because you need to add an additional command on both the client and the server that gets sent back and forth.

It would add a PONG message, from controller to client, in response to a PING (aka ACK). But I think that's the only difference.

I think this difference allows the client to also verify controller liveness, especially in the absence of other controller-to-client traffic from which it could, since it provokes a message. That seems like a good feature to have.

@rplankenhorn
Copy link
Contributor

It would add a PONG message, from controller to client, in response to a PING (aka ACK). But I think that's the only difference.

Just so I understand, you are saying that the server (controller) would send data (temp reading or flow reading) and then the client would receive that, send a ping back, and the controller would send a pong?

I think if we know that the server is outputting a message every second then we can use that to verify the server is up and running once the connection is established. When we send a command every second to the client, the client knows the server is alive. When the client responds to that command, the server knows the client is alive.

After talking with @patfreeman a little over Slack, it sounds like after a while the connection just "closes" and needs to be reopened. The connection works well when both client and server are active and sending data. Is that accurate?

Also, if we require some type of response from the client or we close the connection, we probably want to create a second server over a different port so for telnet connections.

Regardless, I'm willing to try a few of the suggestions in this thread to debug the issue and see if I can make it a more reliable connection.

@rplankenhorn
Copy link
Contributor

Coming back to this. Wanted to relay what I am thinking of trying:

  1. I'm going to make a small change so that regardless of whether there is a new meter read or not, the kegboard still sends the current meter status at the interval if the connection is open. This idea spawned from this thread talking about how connections go stale after a while if nothing is being sent. The Android app already supports ignoring the request if there is no actual meter change.
  2. Update the Android app to automatically turn on the Watchdog when it connects. Does it already do this? I need to look closer at the code to see if it does but I see that the watchdog is built in to the particle code.

I think making these two changes will do a better job keeping the socket open.

@rplankenhorn
Copy link
Contributor

I made the above changes and the controller has been connected with the Particle for several days now with no issues. I updated the Android app so that each time it receives a message from the controller, it just sends back the watchdog kick command. I'll put up a PR to both this repo and the Particle repo and tag this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants