Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis consumer does not reconnect after failure to ack message #206

Open
mbowersox opened this issue Nov 12, 2020 · 2 comments
Open

Redis consumer does not reconnect after failure to ack message #206

mbowersox opened this issue Nov 12, 2020 · 2 comments
Assignees
Labels
bug Something isn't working M-redis This issue is related to the Redis module P2 Priority 2
Milestone

Comments

@mbowersox
Copy link

It appears that the Redis consumer will not reconnect after an ack failure. It is believe that the ack failure occurred because the stream was dropped from the Redis instance after exceeding its maximum cache size. In flight messages could not be ack'd because the stream key no longer existed. The producer to this stream re-created a new key and the consumer never recovered.

@sklose sklose added bug Something isn't working M-redis This issue is related to the Redis module labels Nov 12, 2020
@sklose
Copy link
Collaborator

sklose commented Nov 12, 2020

couple of observations

  • the stream was lost because it exceeded its MAXLEN setting
  • the CC application reported errors acking messages
  • the CC application did not report any errors reading from the non-existent stream
  • the stream was recreated at some point, however the CC application still did not receive any new messages until restarted

things to investigate:

  • what's the behavior of the redis client when we read from a non-existent stream? does it produce an error or just return "0 messages"? is there a difference between the stream not existing to begin with vs. the stream disappearing mid-way?
  • why did the CC application no receive new messages after the stream was recreated? does it have to do with the consumer group? does it need to be recreated as well / rejoined by the application?

options to fix this issue:

  • detect when streams don't exist and throw an error from the input source which will terminate the CC application (brute force fix)
  • gracefully handle rejoining consumer groups for re-created streams (in case that turns out to be a problem)
  • streams are already created as part of the initialization logic. when the application detects that a stream is gone it could just re-create it

@chrnola
Copy link
Contributor

chrnola commented Nov 12, 2020

In this case I think if the app had crashed when that first ack failed everything would have self-healed. Propagating that exception might be a decent enough short term fix.

@sklose sklose added this to the 1.4 milestone Dec 14, 2020
@sklose sklose self-assigned this Dec 14, 2020
@plameniv plameniv added the P2 Priority 2 label Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working M-redis This issue is related to the Redis module P2 Priority 2
Development

No branches or pull requests

4 participants