100+ Consecutive fails on some nodes #29

sadgb · 2019-03-14T06:17:37Z

After investigating those nodes i was unable to find any clues.
CPU usage low
Memory usage less than 50%
Same configuration as others, make restart didn't help
Server restart didn't help

The only thing i was able to find in logs is

chainpoint-node | 2019-01-20T12:39:21.196654386Z WARN : Calendar : Could not retrieve block range 25735 (blocks 2573500 to 2573599) ...... chainpoint-node | 2019-03-07T12:04:20.243424875Z WARN : Calendar : Could not retrieve block range 28231 (blocks 2823100 to 2823199) chainpoint-node | 2019-03-13T06:44:41.190138103Z WARN : Calendar : Could not retrieve block range 28511 (blocks 2851100 to 2851199)

The problem starts in batches. For example i have some nodes with 320-350 consecutive faileds and a batch with 738-800 faileds

Please tell me how to find more info or how to fix this

The text was updated successfully, but these errors were encountered:

jacohend · 2019-03-14T18:35:53Z

Hi @sadgb, thanks for reaching out. Could you send us your node IP and node version? This will help us debug the issue. You can send to jacob@tierion.com if you don't want your information public on github.

sadgb · 2019-03-15T11:31:02Z

all of them are 1.5.4
I've sent your an email with details

michael-iglesias · 2019-03-15T23:29:24Z

Hello @sadgb,

Can you please provide us with more complete account of what is being logged within the Nodes suffering from consecutive failure. We've noted in the log output pasted above that there is a considerable block range gap between the first and last reported failures: block range 28231 & block range 28511, respectively.

A more verbose snapshot of log output and steps that you've taken to try to remedy the issue will give us a bit more insight into what is going on.

Feel free to post requested info here in this thread or email either jacob@tierion.com or miglesias@tierion.com.

Thanks,
Michael I.

sadgb · 2019-03-19T16:36:52Z

so i run docker-compose logs > 1-1.txt on one of my nodes 80.211.216.147 File attached. If you can give some instructions of how to gather more data, please share that knowledge сб, 16 мар. 2019 г. в 03:29, Michael Iglesias <notifications@github.com>:

…

Hello @sadgb <https://github.com/sadgb>, Can you please provide us with more complete account of what is being logged within the Nodes suffering from consecutive failure. We've noted in the log output pasted above that there is a considerable block range gap between the first and last reported failures: block range 28231 & block range 28511, respectively. A more verbose snapshot of log output and steps that you've taken to try to remedy the issue will give us a bit more insight into what is going on. Feel free to post requested info here in this thread or email either ***@***.*** or ***@***.*** Thanks, Michael I. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAoXwzjPzgxeDPqe1UiIWcAYpDiCjHMrks5vXCzUgaJpZM4bzYnJ> .

sadgb · 2019-03-19T17:22:50Z

also i would like to mention that nu,ber of problem nodes increased a bit

michael-iglesias · 2019-03-20T16:59:25Z

@sadgb I think you may have forgotten to attach the file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

100+ Consecutive fails on some nodes #29

100+ Consecutive fails on some nodes #29

sadgb commented Mar 14, 2019 •

edited

jacohend commented Mar 14, 2019

sadgb commented Mar 15, 2019

michael-iglesias commented Mar 15, 2019

sadgb commented Mar 19, 2019 via email

sadgb commented Mar 19, 2019

michael-iglesias commented Mar 20, 2019

100+ Consecutive fails on some nodes #29

100+ Consecutive fails on some nodes #29

Comments

sadgb commented Mar 14, 2019 • edited

jacohend commented Mar 14, 2019

sadgb commented Mar 15, 2019

michael-iglesias commented Mar 15, 2019

sadgb commented Mar 19, 2019 via email

sadgb commented Mar 19, 2019

michael-iglesias commented Mar 20, 2019

sadgb commented Mar 14, 2019 •

edited