Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fluent-bit service fails due to segmentation fault #1266

Open
diegoauad opened this issue May 23, 2023 · 1 comment
Open

fluent-bit service fails due to segmentation fault #1266

diegoauad opened this issue May 23, 2023 · 1 comment

Comments

@diegoauad
Copy link

diegoauad commented May 23, 2023

Describe the bug
google-cloud-ops-agent-fluent-bit.service fails to start. A 'segmentation fault' error is shown in journalctl logs.

To Reproduce
Steps to reproduce the behavior:

  1. Start a GCE VM with Debian 10 image.
  2. Install MongoDB Community Edition 5.0.11
  3. Install Ops Agent 2.32.0
  4. Apply the example configuration provided in the guide about Operations Suite for MongoDB
  5. In our case, it worked fine for some hours. Then the fluent-bit.service failed and was unable to start again.

Expected behavior
All Agent Ops services working correctly, or at least an informative error message about what's misconfigured.

Environment (please complete the following information):

  • Project ID pulse-api-370218
  • VM ID 7971989526812200110
  • VM distro / OS: Debian 10
  • Ops Agent version 2.32.0
  • Ops Agent configuration MongoDB operations suite guide
  • Ops Agent log
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]: 2023/05/23 16:46:26 Built-in config:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]: logging:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   receivers:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     syslog:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       type: files
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       include_paths:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       - /var/log/messages
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       - /var/log/syslog
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   service:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     pipelines:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       default_pipeline:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:         receivers: [syslog]
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]: metrics:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   receivers:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     hostmetrics:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       type: hostmetrics
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       collection_interval: 60s
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   processors:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     metrics_filter:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       type: exclude_metrics
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       metrics_pattern: []
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   service:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     pipelines:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       default_pipeline:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:         receivers: [hostmetrics]
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:         processors: [metrics_filter]
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]: 2023/05/23 16:46:26 Merged config:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]: logging:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   receivers:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     mongodb:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       type: mongodb
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     syslog:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       type: files
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       include_paths:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       - /var/log/messages
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       - /var/log/syslog
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   service:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     pipelines:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       default_pipeline:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:         receivers: [syslog]
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       mongo:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:         receivers: [mongodb]
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]: metrics:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   receivers:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     hostmetrics:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       type: hostmetrics
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       collection_interval: 60s
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     mongodb:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       type: mongodb
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       insecure: true
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       insecure_skip_verify: null
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       cert_file: ""
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       key_file: ""
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       ca_file: ""
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       collection_interval: ""
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   processors:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     metrics_filter:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       type: exclude_metrics
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       metrics_pattern: []
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:   service:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:     pipelines:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       default_pipeline:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:         receivers: [hostmetrics]
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:         processors: [metrics_filter]
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:       mongo:
May 23 16:46:26 mongodb-dev-1 google_cloud_ops_agent_engine[17334]:         receivers: [mongodb]
May 23 16:46:26 mongodb-dev-1 systemd[1]: Started Google Cloud Ops Agent - Logging Agent.
May 23 16:46:27 mongodb-dev-1 google_cloud_ops_agent_wrapper[17343]: 2023/05/23 16:46:27 signal: segmentation fault
May 23 16:46:27 mongodb-dev-1 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Main process exited, code=exited, status=255/EXCEPTION
May 23 16:46:27 mongodb-dev-1 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Failed with result 'exit-code'.
May 23 16:46:27 mongodb-dev-1 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Service RestartSec=100ms expired, scheduling restart.
May 23 16:46:27 mongodb-dev-1 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Scheduled restart job, restart counter is at 4.
May 23 16:46:27 mongodb-dev-1 systemd[1]: Stopped Google Cloud Ops Agent - Logging Agent.

Additional context
We are running MongoDB as a standalone replica, with authentication enabled. Monitoring works fine, we are experiencing issues with logging only.

@braydonk
Copy link
Contributor

Hi @diegoauad, thank you for opening an issue. I have been investigating some similar issues other users have reported with MongoDB logging. Particularly we have seen issues with particularly complicated nested logs coming from MongoDB.

Would you please open a support case so that we can assist in greater detail? In the support case, if you could include the output from our diagnostic tool as well as some representative Mongo logs if possible. You can also mention this GitHub Issue in the case.

If you do open a support case, please respond here with the case number so we can investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants