I have an HCI (oVirt with Gluster) cluster which will not boot correctly the engine. Error is a service "vdsm" that is cycling and I need a second pair of eyes on. Both nodes of the cluster cycling same error.
I did the normal test of stopping gluster, running fsck on drives and they seem clean.. I ran updates (and this cluster is the same where I converted to centos-streams 8 \.. so that may play into this as patches were done).
Symptom: ( below error continually loops)
#############
Dec 21 11:24:09 medusa systemd[1]: Starting Virtual Desktop Server Manager...
Dec 21 11:24:09 medusa vdsmd_init_common.sh[28133]: vdsm: Running mkdirs
Dec 21 11:24:09 medusa vdsmd_init_common.sh[28133]: vdsm: Running configure_vdsm_logs
Dec 21 11:24:09 medusa vdsmd_init_common.sh[28133]: vdsm: Running run_init_hooks
Dec 21 11:24:09 medusa vdsmd_init_common.sh[28133]: vdsm: Running check_is_configured
Dec 21 11:24:09 medusa systemd[1]: ovirt-ha-broker.service: Service RestartSec=100ms expired, scheduling restart.
Dec 21 11:24:09 medusa systemd[1]: ovirt-ha-broker.service: Scheduled restart job, restart counter is at 34.
Dec 21 11:24:09 medusa systemd[1]: Stopped oVirt Hosted Engine High Availability Communications Broker.
Dec 21 11:24:09 medusa systemd[1]: Started oVirt Hosted Engine High Availability Communications Broker.
Dec 21 11:24:09 medusa abrt-server[28130]: Deleting problem directory Python3-2020-12-21-11:24:09-27657 (dup of Python3-2020-12-21-08:12:37-988628)
Dec 21 11:24:09 medusa dbus-daemon[1088]: [system] Activating service name='org.freedesktop.problems' requested by ':1.1594' (uid=0 pid=28160 comm="/usr/libexec/platform-python /usr/bin/abrt-action-" label="system_u:system_r:abrt_t:s0-s0:c0.c1023") (using servicehelper)
Dec 21 11:24:09 medusa dbus-daemon[28162]: [system] Failed to reset fd limit before activating service: org.freedesktop.DBus.Error.AccessDenied: Failed to restore old fd limit: Operation not permitted
Dec 21 11:24:10 medusa dbus-daemon[1088]: [system] Successfully activated service 'org.freedesktop.problems'
Dec 21 11:24:10 medusa abrt-server[28130]: /bin/sh: reporter-systemd-journal: command not found
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: Error:
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: One of the modules is not configured to work with VDSM.
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: To configure the module use the following:
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: 'vdsm-tool configure [--module module-name]'.
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: If all modules are not configured try to use:
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: 'vdsm-tool configure --force'
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: (The force flag will stop the module's service and start it
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: afterwards automatically to load the new configuration.)
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]:
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: Manual override for multipath.conf detected - preserving current configuration
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: This manual override for multipath.conf was based on downrevved template. You are strongly advised to contact your support representatives
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: abrt is already configured for vdsm
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: Managed volume database is already configured
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: lvm is configured for vdsm
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: sanlock config file needs options: {'our_host_name': '03000200-0400-0500-0006-000700080009', 'max_worker_threads': '50'}
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: libvirt is already configured for vdsm
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: Modules sanlock are not configured
Dec 21 11:24:10 medusa vdsmd_init_common.sh[28133]: vdsm: stopped during execute check_is_configured task (task returned with error code 1).
Dec 21 11:24:10 medusa systemd[1]: vdsmd.service: Control process exited, code=exited status=1
Dec 21 11:24:10 medusa systemd[1]: vdsmd.service: Failed with result 'exit-code'.
Dec 21 11:24:10 medusa systemd[1]: Failed to start Virtual Desktop Server Manager.
Dec 21 11:24:10 medusa systemd[1]: Dependency failed for MOM instance configured for VDSM purposes.
Dec 21 11:24:10 medusa systemd[1]: mom-vdsm.service: Job mom-vdsm.service/start failed with result 'dependency'.
Dec 21 11:24:11 medusa systemd[1]: vdsmd.service: Service RestartSec=100ms expired, scheduling restart.
Dec 21 11:24:11 medusa systemd[1]: vdsmd.service: Scheduled restart job, restart counter is at 1.
Dec 21 11:24:11 medusa systemd[1]: Stopped Virtual Desktop Server Manager.
Dec 21 11:24:11 medusa systemd[1]: Starting Virtual Desktop Server Manager...
################
Some google hits on this but not really feeling warm and fuzzy with fixes...
https://archived.forum.manjaro.org/t/so ... er/79486/5
https://access.redhat.com/solutions/3900301
# not really sure if this is red hering and or bad to do..
https://access.redhat.com/solutions/5167301
https://bugzilla.redhat.com/show_bug.cgi?id=1839753
# could be answer but don't want to shotgun things. Why does other node work... but maybe because the failing node is trying to launch oVirt engine.
# Debug Notes:
Dec 21 10:49:21 medusa.penguinpages.local setroubleshoot[8389]: AnalyzeThread.run(): Set alarm timeout to 10
Dec 21 10:49:40 medusa.penguinpages.local dbus-daemon[1088]: [system] Activating service name='org.freedesktop.problems' requested by ':1.96' (uid=0 pid=8793 comm="/usr/libexec/platform-python /usr/bin/abrt-action-" label="system_u:>
Dec 21 10:49:40 medusa.penguinpages.local dbus-daemon[8797]: [system] Failed to reset fd limit before activating service: org.freedesktop.DBus.Error.AccessDenied: Failed to restore old fd limit: Operation not permitted
Dec 21 10:49:40 medusa.penguinpages.local dbus-daemon[1088]: [system] Successfully activated service 'org.freedesktop.problems'
Dec 21 10:52:47 medusa.penguinpages.local dbus-daemon[1088]: [system] Activating service name='org.freedesktop.problems' requested by ':1.226' (uid=0 pid=10363 comm="/usr/libexec/platform-python /usr/bin/abrt-action-" label="system_>
Dec 21 10:52:47 medusa.penguinpages.local dbus-daemon[10365]: [system] Failed to reset fd limit before activating service: org.freedesktop.DBus.Error.AccessDenied: Failed to restore old fd limit: Operation not permitted
Dec 21 10:52:47 medusa.penguinpages.local dbus-daemon[1088]: [system] Successfully activated service 'org.freedesktop.problems'
....
[root@medusa ~]# ulimit -Hn
262144
[root@medusa systemd]# ls -alh /etc/systemd/
total 44K
drwxr-xr-x. 4 root root 150 Nov 13 18:01 .
drwxr-xr-x. 139 root root 12K Dec 21 10:48 ..
-rw-r--r--. 1 root root 615 Jun 22 2018 coredump.conf
-rw-r--r--. 1 root root 1.1K Jun 22 2018 journald.conf
-rw-r--r--. 1 root root 1.1K Nov 13 18:00 logind.conf
-rw-r--r--. 1 root root 631 Nov 13 18:00 resolved.conf
drwxr-xr-x. 25 root root 4.0K Dec 13 04:24 system
-rw-r--r--. 1 root root 1.7K Nov 13 18:00 system.conf
drwxr-xr-x. 2 root root 6 Nov 13 18:01 user
-rw-r--r--. 1 root root 1.2K Jun 22 2018 user.conf
# These settings and permissions matches other servers in cluster working fine.
saw note in dbus about SELINUX.. required for oVirt and gluster.. and applied changes
journalctl --unit dbus
Dec 21 10:48:42 medusa.penguinpages.local setroubleshoot[1710]: SELinux is preventing glusterd from setopt access on the netlink_rdma_socket labeled glusterd_t. For complete SELinux messages run: sealert -l b95d8446-a5c5-494d-ac9d-e>
Dec 21 10:48:42 medusa.penguinpages.local setroubleshoot[1710]: SELinux is preventing glusterd from setopt access on the netlink_rdma_socket labeled glusterd_t.
ausearch -c 'glusterd' --raw | audit2allow -M my-glusterd
semodule -X 300 -i my-glusterd.pp
# reboot after SELINUX change (which should not have been needed but.. meh.....
Post reboot no change. Looking for ideas.
As usual.. Thanks in advanced.
HCI - VDSM Service Failure Looping
- penguinpages
- Posts: 91
- Joined: 2015/07/21 13:58:05
Re: HCI - VDSM Service Failure Looping
Found a similar issue to my failure:
https://bugzilla.redhat.com/show_bug.cgi?id=1368115
[root@thor ~]# cp /etc/vdsm/vdsm.conf /etc/vdsm/vdsm.conf.orig
[root@thor ~]# vdsm-tool validate-config
SUCCESS: ssl configured to true. No conflicts
[root@thor ~]# vdsm-tool validate-config
SUCCESS: ssl configured to true. No conflicts
[root@thor ~]# vdsm-tool configure --force
Checking configuration status...
sanlock is configured for vdsm
Current revision of multipath.conf detected, preserving
lvm is configured for vdsm
abrt is already configured for vdsm
libvirt is already configured for vdsm
SUCCESS: ssl configured to true. No conflicts
Managed volume database is already configured
Error: ServiceOperationError: _systemctlStop failed
b'Job for vdsmd.service canceled.\n'
[root@thor ~]# systemctl restart vdsmd
[root@thor ~]# systemctl status vdsmd
● vdsmd.service - Virtual Desktop Server Manager
Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2020-12-28 12:53:06 EST; 3s ago
Process: 136537 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
Main PID: 136588 (vdsmd)
Tasks: 41 (limit: 1235322)
Memory: 68.5M
CGroup: /system.slice/vdsmd.service
└─136588 /usr/bin/python3 /usr/share/vdsm/vdsmd
Dec 28 12:53:05 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running syslog_available
Dec 28 12:53:05 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running nwfilter
Dec 28 12:53:05 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running dummybr
Dec 28 12:53:06 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running tune_system
Dec 28 12:53:06 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running test_space
Dec 28 12:53:06 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running test_lo
Dec 28 12:53:06 thor.penguinpages.local systemd[1]: Started Virtual Desktop Server Manager.
Dec 28 12:53:06 thor.penguinpages.local vdsm[136588]: WARN MOM not available. Error: [Errno 111] Connection refused
Dec 28 12:53:06 thor.penguinpages.local vdsm[136588]: WARN MOM not available, KSM stats will be missing. Error:
Dec 28 12:53:06 thor.penguinpages.local vdsm[136588]: WARN Failed to retrieve Hosted Engine HA info, is Hosted Engine setup finished?
[root@thor ~]#
I think the ticket reflected that vdsmd is not starting.. it is.. but ... seems MOM is having issues.
https://bugzilla.redhat.com/show_bug.cgi?id=1393012
[root@thor ~]# yum install mom
Last metadata expiration check: 0:56:04 ago on Mon 28 Dec 2020 11:59:02 AM EST.
Package mom-0.6.0-1.el8.noarch is already installed.
Dependencies resolved.
Nothing to do.
Complete!
[root@thor ~]# systemctl status mom-vdsm.service
● mom-vdsm.service - MOM instance configured for VDSM purposes
Loaded: loaded (/usr/lib/systemd/system/mom-vdsm.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2020-12-28 12:53:11 EST; 3min 50s ago
Main PID: 136691 (momd)
Tasks: 6 (limit: 1235322)
Memory: 26.1M
CGroup: /system.slice/mom-vdsm.service
└─136691 /usr/libexec/platform-python /usr/sbin/momd -c /etc/vdsm/mom.conf
Dec 28 12:53:11 thor.penguinpages.local systemd[1]: Started MOM instance configured for VDSM purposes.
[root@thor ~]# cat /var/log/vdsm/mom.log
<snip hundreds of lines about ping>
2020-12-28 12:51:54,533 - mom.RPCServer - INFO - ping()
2020-12-28 12:51:54,534 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:52:09,560 - mom.RPCServer - INFO - ping()
2020-12-28 12:52:09,561 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:52:24,587 - mom.RPCServer - INFO - ping()
2020-12-28 12:52:24,588 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:52:39,613 - mom.RPCServer - INFO - ping()
2020-12-28 12:52:39,614 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:52:54,640 - mom.RPCServer - INFO - ping()
2020-12-28 12:52:54,648 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:53:06,259 - mom - INFO - MOM starting
2020-12-28 12:53:06,297 - mom.HostMonitor - INFO - Host Monitor starting
2020-12-28 12:53:06,297 - mom - INFO - hypervisor interface vdsmjsonrpcclient
2020-12-28 12:53:06,323 - mom.HostMonitor - INFO - HostMonitor is ready
2020-12-28 12:53:06,347 - mom - ERROR - Cannot connect to VDSM. This can happen when VDSM is starting. Error: Connection to localhost:54321 with use_tls=True, timeout=60 failed: [Errno 111] Connection refused
2020-12-28 12:53:11,715 - mom - INFO - MOM starting
2020-12-28 12:53:11,754 - mom.HostMonitor - INFO - Host Monitor starting
2020-12-28 12:53:11,754 - mom - INFO - hypervisor interface vdsmjsonrpcclient
2020-12-28 12:53:11,773 - mom.HostMonitor - INFO - HostMonitor is ready
2020-12-28 12:53:11,823 - mom.GuestManager - INFO - Guest Manager starting: multi-thread
2020-12-28 12:53:11,827 - mom.Policy - INFO - Loaded policy '00-defines'
2020-12-28 12:53:11,829 - mom.Policy - INFO - Loaded policy '01-parameters'
2020-12-28 12:53:11,842 - mom.Policy - INFO - Loaded policy '02-balloon'
2020-12-28 12:53:11,862 - mom.Policy - INFO - Loaded policy '03-ksm'
2020-12-28 12:53:11,888 - mom.Policy - INFO - Loaded policy '04-cputune'
2020-12-28 12:53:11,922 - mom.Policy - INFO - Loaded policy '05-iotune'
2020-12-28 12:53:11,922 - mom.PolicyEngine - INFO - Policy Engine starting
2020-12-28 12:53:11,922 - mom.RPCServer - INFO - Using unix socket /run/vdsm/mom-vdsm.sock
2020-12-28 12:53:11,923 - mom.RPCServer - INFO - RPC Server starting
2020-12-28 12:53:22,018 - mom.RPCServer - INFO - ping()
2020-12-28 12:53:22,019 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:53:26,963 - mom.Controllers.KSM - INFO - Updating KSM configuration: run:0 pages_to_scan:0 sleep_millisecs:0 merge_across_nodes:1
2020-12-28 12:53:37,044 - mom.RPCServer - INFO - ping()
2020-12-28 12:53:37,045 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:53:52,071 - mom.RPCServer - INFO - ping()
2020-12-28 12:53:52,072 - mom.RPCServer - INFO - getStatistics()
<snip hundreds of lines of ping issues>
2020-12-28 12:56:37,349 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:56:52,373 - mom.RPCServer - INFO - ping()
2020-12-28 12:56:52,374 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:57:07,396 - mom.RPCServer - INFO - ping()
2020-12-28 12:57:07,397 - mom.RPCServer - INFO - getStatistics()
... Still digging... will post if / when I find issue.
https://bugzilla.redhat.com/show_bug.cgi?id=1368115
[root@thor ~]# cp /etc/vdsm/vdsm.conf /etc/vdsm/vdsm.conf.orig
[root@thor ~]# vdsm-tool validate-config
SUCCESS: ssl configured to true. No conflicts
[root@thor ~]# vdsm-tool validate-config
SUCCESS: ssl configured to true. No conflicts
[root@thor ~]# vdsm-tool configure --force
Checking configuration status...
sanlock is configured for vdsm
Current revision of multipath.conf detected, preserving
lvm is configured for vdsm
abrt is already configured for vdsm
libvirt is already configured for vdsm
SUCCESS: ssl configured to true. No conflicts
Managed volume database is already configured
Error: ServiceOperationError: _systemctlStop failed
b'Job for vdsmd.service canceled.\n'
[root@thor ~]# systemctl restart vdsmd
[root@thor ~]# systemctl status vdsmd
● vdsmd.service - Virtual Desktop Server Manager
Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2020-12-28 12:53:06 EST; 3s ago
Process: 136537 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
Main PID: 136588 (vdsmd)
Tasks: 41 (limit: 1235322)
Memory: 68.5M
CGroup: /system.slice/vdsmd.service
└─136588 /usr/bin/python3 /usr/share/vdsm/vdsmd
Dec 28 12:53:05 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running syslog_available
Dec 28 12:53:05 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running nwfilter
Dec 28 12:53:05 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running dummybr
Dec 28 12:53:06 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running tune_system
Dec 28 12:53:06 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running test_space
Dec 28 12:53:06 thor.penguinpages.local vdsmd_init_common.sh[136537]: vdsm: Running test_lo
Dec 28 12:53:06 thor.penguinpages.local systemd[1]: Started Virtual Desktop Server Manager.
Dec 28 12:53:06 thor.penguinpages.local vdsm[136588]: WARN MOM not available. Error: [Errno 111] Connection refused
Dec 28 12:53:06 thor.penguinpages.local vdsm[136588]: WARN MOM not available, KSM stats will be missing. Error:
Dec 28 12:53:06 thor.penguinpages.local vdsm[136588]: WARN Failed to retrieve Hosted Engine HA info, is Hosted Engine setup finished?
[root@thor ~]#
I think the ticket reflected that vdsmd is not starting.. it is.. but ... seems MOM is having issues.
https://bugzilla.redhat.com/show_bug.cgi?id=1393012
[root@thor ~]# yum install mom
Last metadata expiration check: 0:56:04 ago on Mon 28 Dec 2020 11:59:02 AM EST.
Package mom-0.6.0-1.el8.noarch is already installed.
Dependencies resolved.
Nothing to do.
Complete!
[root@thor ~]# systemctl status mom-vdsm.service
● mom-vdsm.service - MOM instance configured for VDSM purposes
Loaded: loaded (/usr/lib/systemd/system/mom-vdsm.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2020-12-28 12:53:11 EST; 3min 50s ago
Main PID: 136691 (momd)
Tasks: 6 (limit: 1235322)
Memory: 26.1M
CGroup: /system.slice/mom-vdsm.service
└─136691 /usr/libexec/platform-python /usr/sbin/momd -c /etc/vdsm/mom.conf
Dec 28 12:53:11 thor.penguinpages.local systemd[1]: Started MOM instance configured for VDSM purposes.
[root@thor ~]# cat /var/log/vdsm/mom.log
<snip hundreds of lines about ping>
2020-12-28 12:51:54,533 - mom.RPCServer - INFO - ping()
2020-12-28 12:51:54,534 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:52:09,560 - mom.RPCServer - INFO - ping()
2020-12-28 12:52:09,561 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:52:24,587 - mom.RPCServer - INFO - ping()
2020-12-28 12:52:24,588 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:52:39,613 - mom.RPCServer - INFO - ping()
2020-12-28 12:52:39,614 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:52:54,640 - mom.RPCServer - INFO - ping()
2020-12-28 12:52:54,648 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:53:06,259 - mom - INFO - MOM starting
2020-12-28 12:53:06,297 - mom.HostMonitor - INFO - Host Monitor starting
2020-12-28 12:53:06,297 - mom - INFO - hypervisor interface vdsmjsonrpcclient
2020-12-28 12:53:06,323 - mom.HostMonitor - INFO - HostMonitor is ready
2020-12-28 12:53:06,347 - mom - ERROR - Cannot connect to VDSM. This can happen when VDSM is starting. Error: Connection to localhost:54321 with use_tls=True, timeout=60 failed: [Errno 111] Connection refused
2020-12-28 12:53:11,715 - mom - INFO - MOM starting
2020-12-28 12:53:11,754 - mom.HostMonitor - INFO - Host Monitor starting
2020-12-28 12:53:11,754 - mom - INFO - hypervisor interface vdsmjsonrpcclient
2020-12-28 12:53:11,773 - mom.HostMonitor - INFO - HostMonitor is ready
2020-12-28 12:53:11,823 - mom.GuestManager - INFO - Guest Manager starting: multi-thread
2020-12-28 12:53:11,827 - mom.Policy - INFO - Loaded policy '00-defines'
2020-12-28 12:53:11,829 - mom.Policy - INFO - Loaded policy '01-parameters'
2020-12-28 12:53:11,842 - mom.Policy - INFO - Loaded policy '02-balloon'
2020-12-28 12:53:11,862 - mom.Policy - INFO - Loaded policy '03-ksm'
2020-12-28 12:53:11,888 - mom.Policy - INFO - Loaded policy '04-cputune'
2020-12-28 12:53:11,922 - mom.Policy - INFO - Loaded policy '05-iotune'
2020-12-28 12:53:11,922 - mom.PolicyEngine - INFO - Policy Engine starting
2020-12-28 12:53:11,922 - mom.RPCServer - INFO - Using unix socket /run/vdsm/mom-vdsm.sock
2020-12-28 12:53:11,923 - mom.RPCServer - INFO - RPC Server starting
2020-12-28 12:53:22,018 - mom.RPCServer - INFO - ping()
2020-12-28 12:53:22,019 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:53:26,963 - mom.Controllers.KSM - INFO - Updating KSM configuration: run:0 pages_to_scan:0 sleep_millisecs:0 merge_across_nodes:1
2020-12-28 12:53:37,044 - mom.RPCServer - INFO - ping()
2020-12-28 12:53:37,045 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:53:52,071 - mom.RPCServer - INFO - ping()
2020-12-28 12:53:52,072 - mom.RPCServer - INFO - getStatistics()
<snip hundreds of lines of ping issues>
2020-12-28 12:56:37,349 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:56:52,373 - mom.RPCServer - INFO - ping()
2020-12-28 12:56:52,374 - mom.RPCServer - INFO - getStatistics()
2020-12-28 12:57:07,396 - mom.RPCServer - INFO - ping()
2020-12-28 12:57:07,397 - mom.RPCServer - INFO - getStatistics()
... Still digging... will post if / when I find issue.