Network doesn't come back up after an outage

Issues related to configuring your network
Post Reply
indefatigableman
Posts: 4
Joined: 2021/01/06 01:10:18

Network doesn't come back up after an outage

Post by indefatigableman » 2021/01/06 01:53:17

Hi. I hope someone here can help me. I have a Linux desktop running CentOS 7.9 that is configured to use NetworkManager. It has a wired Gigabit Ethernet connection and a static IP address on the Internet.

Whenever there's a network outage, the machine loses access to the network, and it doesn't get back onto the network when the network outage is resolved. When this happens, you can't ping it, but it's not frozen. Subsequently, approximately once per day, it's able to access the network again during a brief window. I know this because the computer sends me emails during this window. I wanted to try to ping it during one of these windows, but I wasn't able to try that before it was rebooted.

Rebooting always fixes the problem until the next network outage. None of the other Linux boxes on the same LAN have this problem, but I note that this is one of a select few with a static IP address. Almost all of the others use DHCP.

My sys admin seems convinced that it's hardware-related, either a problem with the Ethernet jack in the wall, some patch cable, or the computer itself. I just don't see how that is consistent with the evidence though. I think it's some kind of configuration issue. I would like some additional opinions.

Here's the output of `ifconfig -a` (some values redacted for security reasons):

Code: Select all

enp0s25: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 1xx.xxx.53.160  netmask 255.255.255.255  broadcast 1xx.xxx.53.160
        inet6 fe80::xxxx:xxxx:xxxx:509b  prefixlen 64  scopeid 0x20<link>
        ether b8:xx:xx:xx:xx:xx  txqueuelen 1000  (Ethernet)
        RX packets 122001  bytes 40089765 (38.2 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 73284  bytes 12721881 (12.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 20  memory 0xfb200000-fb220000  

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 406  bytes 26212 (25.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 406  bytes 26212 (25.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

virbr0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 192.168.122.1  netmask 255.255.255.0  broadcast 192.168.122.255
        ether 52:xx:xx:xx:xx:47  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

virbr0-nic: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 52:xx:xx:xx:xx:47  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
And here's the output of `nmcli device show`:

Code: Select all

GENERAL.DEVICE:                         enp0s25
GENERAL.TYPE:                           ethernet
GENERAL.HWADDR:                         b8:xx:xx:xx:xx:xx
GENERAL.MTU:                            1500
GENERAL.STATE:                          100 (connected)
GENERAL.CONNECTION:                     Wired connection 1
GENERAL.CON-PATH:                       /org/freedesktop/NetworkManager/ActiveConnection/1
WIRED-PROPERTIES.CARRIER:               on
IP4.ADDRESS[1]:                         1xx.xxx.53.160/32
IP4.GATEWAY:                            1xx.xxx.52.1
IP4.ROUTE[1]:                           dst = 1xx.xxx.53.160/32, nh = 0.0.0.0, mt = 100
IP4.ROUTE[2]:                           dst = 1xx.xxx.52.1/32, nh = 0.0.0.0, mt = 100
IP4.ROUTE[3]:                           dst = 0.0.0.0/0, nh = 1xx.xxx.52.1, mt = 100
IP4.DNS[1]:                             1xx.xxx.10.134
IP4.DNS[2]:                             1xx.xxx.50.17
IP6.ADDRESS[1]:                         fe80::xxxx:xxxx:xxxx:509b/64
IP6.GATEWAY:                            --
IP6.ROUTE[1]:                           dst = ff00::/8, nh = ::, mt = 256, table=255
IP6.ROUTE[2]:                           dst = fe80::/64, nh = ::, mt = 256

GENERAL.DEVICE:                         virbr0
GENERAL.TYPE:                           bridge
GENERAL.HWADDR:                         52:xx:xx:xx:xx:47
GENERAL.MTU:                            1500
GENERAL.STATE:                          100 (connected)
GENERAL.CONNECTION:                     virbr0
GENERAL.CON-PATH:                       /org/freedesktop/NetworkManager/ActiveConnection/2
IP4.ADDRESS[1]:                         192.168.122.1/24
IP4.GATEWAY:                            --
IP4.ROUTE[1]:                           dst = 192.168.122.0/24, nh = 0.0.0.0, mt = 0
IP6.GATEWAY:                            --

GENERAL.DEVICE:                         lo
GENERAL.TYPE:                           loopback
GENERAL.HWADDR:                         00:00:00:00:00:00
GENERAL.MTU:                            65536
GENERAL.STATE:                          10 (unmanaged)
GENERAL.CONNECTION:                     --
GENERAL.CON-PATH:                       --
IP4.ADDRESS[1]:                         127.0.0.1/8
IP4.GATEWAY:                            --
IP6.ADDRESS[1]:                         ::1/128
IP6.GATEWAY:                            --

GENERAL.DEVICE:                         virbr0-nic
GENERAL.TYPE:                           tun
GENERAL.HWADDR:                         52:xx:xx:xx:xx:47
GENERAL.MTU:                            1500
GENERAL.STATE:                          10 (unmanaged)
Any ideas?

Thanks!

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Network doesn't come back up after an outage

Post by TrevorH » 2021/01/06 10:44:55

Read your logs, probably in /var/log/messages and see what errors are present.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

indefatigableman
Posts: 4
Joined: 2021/01/06 01:10:18

Re: Network doesn't come back up after an outage

Post by indefatigableman » 2021/01/06 20:06:03

TrevorH wrote:
2021/01/06 10:44:55
Read your logs, probably in /var/log/messages and see what errors are present.

Code: Select all

Dec 23 20:14:09 xxxxxx kernel: e1000e: enp0s25 NIC Link is Down
Dec 23 20:14:12 xxxxxx crond: do_ypcall: clnt_call: RPC: Unable to receive; errno = No route to host
Dec 23 20:14:15 xxxxxx NetworkManager[1115]: <info>  [1608772455.2955] device (enp0s25): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed')
Dec 23 20:14:15 xxxxxx avahi-daemon[978]: Withdrawing address record for fe80::xxx:xxx:xxx:xxx on enp0s25.
Dec 23 20:14:15 xxxxxx avahi-daemon[978]: Withdrawing address record for 1xx.xxx.xxx.160 on enp0s25.
Dec 23 20:14:15 xxxxxx avahi-daemon[978]: Leaving mDNS multicast group on interface enp0s25.IPv4 with address 1xx.xxx.xxx.160.
Dec 23 20:14:15 xxxxxx avahi-daemon[978]: Interface enp0s25.IPv4 no longer relevant for mDNS.
Dec 23 20:14:15 xxxxxx NetworkManager[1115]: <info>  [1608772455.3211] manager: NetworkManager state is now CONNECTED_LOCAL
Dec 23 20:14:15 xxxxxx dbus[1003]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Dec 23 20:14:15 xxxxxx systemd: Starting Network Manager Script Dispatcher Service...
Dec 23 20:14:15 xxxxxx dbus[1003]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Dec 23 20:14:15 xxxxxx systemd: Started Network Manager Script Dispatcher Service.
Dec 23 20:14:15 xxxxxx nm-dispatcher: req:1 'down' [enp0s25]: new request (4 scripts)
Dec 23 20:14:15 xxxxxx nm-dispatcher: req:1 'down' [enp0s25]: start running ordered scripts...
Dec 23 20:14:15 xxxxxx nm-dispatcher: req:2 'connectivity-change': new request (4 scripts)
Dec 23 20:14:15 xxxxxx chronyd[1061]: Source 1xxx.xxx.xxx.46 offline
Dec 23 20:14:15 xxxxxx chronyd[1061]: Can't synchronise: no selectable sources
Dec 23 20:14:15 xxxxxx nm-dispatcher: req:2 'connectivity-change': start running ordered scripts...
Dec 23 20:15:12 xxxxxx crond: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Then, when the network outage was resolved about 10 minutes later:

Code: Select all

Dec 23 20:26:17 xxxxxx kernel: e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 23 20:26:17 xxxxxx kernel: IPv6: ADDRCONF(NETDEV_CHANGE): enp0s25: link becomes ready
Dec 23 20:26:17 xxxxxx NetworkManager[1115]: <info>  [1608773177.6393] device (enp0s25): carrier: link connected
Dec 23 20:26:17 xxxxxx NetworkManager[1115]: <info>  [1608773177.6402] device (enp0s25): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed')
Dec 23 20:26:17 xxxxxx NetworkManager: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Dec 23 20:26:23 xxxxxx kernel: e1000e: enp0s25 NIC Link is Down
Dec 23 20:26:28 xxxxxx kernel: e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 23 20:26:35 xxxxxx kernel: e1000e: enp0s25 NIC Link is Down
Dec 23 20:26:38 xxxxxx cupsd: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Dec 23 20:26:40 xxxxxx kernel: e1000e: enp0s25 NIC Link is Up 1000 Mbps Half Duplex, Flow Control: Rx/Tx
Dec 23 20:26:40 xxxxxx kernel: e1000e: enp0s25 NIC Link is Down
Dec 23 20:26:44 xxxxxx kernel: e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 23 20:27:13 xxxxxx crond: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
And 11 mins later:

Code: Select all

Dec 23 20:38:14 xxxxxx NetworkManager[1115]: <info>  [1608773894.0365] device (enp0s25): state change: config -> deactivating (reason 'connection-removed', sys-iface-state: 'managed')
Dec 23 20:38:14 xxxxxx NetworkManager[1115]: <info>  [1608773894.0371] manager: NetworkManager state is now CONNECTED_LOCAL
Dec 23 20:38:14 xxxxxx NetworkManager[1115]: <info>  [1608773894.0378] device (enp0s25): state change: deactivating -> disconnected (reason 'connection-removed', sys-iface-state: 'managed')
Dec 23 20:38:14 xxxxxx dbus[1003]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Dec 23 20:38:14 xxxxxx systemd: Starting Network Manager Script Dispatcher Service...
Dec 23 20:38:14 xxxxxx crond: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Dec 23 20:38:14 xxxxxx NetworkManager: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Dec 23 20:38:14 xxxxxx dbus[1003]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Dec 23 20:38:14 xxxxxx systemd: Started Network Manager Script Dispatcher Service.
Dec 23 20:38:14 xxxxxx nm-dispatcher: req:1 'down' [enp0s25]: new request (4 scripts)
Dec 23 20:38:14 xxxxxx nm-dispatcher: req:1 'down' [enp0s25]: start running ordered scripts...
Dec 23 20:39:14 xxxxxx crond: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Dec 23 20:39:14 xxxxxx NetworkManager: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Here's a log snippet from during one of those brief daily windows when it was able to get onto the network:

Code: Select all

Jan  5 04:09:03 xxxxxx rpcbind[976]: warning: /etc/hosts.deny, line 14: open /usr3/etc/denyhosts.list: Permission denied
Jan  5 04:09:06 xxxxxx NetworkManager[1115]: <info>  [1609837746.9365] device (enp0s25): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Jan  5 04:09:06 xxxxxx NetworkManager[1115]: <info>  [1609837746.9370] device (enp0s25): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Jan  5 04:09:06 xxxxxx NetworkManager[1115]: <info>  [1609837746.9377] manager: NetworkManager state is now CONNECTED_LOCAL
Jan  5 04:09:06 xxxxxx NetworkManager[1115]: <info>  [1609837746.9419] manager: NetworkManager state is now CONNECTED_SITE
Jan  5 04:09:06 xxxxxx NetworkManager[1115]: <info>  [1609837746.9421] policy: set 'Wired connection 1' (enp0s25) as default for IPv4 routing and DNS
Jan  5 04:09:06 xxxxxx NetworkManager[1115]: <info>  [1609837746.9427] device (enp0s25): Activation: successful, device activated.
Jan  5 04:09:06 xxxxxx NetworkManager[1115]: <info>  [1609837746.9435] manager: NetworkManager state is now CONNECTED_GLOBAL
Jan  5 04:09:06 xxxxxx dbus[1003]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Jan  5 04:09:06 xxxxxx systemd: Starting Network Manager Script Dispatcher Service...
Jan  5 04:09:06 xxxxxx dbus[1003]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Jan  5 04:09:06 xxxxxx systemd: Started Network Manager Script Dispatcher Service.
Jan  5 04:09:06 xxxxxx nm-dispatcher: req:1 'up' [enp0s25]: new request (4 scripts)
Jan  5 04:09:06 xxxxxx nm-dispatcher: req:1 'up' [enp0s25]: start running ordered scripts...
Jan  5 04:09:06 xxxxxx nm-dispatcher: req:2 'connectivity-change': new request (4 scripts)
Jan  5 04:09:06 xxxxxx systemd: Unit iscsi.service cannot be reloaded because it is inactive.
Jan  5 04:09:07 xxxxxx chronyd[1061]: Source 1xx.xxx.xxx.46 online
Jan  5 04:09:07 xxxxxx nm-dispatcher: req:2 'connectivity-change': start running ordered scripts...
Jan  5 04:09:07 xxxxxx chronyd[1061]: Selected source 1xx.xxx.xxx.46
Jan  5 04:10:01 xxxxxx systemd: Started Session 17335 of user root.
Jan  5 04:12:01 xxxxxx systemd: Started Session 17336 of user xxxxx.
Jan  5 04:13:00 xxxxxx NetworkManager[1115]: <info>  [1609837980.8929] device (enp0s25): state change: activated -> deactivating (reason 'connection-removed', sys-iface-state: 'managed')
Jan  5 04:13:00 xxxxxx systemd: Removed slice User Slice of root.
Jan  5 04:13:00 xxxxxx NetworkManager[1115]: <info>  [1609837980.8979] manager: NetworkManager state is now CONNECTED_LOCAL
Jan  5 04:13:00 xxxxxx NetworkManager[1115]: <info>  [1609837980.8985] device (enp0s25): state change: deactivating -> disconnected (reason 'connection-removed', sys-iface-state: 'managed')
Jan  5 04:13:00 xxxxxx avahi-daemon[978]: Withdrawing address record for fe80::xxxx:xxxx:xxxx:509b on enp0s25.
Jan  5 04:13:00 xxxxxx dbus[1003]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Jan  5 04:13:00 xxxxxx avahi-daemon[978]: Withdrawing address record for 1xx.xxx.xxx.160 on enp0s25.
Jan  5 04:13:00 xxxxxx systemd: Starting Network Manager Script Dispatcher Service...
Jan  5 04:13:00 xxxxxx avahi-daemon[978]: Leaving mDNS multicast group on interface enp0s25.IPv4 with address 1xx.xxx.xxx.160.
Jan  5 04:13:00 xxxxxx avahi-daemon[978]: Interface enp0s25.IPv4 no longer relevant for mDNS.
Jan  5 04:13:00 xxxxxx NetworkManager: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Jan  5 04:13:00 xxxxxx dbus[1003]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Jan  5 04:13:00 xxxxxx systemd: Started Network Manager Script Dispatcher Service.
Jan  5 04:13:00 xxxxxx nm-dispatcher: req:1 'connectivity-change': new request (4 scripts)
Jan  5 04:13:00 xxxxxx nm-dispatcher: req:1 'connectivity-change': start running ordered scripts...
Jan  5 04:13:00 xxxxxx nm-dispatcher: req:2 'down' [enp0s25]: new request (4 scripts)
Jan  5 04:13:00 xxxxxx nm-dispatcher: req:2 'down' [enp0s25]: start running ordered scripts...
Jan  5 04:13:00 xxxxxx chronyd[1061]: Source 1xx.xxx.xxx.46 offline
Jan  5 04:13:00 xxxxxx chronyd[1061]: Can't synchronise: no selectable sources
Jan  5 04:13:01 xxxxxx crond: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Jan  5 04:14:00 xxxxxx NetworkManager: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
Any ideas? Suggestions? Help!

indefatigableman
Posts: 4
Joined: 2021/01/06 01:10:18

Re: Network doesn't come back up after an outage

Post by indefatigableman » 2021/01/19 23:24:33

Can anyone help? Thanks!

Whoever
Posts: 1357
Joined: 2013/09/06 03:12:10

Re: Network doesn't come back up after an outage

Post by Whoever » 2021/01/26 01:59:50

Code: Select all

Dec 23 20:15:12 xxxxxx crond: do_ypcall: clnt_call: RPC: Unable to send; errno = Network is unreachable
I have never got yp to work with NetworkManager. My advice would be to disable Network Manager and make sure you have the appropriate "NM_CONTROLLED=no" in the ifcfg- files.

Edit:
See this bug report and the resolution within it:
https://bugzilla.redhat.com/show_bug.cgi?id=877789
I haven't tried this. Disabling NetworkManager has always worked for me.

indefatigableman
Posts: 4
Joined: 2021/01/06 01:10:18

Re: Network doesn't come back up after an outage

Post by indefatigableman » 2021/01/28 22:38:52

Thanks for the suggestion, @Whoever!

Post Reply