centos wont let me console or ssh untill restarted

Issues related to configuring your network
Post Reply
senaps
Posts: 5
Joined: 2018/06/18 06:05:44

centos wont let me console or ssh untill restarted

Post by senaps » 2019/12/14 08:49:00

Hi there,
I have 7-8 centos 7 installations on the same type of device and hardware, with same software installations and updates. (nexcom 3130 machines).
now, one of these devices after 2 years of running without any problems, is getting disconnected from the network. we can't telnet, we can't see(browse) the web application running on it, we can't ssh, we can't console to it. all we have is a restart. and it is fixed.
so, this is the second time this happens, the previous occurrence was 2-3 months ago.
where should I look for the problem? there are 2 devices on the same network where this machine is living, and both of them are running exactly the same software on them. the only difference is, one is on an intranet network covering the internal network, and the other is on the Edge of the network connected to a firewall and then the internet.
the load isn't a problem, because we have less than 200,000 connections in the past 7 days, and the previous incident had less than 5000 connections throughout the whole week.

so again, where should I look for the problem? how would I engage with finding the cause of this behavior?

ps. the device is installed in a bank's datacenter and every time I go there to check for the logs and etc, I would need a whole game of requests and letters being passed between the managers of our company and theirs.

User avatar
TrevorH
Forum Moderator
Posts: 29171
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: centos wont let me console or ssh untill restarted

Post by TrevorH » 2019/12/14 14:02:04

Better get form filling then because your first step with problems of this nature will be to read /var/log/messages.
CentOS 6 will die in November 2020 - migrate sooner rather than later!
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 is dead, do not use it.
Full time Geek, part time moderator. Use the FAQ Luke

senaps
Posts: 5
Joined: 2018/06/18 06:05:44

Re: centos wont let me console or ssh untill restarted

Post by senaps » 2019/12/15 09:38:16

yeah that was the first to be checked and nothing was in it.kindda strange. no one has seen such a behaviour as I have been searching for this.

aks
Posts: 3008
Joined: 2014/09/20 11:22:14

Re: centos wont let me console or ssh untill restarted

Post by aks » 2019/12/16 06:53:45

All we know right now is that the machine has "gone away". Apparently there's no pertinent information in the message log. Course you could check journal and dmesg too. You should also look at the log(s) on the connected network device(s) (switches, routers etc.) if that is possible.

Sudden remote problems of this nature are hard to troubleshoot (and form filling in does not help). Check your change control - what are the last (say) 10 changes to the node and the network? Are you/your company responsible for both the node and the network?

The "normal" way to go about this is to follow the OSI reference model.

Physical: is the node unplugged?
Data link: has link gone down - years ago had this where a "name branded" switch would switch a port off on overload - overload of the whole switch, not just the specific port. Firmware fix from vendor. I also recall a particular problem where the ARP layer (IPv4) was "jumping" from port to port, in the sense that this MAC address goes to this port and the next second to another port, when the machines involved where had not physically moved.
Network: does the node still have a valid IP address? DHCP sends NACK to the node does to DHCP configuration being wrong (or there are multiple competing DHCP servers). Node no longer had a valid IP address. What about routes? Has the affected node still have the correct routes and have any changed? Can the node get from here to the net hop when affected? If not, why not? If we can, then the problem is probably inbound at the next hop.
Transport: we know TCP is not working. Is UDP (and how would you know)? What about ICMP?
Session: are only new connections affected?
The remaining upper layers are not relevant until we know these lower layers are okay.

Beyond that, is there any correlation between events on the network and the node going down?

Beyond the beyond, try not tear your hair out as the pressure to resolve goes up.

senaps
Posts: 5
Joined: 2018/06/18 06:05:44

Re: centos wont let me console or ssh untill restarted

Post by senaps » 2019/12/16 11:43:34

awesome checklist...
I have checked, we have no connection whatsoever(TCP/UDP).
the strange part for me is, I can't even console in.
I connect a monitor and a keyboard, and all I have is a blank screen. i hit all the key's so that if it's asleep, then it can give me something but no change.
but applications seem to be working(we have an app controling a lcd on the device, showing a message with dots as a progressbar being shown on the screen.
we hve the progressbar up and going, but the whole connectivity is gone.

MartinR
Posts: 590
Joined: 2015/05/11 07:53:27
Location: UK

Re: centos wont let me console or ssh untill restarted

Post by MartinR » 2019/12/16 12:22:27

Is your memory OK? Standard advice is to run memtest overnight (or better over the weekend). I've seen similar problems on a system with a failed DIMM. It was a diskless system and when it ran out of memory it just hung; IIRC that would be under C5.8.

senaps
Posts: 5
Joined: 2018/06/18 06:05:44

Re: centos wont let me console or ssh untill restarted

Post by senaps » 2019/12/16 13:09:24

MartinR wrote:
2019/12/16 12:22:27
Is your memory OK? Standard advice is to run memtest overnight (or better over the weekend). I've seen similar problems on a system with a failed DIMM. It was a diskless system and when it ran out of memory it just hung; IIRC that would be under C5.8.
ill check, thanks for the tip

Post Reply

Return to “CentOS 7 - Networking Support”