clvmd is hanging on startup of CentOS 6 cluster with iscsi

General support questions
Post Reply
eModul
Posts: 1
Joined: 2020/11/20 07:02:36

clvmd is hanging on startup of CentOS 6 cluster with iscsi

Post by eModul » 2020/11/20 17:41:42

Dear community,

I've been digging in the following problem for several days now, but did not find the root problem.
There was a short power surge on the following older redundant production system:

Hardware configuration
Cluster of 5 servers (1 Dell PowerEdge M520, 4 Dell PowerEdge M620).
Attached SAN (iSCSI storage): Dell PowerVault M32xxi
RAID-6

Operating system

Code: Select all

# cat /etc/redhat-release
CentOS release 6.10 (Final)
# uname -a
Linux srv01.local 2.6.32-754.29.2.el6.x86_64 #1 SMP Tue May 12 17:39:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
I know, that this version of CentOS is at its end of life, but I'd like to get this system working again.
  • One of the four switches failed, but the rest is still working. I reconfigured the servers, so they don't use the failed route over em1, but em2, only.
  • I started the servers in runlevel 1 and managed them to start their network and the cluster as well.

Code: Select all

# service cman start
Starting cluster:
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman...                                        [  OK  ]
   Waiting for quorum...                                   [  OK  ]
   Starting fenced...                                      [  OK  ]
   Starting dlm_controld...                                [  OK  ]
   Tuning DLM kernel config...                             [  OK  ]
   Starting gfs_controld...                                [  OK  ]
   Unfencing self...                                       [  OK  ]
   Joining fence domain...                                 [  OK  ]
#

Code: Select all

# clustat
Cluster Status for My-Cluster @ Tue Nov 17 14:37:23 2020
Member Status: Quorate

 Member Name                                                 ID   Status
 ------ ----                                                 ---- ------
 srv01.local                                             1 Online, Local
 srv02.local                                             2 Offline
 srv03.local                                             3 Online
 srv04.local                                             4 Online
 srv05.local                                             5 Online

#
  • But the clvmd is waiting indefinitely.
  • iSCSI is working correctly.
  • Multipath also seems to be correct.

Code: Select all

# multipath -ll
projekte (3690b11c00053b79f0000036a524b1980) dm-4 DELL,MD32xxi
size=1.8T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw
`-+- policy='round-robin 0' prio=9 status=active
  `- 15:0:0:1 sdc 8:32 active ready running
grundlage (3690b11c00053b858000001f43a105417) dm-3 DELL,MD32xxi
size=350G features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw
`-+- policy='round-robin 0' prio=14 status=active
  `- 15:0:0:0 sdb 8:16 active ready running
#

Code: Select all

# lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                           8:0    0 278,9G  0 disk
+-sda1                        8:1    0   500M  0 part  /boot
+-sda2                        8:2    0 278,4G  0 part
  +-vg_srv01-lv_root (dm-0) 253:0    0    50G  0 lvm   /
  +-vg_srv01-lv_swap (dm-1) 253:1    0  15,7G  0 lvm   [SWAP]
  +-vg_srv01-lv_home (dm-2) 253:2    0 212,7G  0 lvm   /old_home
sdb                           8:16   0   350G  0 disk
+-grundlage (dm-3)          253:3    0   350G  0 mpath
  +-grundlagep1 (dm-5)      253:5    0   350G  0 part
sdc                           8:32   0   1,9T  0 disk
+-projekte (dm-4)           253:4    0   1,9T  0 mpath
  +-projektep1 (dm-6)       253:6    0   1,3T  0 part
#
I hope that you may help me and can give me some hint in the right direction. I'm willing to give you the necessary information.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: clvmd is hanging on startup of CentOS 6 cluster with iscsi

Post by TrevorH » 2020/11/20 18:02:49

If you're using clvmd then you must also be running dlm. Do you have firewall rules that allow the dlm port (21064?) through one interface and not via the other?

And, yes, CentSO 6 is nearly EOL so you need to be looking at how to replace this with CentOS 7 or 8. You have 10 more days...
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

Post Reply