Arduino, Linux, Hadoop, Hive, H-Base, Postgres, Zabbix - Things that matter in life: Install Pacemaker and Corosync on SLES 11 SP3 - Postgres streaming

Guess what, no Bitcoins from ANYONE, Please buy some and donate them to me.... - 16tb2Rgn4uDptrEuR94BkhQAZNgfoMj3ug

I recently had the 'pleasure' of installing Pacemaker and Corosync on a 4 node cluster, originally the configuration called for 2 clusters, hence the naming, Cluster1 and Cluster2, cl1 and cl2.

As it normally goes in the IT world, things change and I implemented the following configuration.

Cluster (Only one now)
4 nodes, all running SLES 11 SP3, cl1_lb1, cl1_lb2, cl2_lb1, cl2_lb2
2 nodes, one master Postgres and one slave with Postgres streaming, cl1_lb1 and cl2_lb2

2 applications, the one called CBC, the other CGW

CBC runs on one of the Postgres machines (either cl1_lb1 or cl2_lb1), depending on which one is the Master, uses a VIP address for the outside world to communicate with it

CGW runs on all the nodes, uses a VIP in a round robin fashion, to serve Apache2 pages

Most of this was made possible by the brilliant work by the guys over at ClusterLabs

http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster

See diagram below

So let get to it this, I hear you say
1st off, let get all the ETHes defined, ETH0 will be used for the VIPs and ETH1 for Corosync, the heartbeat etc
Configure the following files to reflect the above

CLUSTER1 - LOAD BALANCER1 - cl1_lb1

cl1_lb1:~ # cd /etc/sysconfig/network/
cl1_lb1:/etc/sysconfig/network # ls -ltr
total 96
drwx------ 2 root root  4096 May  5  2010 providers
-rw-r--r-- 1 root root   239 May 11  2013 ifroute-lo
-rw-r--r-- 1 root root 29333 May 11  2013 ifcfg.template
-rw------- 1 root root   172 May 11  2013 ifcfg-lo
drwxr-xr-x 2 root root  4096 Feb 10 13:43 scripts
drwxr-xr-x 2 root root  4096 Feb 10 13:43 if-up.d
drwxr-xr-x 2 root root  4096 Feb 10 13:43 if-down.d
-rw-r--r-- 1 root root 10590 Feb 10 13:51 dhcp
-rw-r--r-- 1 root root    25 Feb 11 05:16 routes
-rw-r--r-- 1 root root   137 Feb 11 05:19 ifcfg-eth0
-rw-r--r-- 1 root root    78 Feb 11 13:43 ifcfg-eth1
-rw-r--r-- 1 root root 14206 Mar 12 08:54 config
cl1_lb1:/etc/sysconfig/network # cat ifcfg-eth0
STARTMODE='auto'
BOOTPROTO='static'
IPADDR='172.28.200.166'
NETMASK='255.255.255.0'
NM_CONTROLLED='no'
STARTMODE='auto'
USERCONTROL='no'
cl1_lb1:/etc/sysconfig/network # cat ifcfg-eth1
STARTMODE='auto'
BOOTPROTO='static'
IPADDR='172.16.0.1/24'
NM_CONTROLLED='no'
cl1_lb1:/etc/sysconfig/network #

CLUSTER1 - LOAD BALANCER2 - cl1_lb2

cl1_lb2:~ # cd /etc/sysconfig/network/
cl1_lb2:/etc/sysconfig/network # ls -ltr
total 96
drwx------ 2 root root  4096 May  5  2010 providers
-rw-r--r-- 1 root root   239 May 11  2013 ifroute-lo
-rw-r--r-- 1 root root 29333 May 11  2013 ifcfg.template
-rw------- 1 root root   172 May 11  2013 ifcfg-lo
drwxr-xr-x 2 root root  4096 Feb 10 13:29 scripts
drwxr-xr-x 2 root root  4096 Feb 10 13:29 if-up.d
drwxr-xr-x 2 root root  4096 Feb 10 13:29 if-down.d
-rw-r--r-- 1 root root 10590 Feb 10 13:36 dhcp
-rw-r--r-- 1 root root    25 Feb 11 07:48 routes
-rw-r--r-- 1 root root 14206 Feb 11 07:48 config
-rw-r--r-- 1 root root   137 Feb 11 07:57 ifcfg-eth0
-rw-r--r-- 1 root root    78 Feb 11 09:03 ifcfg-eth1
cl1_lb2:/etc/sysconfig/network # cat ifcfg-eth0
STARTMODE='auto'
BOOTPROTO='static'
IPADDR='172.28.200.167'
NETMASK='255.255.255.0'
NM_CONTROLLED='no'
STARTMODE='auto'
USERCONTROL='no'
cl1_lb2:/etc/sysconfig/network # cat ifcfg-eth1
STARTMODE='auto'
BOOTPROTO='static'
IPADDR='172.16.0.2/24'
NM_CONTROLLED='no'
cl1_lb2:/etc/sysconfig/network #

CLUSTER2 - LOAD BALANCER1 - cl2_lb1

cl2_lb1:~ # cd /etc/sysconfig/network/
cl2_lb1:/etc/sysconfig/network # ls -ltr
total 96
drwx------ 2 root root  4096 May  5  2010 providers
-rw-r--r-- 1 root root   239 May 11  2013 ifroute-lo
-rw-r--r-- 1 root root 29333 May 11  2013 ifcfg.template
-rw------- 1 root root   172 May 11  2013 ifcfg-lo
drwxr-xr-x 2 root root  4096 Dec 20  2013 scripts
drwxr-xr-x 2 root root  4096 Dec 20  2013 if-up.d
drwxr-xr-x 2 root root  4096 Dec 20  2013 if-down.d
-rw-r--r-- 1 root root 10590 Dec 20  2013 dhcp
-rw-r--r-- 1 root root    25 Dec 21  2013 routes
-rw-r--r-- 1 root root   104 Dec 21  2013 ifcfg-eth0
-rw-r--r-- 1 root root 14206 Jan 21  2014 config
-rw-r--r-- 1 root root    79 Feb 11 09:04 ifcfg-eth1
cl2_lb1:/etc/sysconfig/network # cat ifcfg-eth0
STARTMODE='auto'
BOOTPRONTP='static'
IPADDR='172.28.200.168'
NETMASK='255.255.255.0'
NM_CONTROLLED='no'
cl2_lb1:/etc/sysconfig/network # cat ifcfg-eth1
STARTMODE='auto'
BOOTPRONTP='static'
IPADDR='172.16.0.3/24'
NM_CONTROLLED='no'
cl2_lb1:/etc/sysconfig/network #

CLUSTER2 - LOAD BALANCER2 - cl2_lb2

cl2_lb2:~ # cd /etc/sysconfig/network/
cl2_lb2:/etc/sysconfig/network # ls -ltr
total 96
drwx------ 2 root root  4096 May  5  2010 providers
-rw-r--r-- 1 root root   239 May 11  2013 ifroute-lo
-rw-r--r-- 1 root root 29333 May 11  2013 ifcfg.template
-rw------- 1 root root   172 May 11  2013 ifcfg-lo
drwxr-xr-x 2 root root  4096 Feb  9 15:16 scripts
drwxr-xr-x 2 root root  4096 Feb  9 15:16 if-up.d
drwxr-xr-x 2 root root  4096 Feb  9 15:16 if-down.d
-rw-r--r-- 1 root root 10590 Feb  9 15:22 dhcp
-rw-r--r-- 1 root root    25 Feb 10 07:52 routes
-rw-r--r-- 1 root root 14206 Feb 10 07:52 config
-rw-r--r-- 1 root root   137 Feb 10 08:02 ifcfg-eth0
-rw-r--r-- 1 root root    78 Feb 11 09:04 ifcfg-eth1
cl2_lb2:/etc/sysconfig/network # cat ifcfg-eth0
STARTMODE='auto'
BOOTPROTO='static'
IPADDR='172.28.200.169'
NETMASK='255.255.255.0'
NM_CONTROLLED='no'
STARTMODE='auto'
USERCONTROL='no'
cl2_lb2:/etc/sysconfig/network # cat ifcfg-eth1
STARTMODE='auto'
BOOTPROTO='static'
IPADDR='172.16.0.4/24'
NM_CONTROLLED='no'
cl2_lb2:/etc/sysconfig/network #

You have to restart the network for this configuration to load - restart the network with the following on ALL the nodes

cl2_lb2:/etc/sysconfig/network # service network restart
Shutting down network interfaces:
    eth0      device: Intel Corporation 82579LM Gigabit Network Con                                                                                                                       done
    eth1      device: D-Link System Inc DGE-528T Gigabit Ethernet A                                                                                                                       done
Shutting down service network  .  .  .  .  .  .  .  .  .                                                                                                                                  done
Hint: you may set mandatory devices in /etc/sysconfig/network/config
Setting up network interfaces:
    eth0      device: Intel Corporation 82579LM Gigabit Network Con
    eth0      IP address: 172.28.200.169/24                                                                                                                                               done
    eth1      device: D-Link System Inc DGE-528T Gigabit Ethernet A
    eth1      IP address: 172.16.0.4/24                                                                                                                                                   done
Setting up service network  .  .  .  .  .  .  .  .  .  .                                                                                                                                  done
cl2_lb2:/etc/sysconfig/network #

Update all the nodes's /etc/hosts file, for my internal addresses, I used a different hostname.
To populate the /etc/hosts, I used the following

172.28.200.166 - cl1_lb1
172.16.0.1 - int_cl1_lb1

172.28.200.167 - cl1_lb2
172.16.0.2 - int_cl1_lb1

172.28.200.168 - cl2_lb1
172.16.0.3 - int_cl2_lb1

172.28.200.169 - cl2_lb2
172.16.0.4 - int_cl2_lb2

cl1_lb1:/ # cat /etc/hosts
#
# hosts         This file describes a number of hostname-to-address
#               mappings for the TCP/IP subsystem.  It is mostly
#               used at boot time, when no name servers are running.
#               On small systems, this file can be used instead of a
#               "named" name server.
# Syntax:
#    
# IP-Address  Full-Qualified-Hostname  Short-Hostname
#


# special IPv6 addresses
::1             localhost ipv6-localhost ipv6-loopback

fe00::0         ipv6-localnet

ff00::0         ipv6-mcastprefix
ff02::1         ipv6-allnodes
ff02::2         ipv6-allrouters
ff02::3         ipv6-allhosts
172.28.200.166  cl1_lb1.xxxx.com    cl1_lb1
172.16.0.1      int_cl1_lb1.xxxx.com        int_cl1_lb1
172.28.200.167  cl1_lb2.xxxx.com    cl1_lb2
172.16.0.2      int_cl1_lb2.xxxx.com        int_cl1_lb2
172.28.200.168  cl2_lb1.xxxx.com    cl2_lb1
172.16.0.3      int_cl2_lb1.xxxx.com        int_cl2_lb1
172.28.200.169  cl2_lb2.xxxx.com    cl2_lb2
172.16.0.4      int_cl2_lb2.xxxx.com        int_cl2_lb2
127.0.0.1 localhost.localdomain localhost
cl1_lb1:/ #

Setup Corosync, vi the file in /etc/corosync called corosync.conf and make the changes as per the printout below

For the MultiCast address, see this Wiki article
http://en.wikipedia.org/wiki/Multicast_address

I used 239.192.6.119 and port 5405, just to keep in line with all the examples I could find

cl1_lb1:/etc/sysconfig/network # cd /etc/corosync/
cl1_lb1:/etc/corosync # cat corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank

aisexec {
 # Run as root - this is necessary to be able to manage
 # resources with Pacemaker
 user:           root
 group:          root
}

service {
 # Load the Pacemaker Cluster Resource Manager
 ver:            0
 name:           pacemaker
 use_mgmtd:      yes
 use_logd:       yes
}

totem {
 # The only valid version is 2
 version: 2

 # How long before declaring a token lost (ms)
 token:          5000

 # How many token retransmits before forming a new configuration
 token_retransmits_before_loss_const: 10

 # How long to wait for join messages in the membership protocol (ms)
 join:           60

 # How long to wait for consensus to be achieved before starting
 # a new round of membership configuration (ms)
 consensus:      6000

 # Turn off the virtual synchrony filter
 vsftype:        none

 # Number of messages that may be sent by one processor on
 # receipt of the token
 max_messages:   20

 # Limit generated nodeids to 31-bits (positive signed integers)
 # you would set it to 'yes', the new option 'new' means wiping 
 # off the highest bit in network order to avoid possible nodeid
 # conflicting.
 clear_node_high_bit: new

 # secauth: Enable mutual node authentication. If you choose to
 # enable this ("on"), then do remember to create a shared
 # secret with "corosync-keygen".
 secauth: off

 # How many threads to use for encryption/decryption
 threads: 0

 # Optionally assign a fixed node id (integer)
 # nodeid: 124

 # interface: define at least one interface to communicate
 # over. If you define more than one interface stanza, you must
 # also set rrp_mode.
 interface {
                # Rings must be consecutively numbered, starting at 0.
  ringnumber: 0
  # This is normally the *network* address of the
  # interface to bind to. This ensures that you can use
  # identical instances of this configuration file
  # across all your cluster nodes, without having to
  # modify this option.
  bindnetaddr: 172.16.0.0
  # However, if you have multiple physical network
  # interfaces configured for the same subnet, then the
  # network address alone is not sufficient to identify
  # the interface Corosync should bind to. In that case,
  # configure the *host* address of the interface
  # instead:
  # bindnetaddr: 192.168.1.1
  # When selecting a multicast address, consider RFC
  # 2365 (which, among other things, specifies that
  # 239.255.x.x addresses are left to the discretion of
  # the network administrator). Do not reuse multicast
  # addresses across multiple Corosync clusters sharing
  # the same network.
  mcastaddr: 239.192.6.119
  # Corosync uses the port you specify here for UDP
  # messaging, and also the immediately preceding
  # port. Thus if you set this to 5405, Corosync sends
  # messages over UDP ports 5405 and 5404.
  mcastport: 5405
  # Time-to-live for cluster communication packets. The
  # number of hops (routers) that this ring will allow
  # itself to pass. Note that multicast routing must be
  # specifically enabled on most network routers.
  ttl: 1
 }
}

logging {
 # Log the source file and line where messages are being
 # generated. When in doubt, leave off. Potentially useful for
 # debugging.
 fileline: off
 # Log to standard error. When in doubt, set to no. Useful when
 # running in the foreground (when invoking "corosync -f")
 to_stderr: no
 # Log to a log file. When set to "no", the "logfile" option
 # must not be set.
 to_logfile: yes
 logfile: /var/log/cluster/corosync.log
 # Log to the system log daemon. When in doubt, set to yes.
 to_syslog: yes
 syslog_facility: daemon
 # Log debug messages (very verbose). When in doubt, leave off.
 debug: off
 # Log messages with time stamps. When in doubt, set to on
 # (unless you are only logging to syslog, where double
 # timestamps can be annoying).
 timestamp: off
 logger_subsys {
  subsys: AMF
  debug: off
 }
}
quorum {
           provider: corosync_votequorum
           expected_votes: 4
}
cl1_lb1:/etc/corosync #

The file above (corosync.conf) has to be copied to ALL the nodes in the cluster, the file contents stay the same, IE all nodes must have the same configuration in the file.
To setup ssh between the nodes, use the following

For example, from cl1_lb1, I generate the cl1_lb1's public key and then copy this key over to all the other nodes, follow this procedure for all the codes in the Cluster, so that any node can ssh to any other node, without using passwords

cl1_lb1:/etc/corosync # ssh-keygen -t rsa -N "" -f /root/.ssh/id_rsa
Generating public/private rsa key pair.
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
17:dc:a5:d0:cb:13:cf:e8:d6:08:61:75:37:f3:5c:84 [MD5] root@cl1_lb1
The key's randomart image is:
+--[ RSA 2048]----+
|          .o. o=+|
|         .ooo+Eo=|
|         .ooo*  o|
|          ..= o  |
|        S .o +   |
|         .  + .  |
|           .     |
|                 |
|                 |
+--[MD5]----------+
cl1_lb1:/etc/corosync # ssh-copy-id -i /root/.ssh/id_rsa.pub root@cl1_lb2
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@cl1_lb2'"
and check to make sure that only the key(s) you wanted were added.

cl1_lb1:/etc/corosync # ssh-copy-id -i /root/.ssh/id_rsa.pub root@cl2_lb1
The authenticity of host 'cl2_lb1 (172.28.200.168)' can't be established.
ECDSA key fingerprint is 65:11:64:26:4b:21:31:36:f9:8d:51:fb:fe:65:44:45 [MD5].
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@cl2_lb1'"
and check to make sure that only the key(s) you wanted were added.

cl1_lb1:/etc/corosync # ssh-copy-id -i /root/.ssh/id_rsa.pub root@cl2_lb2
The authenticity of host 'cl2_lb2' (172.28.200.97)' can't be established.
ECDSA key fingerprint is 05:80:c1:c1:5f:5e:55:09:6b:6d:da:18:74:af:86:16 [MD5].
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@cl2_lb2'"
and check to make sure that only the key(s) you wanted were added.

cl1_lb1:/etc/corosync # ssh root@cl1_lb2
Last login: Tue Mar 25 14:57:01 2014 from cl1_lb1.xxxx.com
cl1_lb2:~ # exit
logout
Connection to cl1_lb2 closed.
cl1_lb1:/etc/corosync # ssh root@cl2_lb1
Last login: Tue Mar 25 14:57:04 2014 from cl1_lb1.xxxx.com
cl2_lb1:~ # exit
logout
Connection to cl2_lb1 closed.
cl1_lb1:/etc/corosync # ssh root@cl2_lb2
Last login: Tue Mar 25 10:41:39 2014 from cl1_lb1.xxxx.com
cl2_lb2:~ # exit
logout
Connection to cl2_lb2 closed.
cl1_lb1:/etc/corosync #

Test and make sure that you can ssh from any node to any other node using the root user with ssh root@clx_lbx

Now you can copy the corosync.conf file to all the other nodes, using the command below
Copy the file from cl1_lb1 to cl1_lb2, cl2_lb1 and cl2_lb2

cl1_lb1:/etc/corosync # scp /etc/corosync/corosync.conf root@cl1_lb2:/etc/corosync/corosync.conf
corosync.conf       100% 4051     4.0KB/s   00:00    
cl1_lb1:/etc/corosync #

Woopee, let start Corosync

As I want corosync to start when the node boots, I have to copy the corosync start script to /etc/init.d and then use chkconfig to set the run levels, this has to be done in all the nodes in the cluster

cl1_lb1:/ # ls -ltr /usr/sbin/corosync
-rwxr-xr-x 1 root root 95184 May 11  2013 /usr/sbin/corosync
cl1_lb1:/ # cp /usr/sbin/corosync /etc/init.d/corosync
cl1_lb1:/ #

Set the runlevels, this has to be done in all the nodes in the cluster

cl1_lb1:/ # chkconfig --add corosync
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
insserv: warning: script 'S11corosync' missing LSB tags
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
insserv: warning: script 'corosync' missing LSB tags
insserv: warning: script 'corosync' missing LSB tags
insserv: script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:'  entry: please add even if empty.
corosync                  0:off  1:off  2:off  3:on   4:off  5:on   6:off
cl1_lb1:/ # chkconfig --level 35 corosync
corosync  on
cl1_lb1:/ # chkconfig --list | grep corosync
corosync                  0:off  1:off  2:off  3:on   4:off  5:on   6:off
cl1_lb1:/ #

And start corosync, again, start corosync on all the nodes in the cluster

cl1_lb1:/ # /etc/init.d/corosync
cl1_lb1:/ #

Logging is enabled, corosync logs to /var/log/cluster

cl1_lb1:/ # ls -ltr /var/log/cluster/
total 2283568
-rw-rw---- 1 hacluster haclient 2336087690 Mar 30 09:03 corosync.log
cl1_lb1:/ # tail -f /var/log/cluster/corosync.log 
Mar 30 09:03:17 [4084] cl1_lb1        cib:     info: cib_process_request:  Completed cib_query operation for section 'all': OK (rc=0, origin=local/crm_mon/2, version=0.1155.78)
Mar 30 09:03:17 [4084] cl1_lb1        cib:     info: crm_client_destroy:  Destroying 0 events
Mar 30 09:03:17 [4084] cl1_lb1        cib:     info: crm_client_new:  Connecting 0x8c1af0 for uid=0 gid=0 pid=10414 id=cadcc4b3-5edf-4bbb-a033-665539dc196d
Mar 30 09:03:17 [4084] cl1_lb1        cib:     info: cib_process_request:  Completed cib_query operation for section nodes: OK (rc=0, origin=local/crm_attribute/2, version=0.1155.78)
Mar 30 09:03:17 [4084] cl1_lb1        cib:     info: cib_process_request:  Completed cib_query operation for section //cib/configuration/nodes//node[@id='cl1_lb1']//instance_attributes//nvpair[@name='pgsql-data-status']: OK (rc=0, origin=local/crm_attribute/3, version=0.1155.78)
Mar 30 09:03:17 [4084] cl1_lb1        cib:     info: crm_client_destroy:  Destroying 0 events
Mar 30 09:03:18 [4084] cl1_lb1        cib:     info: crm_client_new:  Connecting 0x8c1af0 for uid=1102 gid=1004 pid=10418 id=8ab1f25a-bb8d-4c95-b06d-14dcd514c992
Mar 30 09:03:18 [4084] cl1_lb1        cib:     info: crm_client_destroy:  Destroying 0 events
Mar 30 09:03:19 [4084] cl1_lb1        cib:     info: crm_client_new:  Connecting 0x8c1af0 for uid=1102 gid=1004 pid=10423 id=2aeaf592-174e-4464-8d7f-205f96bb98bf
Mar 30 09:03:19 [4084] cl1_lb1        cib:     info: crm_client_destroy:  Destroying 0 events
Mar 30 09:03:20 [4084] cl1_lb1        cib:     info: crm_client_new:  Connecting 0x8c1af0 for uid=0 gid=0 pid=10490 id=03cc505c-3f94-47cc-b8e2-4757731987e7
Mar 30 09:03:20 [4084] cl1_lb1        cib:     info: cib_process_request:  Completed cib_query operation for section 'all': OK (rc=0, origin=local/crm_mon/2, version=0.1155.78)
Mar 30 09:03:20 [4084] cl1_lb1        cib:     info: crm_client_destroy:  Destroying 0 events
^C
cl1_lb1:/ #

Check if all is ok with the following commands
Make sure that the correct IP address is reflected with the commands below, do this in all the nodes in the cluster

cl1_lb1:/ # corosync-cfgtool -s
Printing ring status.
Local node ID 739246081
RING ID 0
 id = 172.16.0.1
 status = ring 0 active with no faults
cl1_lb1:/ # corosync-quorumtool -l
Nodeid     Votes  Name
739246081     1  int_cl1_lb1.xxxx.com
739246082     1  int_cl1_lb2.xxxx.com
739246083     1  int_cl2_lb1.xxxx.com
739246084     1  int_cl2_lb2.xxxx.com
cl1_lb1:/ # corosync-objctl | grep members
runtime.totem.pg.mrp.srp.members.739246081.ip=r(0) ip(172.16.0.1) 
runtime.totem.pg.mrp.srp.members.739246081.join_count=1
runtime.totem.pg.mrp.srp.members.739246081.status=joined
runtime.totem.pg.mrp.srp.members.739246082.ip=r(0) ip(172.16.0.2) 
runtime.totem.pg.mrp.srp.members.739246082.join_count=1
runtime.totem.pg.mrp.srp.members.739246082.status=joined
runtime.totem.pg.mrp.srp.members.739246083.ip=r(0) ip(172.16.0.3) 
runtime.totem.pg.mrp.srp.members.739246083.join_count=1
runtime.totem.pg.mrp.srp.members.739246083.status=joined
runtime.totem.pg.mrp.srp.members.739246084.ip=r(0) ip(172.16.0.4) 
runtime.totem.pg.mrp.srp.members.739246084.join_count=1
runtime.totem.pg.mrp.srp.members.739246084.status=joined
cl1_lb1:/ #

Take a break, if you get simular results as above, your corosync is running.... WoooopEEEEE

Part 2 will follow after this short break..

Install Pacemaker and Corosync on SLES 11 SP3 - Postgres streaming - Part2

Arduino, Linux, Hadoop, Hive, H-Base, Postgres, Zabbix - Things that matter in life

Pages

Install Pacemaker and Corosync on SLES 11 SP3 - Postgres streaming - Part1

No comments:

Post a Comment

Contact Form

Search This Blog