Cluster Membership

Cluster membership implies that the cluster must accurately determine which nodes are active in the cluster at any given time. In order to take corrective action on node failure, surviving nodes must agree on when a node has departed. This membership needs to be accurate and must be coordinated among active members. This becomes critical considering nodes can be added, rebooted, powered off, faulted, and so on. VCS uses its cluster membership capability to dynamically track the overall cluster topology. Cluster membership is maintained through the use of heartbeats.

LLT is responsible for sending and receiving heartbeat traffic over network links. Each node sends heartbeat packets on all configured LLT interfaces. By using an LLT ARP response, each node sends a single packet that tells all other nodes it is alive, as well as include communications information necessary for other nodes to send unicast messages back to the broadcaster.

LLT can be configured to designate specific links as high priority and others as low priority. High priority links are used for cluster communications (GAB) as well as heartbeat. Low priority links only carry heartbeat unless there is a failure of all configured high priority links. At this time, LLT will switch cluster communications to the first available low priority link. Traffic will revert to high priority links as soon as they are available.

LLT passes the status of the heartbeat to the Group Membership Services function of GAB. When LLT on a system no longer receives heartbeat messages from a peer on any configured LLT interface for a pre-defined time, LLT informs of the heartbeat loss for that system. GAB receives input on the status of heartbeat from all nodes and makes membership determination based on this information. When LLT informs GAB of a heartbeat loss, GAB marks the peer as down and excludes the peer from the cluster. In

most configurations, the I/O fencing module will then be utilized to ensure there was not a partition or split of the cluster interconnect. Once the new membership is determined, GAB will then inform processes on the remaining nodes that the cluster membership has changed. VCS will then carry out failover actions as necessary to recover.

0 0

Post a comment

  • Receive news updates via email from this site