How To Build a Linux High Availability Cluster With Our Software
This is a high level overview for understanding how our cluster software can get you from having a pile of computers to having built an operational HA Linux cluster. Our quot;ClusterMaker" High Availability Cluster / IP Load Balancing software can be used for two node failover clustering, multi-node IP load balancing, or a combination of both. In this howto we examine only the two node failover scenario.
Overview
In brief, you install our high availability cluster software on a server or other fast computer running Linux. At this time we support Red Hat Enterprise Linux 5.x and Centos 5.x. In an IP load balancing scenario you could then use ClusterMaker to add some computers to the cluster simply by setting them for PXE boot, turning them on, and allowing the master server to see them. At this point, you log in to the graphical web interface and with a few clicks you can add the new computers as cluster nodes. Simply reboot them and they will come online and run whatever services are set to start in runlevel 2. They operate in a "shared root", so whatever is installed on the master server can be run by any of the nodes.
Our Linux high availability cluster / IP load balancing software provides several important benefits:
-
Data replication can be configured for any type of data
-
Built in Virtual IP / service failover
-
Built in IP load balancing across cluster nodes
-
Worker node duplication: Take nodes from bare metal to online in a few clicks
-
OS / application updates to master server propagate automatically to stateless nodes
-
Combines several HA, load balancing, and scaling technologies under a single management interface
-
Space efficient snapshot capability at the filesystem level makes rollbacks easy
There is no installation or management required on the cluster nodes. In fact, they can all be diskless nodes. They are completely stateless and every computer you boot into the cluster will come up fully loaded with the applications that are installed on the master. Nodes can be dynamically added and removed in seconds, and the cluster monitoring service will automatically add and remove them from IP load balancing as they pass or fail whatever service checks you define.
However, in this howto we are only dealing with high availability. To provide high availability, a Backup Master must be created. Running the 'master-maker' tool will duplicate your primary server onto another server and configure real time data replication and virtual IP address failover. The data replication is tied to the virtual IP, in that whichever server has the IP address will replicate data in the other direction. Therefore if the original server comes back online after a failover, the backup server will automatically bring the original's data back up to date.
For example: A highly available mysql server setup
For a common example, we'll use ClusterMaker to build a mysql failover cluster with just two nodes, a primary and a backup. We want ServerA to run mysql server and monitor the mysql daemon listening on port 3306. If the port fails to respond, ServerA should initiate a failover to ServerB. Also, if ServerA becomes unresponsive, ServerB should take over the virtual IP address and start serving clients current data within a few seconds. Also, if ServerA later comes back online and sees ServerB working, it should NOT immediately take the IP address back. It should assume that ServerB has more current data and should run data synchronization in reverse (from B --> A) to get back up to speed. This way we are immediately ready for more failovers, and we don't have to deal with stopping mysql, manually copying data to get back in sync, and we never have to deal with finding our place in the mysql transaction log files.
Hardware
The first step is to obtain the hardware to use for a "master server". This is the computer that will control all the render nodes, run the web intetrface, and is the only one you will need to interact with in order to manage the render farm. It should have plenty of hard drive space, reasonably fast CPU(s), and at least 1 - 2 gigabytes of memory. Everything will run faster over a gigabit ethernet or faster network, but this is not required. As with all Linux high availability cluster and load balancing applications, the required network speed is dependent on your particular use case. Sometimes a faster network will speed things up considerably, and if the fastest possible i/o is required our software can use Infiniband as the interconnect that the cluster nodes use to communicate with the master server(s). In our simple mysql example, this is not necessary.
.
Linux Installation
You'll need to install Linux on the master server. At this time, there are two choices: Red Hat Enterprise Linux version 5.x, or Centos version 5.x. Centos is a free clone of Red Hat Enterprise Linux, so at a technical level these are identical. For complete instructions on how to install these versions of Linux, see our master server Linux installation guide. It will take you step by step through the process of installing Centos 5.x on typical computer hardware. Please note that if you use a 64 bit version of Centos / RHEL, then ALL of your diskless nodes must be 64 bit capable. They will operate as a shared root cluster where all nodes concurrently run the same operating system. If you use the standard 32 bit version, then 64 bit nodes will run fine but they will not be able to address more than 4 GB of memory each.
Prerequisites
If you already have RHEL or Centos installed, there are only a few prerequisites for installing ClusterMaker and deploying your scalable high availability cluster.
- /etc/hosts must show the correct hostname and IP address. See the example in the installation guide for more information.
- your kernel source headers must be installed and linked to /usr/src/linux. See the example in the installation guide for more information.
- the IP address the render farm master will use to communicate with the render nodes must be on eth0. For example, if the master is 10.0.0.5 and the clients will be 10.0.0.6,7,and 8., then 10.0.0.5 needs to be on eth0. If this is not the case, it must be fixed before a successful installation can occur.
ClusterMaker Installation
Our ClusterMaker installation page covers this brief process in detail. But in a nutshell, all you need to do is run our installer. It will check for all the prerequisites and ask some basic questions like: What IP address should we use for the virtual IP that clients will connect to? What email address should cluster alerts be sent to? Then it will install and configure all the required software while you wait.
When this is complete, we can run master-maker and configure a backup master for high availability. Before doing this, connect a serial (null modem) cable between ServerA and ServerB, and verify connectivity by running:
cat < /dev/ttyS0
on one server and then echoing some text into /dev/ttyS0:
echo sometext > /dev/ttyS0
on the other. If the server running 'cat' displays what was echoed on the other server, the serial cable is functional. If for some reason you use a port other than com1 (like /dev/ttys1, for example), make sure to update the /etc/ha.d/ha.cf file with the correct serial port address after running master-maker. With this in place, we can now run master-maker:
cd /cluster/bin/
./master-maker
Master-maker will ask for the hostname of the backup master. If this name is not found in /etc/hosts, it will then ask for the IP address. After this it will ssh into the target server (ServerB, in our example) and ask one time for the root password. After this it will automatically configure passwordless ssh between the two servers, and it will then determine everything it needs to know in order to clone ServerA onto ServerB while leaving certain settings intact, like IP addresses and hard drive configurations. Now master-maker will build a special ram disk and configure ServerB to boot from it, and then finally it will reboot ServerB.
After the first sync has finished, you will need to reboot into the newly cloned ServerB. Note the instructions that appear after the sync is complete, and press a key during the boot up to manually choose the correct boot entry from the Grub screen. After the boot is complete, edit the grub configuration file at /boot/grub/menu.lst. Remove the "One Time Boot Entry" at the bottom, and set the "default=x" line correctly (probably default=0). Alternatively, you could restore the backup of the original file from /boot/grub/.menu.lst.orig.
ServerA and ServerB are identical now and data replication is flowing for the default directories:
( /cluster /etc /var /usr /lib /root )
This covers most parts of the server, but if you want to add more directories just add them to the configuration file and restart the service. Now we need to configure the monitoring system. Open the file /cluster/mon/mon.cf and uncomment this line:
hostgroup local_mysql localhost
and these lines:
watch local_mysql
service mysql
interval 10s
randskew 4s
monitor tcp.monitor -p 3306
period hr {12am-11pm}
alert cause_hb_failover.alert
alert mail.alert -S "MYSQL monitor is requesting failover" EMAILADDRESS
numalerts 1
If the word EMAILADDRESS is not already replaced with a valid email address for alerts, please do so now. After restarting mon with this command:
service mon restart
we can add mysql to the heartbeat configuration file /etc/ha.d/haresources. Open the file and add the name of the mysql daemon on the end like this:
sqlmaster.local IPaddr::192.168.1.49/24/eth0 MailTo:: This e-mail address is being protected from spambots. You need JavaScript enabled to view it ::Cluster_Failover_Event mon syslog ipvsadm smb dhcpd tftpd watcher mysqld
chkconfig mysqld off
Customization
Now everything is installed and you have a working configuration, the settings may be changed as you wish. Throughout our development, we left the all of the original open source code untouched so that total customization was possible. Any Heartbeat configuration that works somewhere else will work here. All components remain independent of each other. As a result, you can change the Heartbeat configuration, the LVS configuration (not used in this howto, but present), and/or the Mon configuration without breaking our data synchronization process or the graphical monitoring system (Ganglia).
To keep an eye on resource utilization, go to the ClusterAdmin web interface at "http://master_ip_address" and log in with username "master" and password "admin". Since we ahve not added any IP Load Balancing nodes to this cluster, the only part of the web interface we are interested in is the Status, Performance Graphs page. This will bring up a new window (some pop-up blockers prevent this) where all cluster nodes performance history (CPU, memory, etc.) can be seen at a glance.