Cloud computing opens up huge opportunities for deploying identical servers quickly and efficiently. The challenge is that most applications are not built for distributed or parallel operation across many servers, leaving us with some underlying problems that remain to be solved:
- Shared Data: How will all nodes access shared data like application configuration files? Do I need a cluster filesystem?
- Load Balancing: I can start 20 nodes, but how will I distribute inbound connections across them?
- Health Checking: When a node fails, how will it get removed from load balancing until it becomes healthy again?
- Failover / High Availability: What if the database server or load balancer fails? How will my 20 application servers continue to operate if this single point of failure (SPOF) has a problem?
Today we can deploy more nodes than ever before in record time. What the aforementioned challenges show is that simply having more nodes does not mean you have an effective cluster that allows your application to scale. You need to consider these additional factors if you want scalability and high availability that works in the real world.
A Linux system administrator may be surprised to look at this list and observe that these are the very same issues that they face with "In House" clusters. But this is as it should be; after all, cloud computing is really about creating servers. The fact that they are virtual instead of physical doesn't make all the networking challenges go away.
Yesterday I saw a solution from RightScale that addressed this very issue. It was refreshing to see another company that understands true scalability and doesn't try to mislead their customers by telling them that they can scale their performance instantly by just switching on more nodes.
The RightScale solution is called RightScale Website Edition and they provide a diagram that neatly captures all the plumbing required to make a service scalable and highly available. There are two load balancers, two database servers with replication, and multiple app servers.
But all this comes with a cost. Their offering is weighed down by a $2500 setup fee and a $500 monthly payment. You are getting good technology andtheir approach is well thought out. What is needed is an application for these services in your local environment.
ClusterMaker software does exactly that. It is the only way to deliver cloud style ease of use into your own datacenter. After installing it on a Linux server along with your applications, you can add application nodes (or compute nodes, for the HPC crowd) by simply setting them for PXE boot and turning them on. The master server running ClusterMaker will take care of the rest:
- Deploying an operating system
- Configuring a cluster filesystem
- Creating a shared root cluster where all nodes see the same data
- Setting up load balancing and automatic health monitoring
- Web based performance monitoring and node management
- Creating a backup master by cloning the original server
- Setting up data replication and synchronization between the two
- Creating a failover virtual IP address that is always active so the nodes always have access to the shared root despite inevitable failures
- Installing easy to use system state snapshotting with system restore points
- Separate snapshotting for data / databases that can be used without taking the database offline
While the cloud is an amazing platform with many scalability advantages over traditional hardware, it is not the right choice for all applications. Our goal is to provide an alternative that lets you quickly and easily implement similar levels of Linux high availability and scalability in your own datacenter.
ClusterMaker is easy to use, extremely cost effective, and has one more big advantage: It provides a consistent framework for cluster building. In the open source world, there are so many different cluster tools that Linux system administrators each tend to have their own favorite ways to approach the problem. This reduces productivity every time one admin has to pick up where another left off. Using ClusterMaker, "brain drain" from employee turnover or transfer in this area is virtually eliminated, saving the company hundreds or thousand of dollars in administrator time.