Clustering technologies can bring greater reliability, fault-tolerance, availability, and scalability to a distributed computing system. Achieving any one of these goals effectively, however, calls for astute management, said Chris Smith, integration architect for Toronto-based Platform Computing Inc. Poor management of such tasks as assigning user names, creating configurations, and installing applications can derail cluster efficiency. Based on his experience with enterprise clustering projects and Platform's LSF Suite cluster management software, Smith offers these dos and don'ts for cluster management.
Don't overlook the importance of creating a common user name space. It sounds like common sense, but making sure you are in a domain or active directory environment -- where users have the same username on every machine -- is a first fundamental step in minimizing cluster management headaches.
Do make sure that every system in the cluster has the same configuration, in terms of software, operating system version, and hardware configuration. Standardization ensures that applications run consistently across the cluster, and enables seamless integration of new hardware, making scalability easier.
Do install software locally on systems versus across the network. When applications run across the network, substantial performance degradations often result. This problem only gets worse when you add more systems to the cluster, as there are more systems accessing the network. Local software installments involve a little more work at the outset, but it's often worth it in terms of performance benefits.
Do set up a separate test cluster, and test applications first on the dedicated cluster before putting an application into production.
Do keep critical data in a centralized location within the network. Given availability concerns, people are often hesitant to keep all their "crown jewels" in one place, said Smith. But when all the data is in one place, IT can focus fault-tolerant or highly available infrastructure support in that one area and implement centralized and reliable backup systems.
Do process data locally. Technical applications do lots of I/O, and this I/O should be done on local disks to avoid saturating the network bandwidth and overloading the centralized data store. Copy data in, do processing locally, and then copy the results back out to the data store.
Do use a resource management tool to handle load sharing, leverage idle computing cycles, and improve cluster performance. Resource management tools optimize return on investment from investments in critical computing resources, while providing a highly available and reliable computing infrastructure.
Do "think distributed" in order to harness the power of your distributed computing resources, said Smith. Be aware that you are using computing resources across a network and work with your ISV partners to ensure that software products can function in a distributed environment. Make distributed computing functionality a requirement in your purchasing decisions.
FOR MORE INFORMATION