Are you spending too many lunch hours adding more disks to your servers? Do your backup cycles take 26 hours to complete? If so, then it's time to face the facts: You're not controlling your servers; they're controlling you!
Thankfully, there are ways to reestablish your server dominance. In this story, three server experts -- David Flawn, Donnie Bell and Karl Friedrich -- offer a dos-and-don'ts guide to enlightened server control. Bell and Friedrich hail from Round Rock, Tx., where they work as senior marketing consultants for server and PC maker Dell Computer Corp. Flawn is vice president of worldwide business development at Stratus Technologies, Inc., a server vendor in Maynard, Mass.
Don't let users download new drives and fool with their systems, advised Flawn. Lockdown your configurations so no changes are made until they are tested and verified. In his opinion, the more you can control an environment, the better off it's going to be availability-wise.
Do recognize that systems are going to fail, Flawn said. Have a strong root-cause analysis methodology to find out why a failure took place. Then, take strides to prevent that failure from happening again.
Don't immediately restart the system after a failure. A "core dump" of information left behind explains the operating system, hardware and application interactions prior to the failure, said Flawn. The core dump data could help with root-cause analysis. A quick restart leaves no room to understand why exactly failure occurred.
Do acknowledge that any kind of redundancy built into a system is important, said Flawn.
Do stop single points of failure in hardware. "A well put together cluster will have all the redundancies that you would expect and eliminates single points of failure," Flawn said.
Don't manage servers across the Internet, said Friedrich. This is a questionable practice, which can create major security issues. Your assets are open to the public if you manage via the Web, he said.
Do use a third-party central management console product to manage a heterogeneous environment. "Otherwise, you'll need a console for every platform you're running," said Bell. That can be lead to much confusion. Hewlett-Packard, Inc.'s OpenView is one example of a central management console, he said.
Do install the manageability agents provided by your vendor, said Bell. Often those tools allow you to be anywhere on your network and access a server.
Do use the installation tools that come with server software, said Bell. When they're not used, administrators often install incorrect drivers, such as generic drivers. Generic drivers don't always work, he said. "In essence, it slows you down not to use installation tools."
Do keep all servers configured as similarly as possible. That will minimize problems. "You won't worry about the differences between them," said Bell.
Do test a new product in a separate, pilot environment first before rolling it out, Friedrich said.
Don't fix what's not broken, Bell said. If you are in control and things are working well, think twice before tinkering. Otherwise, you could be back to working during lunch hour.
FOR MORE INFORMATION