All drives fail.
Ironically, because direct-attached storage (DAS) is so much simpler than any other form of storage architecture, drive failures are more of a problem for DAS devices. With DAS, there is a tendency to "set it and forget it," which is impossible with a storage area network (SAN).
Furthermore, because most of the DAS applications in today's enterprises, such as desktop PCs and departmental servers, tend to be fairly low-performance, it is harder to collect the statistics necessary to monitor a drive's health. An administrator can use a remote management tool to check on drives on a SAN or a network-attached system (NAS), or set the alarm function to be alerted to a failing drive, but such options are rarely available for desktop systems. (Obviously, the situation is considerably different for DAS devices attached to major servers or those used for performance reasons.)
Disk drives show a pronounced bathtub curve when it comes to failures. They tend to have a relatively high failure rate in their very early life as manufacturing and component defects take their toll, but then the failure rate drops rapidly to a low level and the drives hum along for years. Then the effects of age and wear mount, and the drives fail.
Drives in DAS applications tend to start failing at significantly higher rates after a period of about five years. For many enterprises, that's longer than the replacement period. However, as IT budgets tighten up, there is an increasing tendency to repurpose older desktop systems by migrating them to less critical applications within the enterprise.
There are two methods -- aside from replacing the entire system -- for dealing with aging disk drives. One strategy is to simply replace all disk drives over a certain age, say 48 months. This means tracking the age of the drives and automatically replacing them. The second is to budget for the inevitable failures, but not replace the drives until drive errors reach an unacceptable level.
The more common strategy is to replace drives as they begin to fail, but this method runs the risk of data loss. Although old drives seldom fail catastrophically, users tend to ignore or work around bad sectors and other signs of a failing disk. If you're going to use the replace-as-failed strategy, you must make sure that users know to alert storage administration personnel to the signs of a failing drive. The alternative is to check older drives every so often for signs of failure.
Whichever method you choose, for security purposes, make sure you scrub or destroy the old drives before you dispose of them. To take care of failing drives, the simplest solution probably involves a sledgehammer and a concrete floor.
Part two of this three-part series discusses cabling problems inherent to DAS. Part three takes a look at DAS-related SCSI issues.
Rick Cook has been writing about mass storage since the days when the term meant an 80 K floppy disk. The computers he learned on used ferrite cores and magnetic drums. For the last 20 years he has been a freelance writer specializing in issues related to storage and storage management.
More information from SearchWinSystems.com