Although the actual disk fault management process will vary between organizations, depending on the policies, tools...
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
and personnel expertise available, there are some common elements of the disk replacement process that Windows administrators can follow.
First, you need to identify the faulty disk. Windows Server 2012 R2 provides several resources for disk fault and identification data including Event Viewer logs, through the Physical Disks report in Server Manager, through an alerts dialog in System Center Operations Manager (SCOM) or through Windows PowerShell queries. Where tools such as SCOM can report the specific location of a disk fault -- slot, tray and position -- other tools report a disk failure as a physical disk number or globally unique identifier (GUID). GUIDs can be translated into physical disk numbers using PowerShell Get-PhysicalDisk commands.
After determining which disk has failed, find it in the storage array enclosure. Many storage arrays provide LEDs that blink when a corresponding disk fails. If not, technicians will need extra time to find the correct physical disk or serial number.
Next, many technicians will first check the disk connections by attempting to reseat the troubled disk in its slot or cable connections. If this works, clear the blinking LED by resetting the physical disk use or removing the disk from the storage pool through a PowerShell PhysicalDisk command. If disk problems persist, replace the disk using the instructions for the particular storage array. Typical best practice states the new disk's characteristics should match the failed disk to prevent performance mismatches that might cause storage problems later. Replace the physical disk before removing the disk from any storage pool configuration. Give the new disk a chance to rebuild otherwise there may be data loss.
Make sure that each identical disk in the group or array is using the same firmware version. Once the new disk is in place, update its firmware to the latest accepted version used on other disks in the group or greater array. Remember that each new firmware version can introduce changes in timing and access. While this should improve the disk itself, firmware version differences can also introduce performance differences that might trigger unexpected or intermittent storage errors. Tools such as Server Manager or Windows PowerShell can report on disk firmware versions, and updates should follow the disk manufacturer's instructions.
At this point, use Server Manager or Windows PowerShell to add the new physical disk to the storage pool, and then retire and remove the old disk from the storage pool. In the event of a complete disk failure, the failed disk should have been retired automatically. If the disk is being replaced pre-emptively -- such as in response to intermittent problems -- retire the disk first through PowerShell.
As a final step in disk fault management, technicians can run a storage health test to verify the storage pool or cluster, and then dismiss any alerts.
Tips to stretch drive longevity
Failures can be the best training exercise
Techniques to handle server issues
Dig Deeper on Disk Drives and Disk Arrays for Windows
Related Q&A from Stephen J. Bigelow
Consider factors like security, platform compatibility, data usage requirements and management when transitioning from a private cloud to a hybrid ...continue reading
Several tools and commands can come in handy to storage admins looking to benchmark I/O performance on Linux systems. But not all benchmarking tools ...continue reading
If you consider security, performance, scalability, expertise, network visibility and service management when creating a private cloud, you avoid ...continue reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.