Today, many medium to large organizations seem to have a team of people whose job it is to review and approve software patches. As a SharePoint administrator, placing the burden of patch management onto another team may initially seem like a great idea. However, doing that can mean losing some control over the SharePoint servers.
The big problem when you allow a dedicated team to handle patch management for SharePoint servers is that you have no way of knowing how thoroughly the patches are being tested. The patch management team should test each patch meticulously, but I have seen IT professionals who skip the testing process altogether, assuming that if a patch was released by Microsoft then it must be OK.
Microsoft openly admits that patches do not receive the same degree of testing as service packs do. Normally, Microsoft seems to do a decent job of releasing patches that work as advertised, but there have been a couple of buggy patches that have made it out the door. And buggy patches can severely cripple your servers.
The only way to completely avoid the potential for installing untested -- or under-tested -- patches is to take control of the patch management process for your SharePoint servers. Corporate bureaucracy being what it is, though, that may be impossible. In that case, the next best thing is to hedge your bets and address patch management within your governance plan by placing an emphasis on code retention policies.
So what are code retention policies? They dictate how existing and legacy versions of known code should be retained. Such policies may document how many different versions of a file should be kept on hand and how that code is to be stored and documented. The bottom line is that you should have a damage control plan in place so that your team will know how to respond when a buggy patch puts SharePoint at risk.
Documentation. Start out by creating a policy stating that each patch that is to be applied must be documented. The trick is to set up the documentation requirements in a way that will help you recover your servers should the need arise. Begin by documenting the Microsoft Knowledge Base article ID number that corresponds to the patch, along with the date and time when the patch is supposed to be applied. Having this information on hand makes the troubleshooting process a lot easier if something goes wrong.
Another step in the documentation process involves obtaining a copy of the patch that is to be applied and extracting its contents to an empty folder. You can do this by specifying the Extract switch after the executable file name.
After extracting the patch file, make a note of the files that it contains. That way, you will know which system files are going to be replaced, and you can make a backup copy of those files before you install the patch.
Backups. So why not just make a full system state backup of your SharePoint servers before a patch is applied? You could use that approach, but it isn't always practical. Imagine, for example, the time and resources required to manually create full system state backups of a hundred different SharePoint servers every time there is a patch that needs to be deployed.
System recovery points. I once read an article in which someone wrote that backups prior to patch deployment are unnecessary because Windows automatically creates system restore points that you can fall back to should the patch cause problems. I will be the first to admit that system restore points are a great feature, but I wouldn't stake my job on their ability to reverse server damage.
The problem with system recovery points is that they are not retained indefinitely. If you apply a patch to a server, and then half an hour later you notice that the server is having problems, then reverting to a recovery point is probably the way to go.
But, what happens if you don't notice the problem for six weeks? By that time, the recovery point that you need may have been overwritten by newer recovery points. In the end, code retention is simply more reliable than system recovery points.
Patch removal. The argument could be made that retaining legacy code is unnecessary because buggy patches can be uninstalled. In a perfect world, this is absolutely true. In fact, your governance plan should directly state that the first course of action against a buggy patch should be to try to uninstall that patch.
Sometimes, though, you may find that a patch's flaws are severe enough that uninstalling the patch or rolling back the system to a recovery point becomes impossible. Restoring legacy system files may be the only means of recovery in such situations.
Structure your backups. Even if creating a special backup every time a new patch is to be deployed ends up being impractical, anyone who regularly backs up their SharePoint servers is already performing at least some degree of code retention. As such, it makes a lot of sense to examine your backup processes to see how well they would serve you if a buggy patch were to cause problems with your SharePoint servers.
One thing that you can do to make the recovery process easier is to structure your backups in a way that simplifies the recovery process. SharePoint stores all of its data and most of its configuration information in a SQL database. If you are not already doing it, back up the SQL database in a separate backup job from the SharePoint server's system files. That way, if problems do occur, the person who is tasked with fixing the server can restore the server's system files without any fear of accidentally overwriting data.
One last recommendation: Avoid applying large numbers of new patches at the same time -- aside from initially provisioning a server.
Imagine, for instance, that the patch management team informs you that it needs to apply 20 new patches to your SharePoint servers. If problems occur after the patches have been applied, then how will you know which patch caused the problem? Applying patches individually or in small batches goes a long way to make the troubleshooting process easier if problems should occur.
Even if the patch management and testing process is out of your hands, you aren't powerless to protect your SharePoint servers against buggy patches. Code retention policies can help to ensure that previous versions of system files remain available even after a patch has been applied, and that the servers can be reverted to a previous state.
About the author: Brien M. Posey, MCSE, is a seven-time recipient of Microsoft's Most Valuable Professional (MVP) award for his work with Exchange Server, Windows Server, Internet Information Services (IIS), and File Systems and Storage. Brien has served as CIO for a nationwide chain of hospitals and was once responsible for the Department of Information Management at Fort Knox. As a freelance technical writer, Brien has written for Microsoft, TechTarget, CNET, ZDNet, MSD2D, Relevant Technologies and other technology companies. You can visit Brien's personal website at www.brienposey.com.
Dig Deeper on SharePoint administration and troubleshooting
How to create a SharePoint online backup strategy to protect data
Tricks and tools to prevent data loss in Office 365
Carbonite backup line adds appliances, bare-metal restore
Microsoft Azure strengthens its backup and recovery services