I’ve seen a lot of different patching processes throughout my time in the industry. Automated patching with minimal capabilities for exclusions, excluding patches site wide because a few dozen servers couldn’t have the patch, manual (!) patching, and those who avoid patching at all (!!). More often than not, the administrators responsible for this effort who are achieving less than stellar results are hampered due to factors out of control: technology, preexisting processes, politics, bureaucracy, etc. These factors can have deleterious impacts on one another: good process can be hampered by technological limitations/unreliability, solid combinations of technology and process can be undermined by politics/bureaucracy, etc.
Rather than cover those limitations, I’d like to explore what I’ve seen work and what can be achieved when you have the right technology, adhere to a process that is well-established enough to be automated but flexible enough to be accepted, and you are wholeheartedly backed by the organizational willpower to achieve extraordinary outcomes.
Without further ado, let’s get into some high-level details. The technologies that my team and I are currently leveraging are as follows:
- Tanium (With Patch module)
- Active Directory
- PowerShell
I will reference these technologies in both this post and posts to come but the point of this post isn’t to proselytize any given technology (Except perhaps the precious, PowerShell). Tanium has helped us achieve great patching outcomes but I’ve seen other technologies come close: Active Directory has made this process easier but there are other methods for making logical collections of endpoints.
Step 1: The starting point
We utilize a relatively simple formula to establish consistency from inconsistency in the form of Patch Tuesday. Generally all administrators are aware of Patch Tuesday (Windows admins need to know when to be scared and Linux admins need to know when to pull up a chair and watch the fireworks). Patch Tuesday is the 2nd Tuesday of every month and the ubiquity of this knowledge makes it a fairly logical starting point for a process.
- Patch Tuesday + $x
That simple formula is the catalyst for a significant amount of automation. You can see how the function works in the Determining Patch Tuesday with PowerShell post.
Step 2: Bridging the gap between flexibility and automation
Before you hurl your coffee mug at me with reckless abandon, hear me out. I recognize flexibility is typically the antithesis of automation but sometimes you can have your cake and eat it too. In order to provide our customers (In this instance, app owners) with flexibility, we’ve defined a significant amount of slots that they can opt into for patching activities. Essentially, it works like this: after Patch Tuesday, our automated configurations kick off to configure maintenance windows, patch lists, and deployments. The deployments themselves begin the following week over each evening (With exception of Wednesday which serves as an ‘Uh Oh’ day in case something goes wrong with the first couple days of patching). Each evening has a slot available every 30 minutes; the slots are defined by Security Groups within Active Directory that are mostly populated at server-build but can be changed by request at a later date.
Within Tanium, each slot has an identically named Computer Group and an identically named Maintenance Window:
- Tanium Computer Group: Day_02_2200
- Tanium Maintenance Window: Day_02_2200
- AD Security Group: Day_02_2200
With this uniformity, the number of configurations at play is relatively unimportant because the predictability makes configuration easy to automate around. Whether you have a handful or a hundred, the process is consistent throughout. Different domains in the AD forest have the same AD Security Groups and Tanium is domain-agnostic.
Step 3: Automate
Automation is the easy part of these types of equations. With the Tanium platform, the actions that must be taken can be taken with the help of PowerShell, TanRest (Tanium-specific PowerShell module), and the Tanium APIs. I will describe the high-level process here and update these items with links to technical posts that get into greater detail:
- New month begins
- Update maintenance window objects in both DEV and PROD to reflect the current $patchTuesday + $x values (Automation).
- Email sent to team detailing the maintenance window objects that were changed (Automation).
- Patch Tuesday occurs
- Update the Preproduction patch lists in both the DEV and PROD environments using the Set-TaniumPatchPatchlist function from TanRest (Automation).
- Email sent to team detailing the patch lists that were changed (Automation).
- Endpoints in the preexisting, ongoing deployments to Preproduction endpoints see that the patch list for the deployment has changed, pull down the content for the new patches, and await maintenance windows for application (Automation).
- Email sent to team detailing the deployment outcomes (Automation).
- Scrutinize Preproduction outcomes for go/no-go decision.
- Update the Production patch lists in both the DEV and PROD environments using the Set-TaniumPatchPatchlist function from TanRest (Automation).
- Email sent to team detailing the patch lists that were changed (Automation).
- Endpoints in the preexisting, ongoing deployments to Preproduction endpoints see that the patch list for the deployment has changed, pull down the content for the new patches, and await maintenance windows for application (Automation).
- Email sent to team detailing the deployment outcomes (Automation).