#traininginpractice

dredmorbius@joindiaspora.com

Systems Operations is a Risk Mitigation Practice

Having done ops for much of my professional life, one thing I've realised (largely since having stepped out of the role) is that:

  • Ops is largely about risk mitigation and management.
  • Running regular scenario drills should in fact be a large part of the role function.
  • As is updating procedures with lessons learnt from the exercises.

That is, testing what happens when some eventuality occurs and how your organisation responds to it. Where those scenarios evolve as the landscape about your evolves. E.g., ransomware and associated threats are a major concern now, though they are only one of a number of potential risks.

I am not aware of any significant or widely-known guide to systems administration and operations which takes this viewpoint. The model does address many of the frustrations I've had with the role over my own career.

Keep in mind that a specific countermeasure may only address part of a risk. E.g., backups address the "we can get our data back" problem. Backups do not address the "we cannot unpublish that which has been made public" problem. So depending on your threat model, backups alone are not a complete mitigation.

#Sysadmin #DevOps #Operations #Risk #DrillBabyDrill #ScenarioPlanning #TrainingInPractice