The Tale of the Network Automator: From Knight Errant to Engineer / by James Kelly


Podcast on YouTube

Perchance the temptation of technology and for certain the pangs of toil have persuaded some networkers to stray from the path of the CLI and *IE certification race.

Swinging APIs, they’ve heard, is the promised land and wielding shiny tools like Ansible can, not only confirm their chivalry, but also chase away any lament of rote configuration changes and misfortunes of fat fingers.

“Tonight, Sancho, we drink from the cup of automation. And we will be safely protected from any insurgent application issues rising to the heights of our virtuous network.”

Chapter 1: Initiation

And so once upon a time, after the ecstasy of donning the automation helmet and downloading a few horses, networkers looking for a rite of passage into the world of automation looked bravely in the mirror and said to themselves, “what is there to do that’s dangerous around here?”

The way many networkers made haste to an automation quest was ostensibly with all the strategy of the ad hoc adventures of a knight errant, looking to prove their worth.

These trailblazers were defined by the natural signposts in their workflows, by vendor toolkits, and by tools imported from the far away land of sysadmin. And so, often under the tutelage of nothing but their own trial and error, the networker knight errants learned to conquer a couple of the workflows that they were presented with, slaying manual steps one by one.

Chapter 2: Calamity

Shortly thereafter, while riding proudly toward a well-deserved weekend, the knights were so enchanted with their new weaponry and so pleased with themselves—their heads, swollen with victories, may have tightened helmets to their heads—that their attention and vision was impaired. And it was while fantasizing of their next rout that the slings and arrows started flying.

The actual assault is not important: a sudden unreachability, an impetuous line-of-business change request, a cloud VPC meltdown. It was as unexpected as it was inevitable. For some of the knights the affliction would uncover a weakness in armor. For others, it may have been the sting of the automation sword that was erratically swung and cut the wrong way, this time automating and thus swiftly amplifying an accident.

Chapter 3: Dedication

For some beginners, the setback of an automation blunder was so vexing that they turned back to peasant life. For many however, after finding their feet again and after some commiseration with knight companions, they found the strength to press on, carrying the lesson of their lapse forward to not repeat that unforgettable venomous mistake.

In knighthood, the networkers aspired to be so valiant and so adroit that they got the notion that if they’re doing it right, they won’t have difficulties. But difficulties, it turns out, are precisely the best circumstances for learning.

Ergo, the knight automators that persisted and persevered learned to be magnanimous with themselves first and foremost. To uphold future reliability, they resolved to automate around the woes that wounded them. In this way, an indiscretion, once defeated with automation, would never weasel its way back to a second sour encounter.

Chapter 4: Trials

Again and again in the hardship of automating their accomplishments, the networker knights became more astute and attuned to their environment. They began forging their own armaments and virtual squires to do their bidding. They also became less foolish and blunt on their escapades, taking what was useful from other toolsmiths and intrepid automators with software engineering skills.

In their training, the knights’ tenacity grew used to absorbing the lessons of failure, but they grew tired of the disrepute and thorns of trial and error.

As fortune would have it, one day they encountered software sages that spoke of staged replica battles. “But what is the use of repeating the past if we have integrated its teachings?” the knights inquired. “Not the past;” explained a sage, “we stage in preparation for any future change and crusade with many possibilities accounted for and tested.”  Testing and practice ahead of affronting a matter: it sounded ingenious, and so the knights too started building training grounds to rehearse their affairs.

Through testing, stressing and staging, and then automating and strengthening that preparation work too, the knights became known for their fastidious preparation and resulting dependability to conquer new projects and change conquests in production, and now with less havoc than sorely familiar from the past. Soon they became so mighty that their automation could take on more incessant action and circumstances of increasing variety.

Chapter 5: Consummation

Becoming even more zealous about automating trials not to error, the knights decided to consummate their erudition and experience by taking new identities as knight engineers.

See, each knight’s struggles and the many stories shared at the inns gradually made them less wandering and more disciplined and deliberate about their path. Their practice became more ritual and rigorous, and their automating more rewarding. Through their evolving wisdom, they had transformed from errant to engineer, worthy of an order or iron ring.

In due course, through tournaments and their own track record, the knight engineers decided they could not be passionate, proud engineers unless they could also measure and show their success. After all, they vowed reliability to others. And so, they decided to build public displays for every pledge of reliability that would maintain accurate assessments of their results. Constructed with even more automation, these displays objectively told of the current conduct and past performance of the systems in their lands.

Ultimately, so well-known these reliability signs of service became, that the ardent automation engineers are today called network reliability engineers.

Coda: The Canons of NRE

There are two discernable standards to which the knights held.

First, reliability was their utmost value. They determined that at the base of the hierarchy of needs for their service, if the network was not measurably reliable in the ways they promised, that nothing else mattered. Instead of trading off reliability with other values like agility, efficiency and so on, these boons were incidental because they first solved for reliability with automation.

Second, engineering is the best road to take as an automator. The processes and skills of software engineering guide far better than getting consumed with happened-upon technology, tools and APIs. Networking workflows are the battleground in which to practice automating and a brilliant place to start ad hoc, but ultimately the rigors of software engineering will provide strategy and structure for the tactics and tools.

For more on the culture, skills, processes, behaviors and common technologies of network reliability engineer (NRE) roles and teams, look to the famous older cousins: site reliability engineers (SRE). Check out the free online SRE book and my past blogs.


Thanks to the NREs, SREs and network automator forebears for your inspiration, enlightenment and advice in putting together the 5-step journey to automated NetOps. With that guide, may the path of forthcoming automators be smoother and more straightforward than your own story.

Cliff Notes Translation

For those unfamiliar with the famous humor and satire of the book Don Quixote, the above story and style may require some explanation. However, I suspect that those, even the slightest bit attentive to the network automation space, can see the parallel between this metaphor and real life.

Here are the Cliff Notes of the stereotypical story:

Chapter 1: Not looking before they leap into automation, many people progress in a rather ad hoc manner, tuning tribal knowledge and apparent workflows into an opportunity to learn while aggregating manual tasks. This is mostly governed by the tools they know or those put in front of them—which may or may not be the best tool for the jobs at hand.

Chapter 2: Many people get stung by automation in one way or another. Especially with config management automation, one small issue could easily get propagated to a massive blast radius. Change workflows aren’t the safest place to learn, but they are the most infamous for some reason. Automating directly in production when learning and without testing is also a catastrophe waiting to happen.

Chapter 3: Continuous improvement and learning is always the foundation of good automation. People learn that small changes and sometimes small mistakes are where the lessons lay to help discover what to automate next. This can be taken all the way to NRE concepts like chaos engineering, where failure is induced on purpose.

Chapter 4: The rigors of software engineering like test-driven engineering, continuous integration and delivery (CICD) and automated deployment are critical to success and safely moving quickly, instead of trading off reliability for velocity. Automation is learned in pre-production and environments like labs or virtual labs. Over time, engineers also create well-tested in-production continuous response.

Chapter 5: Instead of relying on only general monitoring tools, NREs, like SREs, engineer what matters most: service-level indicators to objectively measure success of higher-order service goals and promises. With the help of data, they manage their way to truly measured continuous improvement.

Podcast on YouTube

image credit blitzmaerker/pixabay