Forging DevOps Culture with Hedge-fund Flair by James Kelly

teamwork-culture shutterstock_506137132.jpg

People: your most important resource and your greatest predicament to DevOps potency.

When the DevOps consultants recess and you need to scale a pilot-project team’s savvy, how do you affect the wider organization with DevOps principles?

Balancing the ingredients of this so-called mentality is trickier than revamping tools and processes. We all know to let tooling lead thy process, and process lead thy tooling. We know the approach is a rolling upgrade, not a mass reboot.

But in the plethora chapter and verse on DevOps, cultural principles are still parsimonious—not another definition, nor “automate everything,” nor the trite dev and ops working closely—real principles of cultural behaviors, their reasoning and an implementation track record.

When I was pouring through the pages of Principles by the Steve Jobs of investing, Ray Dalio, I was expecting to learn about life, finance and business from this famed hedge-fund investment and business guru. I did. I also realized, Ray’s high-performing investment and management principles codify common aspects of the DevOps mentality with some new ideas and revisions. And he’s got the CEO and CIO track record to support it, only his c-level ‘I’ stands for investment.

In the spirit of the ‘S’ for sharing in DevOps’s CALMS, Ray has provided a principles manifesto in clear, practical terms. I won’t reveal them all—I encourage you to read the book for that—but here are five of his greatest principles, distilled and steeped with my own perspective for the DevOps anthology.

1. Expedite Evolution, Not Perfection

From the opening biography, we come to know Ray as a continual learner by trial and error. He’s always looking for lessons in failures to carry forward, to do it better next time. He doesn’t regret failures; he values them more than successes because they provide learning.

Ray tells how he wouldn’t be where he is today—one of TIME’s top-100 most influential people in the world—if he had not hit rock bottom, having to let go of all his employees and forced to borrow $4000 from his dad to pay household bills until his family could sell their second car.

Because Ray upcycles painful mistakes into lessons and principles, learning and efficiency compound. He embraces evolutionary cycles, and knows a thing or two about compounding. Our human intelligence allows us to falter and adapt in rapid cycles that compound wisdom, without waiting for effects of generations. This iterative, rather than intellectual, approach performs better with the added benefit that, being experiential, you know it works.

If you’re a DevOps advocate, your Kaizen lightbulb may have lit. Kaizen is continuous learning: as I say, it’s the most important of all continuous practices in DevOps—and in life. Drawing from Ray’s rapid iteration of trial, error, reflect and learn, we see how he pairs Kaizen with Agile, values learning from failure, and takes many small quick steps for faster evolution.

To solidify the value behind this concept pairing, imagine a fixed savings interest rate, but change the cycle. What’s better: 12% annually or 1% monthly? “Periods do Matter” in this Investopedia article will show you that shorter cycles are better than longer ones. There is the technical reasoning behind why faster failing, leads to better evolution.

In another great read, 4 Seconds, Peter Bregman exemplifies how to manage learning and failure in business by telling the story of teaching his daughter to ride a bike without training wheels. Managing is knowing just the right time to step in and catch her. Too soon and she won’t learn to rebalance herself. Too late and...wipeout! He explains, “Learning to ride a bike, learning anything actually, isn't about doing it right: it's about doing it wrong and then adjusting. Learning isn't about being in balance, it's about recovering balance. And you can't recover balance if someone keeps you from losing balance in the first place.”

In summary, allow failure, cycle quickly and record the lessons. Depriving your people from the opportunity to fail, you deprive them from the opportunity to succeed—and the opportunity to improve. Breed a culture of rapid feedback and experimentation with guardrails, allowing failure without fatality.

2. Triangulate and Be Actively Open Minded

DevOps aficionados are familiar with “fail fast,” Agile and Kaizen. What’s further interesting about Ray, is how he allows for failure and equally reaches for high standards. And beyond technology, excellence is rarely discussed in DevOps circles.

Ray pursues life’s best. “You can have virtually anything you want, you just can’t have everything you want,” he says. Aside from his uncompromising principles in hiring and maintaining excellent people, Ray insists on excellent decision making to instill quality into evolution.

If failure doesn’t form progress, “fail fast” falls flat. Just like machine learning uses new and quality data to improve, our cycle progress is proportional to the quality and newness of abilities and information we use to pursue our goals.

The approach Ray hammers again and again is triangulation: exploring opinions different than his own or the first one offered up. Varying judgments can’t all be right, but understanding different viewpoints, is like making a quantum leap in an evolutionary cycle compared to learning from one source.

Ray’s dramatic story of receiving a cancer diagnosis indelibly impresses the importance of triangulation.

Obviously shaken up, he began to estate plan and spend more time with family, but he also consulted three experts. The first two doctors had wildly different prognoses and proposals for treatment or surgery. So he got them speaking with one another; they were respectful in understanding each other’s take, and Ray learned a lot. Finally, the third doctor suggested a regiment of no treatment nor surgery, but instead to monitor a biopsy of the cells every 6 months because his data showed treatments and surgery didn’t necessarily extend life in cases of cancer of the esophagus.

The three specialists, Ray and his family doctor agreed that this final approach wouldn’t hurt. The learning value or this triangulation aside, the outcome of the story will floor you: Ray’s first biopsy showed that he didn’t have any cancerous cells.

Back to DevOps, the CALMS ‘S’ for sharing is brilliant, but we can push beyond sharing. Actively seeking, not only sharing, information is key to boosting the quality of our decision making and evolution. Companies like Google do this with a manic focus on data, and data is just one avenue of information that may or may not go against our own beliefs.

In general, DevOps leaders must advocate for a culture and habits of active open mindedness, seeking opinions of other believable people and data. Like Ray, assertively explain your own opinions, while maintaining poise and humility to change your mind.

3. Radical Truth and Transparency

At the heart of Ray’s high-performing company, Bridgewater, is a culture of radical truth and transparency. Their patriarch trusts in truth, and loves his people like family, but also equitably protects the whole more than any part. For the greater good he doesn’t hold back in accurate evaluation, root-cause analysis, and openly pointing out problems, even in people. “Love the people you shoot,” he writes, “Tough love is both the hardest and most important kind of love to give.”

The firm keeps an internally available “baseball card” on each employee’s strengths and weaknesses synthesized from evidence-based patterns and a collection of business tools with psychographic-data crunching backends. Weaknesses aren’t misconstrued for weak people, and employees aren’t pigeonholed; the transparency enables orchestrating employees’ best work and identifying their believability in decision making.

For decades, Bridgewater has been using data on people and their track records to do believability-weighted decision making with the help of computers. The company’s “Idea Meritocracy” tools like the Dot Collector Matrix collect data and help teams make believability-weighted decisions, even instantly in meetings. While this was pioneered for investment decisions, Bridgewater later adopted the system for management decisions. Ray also hints he’s working toward offering the system as a service.

This principle is about being ruthless in demanding integrity, honesty, accuracy and openness. Common workplace biases like loyalty, niceness, confidentiality and secrecy might seem safe or well-intentioned in small contexts, but are ultimately self-defeating of the big-picture success of the whole.

Every person and organization has a unique twist on values and workplace politics, but while Bridgewater’s success speaks volumes, its radically straightforward approach is also reported to be the preference of techies and millennials that make up many DevOps-forward teams.

Embracing DevOps, results in more than dev and ops working together—it’s working more closely with the business too. While DevOps leaders can’t control the culture in the wider organization, they can shape the sub-culture of their own teams. Not only is it more manageable on that scale, but this cultural principle and corresponding tools seem a natural fit for IT workers. Just maybe as the role of IT is growing in most businesses today, the culture might catch on.

4. Be Candid and Fearless, Rather Than Blameless

Blameless post-mortem or retrospective meetings are not uncommon in a DevOps culture.

You can probably guess how Ray might see this differently.

If your culture is blameless, there’s less accountability, so you’re more likely to miss lessons and chances for improvement. It’s not just about fixing the machine neither, it’s about helping individuals. And if someone is truly not capable, you could fail to see it if you don’t dispassionately trace the blame.

Accuracy requires great diagnosis, and Ray’s method for root-cause analysis makes Toyota’s 5 Whys look skin deep.

Ray advocates to keep people responsible for investigation reporting up independently of where diagnosis happens, so there’s no fear of recrimination. “Remember people tend to be more defensive than self-critical. It is your job as a manager to get at truth and excellence, not to make people happy. Everyone's objective must be to get the best answers, not the answers that will make the most people happy.”

Having said that, Bridgewater’s culture also pushes everyone to tell the truth without fear of adverse consequences from admission of mistake.

When an employee missed placing a trade for a client, it ended up costing millions for younger, smaller, less-capitalized Bridgewater. With the whole company watching so to speak, Ray decided not to fire this employee—he knew that would lead to a culture of people hiding their mistakes instead of bringing them to light as soon as possible.

With respect to handling missteps, this hedge funder would admonish blamelessness in favor of candor and staff fearlessness. It has the efficiency of earlier learning and earlier redesign for prevention. It also doesn’t eschew accountability, encouraging individual improvement that eventually lifts the whole team.

5. Management by Machine and Metrics

Techies will appreciate how Ray talks about his business as a machine.

If you have great principles that guide you from your values to your day-to-day decisions, but don't have a way of making sure they're systematically applied, you leave their usefulness to chance. We need to cement culture into habits and help others do so as well. Systematizing any cultural principle into a process, tool, or both, “it typically takes twice as long,” Ray says, “but pays off many times over because the learning compounds.”

Bridgewater always put investment principles into algorithms and expert systems, and has long since run the rest of their business by software machinery as well.

Is this just the well-worn “automate everything” DevOps call?

Automation advances scale, performance, correctness, consistency and instrumentation. But high-performing businesses like Bridgewater also manage by metrics: they compare outcomes and measurements to goals.

While data-driven decision making is eminent these days, data-driven measurement and accountability is less common. We have KPIs, QBRs and performance reviews, but how many teams are consistently managed by metrics? We more easily look forward, than take an objective look in the mirror even though it’s critical for evolution.

Goal-setting zealots argue that goals must be measureable, and Ray’s advice takes it one step further: Don’t look at the numbers you have and adapt them to your needs. Instead, “start with the most important questions and come up with the metrics that will answer them,” he says. “Remember any single metric can mislead.” Furthermore, like big-data analytics, data garbage in equals information garbage out.

Be the Change

Ray also says, “An organization is the opposite of a building: it's foundation is at the top.”

But we all know stories of change percolating from all levels of organizations, communities and countries. If you’re not a CEO like Ray was, you can still make a meaningful difference bottom-up or managing your own team, leading by example.

You could simply publish your team’s principles, create a tool, or ignite behaviors you want to spread. Of the DevOps people-process-technology, people are your most important resource; so forge the principles of their operating systems: sharpen, tweak, prioritize and balance. With the transformation door open in your digital business and DevOps journey, there’s no better time to make an invaluable mark on culture—in IT and beyond.

image credit Jacob Lund/Shutterstock

New Heroes in the DevOps Saga: DevSecOps and DevNetOps by James Kelly


This article was originally published on September 26 at

The evolution of DevOps is by no means done, but it’s safe to say that there is enough agreement and acceptance to declare it a hero. DevOps has helped glorify IT to the point where it’s no longer preventing business, nor a provider nor a partner of the business.

Often IT is the business, or its vanguard for competitive disruption and differentiation.

Splintering the success of this portmanteau hero, we now hear more and more of two trusty sidekicks: DevSecOps and DevNetOps. Lesser understood in their adolescence, these tots are still frequently misunderstood, are still forming their identities, and still need a lot of development if they’re to enter the IT hall of fame like their forerunner.

Just as the terms look, DevSecOps and DevNetOps are often assumed to be about wrapping DevOps principles around security and networking: operators hope to assuage technical debt and drudgery by automating in proficiency and resiliency. For networking, I’ve covered how there is a lot more to that than coding, but to be sure, these sidekicks certainly espouse operators learning how to do develop while DevOps was equally, if not more, about developers learning to operate.

The Shift Left: SecDevOps and NetDevOps

As if it wasn’t hard enough to tell what DevSecOps and DevNetOps want to be when they grow up, we’ve gone and given them alter egos: SecDevOps (aka “rugged” DevOps) and NetDevOps. Think about them exactly as the words look – it’s about the shift to the left. Left of what?

Traditional DevOps practices focus on business-specific applications development. The development timeline is known as concept to cash, and with all the superpowers of DevOps we try to reduce our enemy: the lead time and repeatable processes between code and cash.

Security and building infrastructure – like networks – were supporting tasks, not revenue-generating nor competitive advantages. Thus, security and networking were far to the right on the timeline with concerns that deal with operational scale, performance and protection.

Today’s shift left propels security and infrastructure considerations earlier on the timeline, into coding, architecture and pre-production systems. It’s a palpable penny-drop amid daily news of security breaches and infrastructure outages causing technology-defined establishments to bleed money and brand equity.

Fill the bucket with cash, but don’t forget to forestall the leaks!

DevOps and Infrastructure: Challenge and Opportunity

Automation sparks have flown over the proverbial wall into the camp of I&O pros. Operators trading physical for virtual, macro for micro, converged for composed, and configuration for code is proof that the fire has caught security and networking. Controlling the burn now, is key, so that healthier skills and structures arise in place of the I&O dogma and duff. Fortunately, this is precisely the destiny for our newfound heroes, DevSecOps and DevNetOps.

However, doing DevSecOps and DevNetOps, embracing security and networks as code, we mustn’t be so credulous as to forget the formidable DevOps practices and patterns that need transforming along the ultimate automation journey. Testability, immutability, upgradability, traceability, auditability, reliability, and other __abilities are not straightforward to achieve.

Discounting “aaS” technology consumed as a service, a fundamental challenge to innovating SecOps and NetOps, compared to application ops, is that applications are crafted and built; security and networking solutions are mostly still bought and assembled.

Security and network infrastructure as code is something that needs to be co-created with the vendors. Other than in the cloud, it will take a while before security and networking systems are driven API-first, and are redesigned and broken down to offer simulation, composition and orchestration with scale and resilience.

While this will land first in software-defined infrastructure, there is still a ways to go to manage most software-defined security and networking systems with continuous practices of artifact integration, testing, and deployment. Hardware and embedded software will be even more challenging.

Finding Strength in Challenge

So on one hand, DevOps is evolving with security and networking shifting left. On the other hand, traditional security and networking ops are transforming with DevOps principles.

Is the ultimate innovation to squeeze out those traditional operations altogether? Does NetDevOps + DevNetOps = DevOps?

There is a parallel train of thought and debate, with success on both sides. Purist teams cut out operations with the “you build it, you run it” attitude. Other companies like Google have dedicated operations specialist teams of SREs. While the SRE reporting structure is isolated, SRE jobs are very integrated with that of development teams. It’s easy to imagine the purist approach, subsuming security and networking into DevOps practices, but only if we assume the presence of cloud infrastructure and services as a platform. Even then, there is still substantiation for the SRE.

Layers below, however, somebody still needs to build the foundations of the cloud IaaS and data center hardware. As they say, “Even serverless computing, is not actually serverless.”

Underpinning the clouds are data centers. And then there’s transport, IoT, mobile or other secure networks to and between clouds. In these areas, it’s obvious there is a niche for our two trusty sidekicks, DevSecOps and DevNetOps, to shake up ops culture and principles. These two heroes can rescue software-defined and physical infrastructure from the clutches of so many anti-pattern evils, like maintenance windows and change controls (ahem, it’s called a “commit”).

We may not require rapid experimentation in our infrastructure, but we would warmly welcome automated deployments, automated updates, failure and attack testing drills, and intent-driven continuous response. They will boost resiliency and optimization for the business and peace of mind for the builders.

Teams operating security, networks, and especially clouds, need to honor and elevate DevSecOps and DevNetOps, so that on the journey now afoot, our teams and our new heroes may realize their potential.

This article was originally published on September 26 at

Good Habits to Make the Multi-Cloud Work For You – Part 2 of 2 by James Kelly


This article was originally published on September 12 at Data Center Knowledge

In my previous article, I talked about the state of infatuation with hybrid and multi-cloud environments. Would you be surprised that in the stresses and mania surrounding IT cloud strategy, some folks fixate more on the playing field than the game itself? You probably already know that you’ve got to get your head in the game in this unforgiving age, and a winning strategy for digitally speeding and feeding the business across the multi-cloud is not: taste the rainbow; it’s choosing and consuming cloud wisely.

Too bad that how you do so isn’t obvious, and as if it wasn’t difficult enough to anticipate technology turns ahead, there are so many captivating cloud services that might lead you down treacherous roads to traps and debt. But there are also well-known tactics emerging that you can model to ready and steady your organization for change and success. Like most, if your journey has already begun, you’re picking these up along the way and adjusting your habits as you go.

You know how bad habits are easy to form and hard to live with? Similarly, it’s very easy to jump into multi-cloud or unwittingly let it happen to you. At this precipice, the warning signs and early stories of cloud lock-in, overwhelming multiple-cloud context switches, runaway expenses, and situational blindness, are hopefully enough to grab your attention. Multi-cloud is inevitable; these fatalities are not.

A multi-cloud platform is a powerful environment, and it requires proper preparation so you can control it, instead of it controlling you. With that, here are four of the best preparations I’ve seen, like good habits that are hard to form, but easy to live with.

1. Unify Your Toolchain

In the eternal deluge and disruption of new tech tooling and systems, remember those good old-fashioned IT values of standardization and consolidation? Don’t throw those babies out with the ITIL bathwater.

As you embrace cloud and bimodal IT with new and improved tools, you might lessen the reins on your traditional values, using public cloud and building private cloud infrastructure alongside your physical and virtualized data centers. In loosening the reins or spinning out agile side projects, just watch out for the trap of hasty developers rolling their own stack or going stackless/serverless, only to get caught in a web of proprietary cloud services.

Don’t rush an obstinate knee-jerk to block this neither. Think of a unified toolchain effort as one with the developers to rationalize a base devops pipeline, cluster, and middleware stack, that could serve 80 percent of projects.

  • Your tools need to work on any cloud infrastructure, and if they can work with your legacy infrastructure, even better.
  • Freeing yourself from lock-in of cluster and pipeline orchestration tools and infrastructure-as-code lifecycle management: keep them untethered from any specific underlying IaaS, with portable shims like Terraform.
  • While you don’t want to throttle developers back from using services outside of your stack – they’ll go around you anyway – encourage managed open-source-based services. Then incorporate such services into your middleware toolchain as it matures. Tools like Helm, make it easier to manage services yourself, more than ever before.

If you’re a lean IT shop, let’s face it, following this to the letter may take you away from getting to market ASAP. Maybe you’re a startup or in that mode? You don’t just want, but need, to focus on developing your core competitive technology, not a portable multi-cloud toolchain.

How do you balance moving fast, employing low-hanging SaaP, with the concern of vendor and architectural lock-in?

If a tool is a competitive differentiator, then you should probably build it. Otherwise, remember there are a lot of open-source tools that are glued together with reference implementations of other open-source tools: large projects like Kubernetes and Spinnaker are easy to adopt with a bunch of pre-canned sensible defaults. Another option is to choose managed open-source services, that are more easily insourced later or offered by multiple cloud vendors.

Finally, software design is probably the most important and challenging factor of all. Architecting for scale is obvious, but flexibility enables business agility; so consider not only today’s lock-in, but also getting locked out of a competitive advantage tomorrow. Assembling API-driven (capital ‘S’) Services from micro-services is a well-established pattern to do this, and I’d recommend software alchemists investigate evolutionary architecture from ThoughtWorks for more wisdom.

2. Connect Your Clouds

 Connection was a given in the world of hybrid cloud. That still holds true. However, cloud bursting, the most bombastic of all use cases for hybrid cloud, is the least common. Multiple clouds need to be connected together for many more realistic and common use cases:   

  • Imagine pipeline automation that includes environments or steps split across clouds. Dev/test can happen anywhere, but you may have higher requirements for staging/production.
  • Secure data replication for warehousing or distributed applications, and backups for disaster recovery and avoidance.
  • Split application tiers, where there are different non-functional requirements for the various application tiers like sovereignty, security, scale, performance, etc. that must be met in various geographies or optimized with split economics. Some applications may be split because of functional requirements too because certain clouds have unique advantages that others can’t reproduce.

Such cloud interconnections demand higher security than using the internet, and often clouds simply require a secure connection back to your enterprise staff or users. Beyond security, unique routing and legacy layer-2 unicast or even multicast connectivity could be an application requirement. While there are cloud-vendor tools for basic security and networking, you can also search for your own software-defined and virtualized security and networking solutions that are agnostic to any cloud infrastructure, unifying this toolchain too and incorporating it into your infrastructure-as-code policies.

3. Harmonize, Unify and Simplify Policy

If you have software deployed or scaled across multiple cloud locations, the configuration, monitoring and automatic-response systems may get unwieldly unless you seek to elevate your orchestration across clouds. Of course, there are cloud management platforms for this. With or without them, you can also do some multi-cloud management with your own centrally harmonized configurations and management as code. A further step might unify configuration and management with global controllers, but with the track record of humans causing most errors, be careful with your blast radius for a fat-finger typo.

Another trend in provisioning models and APIs is abstraction, which can be at many levels like multi-cloud orchestration, individual stack, pipeline or application. By making things more intuitive and concise for humans and leaving the execution to your software machinery and machine learning, you’re likely to improve the lives of your operators, your applications and your application users.

4. Hold Up Before You Speed Up

Cloud will move you faster, and if that’s not enough cause for care, even with no IT strategy, you’ll still end up with multi-cloud in no time: multiple owners, vendors, regions, and availability zones. The increased danger is that multi-cloud can multiply messes and mistakes. Preparation in building a platform is key, and like many things that take a bit of time upfront, it’s worth the effort in the long run.

Consciousness to hold up isolated quick gains as short-term one-offs, that generally beget debt down the road, is the critical gambit that will return long-term payouts in adaptability and speed atop a united multi-cloud platform.

IT leaders know that digital transformation is a journey, not a destination. With continuous learning, the first of all healthy continuous IT practices, mastering the tactics and good habits for structuring your multi-cloud platform and using the ins and outs of devops atop it, can be fun and rewarding. It allows safe acceleration and agility for IT, and it’s essential to sustainably advance the speeds and smarts of your business.