Good Habits to Make the Multi-Cloud Work For You – Part 2 of 2 / September 12, 2017 by James Kelly

This article was originally published on September 12 at Data Center Knowledge http://www.datacenterknowledge.com/industry-perspectives/good-habits-make-multi-cloud-work-you-part-2

In my previous article, I talked about the state of infatuation with hybrid and multi-cloud environments. Would you be surprised that in the stresses and mania surrounding IT cloud strategy, some folks fixate more on the playing field than the game itself? You probably already know that you’ve got to get your head in the game in this unforgiving age, and a winning strategy for digitally speeding and feeding the business across the multi-cloud is not: taste the rainbow; it’s choosing and consuming cloud wisely.

Too bad that how you do so isn’t obvious, and as if it wasn’t difficult enough to anticipate technology turns ahead, there are so many captivating cloud services that might lead you down treacherous roads to traps and debt. But there are also well-known tactics emerging that you can model to ready and steady your organization for change and success. Like most, if your journey has already begun, you’re picking these up along the way and adjusting your habits as you go.

You know how bad habits are easy to form and hard to live with? Similarly, it’s very easy to jump into multi-cloud or unwittingly let it happen to you. At this precipice, the warning signs and early stories of cloud lock-in, overwhelming multiple-cloud context switches, runaway expenses, and situational blindness, are hopefully enough to grab your attention. Multi-cloud is inevitable; these fatalities are not.

A multi-cloud platform is a powerful environment, and it requires proper preparation so you can control it, instead of it controlling you. With that, here are four of the best preparations I’ve seen, like good habits that are hard to form, but easy to live with.

1. Unify Your Toolchain

In the eternal deluge and disruption of new tech tooling and systems, remember those good old-fashioned IT values of standardization and consolidation? Don’t throw those babies out with the ITIL bathwater.

As you embrace cloud and bimodal IT with new and improved tools, you might lessen the reins on your traditional values, using public cloud and building private cloud infrastructure alongside your physical and virtualized data centers. In loosening the reins or spinning out agile side projects, just watch out for the trap of hasty developers rolling their own stack or going stackless/serverless, only to get caught in a web of proprietary cloud services.

Don’t rush an obstinate knee-jerk to block this neither. Think of a unified toolchain effort as one with the developers to rationalize a base devops pipeline, cluster, and middleware stack, that could serve 80 percent of projects.

Your tools need to work on any cloud infrastructure, and if they can work with your legacy infrastructure, even better.
Freeing yourself from lock-in of cluster and pipeline orchestration tools and infrastructure-as-code lifecycle management: keep them untethered from any specific underlying IaaS, with portable shims like Terraform.
While you don’t want to throttle developers back from using services outside of your stack – they’ll go around you anyway – encourage managed open-source-based services. Then incorporate such services into your middleware toolchain as it matures. Tools like Helm, make it easier to manage services yourself, more than ever before.

If you’re a lean IT shop, let’s face it, following this to the letter may take you away from getting to market ASAP. Maybe you’re a startup or in that mode? You don’t just want, but need, to focus on developing your core competitive technology, not a portable multi-cloud toolchain.

How do you balance moving fast, employing low-hanging SaaP, with the concern of vendor and architectural lock-in?

If a tool is a competitive differentiator, then you should probably build it. Otherwise, remember there are a lot of open-source tools that are glued together with reference implementations of other open-source tools: large projects like Kubernetes and Spinnaker are easy to adopt with a bunch of pre-canned sensible defaults. Another option is to choose managed open-source services, that are more easily insourced later or offered by multiple cloud vendors.

Finally, software design is probably the most important and challenging factor of all. Architecting for scale is obvious, but flexibility enables business agility; so consider not only today’s lock-in, but also getting locked out of a competitive advantage tomorrow. Assembling API-driven (capital ‘S’) Services from micro-services is a well-established pattern to do this, and I’d recommend software alchemists investigate evolutionary architecture from ThoughtWorks for more wisdom.

2. Connect Your Clouds

Connection was a given in the world of hybrid cloud. That still holds true. However, cloud bursting, the most bombastic of all use cases for hybrid cloud, is the least common. Multiple clouds need to be connected together for many more realistic and common use cases:

Imagine pipeline automation that includes environments or steps split across clouds. Dev/test can happen anywhere, but you may have higher requirements for staging/production.
Secure data replication for warehousing or distributed applications, and backups for disaster recovery and avoidance.
Split application tiers, where there are different non-functional requirements for the various application tiers like sovereignty, security, scale, performance, etc. that must be met in various geographies or optimized with split economics. Some applications may be split because of functional requirements too because certain clouds have unique advantages that others can’t reproduce.

Such cloud interconnections demand higher security than using the internet, and often clouds simply require a secure connection back to your enterprise staff or users. Beyond security, unique routing and legacy layer-2 unicast or even multicast connectivity could be an application requirement. While there are cloud-vendor tools for basic security and networking, you can also search for your own software-defined and virtualized security and networking solutions that are agnostic to any cloud infrastructure, unifying this toolchain too and incorporating it into your infrastructure-as-code policies.

3. Harmonize, Unify and Simplify Policy

If you have software deployed or scaled across multiple cloud locations, the configuration, monitoring and automatic-response systems may get unwieldly unless you seek to elevate your orchestration across clouds. Of course, there are cloud management platforms for this. With or without them, you can also do some multi-cloud management with your own centrally harmonized configurations and management as code. A further step might unify configuration and management with global controllers, but with the track record of humans causing most errors, be careful with your blast radius for a fat-finger typo.

Another trend in provisioning models and APIs is abstraction, which can be at many levels like multi-cloud orchestration, individual stack, pipeline or application. By making things more intuitive and concise for humans and leaving the execution to your software machinery and machine learning, you’re likely to improve the lives of your operators, your applications and your application users.

4. Hold Up Before You Speed Up

Cloud will move you faster, and if that’s not enough cause for care, even with no IT strategy, you’ll still end up with multi-cloud in no time: multiple owners, vendors, regions, and availability zones. The increased danger is that multi-cloud can multiply messes and mistakes. Preparation in building a platform is key, and like many things that take a bit of time upfront, it’s worth the effort in the long run.

Consciousness to hold up isolated quick gains as short-term one-offs, that generally beget debt down the road, is the critical gambit that will return long-term payouts in adaptability and speed atop a united multi-cloud platform.

IT leaders know that digital transformation is a journey, not a destination. With continuous learning, the first of all healthy continuous IT practices, mastering the tactics and good habits for structuring your multi-cloud platform and using the ins and outs of devops atop it, can be fun and rewarding. It allows safe acceleration and agility for IT, and it’s essential to sustainably advance the speeds and smarts of your business.

Multi-cloud is A Reality, Not A Strategy - Part 1 of 2 / August 25, 2017 by James Kelly

This article was originally posted on August 25th, 2017 on Data Center Knowledge at http://www.datacenterknowledge.com/archives/2017/08/25/multi-cloud-reality-not-strategy-part-1/

So you’re doing cloud, and there is no sign of slowing down. Maybe your IT strategies are measured, maybe you’re following the wisdom of the crowd, maybe you’re under the gun, you’re impetuous or you’re oblivious. Maybe all of the above apply. In any case, like all businesses, you’ve realized the future belongs to the fast and that cloud is the vehicle for your newly dubbed software-defined enterprise: a definition carrying onerous, what I call, “daft pressures” for harder, better, faster, stronger IT.

You may as well be solving the climate-change crisis because to have a fighting chance today, it feels like you have to do everything all at once.

Enduring such exploding forces and pressures, most crapulent enterprises simultaneously devour cloud, left, right and center; the name for said zeitgeist: multi-cloud. If you’ve not kept abreast, overwhelming evidence – like years of the State of the Cloud Report and other analyst research – show that multi-cloud is the present direction for most enterprises. But the name “multi-cloud” would suggest that using multiple clouds is all there is to it. Thus, in the time it takes to read this article, any technocrat can have their own multi-cloud. It sounds too easy, and as they say, “when life looks like Easy Street, danger is at the door.”

Apparition du Jour

While some cloud pundits forecasting the future of IT infrastructure will shift from hybrid cloud to multi-cloud or go round and round in sanctimonious circles, those who are perceptive have realized that multi-cloud pretense is merely a catch-all moniker, rather than an IT pattern to model. As with many terms washed over too much, “multi-cloud” has quickly washed out. It’s delusional as a strategy, and isn’t particularly useful, other than as le mot du jour to attract attention. How’s that for hypocrisy?

Just like you wouldn’t last on only one food, it’s obvious that you choose multiple options at the cloud buffet. How you choose and how you consume is what will make the difference to if you perish, survive or thrive. Choosing and consuming wisely and with discipline is a strategy.

Is the Strategy Hybrid Cloud?

Whether you want to renew your vows to hybrid cloud or divorce that term, somewhere along the way, it got confused with bimodal, hybrid IT and ops. Nonetheless, removing the public-plus-private qualification, there was real value in the model to unite multiple clouds for one purpose. It’s precisely that aspect that would also add civility to the barbarism of blatantly pursuing multi-cloud. However, precipitating hybrid clouds for a single application with, say, the firecracker whiz-bang of the lofty cloud bursting use case, is really a distraction. The greater goal of using hybrid or multiple clouds should be as a unified, elastic IT infrastructure platform, exploiting the best of many environments.

Opportunities for Ops

Whether public or private, true clouds (which are far from every data center—even those well virtualized) ubiquitously offer elastic, API-controllable and observable infrastructure atop which devops can flourish to enable the speeds and feeds of your business. An obvious opportunity and challenge for IT operations professionals is in building true cloud infrastructure, but if that’s not in the cards for your enterprise, there is still even greater opportunity in managing and executing the strategy for the wisdom and discipline in choosing and consuming cloud.

In the new generation of IT, ops discipline doesn’t mean being the “no” person, it means shaping the path for developers, with developers. This is where devops teaches us that new fun exists to engineer, integrate and operate things. It’s just that they are stacks, pipelines and services, not servers. Of course, to ride this elevator up, traditional infrastructure and operations pros need to elevate their skills, practicing devops kung-fu across the multi-cloud and possibly applying it in building private clouds too.

Imagine what would happen if we exalted our trite multi-cloud environment with strategies and tactics to master it? We can indeed extract the most value from each cloud of choice, avoid cloud lock-in, and ensure evolvability, but you probably wouldn’t be surprised to learn that there are also shortcuts with new technical debt spirals lurking ahead.

To secure the multi-cloud as a platform serving us, and not the other way around, some care, commitment, time and work is needed toward forming the right skills, habits, and using or building tooling, services, and middleware. There is indeed a maturing shared map leading to portability, efficiency, situational awareness and orchestration. In my next article, we’ll examine some of the journey and important strategies to use the multi-cloud for speed with sustainability. Stay tuned.

This article was originally posted on August 25th, 2017 on Data Center Knowledge at http://www.datacenterknowledge.com/archives/2017/08/25/multi-cloud-reality-not-strategy-part-1/

It’s the End of Network Automation as We Know It (and I Feel Fine) / July 25, 2017 by James Kelly

This article was originally posted on July 5th on The New Stack at https://thenewstack.io/end-network-automation-know-feel-fine/

Network automation does not an automated network make. Today’s network engineers are frequently guilty of two indulgences. First, random acts of automation hacking. Second, pursuing aspirational visions of networking grandeur — complete with their literary adornments like “self-driving” and “intent-driven” — without a plan or a healthy automation practice to take them there.

Can a Middle Way be found, enabling engineers to set achievable goals, while attaining the broader vision of automated networks as code? Taking some inspiration from our software engineering brethren doing DevOps, I believe so.

How Not to Automate

There’s a phrase going around in our business: “To err is human; to propagate errors massively at scale is automation.”

Automation is a great way for any organization to speed up bad practices and #fail bigger. Unfortunately, when your business is network ops, the desire to be a cool “Ops” kid with some “Dev” chops — as opposed to just a CLI jockey — will quickly lead you down the automation road. That road might not lead you to those aspirational goals, although it certainly could expand your blast radius for failures.

Before we further contemplate self-driving, intent-driven networking, and every other phrase that’s all the rage today (although I’m just as guilty of such contemplation as anyone else at Juniper), we should take the time to define what we mean by “proper” in the phrase, “building an automated network properly.”

If you haven’t guessed already, it’s not about writing Python scripts. Programming is all well and good, but twenty minutes of design often really does save about two weeks of coding. To start hacking at a problem right away is probably the wrong approach. We need to step back from our goals, think about what gives them meaning, apply those goals to the broader picture, and plan accordingly.

To see what is possible with automation, we should look at successful patterns of automation outside of networking and the reasons behind them, so we may avoid the known bad habits and anti-patterns, and sidestep avoidable pitfalls. For well-tested patterns of automation, we needn’t look any further than the wealth of knowledge and experience in the arena of DevOps.

It matters what we call things. For better or worse, a name focuses the mind. The overall IT strategy to improve the speed, resilience, quality and intelligence of our applications is not called automation or orchestration. While ITIL volumes make their steady march into museums, the new strategy to enable business speed and smarts is incontrovertible, and it’s called DevOps.

Initially that term may invoke a blank page or even a transformation conundrum. But you can learn what it means to practice successful DevOps culture, processes, design, tooling and coding. DevOps can define your approach to the network, and is why we ought not promote network automation (which could focus the mind on the wrong objectives) and instead talk about DevNetOps as the application of patterns of DevOps applied to networking.

Networks as Code

The idea of infrastructure-as-code (IaC) has been around for a while, but surprisingly has seldom been applied to networking. Juniper Networks (where I hang my hat) and other networking vendors like Apstra have made some efforts over the years to move folks in this direction, but there is still a lot of work to do. For example, Juniper has had virtual form factors of most series of hardware systems, projects like Junosphere for network modeling in the cloud (many of us now use Ravello), and impressive presentations on IaC and professional services consulting. Juniper’s senior marketing director Mike Bushong (formerly with Plexxi) wrote about the network as code back in 2014.

IaC is generally well applied to cloud infrastructure, but it’s way harder to apply to bare metal. For evidence of this, just look at Triple-O, Kubernetes on Kubernetes, or Kubernetes on OpenStack on Kubernetes! That bottom, metal layer is quite the predicament.

To me, this means in networking, we should be easily applying IaC to software defined networking (SDN). But can we apply it to our network devices and manage the physical network? I asked Siri, and she said it was a trick question. As an armchair architect myself, I don’t have all the answers. But as I see it, here are some under-considered aspects for designing networks as code with DevNetOps:

1. Tooling

In tech, everyone loves shiny objects, so let’s start there. Few network operators — even those who have learned some programming — are knowledgeable about the ecosystem of DevOps tooling, and few consider applying those same tools to networking. Learning Python and Ansible is just scratching the surface. There is a vast swath of DevOps tools for CI/CD, site reliability engineering (SRE), and at-scale cloud-native ops.

2. Chain the tool chain: a pipeline as code

When we approach the network as code, we need to consider network elements and their configurations as building blocks created in a code-development pipeline of dev/test/staging/production phases. Stringing together this pipeline shouldn’t be a manual process; it should be mostly coded to be automatic.

As with software engineering, there are hardware and foundational software elements with network engineering, such as operating systems that the operator will not create themselves, but rather just configure and extend. These configurations and extensions, with their underlying dependencies, can be built together, versioned, tested, and delivered. Thinking about the network as an exercise in development, automation should start in the development infrastructure itself.

3. Immutable infrastructure

Virtualization and especially containers have made the concept of baking images very accessible, and immutable infrastructure popular. While there is still much work to do with network software disaggregation, containerization, and decoupling of services, there are many benefits of adopting immutable infrastructure that are equally applicable to networking. Today’s network devices are poster children for config drift, but to call them “snowflakes” would be an insult to actual snowflakes.

Applying principles of immutable infrastructure, I imagine a network where each device PXE-boots into a minimal OS and runs signed micro-service building blocks. Each device has declarative configs, decoupled secret management and rotation, and logging and other monitoring data with good overall audit ability and traceability — all of which is geared to take the network off the box ASAP.

Interestingly, practices such as SSH’ing into boxes would be rendered impossible, and practices that “savvy” network automators do today like running Ansible playbooks against an inventory of devices would be banished.

4. Upgrades

Upgrades to network software and even firmware/microcode on devices could be managed automatically, by means of canary tests and rolling upgrade patterns. To do this on a per-box or per-port basis, or at finer levels of flows or traffic-processing components, we need to be able to orchestrate traffic balancing and draining.

If that sounds complex, we can make things simpler. We could treat devices and their traffic like cattle instead of pets, and rely on their resilience. Killing and resurrecting a component would restart it with a new version. While this is suitable for some software applications, treating traffic as disposable is not yet desirable for all network applications and SLAs. Still, it would go a long way toward properly designing for resilience.

5. Resilience

One implication of all this is the presence of redundancy in the network paths. As with any networking component, that’s very important for resilience. Drawing inspiration from scale-out architectures in DevOps and the microservices application model, redundancy and scale would go hand-in-hand by means of instance replication. So redundancy would neither be 1:1 nor active-passive. We should always be skeptical of architectures that include those anti-patterns.

Good design would tolerate a veritable networking chaos monkey. Burning down network software would circuit-break to limit failures. Killing links and even boxes, we would quickly re-converge as we often do today, but dead boxes, dead SDN functions or dead virtual network functions would act like phoenix servers, rising back up or staying down in case of repeated failures or detected hardware failures.

The pattern for preventing black-swan event failures is to practice forcing these failures, and thus practice automating around them, so that the connectivity or other network service and its performance is tolerant and acceptably resilient on its own SLA measuring stick, whatever the meta-service in question may be.

Doing DevNetOps

In each one of these above topics lies much more complexity than I will dive head-first into here. By introducing them here, my aim has been to demonstrate there are interesting patterns we may draw from, and some operators are doing so already. If you’ve ever heard the old Zen Buddhist koan of the sound of one hand clapping, that’s the sound you’re likely to hear from your own forehead, once the obviousness of applying DevOps to DevNetOps hits you squarely in the face.

Just as the hardest part of adopting DevOps is often cited to be breaking off one manageable goal at a time and focusing on that, I think we’ll find the same is true of DevNetOps. Before we even get there in networking, I think we need to scope the transformation properly of applying DevNetOps to the challenges and design of networking, especially with issues of basic physical connectivity and transport.

While “network automation” leads the mind to jump to things like applying configuration management tooling and programming today’s manual tasks, DevNetOps should remind us that there is a larger scope than mere automation coding. That scope includes culture, processes, design and tools. Collectively, they may all lead us to a happier place.

This article was originally posted on July 5th on The New Stack at https://thenewstack.io/end-network-automation-know-feel-fine/

Title image of a Bell System telephone switchboard, circa 1943, from the U.S. National Archives.