sdn

The Wisdom of the Giants in SD-WAN by James Kelly

tree-1750784_1920.jpg

Podcast on YouTube

When it comes to your branch how can SD-WAN upgrade without also uprooting? Tall trees may tell.

A Branch’s Reach Should Not Exceed Its Grasp

They are the showy exterior of your organization: your branches, your stores, your schools, your sites. But insofar as networking domains, these are the humblest of locations with little or no networking expertise and sophistication. In the past, your networking grasp was feeble in the far reaches of the branch.

Now the story goes that SD-WAN is changing that. It’s putting the prowess of your brightest networking pros and the autopilot  automation of SDN steadily into these network extremities. But this is only the beginning of the story. So allow me to disabuse you from the enrapture of the shining fruits and perfumed flowers of the branch that is SD-WAN today.

You have been tricked. This was not the story, merely the first act.

Focusing on SD-WAN, my friends, we see the fruits. Take a step back and look wider. Now we see the tree. Now we see the roots.

One Tree: Everything Is Connected

The levity with which some people and vendors approach branch networking with SD-WAN quickly fades when they realize the simple truth that, beyond the branch, everything is connected. It is one tree.

Ungrounded SD-WAN solutions ignore what’s below the branches and clouds at tree tops. But approaching enterprise networking grounded in reality, you see the whole picture: your wide-area is not only your remote and branch connectivity. Everything is connected between branch sites, campuses, headquarters, data centers, and certainly today, multicloud—SaaS and your own cloud-based applications.

You would never be so credulous as to protect a tree’s exterior, believing it’s safe from harm. And no one would mistake strung-up ornaments for the tree itself. How about vines overlaying the tree? Yes, they could reach the branches. But they still aren’t your tree, nor its species, and they cannot be grafted on. This is SD-WAN for dummies and by decoration, but it parallels some SD-WAN propaganda.

SD-WAN savvy would never use proprietary control and data plane protocols that won’t graft and interoperate with your wider network. Security would not be secondary and sheath, but foremost in the immune system of the network first. Add-on network functions like VNFs would be symbiotic and seamless with network design and management. And other virtualized branch services would felicitously fold into the SD-Branch canopy or NFV-centers in nearby limbs.

This is multicloud and multi-site thinking, end to end and top to bottom. While its natural given Juniper’s portfolio, it’s quite different than the thinking of some other SD-WAN vendors whose niche interests, I leave to be addressed with the words of a fine woodsman. “When we try to pick out anything by itself, we find it hitched to everything else in the universe.” -John Muir

Layer Upon Layer

Just under the bark are the newest layers of a tree. Pushing out and up, a tree’s trunk core and deep roots nourish new growth and give it strength to endure the tests of time.

Drawing a parallel to networking growth and longevity, you may have observed this strategy at Juniper, where investment is steadfast in Junos and our platforms. Customers enjoy the benefit of this continuity, as investment protection and the ability to simply extend and build on base systems with SDN, like SD-WAN, employing our NFX, SRX, and MX Series systems and interoperating with the routing of all Junos-powered platforms.

You may observe another approach in the industry too. Vendors that continually force rip and replacement of systems. There are sales motivations for this, but another cause runs deeper...

When you engineer something anew, you usually architect for a minimum viable product and getting to market quickly. Take a tech startup for example: it’s faster to build software as a monolith or a mesh of purely cloud services, than to construct a devops pipeline, platform architecture, and microservices that scale. Taking that MVP route, eventually they will throw away their early work, to redo it at scale, with extensibility, with reliability and economically. This is invisible to customers of SaaS companies, but when translated to packaged-and-sold hardware and software systems, this architecture fetters customers with technical debt and forces rip and replacement inefficiency.

In networking, it’s wiser to sow scale and flexibility into the seeds of your base networking technologies and topologies. Architecting for growth in layers, allows you to scale your rootstock and your core so to speak, evolving today’s investments tomorrow.

Evolvable architecture is how the cloud giants design their software, and happens to be how Juniper designs our portfolio. This is why we did not acquire an SD-WAN solution. And this is why we built SD-WAN backward: we tackled the hard problems first (multi-tenancy, scale, reliability, NFV, etc.), so we could design once and for all, and offer the simplicity of one solution.

Reach for the Clouds

With so many SD-WAN solutions in the market, and mostly built with haste, as you might imagine, the winds of technology change will cause many to snap and topple. They weren’t designed beyond SD-WAN connections for the branch and cloud endpoints.

The wisdom of giant trees would suggest that as you reach for the multicloud, strength lies in swaying and adapting with the winds of change, and evolving and using the strength of the whole.

About Juniper Contrail SD-WAN

Juniper’s newly dubbed Contrail SD-WAN solution and its component parts were designed to inherently secure from within and to scale to support thousands of tenants each with thousands of sites. It was designed where SD-WAN is merely the first act of your transformation story. So it will grow with you to SD-Branch for site virtualization and consolidation, and even incorporate NFV-cloud services into your network service. Of course it’s multicloud-ready, connecting up to the likes of AWS, but just as importantly, it ties right into your core WAN routing today from your campuses and data centers.

Podcast on Soundcloud
Podcast on YouTube

image credit MichaelGaida/pixabay

It’s the End of Network Automation as We Know It (and I Feel Fine) by James Kelly

This article was originally posted on July 5th on The New Stack at https://thenewstack.io/end-network-automation-know-feel-fine/

Network automation does not an automated network make. Today’s network engineers are frequently guilty of two indulgences. First, random acts of automation hacking. Second, pursuing aspirational visions of networking grandeur — complete with their literary adornments like “self-driving” and “intent-driven” — without a plan or a healthy automation practice to take them there.

Can a Middle Way be found, enabling engineers to set achievable goals, while attaining the broader vision of automated networks as code? Taking some inspiration from our software engineering brethren doing DevOps, I believe so.

How Not to Automate

There’s a phrase going around in our business: “To err is human; to propagate errors massively at scale is automation.”

Automation is a great way for any organization to speed up bad practices and #fail bigger. Unfortunately, when your business is network ops, the desire to be a cool “Ops” kid with some “Dev” chops — as opposed to just a CLI jockey — will quickly lead you down the automation road. That road might not lead you to those aspirational goals, although it certainly could expand your blast radius for failures.

Before we further contemplate self-driving, intent-driven networking, and every other phrase that’s all the rage today (although I’m just as guilty of such contemplation as anyone else at Juniper), we should take the time to define what we mean by “proper” in the phrase, “building an automated network properly.”

If you haven’t guessed already, it’s not about writing Python scripts. Programming is all well and good, but twenty minutes of design often really does save about two weeks of coding. To start hacking at a problem right away is probably the wrong approach. We need to step back from our goals, think about what gives them meaning, apply those goals to the broader picture, and plan accordingly.

To see what is possible with automation, we should look at successful patterns of automation outside of networking and the reasons behind them, so we may avoid the known bad habits and anti-patterns, and sidestep avoidable pitfalls. For well-tested patterns of automation, we needn’t look any further than the wealth of knowledge and experience in the arena of DevOps.

It matters what we call things. For better or worse, a name focuses the mind. The overall IT strategy to improve the speed, resilience, quality and intelligence of our applications is not called automation or orchestration. While ITIL volumes make their steady march into museums, the new strategy to enable business speed and smarts is incontrovertible, and it’s called DevOps.

Initially that term may invoke a blank page or even a transformation conundrum. But you can learn what it means to practice successful DevOps culture, processes, design, tooling and coding. DevOps can define your approach to the network, and is why we ought not promote network automation (which could focus the mind on the wrong objectives) and instead talk about DevNetOps as the application of patterns of DevOps applied to networking.

Networks as Code

The idea of infrastructure-as-code (IaC) has been around for a while, but surprisingly has seldom been applied to networking. Juniper Networks (where I hang my hat) and other networking vendors like Apstra have made some efforts over the years to move folks in this direction, but there is still a lot of work to do. For example, Juniper has had virtual form factors of most series of hardware systems, projects like Junosphere for network modeling in the cloud (many of us now use Ravello), and impressive presentations on IaC and professional services consulting. Juniper’s senior marketing director Mike Bushong (formerly with Plexxi) wrote about the network as code back in 2014.

IaC is generally well applied to cloud infrastructure, but it’s way harder to apply to bare metal. For evidence of this, just look at Triple-O, Kubernetes on Kubernetes, or Kubernetes on OpenStack on Kubernetes! That bottom, metal layer is quite the predicament.

To me, this means in networking, we should be easily applying IaC to software defined networking (SDN). But can we apply it to our network devices and manage the physical network? I asked Siri, and she said it was a trick question. As an armchair architect myself, I don’t have all the answers.  But as I see it, here are some under-considered aspects for designing networks as code with DevNetOps:

1. Tooling

In tech, everyone loves shiny objects, so let’s start there. Few network operators — even those who have learned some programming — are knowledgeable about the ecosystem of DevOps tooling, and few consider applying those same tools to networking. Learning Python and Ansible is just scratching the surface. There is a vast swath of DevOps tools for CI/CDsite reliability engineering (SRE), and at-scale cloud-native ops.

2. Chain the tool chain: a pipeline as code

When we approach the network as code, we need to consider network elements and their configurations as building blocks created in a code-development pipeline of dev/test/staging/production phases. Stringing together this pipeline shouldn’t be a manual process; it should be mostly coded to be automatic.

As with software engineering, there are hardware and foundational software elements with network engineering, such as operating systems that the operator will not create themselves, but rather just configure and extend. These configurations and extensions, with their underlying dependencies, can be built together, versioned, tested, and delivered. Thinking about the network as an exercise in development, automation should start in the development infrastructure itself.

3. Immutable infrastructure

Virtualization and especially containers have made the concept of baking images very accessible, and immutable infrastructure popular. While there is still much work to do with network software disaggregation, containerization, and decoupling of services, there are many benefits of adopting immutable infrastructure that are equally applicable to networking. Today’s network devices are poster children for config drift, but to call them “snowflakes” would be an insult to actual snowflakes.

Applying principles of immutable infrastructure, I imagine a network where each device PXE-boots into a minimal OS and runs signed micro-service building blocks. Each device has declarative configs, decoupled secret management and rotation, and logging and other monitoring data with good overall audit ability and traceability — all of which is geared to take the network off the box ASAP.

Interestingly, practices such as SSH’ing into boxes would be rendered impossible, and practices that “savvy” network automators do today like running Ansible playbooks against an inventory of devices would be banished.

4. Upgrades

Upgrades to network software and even firmware/microcode on devices could be managed automatically, by means of canary tests and rolling upgrade patterns. To do this on a per-box or per-port basis, or at finer levels of flows or traffic-processing components, we need to be able to orchestrate traffic balancing and draining.

If that sounds complex, we can make things simpler. We could treat devices and their traffic like cattle instead of pets, and rely on their resilience. Killing and resurrecting a component would restart it with a new version. While this is suitable for some software applications, treating traffic as disposable is not yet desirable for all network applications and SLAs. Still, it would go a long way toward properly designing for resilience.

5. Resilience

One implication of all this is the presence of redundancy in the network paths.  As with any networking component, that’s very important for resilience. Drawing inspiration from scale-out architectures in DevOps and the microservices application model, redundancy and scale would go hand-in-hand by means of instance replication. So redundancy would neither be 1:1 nor active-passive. We should always be skeptical of architectures that include those anti-patterns.

Good design would tolerate a veritable networking chaos monkey. Burning down network software would circuit-break to limit failures. Killing links and even boxes, we would quickly re-converge as we often do today, but dead boxes, dead SDN functions or dead virtual network functions would act like phoenix servers, rising back up or staying down in case of repeated failures or detected hardware failures.

The pattern for preventing black-swan event failures is to practice forcing these failures, and thus practice automating around them, so that the connectivity or other network service and its performance is tolerant and acceptably resilient on its own SLA measuring stick, whatever the meta-service in question may be.

Doing DevNetOps

In each one of these above topics lies much more complexity than I will dive head-first into here. By introducing them here, my aim has been to demonstrate there are interesting patterns we may draw from, and some operators are doing so already. If you’ve ever heard the old Zen Buddhist koan of the sound of one hand clapping, that’s the sound you’re likely to hear from your own forehead, once the obviousness of applying DevOps to DevNetOps hits you squarely in the face.

Just as the hardest part of adopting DevOps is often cited to be breaking off one manageable goal at a time and focusing on that, I think we’ll find the same is true of DevNetOps. Before we even get there in networking, I think we need to scope the transformation properly of applying DevNetOps to the challenges and design of networking, especially with issues of basic physical connectivity and transport.

While “network automation” leads the mind to jump to things like applying configuration management tooling and programming today’s manual tasks, DevNetOps should remind us that there is a larger scope than mere automation coding.  That scope includes culture, processes, design and tools. Collectively, they may all lead us to a happier place.

This article was originally posted on July 5th on The New Stack at https://thenewstack.io/end-network-automation-know-feel-fine/

Title image of a Bell System telephone switchboard, circa 1943, from the U.S. National Archives.

Getting to GIFEE with SDN: Demo by James Kelly

A few short years ago, espousing for open source and cloud computing was even more difficult than touting the importance of clean energy and the realities of climate change. The doubters and naysayers, vocal as they are, are full of reasons why things are (fine) as they are. Reasons, however, don’t get you results. We needed transformative action in IT, and today, as we’re right between the Google NEXT event and the OpenStack Summit in Austin, open source and cloud are the norm for the majority.

After pausing for a moment of vindication – we told you so – we get back to work to improve further and look forward, and a good place to look is indeed at Google: a technology trailblazer by sheer necessity. We heard a lot about the GCP at NEXT, especially their open source project Kubernetes, powering GKE. What’s most exciting about such container-based computing with Docker is that we’ve finally hit the sweet spot in the stack with the right abstractions for developers and infrastructure & ops pros. With this innovation now accessible to all in the Kubernetes project, Google’s infrastructure for everyone else (#GIFEE) and NoOps is within reach. Best of all, the change this time around is less transformative and more incremental…

One thing you’ll like about a serverless architecture stack like Kubernetes, is that you can run it on bare-metal if you want the best performance possible, but you can easily run it on top of IaaS providing VMs in public or private cloud, and that benefits us with a great deal of flexibility in so many ways. Then of course if you just want to deploy workloads, and not worry about the stack, an aaS offering like GKE or ECS is a great way to get to NoOps faster. We have a level playing field across public and private and a variety of underpinnings.

For those that are not only using a public micro-service stack aaS offering like GKE, but supplementing or fully building one internally with Kubernetes or a PaaS on top of it like OpenShift, you’ll need some support. Just like you didn’t build an OpenStack IaaS by yourself (I hope), there’s no reason to go it alone for your serverless architecture micro-services stack. There’s many parts under the hood, and one of them you need baked into your stack from the get go is software-defined secure networking. It was a pleasure to get back in touch with my developer roots and put together a demo of how you can solve your networking and security microsegmentation challenges using OpenContrail.

I’ve taken the test setup for OpenContrail with OpenShift, and forked and modified it to create a pure demo cluster of OpenContrail + OpenShift (thus including Kubernetes) showing off the OpenContrail features with Kubernetes and OpenShift. If you learn by doing like me, then maybe best of all, this demo cluster is also open source and Ansible-automated to easily stand up or tear down on AWS with just a few commands to go from nada to a running OpenShift and OpenContrail consoles with a running sample app. Enjoy getting your hands dirty, or sit back and watch demo video.

Those that want to replicate the demo. Here's the link I mentioned in the video: https://github.com/jameskellynet/container-networking-ansible