The Wisdom of the Giants in SD-WAN / April 16, 2018 by James Kelly

When it comes to your branch how can SD-WAN upgrade without also uprooting? Tall trees may tell.

A Branch’s Reach Should Not Exceed Its Grasp

They are the showy exterior of your organization: your branches, your stores, your schools, your sites. But insofar as networking domains, these are the humblest of locations with little or no networking expertise and sophistication. In the past, your networking grasp was feeble in the far reaches of the branch.

Now the story goes that SD-WAN is changing that. It’s putting the prowess of your brightest networking pros and the autopilot automation of SDN steadily into these network extremities. But this is only the beginning of the story. So allow me to disabuse you from the enrapture of the shining fruits and perfumed flowers of the branch that is SD-WAN today.

You have been tricked. This was not the story, merely the first act.

Focusing on SD-WAN, my friends, we see the fruits. Take a step back and look wider. Now we see the tree. Now we see the roots.

One Tree: Everything Is Connected

The levity with which some people and vendors approach branch networking with SD-WAN quickly fades when they realize the simple truth that, beyond the branch, everything is connected. It is one tree.

Ungrounded SD-WAN solutions ignore what’s below the branches and clouds at tree tops. But approaching enterprise networking grounded in reality, you see the whole picture: your wide-area is not only your remote and branch connectivity. Everything is connected between branch sites, campuses, headquarters, data centers, and certainly today, multicloud—SaaS and your own cloud-based applications.

You would never be so credulous as to protect a tree’s exterior, believing it’s safe from harm. And no one would mistake strung-up ornaments for the tree itself. How about vines overlaying the tree? Yes, they could reach the branches. But they still aren’t your tree, nor its species, and they cannot be grafted on. This is SD-WAN for dummies and by decoration, but it parallels some SD-WAN propaganda.

SD-WAN savvy would never use proprietary control and data plane protocols that won’t graft and interoperate with your wider network. Security would not be secondary and sheath, but foremost in the immune system of the network first. Add-on network functions like VNFs would be symbiotic and seamless with network design and management. And other virtualized branch services would felicitously fold into the SD-Branch canopy or NFV-centers in nearby limbs.

This is multicloud and multi-site thinking, end to end and top to bottom. While its natural given Juniper’s portfolio, it’s quite different than the thinking of some other SD-WAN vendors whose niche interests, I leave to be addressed with the words of a fine woodsman. “When we try to pick out anything by itself, we find it hitched to everything else in the universe.” -John Muir

Layer Upon Layer

Just under the bark are the newest layers of a tree. Pushing out and up, a tree’s trunk core and deep roots nourish new growth and give it strength to endure the tests of time.

Drawing a parallel to networking growth and longevity, you may have observed this strategy at Juniper, where investment is steadfast in Junos and our platforms. Customers enjoy the benefit of this continuity, as investment protection and the ability to simply extend and build on base systems with SDN, like SD-WAN, employing our NFX, SRX, and MX Series systems and interoperating with the routing of all Junos-powered platforms.

You may observe another approach in the industry too. Vendors that continually force rip and replacement of systems. There are sales motivations for this, but another cause runs deeper...

When you engineer something anew, you usually architect for a minimum viable product and getting to market quickly. Take a tech startup for example: it’s faster to build software as a monolith or a mesh of purely cloud services, than to construct a devops pipeline, platform architecture, and microservices that scale. Taking that MVP route, eventually they will throw away their early work, to redo it at scale, with extensibility, with reliability and economically. This is invisible to customers of SaaS companies, but when translated to packaged-and-sold hardware and software systems, this architecture fetters customers with technical debt and forces rip and replacement inefficiency.

In networking, it’s wiser to sow scale and flexibility into the seeds of your base networking technologies and topologies. Architecting for growth in layers, allows you to scale your rootstock and your core so to speak, evolving today’s investments tomorrow.

Evolvable architecture is how the cloud giants design their software, and happens to be how Juniper designs our portfolio. This is why we did not acquire an SD-WAN solution. And this is why we built SD-WAN backward: we tackled the hard problems first (multi-tenancy, scale, reliability, NFV, etc.), so we could design once and for all, and offer the simplicity of one solution.

Reach for the Clouds

With so many SD-WAN solutions in the market, and mostly built with haste, as you might imagine, the winds of technology change will cause many to snap and topple. They weren’t designed beyond SD-WAN connections for the branch and cloud endpoints.

The wisdom of giant trees would suggest that as you reach for the multicloud, strength lies in swaying and adapting with the winds of change, and evolving and using the strength of the whole.

About Juniper Contrail SD-WAN

Juniper’s newly dubbed Contrail SD-WAN solution and its component parts were designed to inherently secure from within and to scale to support thousands of tenants each with thousands of sites. It was designed where SD-WAN is merely the first act of your transformation story. So it will grow with you to SD-Branch for site virtualization and consolidation, and even incorporate NFV-cloud services into your network service. Of course it’s multicloud-ready, connecting up to the likes of AWS, but just as importantly, it ties right into your core WAN routing today from your campuses and data centers.

Podcast on Soundcloud
Podcast on YouTube

image credit MichaelGaida/pixabay

Seven Deadly Deceptions of Network Automation / March 20, 2018 by James Kelly

Podcast on YouTube

The greatest deception men suffer is from their own opinions. - Leonardo da Vinci

“Network automation does not an automated network make.” Those same words started my formative piece on DevNetOps. It reasoned that we must elevate DevOps culture, processes and principles above technology, end random acts of network automation, and instead pursue holistically automated network engineering and operations. The professional that implements this—from code to production—is the network reliability engineer (NRE). The NRE implements DevNetOps for network infrastructure just as the SRE implements DevOps for apps and platforms.

It’s been a journey discovering DevNetOps and network reliability engineering. With help from peers and NRE friends, I’ve faced debate and dogma forged in the fiery cynicism of the networking I&O silo. To share these lessons, let’s overturn some anti-patterns and deceptions, starting with the opposite of the NRE: those who say automation is “not for me.”

1. It’s not for me

You used to hear people say, “We’re not Google. We don’t have those problems or need those solutions.” Today everyone is mad for #GIFEE and racing for the same outcomes as the unicorns. If you think you’re a thoroughbred horse in a different race, you’re utterly deceiving yourself, and your business is heading to the glue factory.

Before we can change our minds, we must open our minds. Life is an inside job.

If you're a network admin, the rationale to retitle yourself as an NRE is right in front of you. Look forward. You’ll see a future less doldrum, more creative, and one where you have more control over your own destiny and that of your organization. And more pay and job opportunities too. Yes, NRE is already an actual job title.

With retitling comes reform. You used to rely on vendors for all network engineering, but this relegated operations people to technicians instead of technologists. As an NRE, you don’t need to hop over the proverbial dev-ops wall, to engineer boxes and SDN systems. You just need to lower the wall and pick up where vendors leave off. Their day of product delivery to market and to you, is your day zero where you automate, not only integration workflows, but outcomes like accuracy, reliability, scale, efficiency and ops speed.

2. It’s all about automation and technology

Rod Michael said, “If you automate a mess, you get an automated mess.” Automation must follow architecture and accuracy.

It’s common for builders to want to build, but you cannot be so swift as to forget the blueprints. There’s a balance to strike between build and design. To be sure, a DevOps mindset promotes build iteration, as did Mark Twain when he said, “Progressive improvement beats delayed perfection.”

Of course it’s not all planning processes or forging culture, but technocrats tend to obsess over technology too much. Network reliability engineering and DevNetOps is not only about technology, just as racing is not only about cars.

3. It’s only about open source

I’m a proponent of open source and believe it aligns with human nature. From GitHub to the growth of the CNCF and so many other projects, the open-source watershed has hit.

However, especially in networking, there is plenty of closed source. Fortunately, open APIs make integration and automation possible even for closed systems. Today, open APIs are increasingly commonplace because they’re not a nicety—they’re a necessity.

Moreover, the as-code, gitOps, and CI/CD movements shine the automation spotlight onto pre-production pipelines and processes. These trends are supported by and apply equally to open- and closed-source software, so don’t let closed source deter your DevNetOps desires.

4. It’s incompatible with ITIL, InfoSec, I&O or hardware

You might believe there’s no need for infrastructure rapid iteration, agility and experimentation. But just because you don’t need all the benefits of a DevOps culture, doesn’t mean you don’t need any.

You may also deceive yourself, thinking networking is different. But just because network hardware is more foundational than application software and less flexible today, doesn’t make DevNetOps ideas impossible. It’s precisely because networking is foundational that having it automated is crucial and will add simplicity and flexibility.

First of all, there is a large software side to networking—SDN, NFV and network management—where we can more easily apply DevNetOps behaviors. Translating some behaviors like CI/CD and chaos engineering to network devices, however, isn't straightforward. In a past article on TheNewStack, I examined the difficulties aligning Agile to today’s architectures in network operating systems, boxes and topologies. In re-architecting networking for DevNetOps, we ought to draw inspiration from microservices—a catalyst for the traditional DevOps transformation—because smaller architectural units allow for smaller, safer, and speedier steps of change.

Finally, many DevOps practitioners have overcome organizational policy “barriers” like ITIL and InfoSec. As well established in the DevOps handbook, success lies not in rallying anarchy; rather the DevOps principles automate in security, compliance and consistency.

5. It’s not obvious where to start

The territory is now increasingly marked with maps: training and didactic case studies. But don't mistake studying for starting. Complement your wonder with some wander. Try playing with git. Sharpen up your programming fingers. Give that tool a whirl.

There are many paths to success. Even if your journey is serpentine, even if you lose true north, you may pick up useful tools and lessons in unexpected places.

Like building any new habit, it’s useful to have a buddy, or better yet a two-pizza team. You'll progress quickest in green fields. Choose a team project with no technical debt when you’re just starting out and take small wins and small risks. When you allow for failure and iteration, you record lessons into processes and automated systems, and you grow people.

The easiest place to start is at the beginning of the project stream, where it is small, not down in the ocean. Start at day-0 and flow from pre-production to production. Build as simple an automated pipeline as possible to integrate artifacts, secrets and configuration as code. As you mature, expand the middle: pipeline orchestration, building, testing, integration, more testing, immutable deliveries, and finally orchestrated deployments into staging and then production. Eventually beyond network and SDN automated deployments, you will have other in-production automation extensions for systems integration and event handling that can follow the same pipeline.

6. It’s all about speed

I wanna go fast! - Ricky Bobby

Keeping up with the pace of technology is ever harder. And so goes the saying, “the future belongs to the fast.” But when it comes to automation, the NRE title tells us something very important: we must focus on reliability.

Speed alone will never win a race, and speed without reliability is a glorious way to crash and burn, just ask rocket scientists. If you were a racecar driver—one smarter than Ricky Bobby—you would say that to finish first, you must first finish.

A twin burden today, equally as confronting as the need for speed, is complexity. You know if Dijkstra, a networking hero for his SPF algorithm, were alive today, he would be a champion of network reliability engineering simplicity (a coincidental portmanteau-ing of NRE and the Juniper anthem) because of his famous quote, “Simplicity is prerequisite to reliability.”

In summary, we need speed, and we need smart. We must be consistent with simplicity, effectiveness, efficiency and reliability (...smart) while employing the economies of velocity, agility, scale and reach (...speed). We all love going fast, but it's not how fast you drive, it's how you drive fast.

7. It’s all about DevNetOps & NRE

The hype of DevOps and SRE is probably warranted if you seriously put it to work. I believe the same is true for DevNetOps and NRE.

However, these are just signposts. Like the Buddhist lesson that the finger pointing to the moon is not the moon itself. If you miss the moon for the finger, you’ve missed the glory.

The real truth in technology is that transformation is the only timeless topic. Digital transformation has been around for three decades, and the digital intelligence transformation is on its way next.

To manage transformation: in technology, equip for an evolvable architecture; in process, incorporate continuous improvement; and in people, embrace continuous learning.

I’ll leave you with a final quote, I often use when speaking on these topics.

It is not the strongest of the species that survive, nor the most intelligent, but the one most responsive to change. - Charles Darwin

Waving the NRE Flag, Live at Open Networking Summit

Next Wednesday at 11:15, Matt Oswalt and I will be speaking on NRE and DevNetOps at the Open Networking Summit. We’ll be joined by Doug Lardo from Riot Games who will share lessons from the front lines. If you happen to be there, please join us. If not, take to Twitter (jk, mo) or comment below on how you see the evolution of the #NRE.

Podcast on Soundcloud
Podcast on YouTube

image credit steampunk/pixabay

Serverless Containers Intensify Secure Networking Requirements / January 28, 2018 by James Kelly

When you’re off to the races with Kubernetes, the first order of business as a developer is figuring out a micro-services architecture and your DevOps pipeline to build pods. However, if you are the Kubernetes cluster I&O pro, also known as site reliability engineer (SRE), then your first order of business is figuring out Kubernetes itself, as the cluster becomes the computer for pod-packaged applications. One of the things a cluster SRE deals with—even for managed Kubernetes offerings like GKE—is the cluster infrastructure: servers, VMs or IaaS. These servers are known as the Kubernetes nodes.

The Pursuit of Efficiency

When you get into higher-order Kubernetes there are a few things you chase.

First, multi-purpose clusters can squeeze out more efficiency of the underlying server resources and your SRE time. By multi-purpose cluster, I mean running a multitude of applications, projects, teams/tenants, and devops pipeline stages (dev/test, build/bake, staging, production), all on the same cluster.

When you’re new to Kubernetes, such dimensions are often created on separate clusters, per project, per team, etc. As your K8s journey matures though, there is only so long you can ignore the waste this causes in the underlying server-resource capacity. Across your multicloud, consolidating many clusters into as few as practical for your reliability constraints also saves you time and less swivel-chairing for: patching, cluster upgrades, secrets and artifact distribution, compliance, monitoring, and more.

Second, there’s the constant chase of scaling efficiency. Kubernetes and active monitoring agents can help take care of auto-scaling individual micro-services, but scaling out assumes you have capacity in your cluster nodes. Especially if you’re running your cluster atop IaaS, it’s actually wasteful to maintain—and pay for—extra capacity in spare VM instances. You probably need some buffer because spinning up VMs is much slower than for containers and pods. Dynamically right-sizing your cluster is quite the predicament, particularly as it becomes more multi-purpose.

True CaaS: Serverless Containers

When it comes to right-sizing your cluster scale, while the cloud providers are happy to take your money for extra VMs powering your spare node capacity, they do have a better solution. At Re:Invent 2017, AWS announced Fargate to abstract away the servers underneath your cluster. Eventually it should support EKS in addition to ECS. In the meantime, Azure Container Instances (ACI) is a true Kubernetes-pods as a service offering that frees you from worrying about the server group upon which it’s running.

Higher-order Kubernetes SRE

While at Networking Field Day 17 (NFD17 video recording), I presented on “shifting left” your networking and security considerations to deal with DevOps and multi-purpose clusters. It turns out that on the same day, Software Engineering Daily released their Serverless Containers podcast. In listening to it you’ll realize that such serverless container stacks are probably the epitome of multi-purpose Kubernetes clusters.

What cloud providers offer in terms of separation of concerns with serverless container stacks, great cluster SREs will also aim to provide to the developers they support.

When you get to this level of maturity in Kubernetes operations, you’re thinking about a lot of things that you may not have originally considered. This happens in many areas, but certainly in networking and security. Hence me talking about “shift left,” so you can prepare to meet certain challenges that you otherwise wouldn’t see if you’re just getting Kubernetes up and running (great book by that name).

In the domain of open networking and security, there is no project that approaches the scalability and maturity of OpenContrail. You may have heard of the immortal moment, at least in the community, when AT&T chose it to run their 100+ clouds, some of enormous size. Riot Games has also blogged about how it underpins their DevOps and container runtime environments for League of Legends, one of the hugest online games around.

Cloud-grade Networking and Security

For cluster multi-tenancy, it goes without saying that it’s useful to have multi-tenant networking and security like OpenContrail provides. You can hack together isolation boundaries with access policies in simpler SDN systems (indeed, today, more popular due to their simplicity), but actually having multi-tenant domain and project isolation in your SDN system is far more elegant, scalable and sane. It’s a cleaner hierarchy to contain virtual network designs, IP address management, network policy and stateful security policy.

The other topic I covered at NFD17 is the goal of making networking and security more invisible to the cluster SRE and certainly to the developer, but providing plenty of control and visibility to the security and network reliability engineers (NREs) or NetOps/SecOps pros. OpenContrail helps here in two crucial ways.

First, virtual network overlays are a first-class concept and object. This is very useful for your DevOps pipeline because you can create exactly the same networking and security environment for your staging and production deployments (here’s how Riot does it). Landmines lurk when staging and production aren’t really the same, but with OpenContrail you can easily have exactly the same IP subnets, addresses, networking and security policies. This is impossible and impractical to do without overlays. You may also perceive that overlays are themselves a healthy separation of concerns from the underlay transport network. That’s true, and they easily enable you to use OpenContrail across the multicloud on any infrastructure. You can even nest OpenContrail inside of lower-layer OpenContrail overlays, although for OpenStack underlays, it provides ways to collapse such layers too.

Second, OpenContrail can secure applications on Kubernetes with better invisibility to your developers—and transparency to SecOps. Today, a CNI provider for Kubernetes implements pod connectivity and usually NetworkPolicy objects. OpenContrail does this too, and much more that other CNI providers cannot. But do you really want to require your developers to write Kubernetes NetworkPolicy objects to blacklist-whitelist the inter-micro-service access across application tiers, DevOps stages, namespaces, etc? I’d love to say security is shifting left into developers’ minds, and that they’ll get this right, but realistically when they have to write code, tests, fixes, documentation and more, why not take this off their plates? With OpenContrail you can easily implement security policies that are outside of Kubernetes and outside of the developers’ purview. I think that’s a good idea for the sanity of developers, but also to solve growing security complexity in multi-purpose clusters.

If you’ve made it this far, I hope you won’t be willfully blind to the Kubernetes SRE-fu you’ll need sooner or later. Definitely give OpenContrail a try for your K8s networking-security needs. The community has recently made it much more accessible to quick-start with Helm packaging, and the work continues to make day-1 as easy as possible. The Slack team is also helpful. The good news is that with the OpenContrail project very battle tested and going on 5 years old, your day-N should be smooth and steady.

PS. OpenContrail will soon be joining Linux Foundation Networking, and likely renamed, making this article a vestige of early SDN and cloud-native antiquity.

image credit alexandersonscc / Pixabay