What’s next for DevOps: Q&A with Lightstep CEO Ben Sigelman
In the decade since the word “DevOps” was coined, its goal has stayed the same—but the way organizations implement DevOps is constantly changing. For a closer look , we sat…
In the decade since the word “DevOps” was coined, its goal has stayed the same—but the way organizations implement DevOps is constantly changing. For a closer look , we sat down with Ben Sigelman, CEO of Lightstep, whose observability platform is the easiest way for developers and SREs to monitor health and respond to changes in cloud-native applications. Hear the full discussion in our next DevOps webcast with Ben and panelists from GitHub, Red Hat, and RedMonk on February 4.
Let’s kick things off with the big question. In five years, what will DevOps look like?
DevOps isn’t new, but it also hasn’t settled in yet. As a result, there is a staggering proliferation of tools and technologies (CNCF Landscape, anyone?), and yet there’s an equally staggering lack of consensus about how to actually do DevOps. Of course there will not be—and should not be—a single answer for that, but in five years, we will have crisper definitions and success criteria for everything from CI/CD to observability and back again, and we’ll also have a few well-balanced, opinionated, and holistic approaches to DevOps that organizations can adapt and adopt. The end result will be less time spent experimenting with tooling and more time spent actually shipping new products to market.
It’s easy to equate DevOps with CI/CD and/or automation. Although it seems like the goal of DevOps is to “automate everything,” why aren’t these the same thing? Does automating everything equal doing DevOps?
Automation is an important component of any software engineering practice, and there’s no exception for DevOps. But it’s imprecise (and maybe even a bit lazy) to define DevOps solely in terms of automation. The real driver for DevOps is the need to parallelize the complete software development lifecycle. It should be possible for an individual human being to take something off of a backlog and get it into production without blocking on anything, and especially without blocking on the slowest and least reliable thing in any organization: Namely, other human beings!
“The big change for DevOps is giving individuals enough scope to develop, secure, deploy, and operate their own piece of a larger software application while minimizing the number of roundtrip communications with other people.”
This then brings us to automation: Once you’ve eliminated blocking dependencies on other human beings, any good engineer—DevOps or otherwise–will always want to automate everything else that stands in their way, and with the broadened scope of DevOps, they finally can! So indeed there has been a lot of low-hanging “automation fruit,” and that’s super, but “automation != DevOps.” Automation is just something wonderful that DevOps enables.
A lot has changed in the last year. We know team culture and collaboration play a huge role in DevOps, and as we’ve seen in our 2020 Octoverse Report, many teams have had to find entirely new (remote) ways to address both. What changes have you seen to DevOps with this shift to remote work?
Last year’s mass-migration to 100 percent remote work has underlined the importance of self-reliance and parallelism in software development and operations. By consequence, there certainly has been more urgency and more need as far as DevOps is concerned. I also think we’re all thoroughly sick of back-to-back virtual meetings, so there’s been a refreshing added incentive to make the entire product development and deployment process “self-service” from the perspective of the people building and maintaining that software.
On the operational side of DevOps, the shift to remote work means that you can’t just walk across the office to “the haggard old-timer expert-in-everything” when something breaks in production. Maybe you can find that person on Slack, but they might be in another window and it’s certainly far easier for them to ignore you. That said, the pace of production changes continues to rise, and with each change there’s new risk and new ways for services to interact in unplanned and problematic ways. So we need observability tooling that is built to take nothing more than “an unplanned change” and help conjure up potential explanations—dynamically, and without finding “the experts.” This is certainly where observability is headed, and the shift to purely remote work is helping to accelerate that change.
Let’s zoom out a bit: What’s been the most transformative part of DevOps to software development overall?
Before DevOps was a thing, any improvement to software didn’t just require multiple people, it required multiple teams and even multiple organizations. To my mind, the single most transformative part of DevOps has been the parallelism and independence it’s brought to the individual software developer-operators themselves. As a side effect, these DevOps engineers are automating everything in sight—which is fantastic!—and we’re seeing organizations deploying software literally hundreds of times more frequently than they could before. It’s difficult to overstate the difference that can make, particularly when we consider the engineering-cultural implications and the compounding effects over the course of years.
Speaking of parallelism and automation, we’re also seeing security teams adopting these same practices in the form of DevSecOps. In the future, do you think there will be a distinction between DevOps and DevSecOps? Or will security just become fundamental to DevOps as a whole?
Security is (obviously) a complicated topic. I’m sure there are many intelligent and well-informed people who will disagree with me here, but so it goes! DevOps is a durable and valuable concept because 99 percent of Dev and Ops can truly be consolidated—one person can fulfill multiple roles. Now, certainly many aspects of security can also be included in this: automated package vulnerability testing, real-time threat detection in production, tools that audit configuration, and so forth. It’s all good, and I guess I can see how that might qualify as DevSecOps.
The distinction in my mind, though, is that there are so many attack vectors that have nothing much to do with software development: your corporate email accounts, spear phishing, edge network hacks, DDOS, elaborate supply chain attacks like the recent SolarWinds exploit, and so on. So, while 99 percent of Dev and Ops can truly be consolidated, there are so many vital parts of a CISO’s world that simply cannot be managed by the person writing, deploying, and managing a service in production. For that reason, I certainly see security becoming one of many aspects of the “Ops” in DevOps, but I am not a fan of DevSecOps as a term.
While we may not have a perfect definition now, what does successful DevOps look like for you/your team at Lightstep? How do you know when you’ve “made it”?
Good question, and over the years our Engineering, Product, and Design (EPD) organization has gone through different phases and iterations. I would say that, particularly after my experience working at Google for many years, I would never again want “Ops” (or site reliability engineering in Google’s case) to be its own separate part of the org chart. Incentives align when reliability and product goals are considered and balanced by a single team with a single strategy.
That said, there’s no single approach to DevOps that works at every scale. So as our customer base grows, and particularly as our EPD organization grows, we have to listen to each other and realize when the current set of processes and practices isn’t working anymore. Do people feel like they’re responsible for things they control? Is the scope of operational responsibility in line with the scope of development responsibility? Like any organization that’s honest with itself, at Lightstep we are always learning and growing, and we’ll never be able to say we’ve “made it”—at least not in a durable sense. And that’s fine. What’s important to us is to be introspective and try to get ahead of issues before they become crises, and of course to lean into things that work well for us.
You’re joining us on a DevOps panel later this week—where you’ll also be answering questions live. What’s something people should be asking about DevOps, but might not have considered yet?
People often think about DevOps teams working cheerfully and perfectly in parallel—completely independent of each other. This is as appealing as it is fictional! The reality is that the services that DevOps teams maintain depend upon and interact with each other in surprising and often problematic ways. What I want people to think about is how they can (a) plan better so that their “independent [sic]” “improvements [sic]” don’t inadvertently ruin some other team’s day or week, and (b) how they can respond more effectively when their own service goes sideways from someone else’s change.
With thousands of software changes going to production every week, we need a far more robust and dynamic way of understanding how changes in one service affect others—and the DevOps teams responsible for them.
Learn more about the future of DevOps
Want to hear more from Ben? Join him along with panelists from GitHub, Red Hat, and RedMonk for our next GitHub webcast this Thursday, February 4. We’ll dive into the changes and trends we’ve seen in how DevOps teams collaborate, including deploying on Fridays, treating monitoring the same as observability, and DevSecOps.
When
February 4, 2021
11:00 am PT / 2:00 pm ET
Written by
Related posts
Enhance build security and reach SLSA Level 3 with GitHub Artifact Attestations
Learn how GitHub Artifact Attestations can enhance your build security and help your organization achieve SLSA Level 3. This post breaks down the basics of SLSA, explains the importance of artifact attestations, and provides a step-by-step guide to securing your build process.
Streamlining your MLOps pipeline with GitHub Actions and Arm64 runners
Explore how Arm’s optimized performance and cost-efficient architecture, coupled with PyTorch, can enhance machine learning operations, from model training to deployment and learn how to leverage CI/CD for machine learning workflows, while reducing time, cost, and errors in the process.
GitHub Enterprise: The best migration path from AWS CodeCommit
AWS CodeCommit is discontinuing new customer access and will no longer introduce new features. Learn how to migrate to GitHub Enterprise and why it’s the best option for you.