Measuring DevOps and Proving ROI

Legendary management teacher and thinker Peter Drucker told us, “You can’t manage what you can’t measure.” He later paraphrased himself to say, “If you can’t measure it, you can’t improve it.” This advice is not lost on those who are implementing DevOps in their organization. Given how all-encompassing the effort is to do so, it is important not only to management but to everyone involved to be able to point at significant improvements and significantly increased value derived from it. Many analysts and pundits suggest that measuring DevOps is difficult or even impossible, but others choose to invoke the sage advice in one of Stephen Covey’s 7 Habits of Highly Effective People, “Begin with the end in mind.”

Define Your DevOps Goals and Objectives

We measure to determine our progress toward specific goals and objectives. The goals tend to be somewhat subjective and focused on achieving quality improvements. The objectives are, just as they say, much more objective and empirical.

Continuous Improvement

Continuous Improvement is most often cited as the primary goal of any DevOps initiative. One of the commonly cited of the many definitions of DevOps says, “DevOps is the delivery of application changes at the speed of business. By replacing the infrequent updating of monolithic blocks of code with very frequently updated micro services, DevOps enables dramatic acceleration of improvement.

How fast? In their 2018 State of DevOps Report, the DevOps Research and Assessment (DORA) team tell us, “…elite performers are optimizing lead times, reporting that the time from committing code to having that code successfully deployed in production is less than one hour, whereas low performers required lead times between one month and six months,” estimating that, “the elite group has 2,555 times faster change lead times than low performers.” An article in ZDNet from 2015 reported, “How Amazon handles a new software deployment every second.

Change at the speed of business.

Superior Business Outcomes

Superior Business Outcomes are the very “end” goal of every process, including DevOps. Microsoft CEO Satya Nadella reminds us that today, “Every company is a software company,” drawing a direct relationship between how quickly a company can improve their software and how quickly they increase their profits. Some DevOps goals will relate to the outcomes achieved through the software continuously improved by DevOps processes.

The components of continuous improvement include velocity, quality, performance, and outcomes. Some goals may also be set in consideration of known challenges within the enterprise. Whatever goals and objectives a given organization identifies, it is critical to relate them to the value received.

Why You Should be Measuring DevOps?

Following Simon Sinek’s advice to “Start with Why,” we ask why you’d want to be measuring DevOps.

The Heisenberg Uncertainty Principle teaches us that the mere fact that something is being observed affects the thing being observed. Metrics are consistently a highly effective way to motivate teams, and DevOps success is more dependent on the teams involved than anything else.

Implementing DevOps requires significant investments of people, time, organizational attention, and money. Anyone who makes an investment does so in order to receive an identifiable return on investment (ROI). This is the most tangible, significant reason for establishing and executing a discrete set of metrics to determine the return achieved from the DevOps investments.

The most meaningful reason to measure DevOps and the ROI created by it is cited by DORA:

“Traditionally, IT has been viewed as a cost center and, as such, was expected to justify its costs and return on investment (ROI) up front. However, IT done right is a value driver and innovation engine. Companies that fail to leverage the transformative, value-generating power of IT risk being disrupted by those who do.”

DORA also identifies three performance levels for software development speed and stability that help organizations determine how much they will invest in their DevOps effort based on which level they desire to achieve:

High IT Performers

Realize the highest benefits from superior software delivery, such as low unnecessary rework and high employee satisfaction

Medium IT Performers

Have the most to gain by burning down technical debt and optimizing for speed and value over cost.

Low IT Performers

Have the most opportunities for improvement by addressing low-hanging fruit and setting measurable goals.

You Should Be Measuring DevOps By These Elements

DevOps combines three key elements, people, process, and technology, to achieve the startling acceleration in software delivery it helps organizations achieve. Most DevOps metrics will correspond to these fundamental components.

People

People related metrics include task duration, response times, incidence of failure, and more. These are often the most difficult metrics to obtain so they are always the best place to start.

Process

Process is what DevOps is really all about. The process of obtaining user feedback rapidly drives the fast development and delivery of software upgrades which are deployed immediately by operations who then obtain the next round of user feedback to begin the process again. Quality and performance gains from one iteration to the next are a key metric, but a highly subjective one. Development-to-Deployment time is more objective and measurable, most useful when combined with velocity, relevance, effectiveness, efficiency, and smoothness of flow.

Technology

Technology metrics encompass hardware, software, and service functions. System uptime is critical. Software failure rate connects directly to development and deployment metrics. It’s pointless to be moving fast when the failure rate is too high.

Velocity is a Key Measure of DevOps

Velocity is the key consideration, followed closely by performance quality, though DevOps experts frequently cite their willingness to “break things” along the way and learn from those failures. Increasing competitive pressure drives an ever-increasing need to achieve continuous rapid improvement. Software developer Stackify lists useful metrics contributing to the achievement of high-speed iterations:

Deployment frequency

One objective quantity which can easily be tracked is counting the number of deployments performed by the DevOps team. The goal is always to deliver smaller improvements more often.

Change volume

How many improvements and changes derived from user feedback are embodied in each new deployment?

Deployment time

Reducing the time expended at every step of the DevOps process contributes to increases in overall speed. Clocking the actual time it takes operations to deploy new improvements helps to determine whether this task is contributing appropriately.

Lead time

Expanding beyond deployment time, lead time measures the elapsed time from receipt of a new request to availability in production.

Customer tickets

Trouble tickets are the most available metrics of software bugs and other deficiencies that cause rework and user disruption. This is a key element of quality.

Automated test pass %

Another contributor to the overall speed of the DevOps process is the incorporation of automation into the software testing stage. Automated tools evaluate new software far more quickly than humans can. Tracking the incidence of failure helps monitor the efficiency of development, and also the quality of the automated tools themselves.

Defect escape rate

Code defects are going to happen. That’s inescapable. But you’d clearly prefer to catch those defects in the quality assurance (QA) testing stage than have users report them in production. Comparing defects caught in testing to those caught in production is a useful gauge of the effiency of both the development process and the testing infrastructure.

Availability

Anyone in IT is vividly aware of the importance of “five nines” availability, the ability to keep the system available for users 99.999% of the time. When users complain of “constant downtime” having this ratio calculated helps to resolve their concerns.

Service level agreements (SLA)

Commitment is a key element to obtaining the confidence of the user community. Establishing a firm Service Level Agreement and regular reporting on fulfillment of it achieves this, as long as you are continually meeting or exceeding your agreements.

Failed deployments

Leveraging the resilience of microservices in containers means that any given defect will likely not bring down the entire system. Full system unavailability causes disruption to user workflow and directly impacts your availability which is the key component of most SLAs.

Error rates

Tracking the frequency of occurrence or errors is far more valuable than simply identifying occasional ones. Errors are going to occur for a wide variety of reasons, but a pattern of errors occurring with regularity is a clear indicator of a deeper problem.

Application usage and traffic

Developers code solutions for their users to use. If your traffic monitoring reports no activity there is clearly a problem that must immediately be addressed. Similarly, if you see inordinate traffic there may be a faulty microservice causing the anomaly.

Application performance

Many elements can deteriorate application performance. The causative factor may come from the code, the storage, compilers, the database itself, protocol errors, the service bus, or many other elements. Effective application performance monitoring is a requirement in all environments.

Mean time to detection (MTTD)

The DevOps “need for speed” extends beyond development and deployment to include detection of anomalies. The faster you detect them, the faster you can resolve them.

Mean time to recovery (MTTR)

The other end of the error handling sequence, once you’ve detected and identified an anomaly the time it takes to actually resolve it and return the application to full availability must be measured.

Don’t Let Perfection Be the Enemy of Good

Many complain that measuring DevOps is “impossible.”  From an end-to-end perspective that may be so, but clearly there are many variables both objective and subjective which can be measured to provide valuable and useful insights.

Enjoyed This Article? Check Out Our Other DevOps Content