How To Measure A Process When There’s No Metric

“What gets measured gets managed.”

–Peter Drucker


“If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.”

–H. James Harringon


Sometimes There’s No Tool That Does The Job


You want to measure it but there is no metric, now what?  You just can’t find some way to measure something that feels really important, or something that just hasn’t been measured before but which may really need to be improved…now what?  Read on…


There are some endpoints that are very important to a quality improvement project. Your organization may feel that there are certain things that “everyone knows” but are difficult to capture–and therefore to improve. Here we answer the question ‘What if we think something is important but there is no known way to measure it?’


Typical Endpoints Don’t Always Capture What You Want

We’ve run into the situation before: for example, a hospital system was focused on how to admit patients via its trauma service effectively.  Staff and felt something was “just off” and there are certain features of the system that people intuitively believed were poor. Typical endpoints didn’t seem to capture all of the pain of the process, and the team even tried to look up endpoints used by others in similar circumstances.  No luck.  What to do now?  The team created its own measurement tools and validated their metrics as being both reliable and useful.


How do you create your own? There is a very useful and under-utilized tool in healthcare called the Gauge R&R. “Gauge R&R” stands for gauge reproducibility and reliability. (Sometimes “gauge” is spelled “gage” by different authors.) This is a unique way to develop your own tools for an important project.


Use The Gauge R & R

The Gauge R&R is a validation process that can be used in several ways.  Here’s an example of the Gauge R&R in action.  Pretend your health system has a problem with resident oversight. Perhaps state accrediting bodies have come through your organization and said that your ability to watch over residents is poor at best. However, people on the team feel that residents are supervised well. Let’s say you want to settle this with data to at least know where you stand, whether to make changes, and to demonstrate to the team that current level of performance.


Perhaps you go to the literature to find different endpoints that have been used for resident oversight. For reasons we discuss here, you would like a continuous data end point so as to limit the sample size you need to show meaningful improvement. Unfortunately, it is very challenging to find a continuous data endpoint (and you like continuous data because you need a much smaller sample to demonstrate meaningful improvement) that captures resident supervision. Now what?


You decide to design a tool for your organization. You pull a sample of charts and look for commonalities that seem to display excellent supervision versus poor supervision. You formulate a tool that is a scale from 0-5 that indicates excellent resident supervision as a continuous data endpoint. Further, let’s pretend your tool gives examples for what a 0 looks like to a 5. Now, you would like to test the tool to find out whether it really tells the difference between weak resident supervision and excellent resident supervision. Can different people who use the tool come to the same conclusion reliably (minimizing chance involved) and repeatedly? Here is where we utilize the Gauge RnR.


There are many specific ways to perform the Gauge R&R.  Let’s pretend, here, we perform what is called a 3 by 30 analysis. Three reviewers are placed into a room with 10 charts. An additional staff member keeps track of the charts and these charts are handed out to each reviewer so that each reviewer sees each chart (and rates it) three times. These charts are passed to the blinded reviewers according to a random number generator and charts are prepared so that reviewers can’t tell when they’ve received the same chart again. (Each reviewer gets each chart in a blinded fashion three times.) Therefore three reviewers review 10 charts three times each for a total of 30 samples. Their ratings are recorded in Excel and the Gauge R&R tool is utilized from a quality software package.


It’s Actually An ANOVA…

The Gauge R&R tool is actually an analysis of variance (ANOVA) and can tell you how much of the variability in the scores of the examiners are due to variability in the charts themselves, chance, and other influencers versus the tool itself. Therefore, you can tell whether staff members can utilize this custom-made tool and that the tool will be useful and reliable to collect data on your unique endpoint.


The Gauge R&R can allow you to determine the kappa statistic. The kappa statistic is the amount of inter-rater (between people using the tool) agreement that is NOT owing to chance. This concept took me a moment when I first heard of it, so let’s discuss it a bit further.



The Kappa Statistic Tells You How Much People Really Agree

The kappa statistic is the probability that two people will agree on a certain finding NOT owing to chance alone. This example helped me and so I’ll pass it along.


Consider a room with 100 patients in it. Pretend I tell you that 50 patients have heart murmurs and yet I won’t tell you which ones. You go into the room with your stethoscope, listen to each person, and leave the room to tell me (quite accurately) that 50 people have heart murmurs. I enter the room and also say that 50 people of the 100 have heart murmurs. Now, the question is, do we agree on which 50?  On the face of things, we agree 100% that 50 of 100 have heart murmurs.


The kappa statistic solves this issue.   What I described earlier is called simple agreement. We both agree, on the face of things, 100%. We both said that 50 of 100 people have heart murmurs. The kappa statistic helps tell us how much overlap there is between the 50 people we said had heart murmurs. It tells us how much we agree, where that agreement is not owing to chance alone. The Gauge R&R will tell you the kappa statistic for your test. Importantly, we accept a kappa statistic of 0.8 or higher to indicate strong enough agreement to make the tool useful. Now, complexities aside, you have a methodology to employ to evolve an endpoint that has meaning for you.


Use The Gauge R & R To Design A New Tool When Necessary

In conclusion, sometimes we want to measure things for which there is no straightforward measurement. The uniqueness of our system may require, at times, a new or different tool to be created. The Gauge R&R is that advance statistical tool utilized in Six Sigma and statistical process control to determine whether a tool you have designed can be used reliably to determine whether changes have yielded meaningful improvement for your system.


Caution:  don’t try this at home. It can be very challenging to execute a Gauge R&R. Remember, you are using a Gauge R&R where there are no good endpoints to capture something that you feel is meaningful or necessary for your quality improvement project. Therefore, take care to execute the Gauge R&R correctly. This can be a challenge as this tool is not commonly employed in statistical process control–especially in healthcare. Want to learn more about the Gauge R&R for your system, or find professional help? Let me know beneath!