http://bit.ly/2jJ2qlq This episode describes a useful healthcare value metric based on a process capability measure (Cpk) and waste measurement (Cost of Poor Quality).
David Kashmer (@DavidKashmer)
In the last entry, you saw a novel, straightforward metric to capture the value provided by a healthcare service called the Healthcare Value Process Index (HVPI). In this entry, let’s explore another example of exactly how to apply the metric to a healthcare service to demonstrate how to use the index.
At America’s Best Hospital, a recent quality improvement project focused on time patients spent in the waiting room of a certain physician group’s practice. The project group had already gone through the steps of creating a sample plan and collecting data that represents how well the system is working.
From a patient survey, sent out as part of the project, the team learned that patients were willing to wait, at most, 20 minutes before seeing the physician. So, the Voice of the Customer (VOC) was used to set the Upper Specification Limit (USL) of 20 minutes.
A normality test (the Anderson-Darling test) was performed, and the data collected follow the normal distribution as per Figure 1 beneath. (Wonder why the p >0.05 is a good thing when you use the Anderson-Darling test? Read about it here.)
The results of the data collection and USL were reviewed for that continuous data endpoint “Time Spent In Waiting Room” and were plotted as Figure 2 beneath.
The Cpk value for the waiting room system was noted to be 0.20, indicating that (long term) the system in place would produce more that 500,000 Defects Per Million Opportunities (DPMO) with the accompanying Sigma level of < 1.5. Is that a good level of performance for a system? Heck no. Look at how many patients wait more than 20 minutes in the system. There’s a quality issue there for sure.
What about the Costs of Poor Quality (COPQ) associated with waiting in the waiting room? Based on the four buckets of the COPQ, your team determines that the COPQ for the waiting room system (per year) is about $200,000. Surprisingly high, yes, but everyone realizes (when they think about it) that the time Ms. Smith fell in the waiting room after being there 22 minutes because she tried to raise the volume on the TV had gotten quite expensive. You and the team take special note of what you items you included from the Profit and Loss statement as part of the COPQ because you want to be able to go back and look after changes have been made to see if waste has been reduced.
In this case, for the physician waiting room you’re looking at, you calculate the HVPI as
(100)(0.20) / (200) or 0.1
That’s not very good! Remember, the COPQ is expressed in thousands of dollars to calculate the HVPI.
Just then, at the project meeting to review the data, your ears perk up when a practice manager named Jill says: “Well our patients never complain about the wait in our waiting room which I think is better than that data we are looking at. It feels like our patients wait less than 20 minutes routinely, AND I think we don’t have a much waste in the system. Maybe you we could do some things like we do them in our practice.”
As a quality improvement facilitator, you’re always looking for ideas, tools, and best practices to apply in projects like this one. So you and the team plan to look in on the waiting room run by the practice manager.
Just like before, the group samples the performance of the system. It runs the Anderson-Darling test on the data and they are found to be normally distributed. (By the way, we don’t see that routinely in waiting room times!)
Then, the team graphs the data as beneath:
Interestingly, it turns out that this system has a central tendency very similar to the first waiting room you looked at–about 18 minutes. Jill mentioned how most patients don’t wait more than 18 minutes and the data show that her instinct was spot on.
…but, you and the team notice that the performance of Jill’s waiting room is much worse than the first one you examined. The Cpk for that system is 0.06–ouch! Jill is disappointed, but you reassure her that it’s very common to see that how we feel about a system’s performance doesn’t match the data when we actually get them. (More on that here.) It’s ok because we are working together to improve.
When you calculate the COPQ for Jill’s waiting room, you notice that (although the performance is poor) there’s less as measured by the costs to deliver that performance. The COPQ for Jill’s waiting room system is $125,000. (It’s mostly owing to the wasted time the office staff spend trying to figure out who’s next, and some other specifics to how they run the system.) What is the HVPI for Jill’s waiting room?
(100)(0.06) / (125) = 0.048
Again, not good!
So, despite having lower costs associated with poor quality, Jill’s waiting room provides less value for patients than does the first waiting room that you all looked at. It doesn’t mean that the team can’t learn anything from Jill and her team (after all, they are wasting less as measured by the COPQ) but it does mean that both Jill’s waiting room and the earlier one have a LONG way to go to improve their quality and value!
Fortunately, after completing the waiting room quality improvement project, the Cpk for the first system studied increased to 1.3 and Jill’s waiting room Cpk increased to 1.2–MUCH better. The COPQ for each system decreased to $10,000 after the team made changes and went back to calculate the new COPQ based on the same items it had measured previously.
The new HVPI (with VOC from the patients) for the first waiting room? That increased to 13 and the HVPI for Jill’s room rose to 12. Each represents an awesome increase in value to the patients involved. Now, of course, the challenge is to maintain those levels of value over time.
This example highlights how value provided by a system by a healthcare system for any continuous data endpoint can be calculated and compared across systems. It can be tracked over time to demonstrate increases. The HVPI represents a unique value measure comprised of a system capability measure and the costs of poor quality.
Questions or thoughts about the HVPI? Let me know & let’s discuss!
You’ve probably heard the catchphrase “volume to value” to describe the current transition in healthcare. It’s based on the idea that healthcare volume of services should no longer be the focus when it comes to reimbursement and performance. Instead of being reimbursed a fee per service episode (volume of care), healthcare is transitioning toward reimbursement with a focus on value provided by the care given. The Department of Health and Human Services (HHS) has recently called for 50% or more of payments to health systems to be value-based by 2018.
Here’s a recent book I completed on just that topic: Volume to Value. Do you know what’s not in that book, by the way? One clear metric on how exactly to measure value across services! That matters because, after all
If you can’t measure it, you can’t manage it. –Peter Drucker
An entire book on value in healthcare and not one metric which points right to it! Why not? (By the way, some aren’t sure that Peter Drucker actually said that.)
Here’s why not: in healthcare, we don’t yet agree on what “value” means. For example, look here. Yeesh, that’s a lot of different definitions of value. We can talk about ways to improve value by decreasing cost of care and increasing value, but we don’t have one clear metric on value (in part) because we don’t yet agree on a definition of what value is.
In this entry, I’ll share a straightforward definition of value in healthcare and a straightforward metric to measure that value across services. Like all entries, this one is open for your discussion and consideration. I’m looking for feedback on it. An OVID, Google, and Pubmed search revealed nothing similar to the metric I propose beneath.
First, let’s start with a definition of value. Here’s a classic, arguably the classic, from Michael Porter (citation here).
Value is “defined as the health outcomes per dollar spent.”
Ok so there are several issues that prevent us from easily applying this definition in healthcare. Let’s talk about some of the barriers to making something measurable out of the definition. Here are some now:
(1) Remarkably, we often don’t know how much (exactly) everything costs in healthcare. Amazing, yes, but nonetheless true. With rare exception, most hospitals do not know exactly how much it costs to perform a hip replacement and perform the after-care in the hospital for the patient. The time spent by FTE employees, the equipment used, all of it…nope, they don’t know. There are, of course, exceptions to this. I know of at least one health system that knows how much it costs to perform a hip replacement down to the number and amount of gauze used in the OR. Amazing, but true.
(2) We don’t have a standardized way for assessing health outcomes. There are some attempts at this, such as QALYs, but one of the fundamental problems is: how do you express quality in situations where the outcome you’re looking for is different than quality & quantity of life? The QALY measures outcome, in part, in years of life, but how does that make sense for acute diseases like necrotizing soft tissue infections that are very acute (often in patients who won’t be alive many more years whether the disease is addressed or not), or other items to improve like days on the ventilator? It is VERY difficult to come up with a standard to demonstrate outcomes–especially across service lines.
(3) The entity that pays is not usually the person receiving the care. This is a huge problem when it comes to measuring value. To illustrate the point: imagine America’s Best Hospital (ABH) where every patient has the best outcome possible.
No matter what patient with what condition comes to the ABH, they will have the BEST outcome possible. By every outcome metric, it’s the best! It even spends little to nothing (compared to most centers) to achieve these incredible outcomes. One catch: the staff at ABH is so busy that they just never write anything down. ABH, of course, would likely not be in business for long. Why? Despite these incredible outcomes for patients, ABH would NEVER be re-imbursed. This thought experiment shows that valuable care must somehow include not just the attention to patients (the Voice of the Patient or Voice of the Customer in Lean & Six Sigma parlance), but also to the necessary mechanics required to be reimbursed by the third party payors. I’m not saying whether it’s a good or bad thing…only that it simply is.
So, where those are some of the barriers to creating a good value metric for healthcare, let’s discuss how one might look. What would be necessary to measure value across different services in healthcare? A useful value metric would
(1) Capture how well the system it is applied to is working. It would demonstrate the variation in that system. In order to determine “how well” the system is working, it would probably need to incorporate the Voice of the Customer or Voice of the Patient. The VOP/VOC often is the upper or lower specification limit for the system as my Lean Six Sigma and other quality improvement colleagues know. The ability to capture this performance would be key to represent the “health outcomes” portion of the definition.
(2) Be applicable across different service lines and perhaps even different hospitals. This requirement is very important for a useful metric. Can we create something that captures outcomes as disparate as time spent waiting in the ER and something like patients who have NOT had a colonoscopy (but should have)?
(3) Incorporate cost as an element. This item, also, is required for a useful metric. How can we incorporate cost if, as said earlier, most health systems can’t tell you exactly how much something costs?
With that, let’s discuss the proposed metric called the “Healthcare Value Process Index”:
Healthcare Value Process Index = (100) Cpk / COPQ
where Cpk = the Cpk value for the system being considered, COPQ is the Cost of Poor Quality for that same system in thousands of dollars, and 100 is an arbitrary constant. (You’ll see why that 100 is in there under the example later on.)
Yup, that’s it. Take a minute with me to discover the use of this new value metric.
First, Cpk is well-known in quality circles as a representation of how capable a system is at delivering a specified output long term. It gives a LOT of useful information in a tight package. The Cpk, in one number, describes the number of defects a process is creating. It incorporates the element of the Voice of the Patient (sometimes called the Voice of the Customer [VOC] as described earlier) and uses that important element to define what values in the system are acceptable and which are not. In essence, the Cpk tells us, clearly, how the system is performing versus specification limits set by the VOC. Of course, we could use sigma levels to represent the same concepts.
Weaknesses? Yes. For example, some systems follow non-normal data distributions. Box-Cox transformations or other tools could be used in those circumstances. So, for each Healthcare Value Process Index, it would make sense to specify where the VOC came from. Is it a patient-defined endpoint or a third party payor one?
That’s it. Not a lot of mess or fuss. That’s because when you say the Cpk is some number, we have a sense of the variation in the process compared to the specification limits of the process. We know how whatever process you are talking about is performing, from systems as different as time spent peeling bananas to others like time spent flying on a plane. Again, healthcare colleagues, here’s the bottom line: there’s a named measure for how well a system represented by continuous data (eg time, length, etc.) is performing. This system works for continuous data endpoints of all sorts. Let’s use what’s out there & not re-invent the wheel!
(By the way, wondering why I didn’t suggest the Cp or Ppk? Look here & here and have confidence you are way beyond the level most of us in healthcare are with process centering. Have a look at those links and pass along some comments on why you think one of those other measures would be better!)
Ok, and now for the denominator of the Healthcare Value Process Index: the Cost of Poor Quality. Remember how I said earlier that health systems often don’t know exactly how much services cost? They are often much more able to tell when costs decrease or something changes. In fact, the COPQ captures the Cost of Poor Quality very well according to four buckets. It’s often used in Lean Six Sigma and other quality improvement systems. With a P&L statement, and some time with the Finance team, the amount the healthcare system is spending on a certain system can usually be sorted out. For more info on the COPQ and 4 buckets, take a look at this article for the Healthcare Financial Management Association. The COPQ is much easier to get at than trying to calculate the cost of an entire system. When the COPQ is high, there’s lots of waste as represented by cost. When low, it means there is little waste as quantified by cost to achieve whichever outcome you’re looking at.
So, this metric checks all the boxes described earlier for exactly what a good metric for healthcare value would look like. It is applicable across service lines, captures how well the system is working, and represents the cost of the care that’s being rendered in that system. Let’s do an example.
Pretend you’re looking at a sample of the times that patients wait in the ER waiting room. The Voice of the Customer says that patients, no matter how not-sick they may seem, shouldn’t have to wait any more than two hours in the waiting room.
Of course, it’s just an example. That upper specification limit for wait time could have been anything that the Voice of the Customer said it was. And, by the way, who is the Voice of the Customer that determined that upper spec limit? It could be a regulatory agency, hospital policy, or even the director of the ER. Maybe you sent out a patient survey and the patients said no one should ever have to wait more than two hours!)
When you look at the data you collected, you find that 200 patients came through the ER waiting room in the time period studied. That means 2 defects per 200 opportunities, which is a DPMO (Defects Per Million Opportunities) of 10,000. Let’s look at the Cpk level associated with that level of defect:
Ok, that’s a Cpk of approximately 1.3 as per the table above. Now what about the costs?
We look at each of the four buckets associated with the Cost of Poor Quality. (Remember those four buckets?) First, the surveillance bucket: an FTE takes 10 minutes of their time every shift to check how long people have been waiting in the waiting room. (In real life, there are probably more surveillance costs than this.) Ok, so those are the costs required to check in on the system because of its level of function.
What about the second bucket, the cost of internal failures? That bucket includes all of the costs associated with issues that arise in the system but do not make it to the patient. In this example, it would be the costs attributed to problems with the amount of time a person is in the waiting room that don’t cause the patient any problems. For example, were there any events when one staff member from the waiting room had to walk back to the main ED because the phone didn’t work and so they didn’t know if it was time to send another patient back? Did the software crash and require IT to help repair it? These are problems with the system which may not have made it to the patient and yet did have legitimate costs.
The third bucket, often the most visible and high-profile, includes the costs associated with defects that make it to the patient. Did someone with chest pain somehow wind up waiting in the waiting room for too long, and require more care than they would have otherwise? Did someone wait more than the upper spec limit and then the system incurred some cost as a result? Those costs are waste and, of course, are due to external failure of waiting too long.
The last bucket, my favorite, is the costs of prevention. As you’ve probably learned before, this is the only portion of the COPQ that generates a positive Return On Investment (ROI) because money spent on prevention usually goes very far toward preventing many more costs downstream. In this example, if the health system spent money on preventing defects (eg some new computer system or process that freed up the ED to get patients out of the waiting room faster) that investment would still count in the COPQ and would be a cost of prevention. Yes, if there were no defects there would be no need to spend money on preventative measures; however, again, that does not mean funds spent on prevention are a bad idea!
After all of that time with the four buckets and the P&L, the total COPQ is discovered to be $325,000. Yes, that’s a very typical size for many quality improvement projects in healthcare.
Now, to calculate the Healthcare Value Process Index, we take the system’s performance (Cpk of 1.3), multiple it by 100, and divide by 325. We see a Healthcare Value Process Index of 0.4. We carefully remember that the upper spec limit was 120 and came from the VOC who we list when we report it out. The 100 is there to make the results easier to remember. It simply changes the size of the typical answer we get to something that’s easier to remember.
We would report this Healthcare Value Process Index as “Healthcare Value Process Index of 0.4 with VOC of 120 min from state regulation” or whomever (whichever VOC) gave us the specification limits to calculate the Cpk. Doing that allows us to compare a Healthcare Value Process Index from institution to institution, or to know when they should NOT be compared. It keeps it apples to apples!
Now imagine the same system performing worse: a Cpk of 0.7. It even costs more, with a COPQ of 425,000. The Healthcare Value Process Index (HVPI)? That’s 0.0165. Easy to see it’s bad!
How about a great system for getting patient screening colonoscopies in less that a certain amount of time or age? It performs really well with a Cpk of 1.9 (wow!) and has a COPQ of $200,000. It’s HVPI? That’s 0.95. Much better than those other systems!
Perhaps even more useful than comparing systems with the HVPI is tracking the HVPI for a service process. After all, no matter what costs were initially assigned to a service process, watching them change over time with improvements (or worsening of the costs) would likely prove more valuable. If the Cpk improves and costs go down, expect a higher HVPI next time you check the system.
At the end of the day, the HVPI is a simple, intuitive, straightforward measure to track value across a spectrum of healthcare services. The HVPI helps clarify when value can (and can not) be compared across services. Calculating the HVPI requires knowledge of system capability measures and clarity in assigning COPQ. Regardless of initial values for a given system and different ways in which costs may be assigned, trending HVPI may be more valuable to track the trend of value for a given system.
Questions? Thoughts? Hate the arbitrary 100 constant? Leave your thoughts in the comments and let’s discuss.
http://bit.ly/2izQO63 In this podcast, we discuss 2 key ideas to evaluate your quality improvement system.
David Kashmer (@David Kashmer)
How would you evaluate a healthcare quality improvement program? Let’s say you’re looking at your healthcare system’s process improvement system and wondering “How good are we at process improvement?” How would you know just how well the quality system was performing?
I’ve sometimes heard this called “PI-ing the PI”, and it makes sense–after all, the idea of building a quality system even extends to learning how well the process improvement (PI) system works.
In the many systems I’ve either worked in, helped design, or have consulted for I’ve found the question of “How good are we at PI?” can often be boiled down to a matter of efficiency and effectiveness.
This dimension of the PI process can be thought of as how little waste there is in the PI process. What is the cycle time from issue identification until closure? How much paper & cost does the PI process incur? Do projects take more than 120 days?
The efficiency question is very difficult to answer in healthcare process improvement, and I think that’s because our systems are not so well developed yet as to have many benchmarks for how long things should take from identification until closure (for example). I often use three months (90 days) as the median time from issue identification to closure, because there are a few papers that cite that number for formal DMAIC projects.
Now, there are a few important statements here: (1) when I say 90 days to issue closure I mean meaningful closure & (2) if 90 days is a median target…what’s the variance of the population?
Let me explain a bit: Lean Six Sigma practitioners are often comfortable with thinking of continuous variables as a distribution with a measure of variance (like range or standard deviation) to indicate just how wide the population at hand is. Quality projects often focus on decreasing the standard deviation to make sure things go better in general. This same approach can be used to “PI the PI” effectiveness. What is the standard deviation of how long it takes to identify and close out an issue for the PI system, for example? How can it be reduced?
These are some of the key questions when it comes to measuring the efficiency of the PI system.
This dimension is, arguably, more important than efficiency. For example, imagine working really hard to decrease the amount of time it takes someone to throw something away. Yup, imagine working hard on improving how well someone throws away a piece of trash. Making a process efficient, but ultimately ineffective, probably isn’t worth your time. (I’m sure there’s some counter example that describes a situation where waste disposal efficiency is very important! I just use that example to show how efficiency can be very far removed from effectiveness.)
When it comes to measuring the effectiveness of your PI system, where would you start? Being busy is one thing, but being busy about the right things is likely more important.
One important consideration is issue identification. How does your PI system learn about its issues? Does it just tackle each item that comes up from difficult cases? How do staff highlight issues to PI staff? Is that easy to do? Does your system gather data and decide which issues are a “big enough deal” to move ahead? Does it use a FMEA and COPQ to look at factors that help prioritize issues?
These are some of the most important issue identification factors for your PI system, but by no means are the only ones related to effectiveness.
Once the right issues are acquired in the right way at the right time, where do they go from there? Are all the stakeholders involved in a process to make improvement? Does the system use data and follow those data to decide what really needs to happen, or does it only use its “gut”? Is the PI system politicized, so that data aren’t used, aren’t important, aren’t regarded, or just aren’t made?
The staff at the “tip of the sword” (the end of the process that touches patients) and even those who never see a patient but whose efforts impact them (that’s every staff member right?) are armed with data they can understand that describe performance. Even better, the staff receive data that they’ve asked for because the PI/QI process tailor made what data the staff receive. (More on that a little later.)
Once issues are identified, and the PI system performs, what happens with the output? This is another key question regarding effectiveness that can let you know a lot about the health system. There’s an element of user design (#UX) in good PI systems. Do the data get to the staff who need to know? Do the staff understand what the data mean? Are the data in a format that allow the data to impact performance? Are the data endpoints (at least some of them) something unique and particular that the staff asked about way-back-when?
Lean Six Sigma is 80% people and 20% math.
You may have heard that old saying. In fact, it’s been said about several quality programs. (I’ve discussed previously that, yes, the system is 80% people but getting the 20% math correct is essential–otherwise the boat won’t float!) It is on this point about effectiveness that I’d like to take a second with you before we go:
One of the major items with quality improvement is the ability to use trusted data to impact what we do for patients for the better.
That’s the whole point right? If the data don’t represent what we do, are the wrong data at the wrong time, or are beautiful but no one can understand them, well, the PI process is not effective.
This, to my mind, is the key question to gauge PI / QI success:
Do we see data impact our behavior on important topics in a timely fashion?
If we do, we have checked many of the boxes regarding efficiency and effectiveness, because, for that to happen, we must have identified key issues, experienced a process that somehow takes those issues and creates meaningful data, taken that data in a format that is understood by the organization, and we must have done it all in a timely fashion that actually changes what we do. That is efficient and effective.
http://bit.ly/2iigxwl This episode explores To Err Is Human, & the idea that healthcare is a decade behind other industries in some important areas.