Use Continuous Data (!)




For the purposes of quality improvement projects, I prefer continuous to discrete data.  Here, let’s discuss the importance of classifying data as discrete or continuous and the influence this can have over your quality improvement project.  For those of you who want to skip to the headlines: continuous data is preferable to discrete data for your quality improvement project because you can do a lot more with a lot less of it.


First let’s define continuous versus discrete data.  Continuous data is data that is infinitely divisible.  That means types of data that you can divide forever ad infinitum comprise the types of data we can label as continuous.  Examples of continuous data include time.  One hour can be divided into two groups of thirty minutes, minutes can be divided into seconds and seconds can continue to be divided on down the line.  Contrast this with discrete data:  discrete data is data which is, in short, not continuous.  (Revolutionary definition, I know.) Things like percentages, levels and colors comprise data that comes in divided packets and so can be called discrete.


Now that we have our definitions sorted, let’s talk about why discreet data can be so challenging.  First, when we go to sample directly from a system, discrete data often demands a larger sample size.  Consider our simple sample size equation for how big a sample we need of discreet data to detect a change:


(p)(1-p) (2 / delta)^2.


This sample size equation for discreet data has several important consequences.  First, consider the terms.  P is the probability of outcome of a certain event.  This is for percentage-type data where we have a yes or no, go or stop, etc.  The terms help determine our sample size.   As mentioned, p is the probability of the event occurring and the delta is the smallest change we want to be able to detect with our sample.


The 2 in the equation comes from the (approximate) z-score at the 95% level of confidence.  We round up from the true value of z to 2 because that gives us a whole number sample slightly larger than what’s required rather than a sample with a fraction in it.  (How do you have 29.2 of a patient, for example?) Rounding up is important too because rounding down would yield a sample that is slightly too small.


In truth, there are many other factors in sampling besides merely sampling size.  However, here, notice what happens when we work through this sample size equation for discrete data.  Let’s say we have an event that has a 5% probability of occurring. This would be fairly typical for many things in medicine, such as wound infections in contaminated wounds etc.  Working through the sample size equation, and in order to detect a 2% change in that percentage, we have 0.05 x 0.95 (2 / 0.02)^2.  This gives us approximately 475 samples required in order to detect a smallest possible decrease of 2%.  In other words, we have obtained a fairly large sample size to see a reasonable change.  We can’t detect a change of 1% with that sample size, so if we think we see a 4.8% as the new percentage after interventions to change wound infections…well, perforce of our sample size, we can’t really say if anything has changed.


One more thing:  don’t know the probability of an event because you’ve never measured it before?  Well, make your best guess.  Many of us use 0.5 as the p if we really have no idea.  Some sample size calculation is better than none, and you can always revise the p as you start to collect data from a system and you get a sense of what the p actually is.


Now let’s consider continuous data.  For continuous data, sample size required to detect some delta at 95% level of confidence can be represented as



( [2][historic standard deviation of the data] / delta)^2.


When we plug numbers into this simplified sample size equation we see very quickly that we have much smaller samples of data required to show significant change.  This is one of the main reasons why I prefer continuous to discrete data.  Smaller sample sizes can show meaningful change.  However, for many of the different end points you will be collecting in your quality project, you will need both.  Remember, as with the discrete data equation, you set the delta as the smallest change you want to be able to find with your data collection project.


Interesting trick:  if you don’t know the historic standard deviation of your data (or you don’t have one) take the highest value of your continuous data and subtract the lowest.  Then divide what you get by 3.  Viola…estimate of historic standard deviation.


Another reason why continuous data is preferable to discrete data is the number of powerful tools it unlocks.  Continuous data allows us to use many other quality tools such as the CPK, data power transforms, and useful hypothesis testing. This can be more challenging with discrete data.  Some of the best ways we have see to represent discrete data include a Pareto diagram.  For more information regarding a Pareto diagram visit here.


Other than the Pareto diagram and a few other useful ways, discrete data presents us with more challenges for analysis.  Yes, there are statistical tests such as the chi squared proportions test that can determine statistical significance.  However, continuous data plainly open up a wider array of options for us.


Having continuous data allows us to make often better visual representations and allows our team to achieve a vision of the robustness of the process along with the current level of variation in the process.  This can be more challenging with the discrete data endpoints.


In conclusion, I like continuous data more than discrete data and I use it wherever I can in a project.  Continuous data endpoints often allow better visualization of variation in a process.  They also require smaller sample sizes and unlock a more full cabinet of tools which we can use to demonstrate our current level of performance.  In your next healthcare quality improvement project be sure to use continuous data points where possible and life will be easier!


Disagree?  Do you like discrete data better or argue “proper data for proper questions”?  Let us know!




A Very Personal Take On Medical Errors


Click beneath for the audio presentation of the entry:



It’s A Little Personal…

Let me share a personal story about the important differences between how I was trained in Surgery regarding medical errors and later training in statistical process control. Here, let’s discuss some personal thoughts on the important differences between a more systemic approach to error and the more traditional take on error which includes a focus on personal assignability. I am sharing these thoughts owing to my experience in different organizations. These experiences have ranged from some organizations which seek to lay blame specifically in one person to those organizations that are system focused on error reduction at the systemic level. What are some characteristics of each of these approaches?


M&M Is Useful, But Not For Quality Improvement (At Least Not Much)

I remember well my general surgical training and subsequent fellowships and I’m grateful for them.  I didn’t realize, at the time, how much of my training was very focused on personal issues with respect to quality improvement. What I mean is that, at the morbidity and mortality conference, I was trained both directly and indirectly to look at the patient care I provided and to focus on it for what I could improve personally.  This experience was shared by my colleagues.  The illusion by which we all abide in morbidity and mortality conference is that we can (and should) overcome all the friction inherent in the system and that by force of personal will and dedication we should be able to achieve excellent results or great outcomes based on our performance alone. What I mean is our morbidity and mortality presentations, or M&M’s, don’t focus on how the lab tests weren’t available, how the patient didn’t have their imaging in a timely fashion, or any of the other friction that can add to uncertainty in fluid situations. M & M, as many of my colleagues have said, is a contrivance. Read on, however, because there’s more: while M & M may be a contrivance, it is a very useful contrivance for training us as staff.


Consider that in the personally assignable world of the M&M conference we often take responsibility for decisions we didn’t make. Part of the idea of the M&M conference to this day (despite the 80 hours restrictions for residents) is that the resident understand the choices made in the OR and be able to defend or at least represent them effectively…even if that resident wasn’t in the OR. So from the standpoint of preventing defects, a case presentation by someone who wasn’t in the OR may help educate the staff…yet it probably doesn’t make for effective process improvement–at least not by itself.


Clearly, this “personal responsibility tact” is an excellent training tool for residents.  Morbidity and Mortality conference focuses on what we could do better personally. It forces us to read the relevant data and literature on the choices that were made in the operating room.  It is extraordinarily adaptive to place trainees in the position where they must defend certain choices, understand certain choices, and be able to discuss the risk versus benefit of the care in the pre-operative, intra-operative, and post-operative phase.  However, classic M&M is not a vehicle for quality improvement.


In The Real World, There Are Many Reasons For A Positive (Or Negative) Outlier


What I mean by this is that we in statistical process control know (and as we in healthcare are learning) there are many reasons that both positive and negative outliers exist.  Only one of the causes for a “bad” outcome is personal failure on the part of the provider and staff, and, in fact, most issues have roots in many other categories of what creates variation. This does not mean that, as a provider, I advocate a lack of personal responsibility for poor medical outcomes and outliers in the system.  (I’ve noticed that staff who, like me, grew up with the focus on personally assignable error and a “who screwed up” mentality typically accuse the process of ignoring personal responsibility owing to their lack of training or understanding of the process.) However, I recognize that outcomes have “man” or “people” as only one cause of variation.  In fact, as we have described previously on the blog, there are six causes of special cause variation.


There are six categories of reasons why things occur outside the routine variation for a system.  This doesn’t mean that a system’s normal variation (common cause variation) is even acceptable. In fact, sometimes systems can be performing at their routine level of variation and that routine level of variation is unacceptable as it generates too many defects. Here, let’s focus on the fact that there are 6 causes of special cause variation which can yield outliers above and beyond other values we might see. As mentioned before in the blog here, these 6 causes include the 6 M’s, which are sometimes referred to as the 5 M’s and 1 P.


Which approach to error and process improvement do I favor?  I favor the more comprehensive approach to error reduction inherent in the statistical process control methodology.  This process is not just for manufacturing and I know this because I’ve seen it succeed in healthcare, where much of the task involved was helping the other physicians understand what was going on and the philosophy behind it.


Let me explain why there can be so much friction in bringing this rigorous methodology to healthcare.  In healthcare, we are often, I believe, more slow to adopt changes.  This is especially true for changes in our thought processes and philosophy.  I think this is perfectly fine and is likely acceptable.  This conservative approach protects patients.  We don’t accept something until we know it works and it works very well.  This, however, does make us later to adopt certain changes (late to the party) compared to the rest of our society.  One of these changes is the rigorous approach to process control.


Physicians and surgeons may even feel that patients are so different that there can be no way to have data that embody their complexities.  (That’s another classic challenge to the Lean and Six Sigma process by the way.) Of course, in time, physicians realize that we use population level data all the time and we see it in the New England Journal of Medicine, The Lancet, and other journals.  A rigorous study with the scientific method, which is what statistical process control brings, allows us to narrow variation in a population without ignoring individual patient complexities.  After all, we do not commit the fallacy of applying population level data directly to individual patients and INSTEAD make system-wide changes that support certain outcomes.  Surprise, after only a month of experiencing the improvements, even physicians come to believe in the methods.


Also, physicians are not trained in this and we see only its fringes in medical education.  This is also adaptive, as there is a great deal to learn in medical education and a complete introduction to quality control maybe out of place.  However, the culture of medicine (of which I am a part) often still favors, at least in Surgery, this very personal and self-focused approach to error rate. However, I can say with confidence and after experimentation, that the systemic approach to error reduction is more effective.


Lean & Six Sigma Have Been Deployed In The Real World Of Healthcare…And Yes They Work


As a Medical Director and Section Chief for a trauma and acute care surgery center, I had the opportunity to deploy statistical process control in the real world as my colleagues and I re-bulit a program with administrative support.  This was highly effective and allowed our surgical team to focus on our rates of defect production as a system.  This eliminated focusing on individual differences and instead helped team building etc.  It also gave a rigorous way for us to measure interval improvement. These are just a few advantages of statistical process control.


Other advantages included the fact that it allowed us to know when to make changes and when to let the system chug along. Using statistical process control allowed us to know our type one and type two error rate which is key to know when to change a system.  For more information regarding type one and type two error rate look here.


There are advantages to both approaches to errors.  The straightforward and often more simplistic view of personal responsibility is highly adaptive and very advantageous for training surgeons.  I think that, while training surgeons, we should realize (and make it transparent) that this personal approach to error is merely a convention which is useful for teaching, keeping us humble, and focusing on how we can improve personally.  After all, the surgical trainees often are in the position where they must take responsibility in a conference format for decisions over which they had no influence. They also must, again, give the illusion that there were no barriers to excellent patient care beyond their control such as multiple trauma activations at once, lab tests not being performed, and no short-staffing on holidays.  Again, personal responsibility and the illusion of complete control over the production of errors is important when the focus is on education and for this reason the personal approach to error is highly adaptive.


However, when we want to actually make less defects, a systemic approach to error that recognizes personal issues as just one of 6 causes of potential defects is key as is rigorous methodology to bring about change.  Being able to quantify the common cause level of variation and special causes of variation in a system is a very useful tool to actually make less defects.  As statistical process control teaches us, prevention is the only portion of the cost of poor quality that has return on investment.  For more information on the cost of poor quality, visit here.


Personal Responsibility Is One Part Of A More Comprehensive View

At the end of the day, I view personal responsibility for medical error as just one portion of a more comprehensive view on error reduction, risk reduction, and true quality control.  As a surgeon, I strongly advocate personal responsibility for patient care and excellent direct patient care.  This is how I was trained.  However, I feel that, although this is key, my focus on how I can do better personally is part of a larger focus that is more comprehensive when it comes to the reduction and elimination of defects.  Statistical process control gives us a prefabricated format that uses rigorous mathematical methods to embody and allow visualization of our error rate.  Other differences from the more classic model of process improvement in healthcare include that statistical process control tends to degenerate less often into a pejorative discussion than the personal focus approach.


Unfortunately, I have been in systems previously where staff are overly focused on who made an error (and how) while they ignore the clear issues that contributed to the outcome.  Sometimes, it is not an individual’s wanton maliciousness, indolence, or poor care that yielded a defect.  Often, it’s that there was friction inherent in the system and the provider didn’t “go the extra mile” that M&M makes us believe is always possible.  Sometimes it is a combination of all these issues and such non-controllable factors such as the weather or similar issue.


The bottom line is, at the end of the day, statistical process control as demonstrated in Lean and Six Sigma methodology allows us to see where we fit in a bigger picture and to rigorously eliminate errors.  I have found that providers in the systems tend to “look much better” when there has been this focus on systems issues in a rigorous fashion.  Outcomes that were previously thought unachievable become routine.  In other words, when the system is repaired and supportive, the number of things we tend to attribute to provider defects or patient disease factors substantially decreases.  I have had the pleasure to deploy this at least once in my life as part of a team, and I will remember it as an example of the power of statistical process control and Lean thinking in Medicine.


Questions, comments, or thoughts on error reduction in medicine and surgery?  Disagree with the author’s take on personal error attribution versus a systemic approach?  Please leave your comments below and they are always welcome.

These Two Tools Are More Powerful Together




Click beneath for the video version of the blog entry:


Click beneath for the audio version of the blog entry:


Using two quality improvement tools together can be more powerful than using one alone. One great example is the use of the fishbone diagram and multiple regression as a highly complementary combination.  In this entry, let’s explore how these two tools, together, can give powerful insight and decision-making direction to your system.


You may have heard of a fishbone, or Ishikawa diagram, previously. This diagram highlights the multiple causes for special cause variation.  From previous blog entries, recall that special cause variation may be loosely defined as variation above and beyond the normal variation seen in a system.  These categories are often used in root cause analysis in hospitals.  See Figure 1.


Figure 1:  Fishbone (Ishikawa) diagram example
Figure 1: Fishbone (Ishikawa) diagram example


As you also may recall from previous discussions, note that there are six categories of special cause variation. These are sometimes called the “6 M’s’” or “5 M’s and one P”. They are Man, Materials, Machine, Method, Mother Nature and Management (the 6 Ms).  We can replace the word “man” with the word “people” to obtain the 5Ms and one P version of the mneumonic device.  In any event, the issue is that an Ishikawa diagram is a powerful tool for demonstrating the root cause of different defects.


Although fishbone diagrams are intuitively satisfying, they can also be very frustrating.  For example, once a team has met and has created a fishbone diagram, well…now what?  Other than opinion, there really is no data to demonstrate that what the team THINKS is associated with the defect / outcome variable is actually associated with that outcome.  In other words, the Ishikawa represents the team’s opinions and intuitions.  But is it actionable?  In other words, can we take action based on the diagram and expect tangible improvements?  Who knows.  This is what’s challenging about fishbones:  we feel good about them, yet can we ever regard them as more than just a team’s opinion about a system?


Using another tool alongside the fishbone makes for greater insight and more actionable data.  We can more rigorously demonstrate that the outcome / variable / defect is directly and significantly related to those elements of the fishbone about which we have hypothesized with the group.  For this reason, we typically advocate taking that fishbone diagram and utilizing it to frame a multiple regression.  Here’s how.


We do this in several steps.  First, we label each portion of the fishbone as “controllable” or “noise”.  Said differently, we try to get a sense of which factors we have control over and which we don’t.  For example, we cannot control the weather.  If sunny weather is significantly related to number of patients on the trauma service, well, so it is and we can’t change it.  Weather is not controllable by us.  When we perform our multiple regression we do so with all factors identified labeled as controllable or not.  Each is embodied in the multiple regression model.  Then, depending on how well the model fits the data, we may decide to see what happens if the elements that are beyond our control are removed from the model such that only with the controllable elements are used.  Let me explain in greater detail about this interesting technique.


Pretend we create the fishbone diagram in a meeting with stakeholders. This lets us know, intuitively, what factors are related to different measures. We sometimes talk about the fishbone as a hunt for Y=f(x) where Y is the outcome we’re considering and it represents a function of underlying x’s. The candidate underlying x’s (which may or may not be significantly associated with Y) are identified with the fishbone diagram.  Next, we try to identify which fishbone elements are ones for which we have useful data already.  We may have rigorous data from some source that we believe. Also, we may need to collect data on our system. Therefore, it bears saying that we take specific time to try to identify those x’s about which we have data.  We then establish a data collection plan. Remember, all the data for the model should be over a similar time period.  That is, we can’t have data from one time period and mix it with another time period to predict a Y value or outcome at some other time.  In performing all this, we label the candidate x’s as controllable or noise (non-controllable).


Next, we seek to create a multiple regression model with Minitab or some other program. There are lots of ways to do this, and some of the specifics are ideas we routinely teach to Lean Six Sigma practitioners or clients. These include the use of dummy variables for data questions that are yes/no, such as “was it sunny or not?” (You can use 0 as no and 1 as yes in your model.) Next, we perform the regression and do try to input confounders if we think two x’s or more x’s are clearly related.  (We will describe this more in a later blog entry on confounding.) Finally, when we review the multiple regression output, we look for an r^2 value of greater than 0.80. This indicates that 80% of the variability in our outcome data, or our Y, is explained by the x’s that are in the model. We prefer higher r^2 and r^2 adjusted values. R^2 adjusted is a more stringent test based on the specifics of your data and we like both r^2 and r^2 adjusted to be higher.


Next we look at the p values associated with each of the x’s to determine whether any of the x’s affect the Y in a statistically significant manner.  As a final and interesting step we remove those factors that we cannot control and run the model again so as to determine what portion of the outcome is in our control.  We ask the question “What portion of the variability in the outcome data is in our control per choices we can make?”


So, at the end of the day, the Ishikawa / fishbone diagram and the multiple regression are powerful tools that complement each other well.


Next, let me highlight an example of a multiple regression analysis, in combination with a fishbone, and its application to the real world of healthcare:


A trauma center had issues with a perceived excess time on “diversion’, or that time in which patients are not being accepted and so are diverted to other centers. The center had more than 200 hours of diversion over a brief time period.  For that reason, the administration was floating multiple reasons why this was occurring.  Clearly diversion could impact quality of care for injured patients (because they would need to travel further to reach another center) and could represent lost revenue.


Candidate reasons included an idea that the emergency room physicians (names changed in the figure beneath) were just not talented enough to avoid the situation.  Other reasons included the weather, and still other reasons included lack of availability of regular hospital floor beds.  The system was at a loss for where to start and it was challenging for everyone to be on the same page to have clarity with respect to where to start and what to do next with this complex issue.


For this reason, the trauma and acute care surgery team performed an Ishikawa diagram with relevant stakeholders and combined this with the technique of multiple regression to allow for sophisticated analysis and decision making.  See Figure 2.


Figure 2:  Multiple regression output
Figure 2: Multiple regression output


Variables utilized included the emergency room provider who was working when the diversion occurred (as they had been impugned previously), the day of the week, the weather, and the availability of intensive care unit beds to name just a sample of variables used.  The final regression result gave an r^2 value less than than 0.80 and, interestingly, the only variable which reached significance was presence or absence of ICU beds.  How do we interpret this?  The variables included in the model explain less than 80% of the variation in the amount of time the hospital was in a state of diversion (“on divert”) for the month.  However, we can say the the availability of ICU beds is significantly associated with whether the hospital was “on divert”.  Less ICU beds was associated with increased time on divert.  This gave the system a starting point to correct the issue.


Just as important was what the model did NOT show.  The diversion issue was NOT associated significantly with the emergency room doctor.  Again, as we’ve found before, data can help foster positive relationships.  Here, it disabused the staff and the rest of administration of the idea that the emergency room providers were somehow responsible for (or associated with) the diversion issue.


The ICU was expanded in terms of available nursing staff which allowed more staffed beds and made the ICU more available to accept patients. Recruitment and retention of new nurses were linked directly to the diversion time for the hospital:  the issue was staffed beds, and so the hospital realized that more nursing staff needed to be hired as one intervention.  This lead to a recruitment and retention push, and, shortly thereafter, an increase in the number of staffed beds.  The diversion challenge resolved immediately once the additional staff was available.


In conclusion, you can see how the fishbone diagram, when combined with the multiple regression, is a very powerful technique to determine which issues underly the seemingly complex choices we make on a daily basis.  In the example above, a trauma center utilized these powerful techniques together to resolve a difficult problem. At the end of the day, consider utilizing an fishbone diagram in conjunction with a multiple regression to help make complex decisions in our data intensive world.


Thoughts, questions, or feedback regarding your use of multiple regression or fishbone diagram techniques? We would love to hear from you.


How The COPQ Helps Your Healthcare Quality Project

Click here for video presentation of content beneath:

Click here for audio presentation of entry:


Challenging To Demonstrate The Business Case For Your Healthcare Quality Project

One of the biggest challenges with quality improvement projects is clearly demonstrating the business case that drives them.  It can be very useful to generate an estimated amount of costs recovered by improving quality.  One of the useful tools in Lean and Six Sigma to achieve this is entitled ‘The cost of poor quality’ or COPQ.  Here we will discuss the cost of poor quality and some ways you can use it in your next quality improvement project.


Use The COPQ To Make The Case, And Here’s Why

The COPQ helps form a portion of the business case for the quality improvement project you are performing.  Usually, the COPQ is positioned prominently in the project charter.  It may sit after the problem statement or in another location depending on the template you are using.  Of the many tools of Six Sigma, most black belts do employ a project charter as part of their DMAIC project.  For those of you who are new to Six Sigma, DMAIC is the acronym for the steps in a Six Sigma project.  It includes: Define, Measure, Analysis, Improve, and Control.  Importantly, we use these steps routinely and each step has a different objective we must achieve.  These objectives are often called tollgates.  Things must happen in each of these steps before progressing to the next step.  One of the tools we can use, and again most project leaders do use this tool routinely, is called the project charter.


The project charter defines the scope for the problem.  Importantly, it defines the different stakeholders who will participate, and the time line for completion of the project.  It fulfills other important roles too as it clearly lays out the specific problem to be addressed.  Here is where the COPQ comes in:  we utilize the COPQ to give managers, stakeholders, and financial professionals in the organization an estimate of the costs associated with current levels of performance.


The Four Buckets That Compose The COPQ

The COPQ is composed of four ‘buckets’.  These are:  the cost of internal failures, the cost of external failures, the cost of surveillance, and the cost associated with prevention of defects.  Let’s consider each of these as we describe how to determine the Costs of Poor Quality.  The cost of internal failures are those costs associated with problems with the system that do not make it to the customer or end user.  In healthcare this question of who is the customer can be particularly tricky.  For example, do we consider the customer as the patient or as the third party payer?  The reason why this is challenging is that, although we deliver care to the patient, the third party payer is the one who actually pays for the value added.  This can make it very challenging to establish the Costs of Poor Quality for internal and other failures.  I believe, personally, this is one of the sources that puts Lean, Six Sigma, and other business initiatives into certain challenges when we work in the healthcare arena.  Who, exactly, is the customer?  Whoever we regard as the customer, internal failures, again, are those issues that do not make it to the patient, third party payer, or eventual recipient of the output of the process.


External failures, by contrast, are those issues and defects that do make it to the customer of the system. There are often more egregious.  These may be less numerous than internal failures but are often visible, important challenges.


Next is the cost of surveillance.  These are the costs associated with things like intermittent inspections from state accrediting bodies or similar costs that we incur perhaps more frequently because of poor quality.  Perhaps our state regulatory body has to come back yearly instead of every three years because of our quality issues.  This incurs increased costs.


The final bucket is the cost of prevention.  Costs associated with prevention are other important components of the cost of poor quality.  The costs associated with prevention are the only expenditures on which we have a Return On Investment (ROI).  Prevention is perhaps the most important element of the COPQ because money we spend on prevention actually translates into, often, that return on investment.


A Transparent Finance Department Gives Us The Numbers

In order to construct the COPQ we need to have ties to the financial part of our organization.  This is where transparency in the organization is key.  It can be very challenging to get the numbers we require in some organizations and in others it can be very straightforward.  Arming the team with good financial data can help make a stronger case for quality improvement.  It is key, therefore, that each project have a financial stakeholder so that the quality improvement effort never strays too far from a clear idea of the costs associated with the project and the expectation of costs recovered.  Interestingly, in the Villanova University Lean Six Sigma healthcare courses, a common statistic cited is that each Lean and Six Sigma project recovers a median value of approximately $250000.  This is a routine amount of recovery of COPQ even for healthcare projects and beyond.  It can be very striking just how much good quality translates into cost cutting.  In fact, I found that decreasing the variance in systems, outliers and bad outcomes has a substantial impact on costs in just the manner we described.


Conclusion:  COPQ Is Key For Your Healthcare Quality Project

In conclusion, the Cost of Poor Quality is useful construct for your next quality improvement project because it clearly describes exactly what the financial stakeholders can expect to recover from the expenditure of time and effort.  The COPQ is featured prominently in the project charter used by many project leaders in the DMAIC process.  To establish the COPQ we obtain financial data from our colleagues in finance who are part of our project.  We then review the costs statements with them and earmark certain costs as costs of internal failure, external failure, surveillance or costs associated with prevention.  We then use these to determine the staged cost of poor quality.  Additionally, we recognize that the COPQ is often a significant figure on the order of 200-300 thousand dollars for many healthcare-related projects.


We hope that use of the COPQ for your next quality improvement project helps you garner support and have a successful project outcome.  Remember, prevention is the only category of expenditures in the COPQ that has a positive return on investment.


Thought, questions, or discussion points about the COPQ?  Let us know your thoughts beneath.

CPK Does Not Just Stand For Creatine Phosphokinase


Play while you’re here:


Or download the mp3 for listening on the go:



One of the most entertaining things we have found with respect to statistical process control and quality improvement is how some of the many acronyms overlap what we typically use for healthcare.  One acronym we frequently use in healthcare, but which takes on a very different definition in quality control, is CPK.  In healthcare, CPK typically stands for creatine phosphokinase.  (Yes, you’re right:  who knows how we in the healthcare world turn creatinine phosphokinase into “CPK” because both the letter p and letter k are together in the second word.  We just have to suspend disbelief on that one.) CPK may be elevated in rhabdomyolysis and other conditions. In statistical process control CpK is a process performance indicator that can help tell us how well our system is performing.  CpK could not be more different than CPK and is a useful tool in developing systems.


As we said, healthcare abounds with multiple acronyms.  So does statistical process control.  Consider the normal distribution that we have discussed in previous blogs.  The normal distribution or Gaussian distribution is frequently noted in processes.  We can do tests such as the Anderson-Darling test, where a p value greater than 0.05 is ‘good’ in that this result indicates our data do not deviate from the normal distribution.  See the previous entry on “When is it good to have a p > 0.05?”


As mentioned, having a data set that follows the normal distribution allows us to utilize well known and comfortable statistical tests for hypothesis testing.  Let’s explore normal data sets more carefully as we build up to considering the utility of Cpk.  Normal data sets display common cause variation.  Common cause variation is when the variation in a system is not due to an imbalance in certain underlying factors.  These underlying factors are known as the 6 M’s, which include man (or, sometimes, nowadays called just “person”), materials, machines, methods, mother nature, or management/measurement.  These are described differently in different texts but the key here is that they are well established sources that yield variability in data.  Again, the normal distribution demonstrates what’s called common cause variation in that none of the 6 M’s are highly imbalanced.


By way of contrast, sometimes we see special cause variation.  Special cause variation occurs when certain findings make the data set vary from the normal distribution to a substantial degree.  Special cause variation is caused when one of the 6Ms is very imbalanced and contributes to a great deal of variation such that a normal distribution is not present.  Where such insights can tell us a great deal about our data, there are other process indicators that are commonly used and that may yield even more insight.


Did you know, for example, the six sigma process is called six sigma because the goal is to fit six standard deviations’ worth of data between the upper and lower specification limit?  This would ensure a robust process where even a relative outlier of the data set is not near an unacceptable level.  In other words, the chances of making a defect are VERY slim.


We have introduced some vocabulary here so let’s take a second to review it.  The lower specification limit (or “lower spec limit”) is the lowest acceptable value for a certain system as specified by the customer for that system, regulatory body, or other entity.  Similarly, the upper spec limit is the highest value acceptable for some data.  Normally in Six Sigma we say the spec limits should be set by the Voice of the Customer (or VOC).  It is the lowest value that is acceptable from the customer, whether that customer be Medicare, an internal customer or another group such as patients.  The upper spec limit is the highest value acceptable in the data set according to that same voice of the customer.  Importantly, there are these other process capabilities that tell us how well systems are performing.  As mentioned, the term six sigma comes from the fact that one of the important goals in Motorola (which formalized this process) and other companies is to have systems where over 6 standard deviations of data can fit between this upper and lower spec limit.


Interestingly, there are some arguments that only 4.5 standard deviations of data should be able to fit between the upper and lower spec limit in idealized systems because systems tend to shift slowly over time plus or minus 1.5 sigma and forcing 6 standard deviations worth between the upper and lower spec limit is over-controlling the system.  This so-called 1.5 “sigma shift” is debated by practitioners of six sigma.


In any event, let’s take a few more moments to talk about why all of this is worthwhile.  First, service industries such as healthcare, law, medicine operate at certain levels of error generically speaking.  This error rate, again in a general sense, is approximately 1 defect per 1000 opportunities of making a defect.  This level of error is what is called the 1-1.5 sigma level. This is because that, when demonstrated as a distribution, a defect rate of 1 per 1000 occurs and a portion of the bell curve fits outside either the upper spec limit, lower spec limit or both.  These defects occur at with only 1-1.5 standard deviation’s worth of data fitting between upper and lower spec limit.  In other words, you don’t have to go far from the central tendency of the data, or the most common values you “feel” when practicing in a system, before you see errors.


…and that, colleagues, is some of the power of this process:  it demonstrates clearly that how we feel when practicing in a system (“hey things are pretty good…I only rarely see problems with patient prescriptions…”) and highlights, often, the sometimes counter-intuitive fact that the rate of defects in our complex service system just isn’t ok.  Best of all, the Six Sigma process makes it clear that it is more than just provider error that yields an issue with a system and in fact there are usually components of the 6Ms that conspire to make a problem.  This does NOT mean that we are excused from any responsibility as providers, yet it recognizes how the data tell us (over and over again) that many things go into making a defect and these are modifiable by us.  The idea that it is the nurse’s fault, the doctor’s fault, or any one person’s issue is rarely (yet not never) the case and is almost a non-starter in Six Sigma.  To get to the low error rates we target, and patient safety we want, the effective system must rely on more than just one of the 6Ms.  I have many stories where defects that lead to patient care issues are built into the system and are discovered when the team collects data on a process, including one time where a computerized order entry automatically yielded the wrong order for a patient on the nurse’s view of the chart.  Eliminating the computer issue, nursing education, and physician teamwork drastically as part of a comprehensive program greatly improved compliance with certain hospital quality measures.


Let’s be more specific about the nuts and bolts of this process as we start to describe CpK.  This bell curve approach demonstrates that a certain portion of the distribution fits above, below (or both) relative to the acceptable area.  We can think of these two acceptable areas as goalposts. There is the lower spec limit goal post as well as an upper spec limit goalpost and our goal is to have at least 4.5 or more likely 6 standard deviations of data fit easily between these goalposts. This ensures a very very low error rate.  Again, this from where term six sigma comes.  If we have approximately 6 standard deviations of data between upper and lower spec limit, we have a process that makes approximately 3.4 defects per every one million opportunities.


Six sigma is more of a goal to achieve for systems rather than a rigid, proscribed, absolute requirement.  We attempt to progress to this level with some of the various techniques we will discuss.  Interestingly, we can think of defect rates with sigma levels as described.  Again, one defect per every thousand opportunities is approximately 1-1.5 sigma level.


There are also other ways to quantify the defect rate and system performance.  One of these is the CpK about which we speak above.  The CpK is a number that represents how the process is centered between the lower and upper spec limit.  A CpK value tells us much of what we want to know about how the process is centered and how well it fits between the upper and lower spec limit.  Thus we can understand a system performance with the associated CpK.  We can also then understand the associated defect rate.  Each CpK value corresponds to a ‘sigma’ value which corresponds to an error rate.  So, a CpK tells us a great deal about system performance in one compact number.


Before we progress on to our next blog entry, take a moment and consider some important facts about defect rates. For example, you may feel that one error in one thousand opportunities is not bad.  That’s how complex systems may fool us…they lull us to sleep because the most common experience is perfectly acceptable and we’ve already stated typically error rates are 1 defect per every 1000 opportunities…that’s low!  However, if that 1-1.5 sigma rate were acceptable there would be several important consequences.  First, let’s use that error rate to highlight some world manifestations in high-stakes situations.  If the 1-1.5 sigma rate were acceptable, we would be ok with 1 plane crash each day at O’Hare airport.  We would also be comfortable with thousands of wrong site surgeries every day across the United States.  In short, the 1-1.5 sigma defect rate is not truly appropriate for high stake situations such as healthcare.  Advanced tools such as the CpK, sigma level and defect rates are key in order to have a common understanding of the rate of performance for different systems and a sense of at what level of performance the system should be.  This useful framework is easily shared by staff across companies who are trained in six sigma and practitioners looking at similar data sets come to similar conclusions.  We can benchmark them and follow them.  We can show ourselves our true performance (as a team) and make improvements over time.  This is very valuable from a quality standpoint and gives us a common approach to often complex data.


In conclusion, it is interesting to see that a term we use typically in healthcare has a different meaning in the statistical process control terminology.  CPK is a very valuable lab test in patients who are at risk for rhabdomyolysis, and for those who have the condition, yet it is also key in terms of describing process centering and defect rates.  Consider using CpK to describe the  level of performance for your next complex system and to help represent overall process functionality.


Questions, thoughts, or stories of how you have used the CpK in lean and six sigma?  Please let us know your thoughts.

When Is It Good To Have p > 0.05?

Some of the things I learned as a junior surgical resident were over simplified. One of these includes that a p value less than 0.01 is “good”.  In this entry we discuss when it is appropriate for a p value to be greater than 0.01 and those times when it’s even favorable to have one greater than 0.05.  We invite you to take some time with this blog entry as it makes for some of the most interesting facts we have found about hypothesis testing and statistics.


In Lean and Six Sigma much of what we do is to take statistical tools that exist and to apply these to business scenarios so that we have a more rigorous approach to process improvement.  Although we call the processes Six Sigma or Lean depending on the toolset we are using, in fact, the processes are pathways to set up a sampling plan, capture data, and rigorously test data so as to determine if we are doing better or worse with certain system changes–and we get this done with people, as a people sport, in a complex organization.  I have found, personally, the value in using the data to tell how us how we are doing is that it disabuses us of instances where we think we are doing well and we are not.  It also focuses our team on team factors and system factors, which are, in fact, responsible for most defects.  Using data prevents us from becoming defensive or angry at ourselves and our colleagues.  That said, there are some interesting facts about hypothesis testing about which many of us knew nothing as surgical residents.  In particular, consider the idea of the p value.


Did you know, for example, that you actually set certain characteristics of your hypothesis testing when you design your experiment or data collection?  For example, when you are designing a project or experiment, you need to decide at what level you will set your alpha.  (This relates to p values in just a moment.) The alpha is the risk of making a type 1 error.  For more information about a type 1 error please visit our early blog entry about type 1 and type 2 errors here.  In this case, let’s leave it at saying the alpha risk is the risk of tampering with a system that is ok; that is, alpha is the risk of thinking there is an effect or change when in fact there is no legitimate effect or change.  So, when we set up an experiment or data collection, we set the alpha risk inherent in our hypothesis testing.  Of course, there are certain conventions in medical literature that determine what alpha level we accept.


Be careful, by the way, because alpha is used in other fields too.  For example, in investing, alpha is the amount of return on your mutual fund investment that you will get IN EXCESS of the risk inherent in investing in the mutual fund.  In that context, alpha is a great thing.  There are even investor blogs out there that focus on how to find and get this extra return above and beyond the level of risk you take by investing.  If you’re ever interested, visit


Anyhow, let’s pretend, here, that we say we are willing to accept a 10% risk of concluding that there is some change or difference in our post-changes-we-made state when in fact there is no actual difference (10% alpha).  In most cases the difference we may see could vary in either direction.  Our values post changes could be either higher or lower than they were pre changes.  For this reason, it is customary to use what is called a two tailed p value.  The alpha risk is split among two tails of the distribution (ie the values post changes are higher or lower than by chance alone) so that we say if the p value is greater than 0.05 (a 5% alpha risk in either direction) we would conclude there is no significant difference in our data between the pre and post changes we made to a system.


The take home is that we decide, before we collect data to keep the ethics of it clean, how we will test these data to conclude if there a change or difference under 2 states.  We determine what will we will count as a statistically significant change based on the conditions we set:  what alpha risk is too high to be acceptable in our estimation?


Sometimes, if we have reason to suspect the data may or can vary in only one direction (such as prior evidence indicating an effect only going one direction or some other factor) we may use a one tailed p value.  A one tailed p value simply says that all of our alpha risk is lumped in one tail of the distribution.  In either case we should set up how we will test our data before we collect them.  Of course, in real life, sometimes there are already data that exist, are high quality (clear operating definition etc.) and we need to analyze them for some project.


Next, let’s build up to when it’s good to have a p > 0.05.  After all, that was the teaser for this entry.  This brings us to some other interesting facts about data collection and the sampling methods by which we do this.  For example, in Lean and Six Sigma, we tend to classify data as either discrete or continuous.  Discrete data is, for example, yes or no data.  Discrete data can be certain defined categories only such as red, yellow, blue, yes / no, black / white / grey etc. etc…continuous data, by contrast, is data that is infinitely divisible.  One way I have heard continuous data described that I use when I teach is that continuous data are data that can be divided in half forever and still make sense.  That is, an hour can be divided into two groups of 30 minutes, minutes can be divided into seconds, and seconds can continue to be divided.  This infinitely divisible type of data is continuous and makes a continuous curve when plotted.  In Lean and Six sigma we attempt to utilize continuous data whenever possible.  Why?  The answer makes for some interesting facts about sampling.


First, did you know that we need much smaller samples of continuous data in order to be able to demonstrate statistically significant changes? In fact, consider a boiled down sampling equation for continuous data versus discrete data.  A sampling equation for continuous data is (2s/delta)^2 where s is the historic standard deviation of the data and delta is the smallest change you want to be able to detect with your data.  The 2 comes from the z score at the 95th percent level of confidence.  For now just remember that this is a generic conservative sampling equation for continuous data.


Now let’s look a sampling equation for discrete data.  The sampling equation for discrete data is p(1-p)(2/delta)^2.  In other words, let’s plug in what it would take to be able to detect a 10% difference in discrete data.  Plugging in the numbers and using p=50% for the probability of yes or no we find that we need a large sample to detect a small change.  For continuous data, using similar methodology we need much smaller samples.  Usually for reasonably small deltas this may be only 35 data points or so.  Again, this is why Lean and Six sigma utilizes continuous data whenever possible.  So, now, we focus on some sampling methodology issues and the nature of what a p value is.


Next, consider the nature of statistical testing and some things that you may not have learned in school.  For example, did you know that underlying most of the common statistical tests is the assumption that the data involved are normally distributed?  In fact, data in the real world may be normally distributed.  Again, normal distribution means data that may be demonstrated as a histogram that follows a Gaussian curve.  However, in the real world of business, manufacturing and healthcare, it is often not the case that data are actually distributed normally.  Sometimes data maybe plotted and look normally distributed but in fact they are not.  This fact would invalidate some of the assumptions utilized by common statistical tests.  In other words, we can’t use a t test on data that are not normally distributed.  Students t test, for example, has the assumption that the data are normally distributed.  What can we do in this situation?


First we can rigorously test our data to determine if they are normally distributed.  There is a named test, called the Anderson-Darling test, that focuses on whether our data are normally distributed.  The Anderson-Darling test tests our data distribution versus normally distributed data.  If the p value for the Anderson-Darling test is greater than 0.05 that means our data do not deviate significantly from the normal distribution.  In other words, if the Anderson-Darling test statistic’s accompanying p value is greater than 0.05 we conclude that our data are normally distributed and we can use the common statistical tests that are known and loved by general surgery residents (and beyond) everywhere.  However, if the Anderson-Darling test indicates that our data are not normally distributed, that is the p value is less than 0.05 we must look for alternative ways to test our data.  This was very interesting to me when I first learned it.  In other words, a p value greater than 0.05 can be good especially if we are looking to demonstrate that our data are normal so that we can go on and use hypothesis tests which require normally distributed data.  Here are some screen captures that highlight Anderson-Darling.  Note that, in Fig 1., the data DON’T appear to be normally distributed by the “eyeball test” (the “eyeball test” is when we just look at the data and go with our gut).  Yet, in fact, the data ARE normally distributed and p > 0.05.  Figure 2 highlights how a data distribution follows the routine, expected frequencies of the normal distribution.



Figure 1:  A histogram with its associated Anderson-Darling test statistic and p value > 0.05.  Here, p > 0.05 means these data do NOT deviate from the normal distribution…and that’s a good thing if you want to use hypothesis tests that assume your data are normally distributed.



Figure 2:  These data follow the expected frequencies associated with the normal distribution.  The small plot in Figure 2 demonstrates the frequencies of data in the distribution versus those of the normal distribution.

As with most things, the message that a p value less 0.01 is good and one greater than 0.01 is bad  is a vast oversimplification.  However, it is probably useful as we teach statistics to general surgery residents and beyond.

So, now that you have identified a methodology for whether your data are or are not normally distributed, let’s progress to talking about what to do next–especially when you find that your data are NOT normally distributed and you wonder where to go next.  In general, there are two options when we have continuous data sets that are NOT normally distributed.  One is that we must transform these data sets with what is called a power transformation. There are many different power transformations including the Box-Cox transformation and Johnson transformation to name a few.


The power transforms take the raw, non-normally distributed data, and raise the data to different powers, such as raising the data to the 1/2 power (aka taking its square root) or raising the data to the second power, third power, fourth power, etc. The optimal power to which the data are raised so as to make the data closest to the normal distribution is identified.  The data are then replotted as transformed data to that power, and then the Anderson-Darling test (or a similar test) is performed on that transformed data to determine whether the new data are now normally distributed.


Often the power transformations will allow the data to become normally distributed.  This brings up an interesting point:  pretend we are looking at a system where time is the focus.  The data are not normally distributed and we perform a power transform which demonstrates that time squared is a normally distributed variable.  Interestingly we may have a philosophic management question.  What does it mean to manage time squared instead of time?  These and other interesting questions arise when we use power transforms.  The use of power transforms is somewhat controversial for that reason. Sometimes it is challenging to know whether the variables have meaning for management when we use power transforms.


However, on the bright side, if we successfully “Box-Cox-ed” or somehow otherwise power-transformed the data to normal data we can now use the common statistical tests. Remember, if the initial data set is transformed the subsequent data must be transformed to the same power.  We have to compare apples to apples.


The next option for how to deal with non-normal data set is to utilize statistical tests which do not require the input of normal data.  These include such rarely used tests as the Levene test, and so called KW or Kruskal-Wallis test.  The Levene test and KW test are tests of data variability.  Another test, the Mood’s median test, tests the median value for non-normal data.  So, again, we have several options for how to address non-normal data sets.  Usually, as we teach the Lean and Six Sigma process, we reserve teaching about how to deal with non-normal data for at least a black belt level of understanding.


At the end of the day, this blog post explores some interesting consequences of the choices we make with respect to data and the consequences of some interesting facts about hypothesis testing.  Again, interestingly, there is much more choice involved than I ever understood as a general surgical resident.  Eventually, working through the Lean and Six sigma courses (and finally the master black belt course) taught me about the importance of how we manage data and, in fact, ourselves.  Also, there are more than 10 projects in which I have participated that have really highlighted these certain facts about data and reinforced text book learning.


An interesting take home message is that the p value less than 0.01 does not mean all is right with the world, just as a p value greater than 0.05 is not necessarily bad.  Again, after all, tests like the Anderson-Darling test are useful to tell us when our data are normally distributed and when we can continue using the more comfortable hypothesis tests that focuses on data which are normally distributed.  In this blog post, we describe some of the interesting ways to deal with data that are non-normally distributed so as to improve our understanding and conclusions based on continuous data sets.  Whenever possible, we favor continuous data as it requires a smaller sample size with which to make meaningful conclusions.  However, as with all sampling, we have to be sure that our continuous data sample adequately represents the system we are attempting to characterize.


Our team hopes you enjoyed this review of some interesting statistics related to the nature and complexity of p-values.  As always, we invite your input as statisticians or mathematicians especially if you have special expertice or interest in these topics.  None of us, as Lean or Six Sigma practitioners, claim to be statisticians or mathematicians.  However, the Lean and Six Sigma process is extremely valuable in applying classic statistical tools to business decision-making.  In our experience, this approach to data driven decision making has yielded vast improvements in how we practice in business systems instead of other models based on opinion or personal experience.


As a parting gift, please enjoy (and use!) the file beneath to help you to select what tool to use to analyze your data.  This tool, taken from Villanova’s Master Black Belt Course, helps me a great deal on a weekly basis.  No viruses or spam from me involved I promise!


Takt Time and Value Added Time in Surgery and Healthcare Processes

By:  David M. Kashmer, MD MBA MBB


These Tools Are Valuable

Two of the most undervalued tool sets in healthcare are the Lean and Six Sigma tool set.  We hear the common refrain from physicians, nurses, and healthcare workers that Lean and Six Sigma tools, along with other statistical process control tools, are not useful in service industries–particularly not in healthcare.  In our experience, this isn’t correct.


It’s A Matter Of Training

Often, healthcare workers are not trained in these tools and therefore find little value in them.  However, in our experience, with training and understanding healthcare workers find these tools just as useful as the broader audience that uses them frequently.  In fact, many of the tools for which healthcare workers are looking to articulate certain words or ideas are already worked out in the well-know tools of statistical process control.  These can be quite valuable in healthcare and other service lines.


Sometimes We Use “Stealth Sigma”

Our team has experience with turnarounds and realignments in more than five trauma centers where we have found the Lean and Six sigma toolset to be invaluable.  We often have to change the moniker associated with this set of tools so as to avoid being too off-putting towards our healthcare colleagues.  Sometimes we call them “statistical process control” so that there’s less pushback caused by use of the term “Lean” or “Six Sigma”.  In fact, some colleagues have a term for the type of deployment where we avoid “Lean” and “Six Sigma”–those deployments get called “stealth sigma”.


Many of us on the team were trained in healthcare and currently practice clinically.  We understand the skepticism of our colleagues as we initially had it ourselves before we were trained in the tools.  Healthcare colleagues, here’s an important headline:  many of the tools you are currently re-inventing in your various fields have been worked out.  There’s even processes to use them.  They’re called Lean and Six Sigma.  Ok, stepping down off soapbox…


It’s only natural for us to be biased and a little evangelical.  After all, several of us are Master Black Belts (degrees of Six Sigma education have names that sound like karate belts) in Lean and Six Sigma.  We are accredited by various bodies throughout the United States.  Until we learned these things, we didn’t understand that they yield an ability to improve healthcare.  Here, allow me to stop testifying and to start telling you some of our experience as we focus on two useful Lean tools.

Let’s Talk About A Case


A healthcare system was having issues with stressed workers and backlog.  The concept of takt time was easily applied to demonstrate issues with the system.  Takt time represents the drum beat of a service line.  Another way to describe it, and one we often use with healthcare workers, is as the heartbeat of their patient.  The takt time is the time required to produce one unit of whatever the service line is producing.  This can be a patient admission, a surgical procedure, or something similar.  Takt time is an average and of course there is variability of the rates of production in practice.  However, takt time gives us an idea of what the drumbeat of the situation should be based on customer demand.


Definition of Takt Time

Takt time can be determined as the total available time to work divided by the demand for a situation.  That is, if, after breaks and other issues there is one hour available available in a day to actually do work and there are three patients that usually show up to the hospital (the demand on that system) in that hour to be admitted, the takt time for admissions is one third of an hour per every admission.  Said differently, it’s 1 hour available to do admissions / 3 admissions to be done.  This is one third of 60 minutes or approximately 20 minutes per admission.  Concepts like these give us an idea of what the drumbeat of the system needs to be.


We can use takt time in many ways.  One of the most useful ways is to pair it with a visual diagram of the process.  This type of diagram is called a value stream map.  Value stream mapping is very useful to better understand processes and services.  We can get a sense of what the drumbeat is in our organization and, based on mapping out the times associated with different portions of our value stream map, we can figure out how long it actually takes us to produce one unit of whatever we are trying to accomplish.  We can compare the two.  A value stream map gives us an idea of how the speed at which the process usually performs compares to takt time.  If takt time is 20 minutes per 1 admission, yet it usually takes us 40 minutes per 1 admission, we probably need to look for where we can cut wasted time and improve the process speed to be closer to takt time.


We can then see if there are discrepancies between our takt time, which is the drumbeat required, and our actual time to produce what we are trying to produce.

Focus On Value Added Time (VAT)


Another useful consequence of the value stream map is something called value-added time.  A troubling statistic often taught in Lean and Six sigma courses is that, in most systems, only approximately 1% of time used in the system is spent adding value to a product, service or patient. The “what adds value” is defined as that benefit or item for which the customer will pay.


In healthcare there are some special issues in application of this definition.  For example, who is the payer in the situation?  When we say value-added time as anything for which the customer will pay, who is the customer? We usually use a third party payer’s perspective as the answer for “who is the customer” because they are usually the ones actually paying for the services and systems.  Rather than talk about who should be paying for services in American healthcare we, instead, focus on who does.  In this respect we treat the third party payer, the source of funds, as the actual entity paying for use of services.


This also has some interesting consequences.  The third party payer, in fact, bases their payment on physician, surgeon or healthcare provider notation.  In fact what they actually are paying for is the tangible product they see which is the note.  Again, the note the physician, advanced practitioner, or healthcare provider supplies is what the third party payer reimburses.  In fact, they also use that as a rational to decline payment.  Consider how, if we gave a service but didn’t write it down, we would not be reimbursed.  This is part of how third party payers control costs whether they mean to or not.  We may have done several procedures, yet it is unlikely we would be reimbursed if we didn’t write down exactly what we did with clear and often exact documentation.  The note is the product for which the provider is paid.  Of course, without rendering the service there can be no note.


We feel strongly that it is improper (to say the least) to write a note based on the services or procedures that were not performed.  This is likely fraud in most cases and we are concerned about this.  We do not suggest writing notes on patients for procedures or care that was not delivered.  However, we acknowledge, in fact, a strong focus on the value chain for healthcare needs to be on the production of a medicolegally compliant, provider-protective, and exact note that satisfies the ever-increasing regulatory requirements of third party payers. That said, at the end of the day, what third party payers pay for is the services given as represented by the note.

 Apply VAT Concept To Everyday Processes


Let’s return to this concept that only 1% of time in most systems is spent adding value to a patient, process or other entity.  Experientially, this seems to be true.  When we have mapped out value streams in healthcare we have determined that again, that only approximately 1% of the time is used to actually add something that the third party payer will eventually reimburse.  A 4 day hospital stay related to cholecystitis is reimbursed with one global payment based on service as represented by the note.  Are there any opportunities to streamline note-writing, patient care, and cholecystectomy performance to decrease the amount of time spent in non-value added activities?  (You may be laughing, because the answer is clear to anyone who has worked in healthcare:  yes of course–there is a great deal of waste and re-work.)


The fact that only about 1% of time in a system is value-added time is often interesting and counterintuitive to the project group until they see the numbers. Once the amount of non-value added time is established and made tangible it becomes much more straightforward to reduce this time.  It is useful to reduce non value added time because much of non value added time is waste.  There are exceptions as you can imagine.  (Sometimes one process has to be completed in preparation to allow a value added step later.)  However, making the amount of non value added time crystal clear, tangible, and visible on a value stream map greatly improves processes and consensus building among healthcare providers, nurses and other allied health practitioners.


Use These Two Tools Together

We suggest, in our practice, to focus on these two tools as important adjuncts to process development. Again, takt time gives us a sense of our patient’s heart rhythm and we can often see how our processes are functioning relative to this concept of takt time.  For more information regarding takt time, value stream mapping, and value added time we invite you to visit Wikipedia or a Lean Six Sigma site after a google search.


Remember, to all our friends in healthcare:  we’ve been there and feel your pain.  We are surgeons, advanced practitioners, and nurses too.  Let us tell you:  the tools for which you are looking, or the ones you are re-inventing, have already been worked out and are called Lean and Six Sigma.  Feel free to borrow our wheel anytime rather than working out how to build your own.