Healthcare Quality Podcast: 4 Types of Bad Metrics In Healthcare

By:  David Kashmer MD MBA FACS (@DavidKashmer) & Vivienne Neale (@SupposeIAm)

LinkedIn Profile here.


Listen to the podcast here.


DDD, bringing you the metrics behind the data. Here is your host, Vivienne Neale.

Hi, and welcome to DDD, which is Data Driven Decision Radio, episode six. My name is Vivienne Neale and I am delighted to be back with you. For those who have asked, my background is in education, training, broadcasting and social media. Of course, I am also a sometime patient curious to know what decisions are being taken in my name that might just affect me and of course thousands of people just like me. So, once again I am joined by David Kashmer, Chair of Surgery at Signature Healthcare. David is an expert in statistical process control, including Lean and Six Sigma. He has a special interest in new tools to improve healthcare, like gamification. David also edits and writes for a blog called So, hi David and welcome back.


Vivienne, thank you so much for the warm welcome and it’s great to be here again with you and the listeners.


Right, well I don’t know about you, but I’ve actually found something in the news this week that I thought you and the listeners might find quite interesting.


Vivienne, it would be great to hear. You always have interesting news stories for us and I’m interested to hear another.


DDD News


I have gone slightly left field here and I’ve decided to look at a story that actually has nothing to do with health care, but has a considerable amount to do with big data. So, in the UK we have an underground, or would I be better to call it a metro system?


[Laughter]. I think we would understand you with either word, Vivienne, but both are welcome. The underground or metro system in London is a great mode of transport, really amazing.


It is and it gets better and better. However, we are about to introduce all night trains and there is a little bit of an issue with the staff who are going to run them. There have been a series of strikes, including one in February 2004. You might say to me, “Well, excuse me, what is a strike on the London underground got to do with health or anything else?” Well, it was interesting because the University of Oxford and Cambridge did a joint study where they explored the data that is being collated by travellers when they use a swipe card, which you load money onto and then you can literally run through the underground all day and then at the end of the day it will calculate how much you’ve spent.


It’s the Oyster Card.


It is the Oyster Card, yes. 10 out of 10 [laughter]. So, anyway, they found out that a sizable fraction of commuters were able to find better routes to work and ironically actually produced a net economic benefit. The reason they found this out was that they examined 20 days’ worth of anonymised Oyster Card data that actually contained more than 200 million data points. They were able to see how individual tube journeys changed during the strike and what happened… this particular strike only resulted in a partial closure of the tube network and so not every commuter was affected by the strike. So, you could actually compare directly what was going on. So, the data enabled researchers to see whether people chose to go back to their normal commute once the strike was over, or if they found a more efficient route and decided to switch. So, therefore you can start seeing that actually regular commuters affected by the strike, either because certain stations were closed or because travel times were considerably different, about 1 in 20 decided to stick with their new route once the strike was over. So, I started thinking that they also did… while the proportion of individuals who ended up changing their routes may sound small, researchers found that the strike actually ended up producing a net economic benefit. By performing a cost benefit analysis of the amount of time saved by those who changed their daily commute, the researchers found that the amount of time saved in the longer term actually outweighed the time lost by commuters during the strike. You wouldn’t expect that kind of knowledge to be thrown up really and it’s only through big data, and we are talking massive data, 200 million data points that throw this up. They also found out, which I also found interesting that the London tube map, which is iconic, may have been a reason why many commuters didn’t find their optimal journey before the strike. In many parts of London the actual distances between trains are distorted on the iconic map, by digitising the tube map and comparing it to the actual distances between stations, the researchers found that those commuters living in or travelling to parts of London where distortion is greatest were more likely to have learned from the strike and found a more efficient route. So, what I was thinking is how is big data throwing up very different, surprising insights into health practice in the US?


Vivienne, big data, just as you say today and as we’ve said previously on the podcast, is powerful especially in that it generates these surprising conclusions and the ability to process these large data sets is part of what has empowered us to see these counter intuitive or surprising conclusions. In the United States, we see that in many different places. For example, on a somewhat smaller scale in what is sometimes called “small data” or “little data”, as an allusion to big data, we have the ability to use data sets from the hospital to do predictive modeling for things like when our hospital will be so full we’ll have to divert patients. When I say small data or little data, these are still large data sets. The bottom line is that the computing power we have now coupled with statistical sophistication lets us see things that we’ve never seen before and just as with this news story, the classic is necessity is the mother of invention, we can see how changes in our department of surgery or in our hospital… sometimes small tweaks, like for example, opening up more intensive care unit beds can have large impact and even predictable impact on the rest of our system. That’s one use of fairly large data and larger data sets that I have used before in part of our surgical system. We have made a predictive model for when our hospital would have to divert patients so that we would know ahead of time and be able to offset it. Things like ICU bed capacity made a world of difference for all sorts of end points through our hospital and it would not have been as easy to tease out, or we wouldn’t have been as focused with our interventions without the use of larger data sets and sophisticated statistical modelling techniques. So, I think the lesson learned from analogous situations like the London tube strike, or the most recent London tube strike, is that this power that we have now can bring us to sometimes counterintuitive conclusions, or sometimes just surprising conclusions that we wouldn’t have guessed otherwise. I think it’s a good lesson, and as you said, it’s right there in the news today.


What I found really interesting was the fact that most people hadn’t bothered to find their optimal route until they are forced to experiment. That says something about human nature, but it’s also something we should learn from. Perhaps we shouldn’t be too frustrated that we can’t always get what we want or that others sometimes take decisions for us. That was the conclusions of one of the co-authors, Dr Tim Williams from Oxford University’s Department of Economics. He said, and the final note to the article was, if we behave anything like London commuters and experiment too little, hitting such constraints may very well be to our long term advantage. That turns everything on its head, doesn’t it? That through adversity, another take on your necessity is the mother of invention, but through adversity we do discover other things. I think with the use of big data, those gains could be huge.


Vivienne, I think it’s a very interesting message and in the quality control and improvement world we have two related techniques. One is called the Design of Experiments, where on a small scale we tweak certain portions of a system and through the use of statistics we can tell if the changes in the systems output are related to our initial tweaks. Again, it’s called DOE, or Design of Experiments, and we do it prospectively rather than analysing data retrospectively and using a more big data technique. So, there are lots of ways to get it… just what the professor said. The fact that sometimes tweaking and innovating around that is key. The last technique is more of a theory actually called the theory of constraints. The theory of constraints goes along with this idea of these boundaries. You need to understand what they are, and again, if you just stayed within the current frame you would never find, for example, your faster way to work. So, what an interesting article.


You mentioned retrospective data and I think that whole business of using data retrospectively has lots of problems to it, don’t you think? I think it is something that perhaps we ought to talk about today in a bit more detail than we’ve mentioned in the past.


Well Vivienne, thanks for highlighting that issue and I think today it’s worth spending some time on some of the different types of bad metrics we have seen in healthcare. I think there are four broad categories that we see recurrently in healthcare that really make it challenging to have a data driven department or hospital, when after all its very valuable to have a data driven mentality because it allows us to make all of these advancements, like the ones you talked about. So, I will share with you today some of the different types of difficult metrics I have seen and how they impact us as patients.


Right. So, would one of those be the difficulties when you can’t actually collect accurate or complete data? Does that cause a problem? Well, I suppose it does [laughter].


You have highlighted one right at the onset and that is these metrics for which we cannot collect accurate or complete data. Vivienne, it is surprisingly challenging in hospitals to collect data, accurate complete data. Often, in fact, in some western management systems data collection is frowned upon or it’s an afterthought. People are so busy that it’s an imposition. So, as we launch in here, let me just tell you that often in hospitals we will hear that we can’t collect any data or any more data or data on a particular problem. Of course, if it’s not measured, it’s really challenging to manage. So, if you can’t afford the time to collect good data, well sometimes it’s useful to stop and think, we really can’t afford not to collect good data.


So, in fact, you are talking about some staff. I don’t think it’s specifically in health. It’s in education certainly, I know that, where people are seeing it as an imposition. I suppose before you do anything, a change of perception, perspectives and attitudes is important before you do anything. Like you say, you can’t afford not to collect big data or data of any kind.


You are exactly right. Not only is it changing people’s thoughts about data collection, it is certain techniques that can make data collection much quick and accurate. For Six Sigma in particular, we have certain ways to collect prospective data that really takes a second, two seconds at most as patients come through the system. So, part of it is leading the culture and our colleagues around us to understand the importance of data collection and then there are specific techniques that can be used and specific ways to set up data collection for both discreet and continuous data that get us the data we need a lot faster and I’m happy to share those with the listeners. If they want to email us through our website or get in touch, we can talk about some specifics for how to set up data collection to get at good data for difficult managerial problems.


We are DDD. Data Driven Radio.


Well, I’m going to throw you a curve ball. We have been talking in this country, just today in fact, about robotics and how robots are becoming more useful to us in the workplace, and I assume in healthcare and everything else, where there will be certain manual tasks or simple tasks, or not quite so simple tasks that will be done by robots. Now, that would be quite interesting if those robots automatically collected data as well, wouldn’t it?


It would be and you probably know that robot comes from a Czechoslovakian word for servant. In fact, if I remember right, it was coined by Isaac Asimov, but I may be wrong about that part. [I was wrong here.  It was Josef Capek.] The fact is, yes, it would be very useful, especially with some of these rapid data collection techniques, to have a robot or a similar automaton that does some pre-described straightforward task and also collects us some data. Yeah, it’s an interesting opportunity and it segways nicely into another type of bad data, or bad data metric. Those are metrics that complicate operations and create excessive overhead. Those are some of the worst ones, Vivienne. The ones where you think you have something important to measure and just the doing of it is so arduous and it makes things so difficult and it takes staff time. Those are metrics that should be designed out and probably aren’t as useful.


Have you got any examples of those?


Sure. Often in healthcare, the reason that staff recoil at data collection is when you hear data collection it can mean one page filled with 10-12 checkboxes that float amidst the sea of checkboxes, vital signs, prescribed forms that we have to fill out. Something even as arduous as one more page can make a big difference for our colleagues when we are at the front line taking care of a patient, or our nursing colleagues who are checking a patient into the post anaesthesia recovery unit. It can be challenging if it’s even one more form. So, yes, I can think of probably five or six stories where management wanted some data collected and the doing of it was just really difficult.


So, in fact the development of specialised software which is ongoing will probably take a lot of the pain away, enabling healthcare workers, surgeons, teachers, whoever, to actually get on with the business that they feel that they are trained for. I saw in the news today actually that there is software being pioneered by some law firms which take out the really basic jobs like checking whether a contract is appropriate and legal. Which are really tedious jobs that you would give a junior to do that can done really quickly by software alone.


Well, electronic health records and similar tools hold great potential to allow us to retrospectively pull data. For the Six Sigma and quality projects, we really like to collect prospective data and often the end points that have the most meaning in our systems are not ones that are typically thought of to be including in an EHR. Again, electronic health records can be very useful, but the Six Sigma teaching is to collect data prospectively in statistically valid sample sizes directly from the line or the process. So, I think you are exactly right. Electronic health records hold great potential, and I would add that just as valuable, if not more, is prospective data as we do it.


Have you got another example of bad data for us?


Vivienne, one of the other big problems we see routinely are metrics that are exceedingly complex and ones that don’t tell a story. There is a value to having a metric that feels how things are going. You may have seen ones created, like the happiness index for countries or some similar metric that almost is a statistic with a humanised element built in. Metrics that are exceedingly complex, that are difficult to explain, that don’t have an intuitive feel, those are tough. On the blog, we have an entry where we briefly describe a metric for operating room readiness. Are you ready to hear, Vivienne, what the operating room readiness was for our department of surgery for last month? Do you want to hear what it was?


Yeah, hit me with it.


Okay, it was…pumpkin. Now, if you are recoiling or confused, well “pumpkin” is a confusing answer or metric or statistic for how well the operating room is doing in terms of readiness. Yet, all the time throughout healthcare and surgery, we have metrics like pumpkin for operating room readiness that really don’t tell us much, or that answer a question in an odd way. That is really challenging. So, like using the score pumpkin for operating room readiness, sometimes in healthcare we have metrics that are complex or difficult to explain or lack a feel to them.


I am speechless here [laughter].


It’s a strange example, but what is stranger are some of the unusual metrics we see on quality dashboards all the time. That idea of just putting some metric on a dashboard… a bad metric, Vivienne, can also cause employees to just ‘make their numbers’. When staff can’t feel the metrics or you have something that really doesn’t have a lot of meaning or you can’t see how it affects patient care, it can be very hard to engage yourself in wanting to collect it, wanting to improve it and that type of challenge makes a metric less useful. So, a metric that causes an employee to just make their numbers or focus on the number rather than what it means, that’s a whole other category of bad metric in healthcare, and they are ones that we see all the time.


Right, so would you like to give us a couple of takeaways, bearing in mind that you’ve been researching and looking into these problems with data?


Yes. I would say, Vivienne, wherever possible, especially as we continue in the information age with so much data and big data that we should take the opportunity to make sure the end points we collect are tailored to have meaning for us, ones that we can feel. I think the way to avoid being hit by the train of bad metrics, and we can see the train coming, it’s important to step out of the way. Some of the metrics that would be most useful for healthcare are the ones that can help us tell a story of our patient care, ones that have a meaning to them, rather than just as we often see, dry percentages and different ones like that. So, especially as we collect more and more data in healthcare, I have this feel like [22:51], a visual display of quantitative information. We have a lot of info and it’s important to represent it in a way that holds meaning for us. So, wherever possible its useful to have humanised endpoints that don’t complicate operations or create excessive overhead and that are straightforward to explain to others, and ones that we can collect accurately and completely. I think those are the takeaways.


Well, thank you very much, David, and I’m sure this is something health teams all over the place will consider very carefully. In fact, we are very interested to hear what innovative practices are being undertaken in your health provision. So, if you’d like to appear on the show, contact us through our website. David, would you like to give us the address?


Sure. They can contact us via and there is an address linked to that page, or they can contact us at our info address, which is


So, we look forward to hearing from you. Meanwhile, if you’ve liked the show, do leave us a rating on ITunes, and you can catch us on Soundcloud too. It’s one way we can ensure the word is spread. So, we look forward to being with you next time. So, from David and from me, its bye for now.

We are DDD, Data Driven Radio. Catch us on Soundcloud and iTunes.