Conceptualizing “over-treatment” waste: Don’t deny health economics

A Health Policy Brief published in Health Affairs on December 13, 2012 referenced an analysis published last April in JAMA regarding waste in health care.  In this analysis, Don Berwick (one of my health care heroes) and Andrew Hackbarth (from RAND) estimated that waste in health care consumed between $476 billion and $992 billion of the $2.6 trillion annual health care expenditures in the US.  That’s 18-37% waste.  They divided this estimate into 5 categories of waste.  Their mid-point estimates are as follows:

Berwick and Hackbarth estimates of waste in health care - JAMA 2011

They consider “failures in care delivery” to include failures to execute preventive services or safety best practices, resulting in avoidable adverse events that require expensive remediation.  By “failures of care coordination,” they mean care that is fragmented, such as poorly planned transitions of care, resulting in avoidable hospital readmissions.  They categorize as “overtreatment” care ordered by providers that ignored scientific evidence, were motivated to increase income or to avoid medical malpractice liability, or out of convenience or habit.  They considered “administrative complexity” to be spending resulting from “inefficient or flawed rules” of insurance companies, government agencies or accreditation organizations.  They estimated the magnitude of administrative complexity by comparing administrative expense in the US to that in Canada’s single payer system.  They considered “pricing failures” to be prices that are greater than those which are justified by cost of production plus a “reasonable profit,” presumably due to the absence of price transparency or market competition.  Finally, they considered “fraud and abuse” to be the cost of fake medical bills and the additional inspections and regulations to catch such wrongdoing.

Underestimating Over-treatment

These estimates are generally in alignment with other attempts to categorize and assess the magnitude of waste in health care.  But, I think Berwick and Hackbarth’s estimates of “overtreatment” are probably far too low.  That’s because they, like so many other health care leaders, are so reluctant to address the issue of cost-effectiveness.  Obviously, the definition of over-treatment depends on one’s philosophy for determining what treatments are necessary in the first place.  Everyone would agree that a service that does more harm than good for the patient is not necessary.  Most would agree that a service that a competent, informed patient does not want is not necessary.  Some argue that, if there is no evidence that a treatment is effective, it should not be considered necessary, while others argue that even unproven treatments should be considered necessary if the patients wants it.   Berwick and Hackbarth are ambiguous about their application of this last category.

But, the big disagreement occurs when evaluating treatments for which there is evidence that the treatment offers some benefit, but the magnitude of the benefit is small in relation to the cost of the treatment.  This is a question about cost-effectiveness.  It is at the heart of medical economics.  In my experience, most health care leaders and an even higher proportion of political leaders choose to deny the principles of medical economics and the concept of cost-effectiveness.  They describe attempts to apply those principles as “rationing” — a term which has taken on a sinister, greedy meaning, rather than connoting the sincere application of rational thought to the allocation of limited resources.   Berwick and Hackbarth implicitly take that view.  They are unwilling to define over-treatment based on cost-ineffectiveness.

The analysis I want to see

For years, I’ve been looking for an analysis that attempted to estimate the magnitude of waste from over-treatment based on the principles of health economics.  The diagram below illustrates the hypothetical results of the type of analysis I’d like to see.

Diagram re Conceptualizing Overtreatment

 In this diagram, the horizontal axis represents the total cost of health care to a population.  I don’t want to see the entire US health care system.  What is more relevant is the population served by an Accountable Care Organization or an HMO.  To create such a diagram, we would first need to break down health care cost into a large number of specific treatment scenarios.  Each of these scenarios would specify a particular treatment (or diagnostic test) with enough clinical context to permit an assessment of the likely health and economic outcomes.  For each scenario, each of the possible health outcomes would be assigned a probability, a duration, and a quality of life factor.  My multiplying the duration by the quality of life factor, we could calculate the “quality-adjusted life years” (or “QALY”) for the outcome.  Then, by taking the probability-weighted average of all the possible health states for the scenario, and then dividing the result by the cost, we could calculate the “cost-effectiveness ratio” for the scenario, measured in “$/QALY.”   Then, we would sort all the treatment scenarios by the cost-effectiveness ratios, with the treatment scenarios with the most favorable health economic characteristics on the left.

Some of the scenarios will generate net savings, such as for certain preventive services where the cost of the avoided disease is greater than the initial cost of the preventive service.  These are clearly cost-effective.  On the other end of the spectrum are scenarios that offer net harm to the patient, such as when adverse side-effects are worse than the benefits of the treatment.  These are clearly cost-ineffective.  In the middle of these extremes are scenarios where there is a positive net benefit to the patient and a positive net cost borne by the population.

If a person rejects the principles of health economics, they would consider all of these middle scenarios to be “necessary” or “appropriate” regardless of how small the benefits or how large the costs.  But, among those who accept the principles of health economics, some of these scenarios could be judged to be cost-effective and others to be cost-ineffective.  Such judgments would presumably reveal some threshold cost-effectiveness ratio that generally separated the scenarios into cost-effective and cost-ineffective.  Since different people have different values, their judgments could reveal different cost-effectiveness thresholds.  If we had many people making these judgments, we could find a range of cost-effectiveness ratios that were considered to be reasonable by 90% of the people.    Applying this range to all the treatment scenarios, one could find a group of scenarios that were considered wasteful by most, and another group of scenarios that were considered wasteful only by some.

Variations on this theme have been used throughout the world for decades by various practice guidelines developers, healthcare policy analysts, health services researchers and health economists.  It is complex and time-consuming.  As I’ve discussed before, it is also controversial in the United States.

Right now, in the US, we all recognize that health care costs are too high.  We’re all focusing on merging providers into larger organizations, installing computer software, and exploring new reimbursement arrangements to address the problem.  But, I’m convinced that over-treatment with cost-ineffective services is a major source of waste.  We will inevitably get back to the core issue of having to figure out which treatment scenarios are wasteful.  We will inevitably have to overcome our denial of health economics and our irrational fear of rational allocation.

 

Read More

The debate about what to maximize when selecting candidates for care management programs: Accuracy? ROI? Net Savings? Or Cost-Effectiveness?

When doing analysis, it is really important to clarify up front what it is you are actually trying to figure out.  This sounds so obvious.  But, I am always amazed at how often sophisticated, eager analysts can zip past this important first step.

Health plans, vendors and health care providers are all involved in the design and execution of wellness and care management programs.  Each program can be conceptualized as some intervention process applied to some target population.  A program is going to add more value if the target population is comprised of the people for whom the intervention process can have the greatest beneficial impact.

I have found it useful to conceptualize the process of selecting the final target population as having two parts.  The first part is the process of identification of the population for whom the intervention process is deemed relevant.  For example, a diabetes disease management program is only relevant to patients with diabetes.  An acute-care-to-ambulatory-care transitions program is only relevant to people in an acute care hospital. A smoking cessation program is only relevant to smokers.  The second part is to determine which members of the relevant population are to actually be targeted.  To do this, a program designer must figure out what characteristics of the relevant candidates are associated with having a higher opportunity to benefit from the intervention.  For example, in disease management programs, program designers often use scores from predictive models designed to predict expected cost or probability of disease-related hospitalization.  They are using cost or likelihood of hospitalization as a proxy for the opportunity of the disease management program to be beneficial.  Program designers figure that higher utilization or cost means that there is more to be saved.  This is illustrated in the graph below, where a predictive model is used to sort patients into percentile buckets.  The highest-opportunity 1% have have an expected annual admit rate of almost 4000 admits per 1000 members, while the lowest 1% have less than 200 admits per 1000 members.  A predictive model is doing a better job when more of the area under this curve is shifted to the left.

Although it is common to use expected cost or use as a proxy for opportunity, what a program designer would really like to know is how much good the intervention process is likely to do.  Other variables besides expected cost or use can contribute to a higher opportunity.  For example, in a disease management program, the program might be more worthwhile for a patient that is already motivated to change their self-management behaviors or one that had known gaps in care or treatment non-compliance issues that the intervention process is designed to address.

Applying micro-economics to care management targeting

Once the definition of “opportunity” is determined and operationally defined to create an “opportunity score” for each member of the relevant population, we can figure out which members of the relevant population to actually target for outreach for the program.  Conceptually, we would sort all the people in the relevant population by their opportunity score.  Then, we would start by doing outreach to the person at the top of the list and work our way down the list.  But, the question then becomes how far down the list do we go?  As we go down the list, we are investing program resources to outreach and intervention effort directed at patients for which the program is accomplishing less and less.   Economists call this “diminishing returns.”

As illustrated by the red line in the graph above, there is some fixed cost to operating the program, regardless of the target rate.  For example, there are data processing costs.  Then, if the program does outreach to a greater and greater portion of the relevant population, more and more people say “yes” and the costs for the intervention go up in a more or less linear manner.  As shown by the green line, the savings increase rapidly at first, when the program is targeting the candidates with the greatest opportunity.  But, as the threshold for targeting is shifted to the right, the additional candidates being targeted have lower opportunity, and the green line begins to flatten. The blue line shows the result of subtracting the costs from the savings to get net savings.  It shows that net savings increases for a while and then begins to decrease, as the cost of intervening with additional patients begins to outweigh the savings expected to accrue from those patients.  In this analysis, net savings is highest when 41% of the relevant population of diabetic patients is targeted for the diabetes disease management program.  The black dotted line shows the result of dividing savings by cost to get the return of investment, or ROI.   With very low target rates, too few patients are accumulating savings to overcome the fixed cost.  So the ROI is less than 1.  Then, the ROI hits a peak at a target rate of 18%, and declines thereafter.  This decline is expected, since we are starting with the highest opportunity patients and working down to lower opportunity patients.   Note that in this analysis, increasing the target penetration rate from 18% to 41% leads to a lower ROI, but the net savings increases by 24%.  So, if the goal is to reduce overall cost, that is achieved by maximizing net savings, not by maximizing ROI.

Should we try to maximize accuracy?

In a recent paper published in the journal Population Health Management by Shannon Murphy, Heather Castro and Martha Sylvia from Johns Hopkins HealthCare, the authors describe their sophisticated methodology for targeting for disease management programs using “condition-specific cut-points.”  A close examination of the method reveals that it is fundamentally designed to maximize the “accuracy” of the targeting process in terms of correctly identifying in advance the members of the relevant disease-specific population that will end up being among the 5% of members with the highest actual cost.   In this context, the word accuracy is a technical term used by epidemiologists.  It means the percentage of time that the predictive model, at the selected cut-point, correctly categorized patients.  In this application, the categorization is attempting to correctly sort patients into a group that would end up among the 5% with highest cost vs. a group that would not.  By selecting the cut point based accuracy, the Hopkins methodology is implicitly equating the value of the two types of inaccuracy: false positives, where the patient would be targeted but would not have been in the high cost group, and false negatives, where the patient would not be targeted but would have been in the high cost group. But, there is no reason to think that, in the selection of targets for care management interventions, false negatives and false positive would have the same value. The value of avoiding a false negative includes the health benefits and health care cost savings that would be expected by offering the intervention. The value of avoiding a false positive includes the program cost of the intervention.  There is no reason to think that these values are equivalent.  If it is more important to avoid a false positive, then a lower cut-point is optimal.  If it is more valuable to avoid a false negative, then a higher cut-point is optimal.  Furthermore, the 5% cost threshold used in the Hopkins methodology is completely arbitrary, selected without regard to the marginal costs or benefits of the intervention process at that threshold.  Therefore, I don’t advise adopting the methodology proposed by the Hopkins team.

What about cost-effectiveness?

The concept of maximizing ROI or net savings is based on the idea that the reason a health plan invests in these programs is to save money.  But, the whole purpose of a health plan is to cover expenses for necessary health care services for the beneficiaries.  A health plan does not determine whether to cover hip replacement surgery based on whether it will save money.  They cover hip replacements surgery based on whether it is considered a “standard practice,”  or whether there is adequate evidence proving that the surgery is efficacious.  Ideally, health care services are determined based on whether they are worthwhile — whether the entire collection of health and economic outcomes is deemed to be favorable to available alternatives.  In the case of hip replacement surgery, the health outcomes include pain reduction, physical function improvement, and various possible complications such as surgical mortality, stroke during recovery, etc.  Economic outcomes include the cost of the surgery, and the cost of dealing with complications, rehabilitation and follow-up, and the savings from avoiding whatever health care would have been required to deal with ongoing pain and disability.  When attempting to compare alternatives with diverse outcomes, it is helpful to reduce all health outcomes into a single summary measure, such as the Quality-Adjusted Life Year (QALY).  Then, the incremental net cost is divided by the incremental QALYs to calculate the cost-effectiveness ratio, which is analogous to the business concept of return on investment.  If the cost-effectiveness ratio is sufficiently high, the health service is deemed worth doing.  There is no reason why wellness and care management interventions should not be considered along with other health care services based on cost effectiveness criteria.

The idea that wellness and care management interventions should only be done if they save money is really just a consequence of the approach being primarily initiated by health plans in the last decade.  I suspect that as care management services shift from health plans to health care providers over the next few years, there will be increased pressure to use the same decision criteria as is used for other provider-delivered health care services.

Read More

Reports of the death of Cost-Effectiveness Analysis in the U.S. may have been exaggerated: The ongoing case of Mammography

Guidelines for the use of mammograms to screen for breast cancer have been the topic of one of the fiercest and longest-running debates in medicine.  Back in the early 1990s, I participated in that debate as the leader of a guideline development team at the Henry Ford Health System.  We developed one of the earliest cost-effectiveness analytic models for breast cancer screening to be used as an integral part of the guideline development process.  I described that process and model in an earlier blog post.  Over the intervening 20 years, however, our nation has fallen behind the rest of the world in the use of cost-effectiveness analysis to drive clinical policy-making.  As described in another recent blog post, other advanced nations use sophisticated analysis to determine which treatments to use, while Americans’ sense of entitlement and duty have turned us against such analysis — describing it as “rationing by death panels.”  Cost-effectiveness analysis and health economics is dead.

But, maybe reports of its death have been exaggerated.

recent paper published on July 5, 2011 in the Annals of Internal Medicine described the results of an analysis of the cost-effectiveness of mammography in various types of women.  The study was conducted by John T. Schousboe, MD, PhD, Karla Kerlikowske, MD, MS, Andrew Loh, BA, and Steven R. Cummings, MD.  It was described in a recent article in the Los Angeles Times.  The authors used a computer model to estimate the lifetime costs and health outcomes associated with mammography.  They used a modeling technique called Markov Microsimulation, basically tracking a hypothetical population of women through time as they transition among various health states such as being well and cancer free, having undetected or detected cancer of various stages and, ultimately, death.

They ran the models for women with different sets of characteristics, including 4 age categories, 4 categories based on the density of the breast tissue (based on the so-called BI-RADS score), whether or not the women had a family history of breast cancer, and whether or not the women had a previous breast biopsy.  So, that’s 4 x 4 x 2 x 2 = 64 different types of women.  They ran the model for no-screening, annual screening, and screening at 2, 3 or 4 year intervals.  For each screening interval, they estimated each of a number of health outcomes, and summarized all the health outcomes in to single summary measure called the Quality-Adjusted Life Year (QALY).  They also calculated the lifetime health care costs from the perspective of a health plan.  Then, they compared the QALYs and costs for each screening interval, to the QALYs and costs associated with no screening to calculate the cost per QALY.  Finally, they compare the cost per QALY to arbitrary thresholds of $50K and $100K to determine whether screening at a particular interval for a particular type of women would be considered by most policy-makers to be clearly costs effective, reasonably cost-effective, or cost ineffective.

The authors took all those cost effectiveness numbers and tried to convert it to a simple guideline:

“Biennial mammography cost less than $100 000 per QALY gained for women aged 40 to 79 years with BI-RADS category 3 or 4 breast density or aged 50 to 69 years with category 2 density; women aged 60 to 79 years with category 1 density and either a family history of breast cancer or a previous breast biopsy; and all women aged 40 to 79 years with both a family history of breast cancer and a previous breast biopsy, regardless of breast density. Biennial mammography cost less than $50 000 per QALY gained for women aged 40 to 49 years with category 3 or 4 breast density and either a previous breast biopsy or a family history of breast cancer. Annual mammography was not cost-effective for any group, regardless of age or breast density.”

Not exactly something that rolls off the tongue.  But, with electronic patient registries and medical records systems that have rule-based decision-support, it should be feasible to implement such logic.  Doing so would represent a step forward in terms of tailoring mammography recommendations to specific characteristics that drive a woman’s breast cancer risk.  And, it would be a great example of how clinical trials and computer-based models work together, and a great example of how to balance the health outcomes experienced by individuals with the economic outcomes borne by the insured population.  It’s not evil.  It’s progress.

It will be interesting to see if breast cancer patient advocacy groups, mammographers and breast surgeons respond as negatively to the author’s proposal as they did to the last set of guidelines approved by the U.S. Preventive Services Task Force which called for a reduction in recommended breast cancer screening in some categories of women.

 

Read More