New HIT ROI lit review is a methodologic tour de force, but misses the point

JAMIA logoRecently, Jesdeep Bassi and Francis Lau of the University of Victoria (British Columbia) published in the Journal of the American Medical Informatics Association (JAMIA) another in a series of review articles that have been written in recent years to summarize the literature regarding the economic outcomes of investments in health information technology (HIT).  Such articles answer the questions

  • “How much do various HIT technologies cost?”
  • “How much do they save?”
  • “Are they worth the investment?”

They reviewed 5,348 citations found through a mix of automated and manual search methods, and selected a set of 42 “high quality” studies to be summarized.  The studies were quite diverse, including a mix of types of systems evaluated, methods of evaluation, and measures included.  The studies included retrospective data analyses and some analyses based on simulation models.  The studies included 7 papers on primary care electronic health record (EHR) systems, 6 on computer-based physician order entry (CPOE) systems, 5 on medication management systems, 5 on immunization information systems, 4 on institutional information systems, 3 on disease management systems, 2 on clinical documentation systems, and 1 on health information exchange (HIE) networks.

Lau HIT ROI results

Key results:

  • Overall, 70% of the studies showed positive economic results, 24% were inconclusive, and 6% were negative.
  • Of 15 papers on primary care EHR, medication management, and disease management systems, 87% showed net savings.
  • CPOE, immunization, and documentation systems showed mixed results.
  • The single paper on HIE suggested net savings, but the authors expressed doubts about the optimistic assumptions made in that analysis about a national roll-out in only ten years.

My take:

Bassi and Lau have made an important contribution to the field by establishing and documenting a very good literature review methodology – including a useful list of economic measures, a nice taxonomy of types of HIT, and many other tools which they graciously shared online for free in a series of appendices that accompany the article.  They also made a contribution by doing some tedious work to sort through lots of papers and sorting and classifying the HIT economics literature.

But, I think they missed the point.

Like many others, Bassi and Lau have implicitly accepted the mental model that health information technology is, itself, a thing that produces outcomes.  They evaluate it the way one would evaluate a drug or a cancer treatment protocol or a disease management protocol.  Such a conceptualization of HIT as an “intervention” is, unfortunately, aligned with the way many healthcare leaders conceptualize their investment decision as “should I buy this software?”  I admit to contributing to this conceptualization over the years, having published the results of retrospective studies and prospective analytic models of the outcomes resulting from investments in various types of health information technologies.

Process PuckBut, I think it would be far better for health care leaders to first focus on improvement to care processes — little things like how they can consistently track orders to completion to assure none fall through the cracks, bigger things like care transitions protocols to coordinate inpatient and ambulatory care team members to reduce the likelihood that the patient will end up being re-hospitalized shortly after a hospital discharge, and really big things like an overall “care model” that covers processes, organizational structures, incentives and other design features of a clinically integrated network.  Once health care leaders have a particular care process innovation clearly in sight, then they can turn their attention to the health information technology capabilities required to enable and support the target state care process.  If the technology is conceptualized as an enabler of a care process, then the evaluation studies are more naturally conceptualized as evaluations of the outcomes of that process.  The technology investment is just a one of a number of types of investments needed to support the new care process.  The evaluation “camera” zooms out to include the bigger picture, not just the computers.

I know this is stating the obvious.  But, if it is so obvious, why does it seem so rare?

This inappropriate conceptualization of HIT as an intervention is not limited to our field’s approach to economic evaluation studies.  It is also baked into our approach to HIT funding and incentives, such as the “Meaningful Use” incentives for investments in EHR technology, and the incentives created by HIT-related “points” in accreditation evaluations and designations for patient-centered medical home (PCMH), accountable care organizations (ACOs), organized systems of care (OSC), etc.  The designers of such point systems seem conscious of this issue.  The term “meaningful use” was intended to emphasize the process being supported, rather than the technology itself.  But, that intention runs only about one millimeter deep.  As soon as the point system designers put any level of detail on the specifications, as demanded by folks being evaluated, the emphasis on technology becomes instantly clear to all involved.  As a result, the intended focus on enabling care process improvement with technology slides back down to a  requirement to buy and install software.  The people being evaluated and incentivized lament that they are being micromanaged and subject to big burdens.  But they nevertheless expend their energies to score the points by installing the software.

So, my plea to Bassi and Lau, and to future publishers of HIT evaluation studies, is to stop evaluating HIT.  Rather, evaluate care processes, and require that care process evaluations include details on the HIT capabilities (and associated one time and ongoing costs) that were required to support the care processes.

Read More

Conceptualizing “over-treatment” waste: Don’t deny health economics

A Health Policy Brief published in Health Affairs on December 13, 2012 referenced an analysis published last April in JAMA regarding waste in health care.  In this analysis, Don Berwick (one of my health care heroes) and Andrew Hackbarth (from RAND) estimated that waste in health care consumed between $476 billion and $992 billion of the $2.6 trillion annual health care expenditures in the US.  That’s 18-37% waste.  They divided this estimate into 5 categories of waste.  Their mid-point estimates are as follows:

Berwick and Hackbarth estimates of waste in health care - JAMA 2011

They consider “failures in care delivery” to include failures to execute preventive services or safety best practices, resulting in avoidable adverse events that require expensive remediation.  By “failures of care coordination,” they mean care that is fragmented, such as poorly planned transitions of care, resulting in avoidable hospital readmissions.  They categorize as “overtreatment” care ordered by providers that ignored scientific evidence, were motivated to increase income or to avoid medical malpractice liability, or out of convenience or habit.  They considered “administrative complexity” to be spending resulting from “inefficient or flawed rules” of insurance companies, government agencies or accreditation organizations.  They estimated the magnitude of administrative complexity by comparing administrative expense in the US to that in Canada’s single payer system.  They considered “pricing failures” to be prices that are greater than those which are justified by cost of production plus a “reasonable profit,” presumably due to the absence of price transparency or market competition.  Finally, they considered “fraud and abuse” to be the cost of fake medical bills and the additional inspections and regulations to catch such wrongdoing.

Underestimating Over-treatment

These estimates are generally in alignment with other attempts to categorize and assess the magnitude of waste in health care.  But, I think Berwick and Hackbarth’s estimates of “overtreatment” are probably far too low.  That’s because they, like so many other health care leaders, are so reluctant to address the issue of cost-effectiveness.  Obviously, the definition of over-treatment depends on one’s philosophy for determining what treatments are necessary in the first place.  Everyone would agree that a service that does more harm than good for the patient is not necessary.  Most would agree that a service that a competent, informed patient does not want is not necessary.  Some argue that, if there is no evidence that a treatment is effective, it should not be considered necessary, while others argue that even unproven treatments should be considered necessary if the patients wants it.   Berwick and Hackbarth are ambiguous about their application of this last category.

But, the big disagreement occurs when evaluating treatments for which there is evidence that the treatment offers some benefit, but the magnitude of the benefit is small in relation to the cost of the treatment.  This is a question about cost-effectiveness.  It is at the heart of medical economics.  In my experience, most health care leaders and an even higher proportion of political leaders choose to deny the principles of medical economics and the concept of cost-effectiveness.  They describe attempts to apply those principles as “rationing” — a term which has taken on a sinister, greedy meaning, rather than connoting the sincere application of rational thought to the allocation of limited resources.   Berwick and Hackbarth implicitly take that view.  They are unwilling to define over-treatment based on cost-ineffectiveness.

The analysis I want to see

For years, I’ve been looking for an analysis that attempted to estimate the magnitude of waste from over-treatment based on the principles of health economics.  The diagram below illustrates the hypothetical results of the type of analysis I’d like to see.

Diagram re Conceptualizing Overtreatment

 In this diagram, the horizontal axis represents the total cost of health care to a population.  I don’t want to see the entire US health care system.  What is more relevant is the population served by an Accountable Care Organization or an HMO.  To create such a diagram, we would first need to break down health care cost into a large number of specific treatment scenarios.  Each of these scenarios would specify a particular treatment (or diagnostic test) with enough clinical context to permit an assessment of the likely health and economic outcomes.  For each scenario, each of the possible health outcomes would be assigned a probability, a duration, and a quality of life factor.  My multiplying the duration by the quality of life factor, we could calculate the “quality-adjusted life years” (or “QALY”) for the outcome.  Then, by taking the probability-weighted average of all the possible health states for the scenario, and then dividing the result by the cost, we could calculate the “cost-effectiveness ratio” for the scenario, measured in “$/QALY.”   Then, we would sort all the treatment scenarios by the cost-effectiveness ratios, with the treatment scenarios with the most favorable health economic characteristics on the left.

Some of the scenarios will generate net savings, such as for certain preventive services where the cost of the avoided disease is greater than the initial cost of the preventive service.  These are clearly cost-effective.  On the other end of the spectrum are scenarios that offer net harm to the patient, such as when adverse side-effects are worse than the benefits of the treatment.  These are clearly cost-ineffective.  In the middle of these extremes are scenarios where there is a positive net benefit to the patient and a positive net cost borne by the population.

If a person rejects the principles of health economics, they would consider all of these middle scenarios to be “necessary” or “appropriate” regardless of how small the benefits or how large the costs.  But, among those who accept the principles of health economics, some of these scenarios could be judged to be cost-effective and others to be cost-ineffective.  Such judgments would presumably reveal some threshold cost-effectiveness ratio that generally separated the scenarios into cost-effective and cost-ineffective.  Since different people have different values, their judgments could reveal different cost-effectiveness thresholds.  If we had many people making these judgments, we could find a range of cost-effectiveness ratios that were considered to be reasonable by 90% of the people.    Applying this range to all the treatment scenarios, one could find a group of scenarios that were considered wasteful by most, and another group of scenarios that were considered wasteful only by some.

Variations on this theme have been used throughout the world for decades by various practice guidelines developers, healthcare policy analysts, health services researchers and health economists.  It is complex and time-consuming.  As I’ve discussed before, it is also controversial in the United States.

Right now, in the US, we all recognize that health care costs are too high.  We’re all focusing on merging providers into larger organizations, installing computer software, and exploring new reimbursement arrangements to address the problem.  But, I’m convinced that over-treatment with cost-ineffective services is a major source of waste.  We will inevitably get back to the core issue of having to figure out which treatment scenarios are wasteful.  We will inevitably have to overcome our denial of health economics and our irrational fear of rational allocation.

 

Read More

The debate about what to maximize when selecting candidates for care management programs: Accuracy? ROI? Net Savings? Or Cost-Effectiveness?

When doing analysis, it is really important to clarify up front what it is you are actually trying to figure out.  This sounds so obvious.  But, I am always amazed at how often sophisticated, eager analysts can zip past this important first step.

Health plans, vendors and health care providers are all involved in the design and execution of wellness and care management programs.  Each program can be conceptualized as some intervention process applied to some target population.  A program is going to add more value if the target population is comprised of the people for whom the intervention process can have the greatest beneficial impact.

I have found it useful to conceptualize the process of selecting the final target population as having two parts.  The first part is the process of identification of the population for whom the intervention process is deemed relevant.  For example, a diabetes disease management program is only relevant to patients with diabetes.  An acute-care-to-ambulatory-care transitions program is only relevant to people in an acute care hospital. A smoking cessation program is only relevant to smokers.  The second part is to determine which members of the relevant population are to actually be targeted.  To do this, a program designer must figure out what characteristics of the relevant candidates are associated with having a higher opportunity to benefit from the intervention.  For example, in disease management programs, program designers often use scores from predictive models designed to predict expected cost or probability of disease-related hospitalization.  They are using cost or likelihood of hospitalization as a proxy for the opportunity of the disease management program to be beneficial.  Program designers figure that higher utilization or cost means that there is more to be saved.  This is illustrated in the graph below, where a predictive model is used to sort patients into percentile buckets.  The highest-opportunity 1% have have an expected annual admit rate of almost 4000 admits per 1000 members, while the lowest 1% have less than 200 admits per 1000 members.  A predictive model is doing a better job when more of the area under this curve is shifted to the left.

Although it is common to use expected cost or use as a proxy for opportunity, what a program designer would really like to know is how much good the intervention process is likely to do.  Other variables besides expected cost or use can contribute to a higher opportunity.  For example, in a disease management program, the program might be more worthwhile for a patient that is already motivated to change their self-management behaviors or one that had known gaps in care or treatment non-compliance issues that the intervention process is designed to address.

Applying micro-economics to care management targeting

Once the definition of “opportunity” is determined and operationally defined to create an “opportunity score” for each member of the relevant population, we can figure out which members of the relevant population to actually target for outreach for the program.  Conceptually, we would sort all the people in the relevant population by their opportunity score.  Then, we would start by doing outreach to the person at the top of the list and work our way down the list.  But, the question then becomes how far down the list do we go?  As we go down the list, we are investing program resources to outreach and intervention effort directed at patients for which the program is accomplishing less and less.   Economists call this “diminishing returns.”

As illustrated by the red line in the graph above, there is some fixed cost to operating the program, regardless of the target rate.  For example, there are data processing costs.  Then, if the program does outreach to a greater and greater portion of the relevant population, more and more people say “yes” and the costs for the intervention go up in a more or less linear manner.  As shown by the green line, the savings increase rapidly at first, when the program is targeting the candidates with the greatest opportunity.  But, as the threshold for targeting is shifted to the right, the additional candidates being targeted have lower opportunity, and the green line begins to flatten. The blue line shows the result of subtracting the costs from the savings to get net savings.  It shows that net savings increases for a while and then begins to decrease, as the cost of intervening with additional patients begins to outweigh the savings expected to accrue from those patients.  In this analysis, net savings is highest when 41% of the relevant population of diabetic patients is targeted for the diabetes disease management program.  The black dotted line shows the result of dividing savings by cost to get the return of investment, or ROI.   With very low target rates, too few patients are accumulating savings to overcome the fixed cost.  So the ROI is less than 1.  Then, the ROI hits a peak at a target rate of 18%, and declines thereafter.  This decline is expected, since we are starting with the highest opportunity patients and working down to lower opportunity patients.   Note that in this analysis, increasing the target penetration rate from 18% to 41% leads to a lower ROI, but the net savings increases by 24%.  So, if the goal is to reduce overall cost, that is achieved by maximizing net savings, not by maximizing ROI.

Should we try to maximize accuracy?

In a recent paper published in the journal Population Health Management by Shannon Murphy, Heather Castro and Martha Sylvia from Johns Hopkins HealthCare, the authors describe their sophisticated methodology for targeting for disease management programs using “condition-specific cut-points.”  A close examination of the method reveals that it is fundamentally designed to maximize the “accuracy” of the targeting process in terms of correctly identifying in advance the members of the relevant disease-specific population that will end up being among the 5% of members with the highest actual cost.   In this context, the word accuracy is a technical term used by epidemiologists.  It means the percentage of time that the predictive model, at the selected cut-point, correctly categorized patients.  In this application, the categorization is attempting to correctly sort patients into a group that would end up among the 5% with highest cost vs. a group that would not.  By selecting the cut point based accuracy, the Hopkins methodology is implicitly equating the value of the two types of inaccuracy: false positives, where the patient would be targeted but would not have been in the high cost group, and false negatives, where the patient would not be targeted but would have been in the high cost group. But, there is no reason to think that, in the selection of targets for care management interventions, false negatives and false positive would have the same value. The value of avoiding a false negative includes the health benefits and health care cost savings that would be expected by offering the intervention. The value of avoiding a false positive includes the program cost of the intervention.  There is no reason to think that these values are equivalent.  If it is more important to avoid a false positive, then a lower cut-point is optimal.  If it is more valuable to avoid a false negative, then a higher cut-point is optimal.  Furthermore, the 5% cost threshold used in the Hopkins methodology is completely arbitrary, selected without regard to the marginal costs or benefits of the intervention process at that threshold.  Therefore, I don’t advise adopting the methodology proposed by the Hopkins team.

What about cost-effectiveness?

The concept of maximizing ROI or net savings is based on the idea that the reason a health plan invests in these programs is to save money.  But, the whole purpose of a health plan is to cover expenses for necessary health care services for the beneficiaries.  A health plan does not determine whether to cover hip replacement surgery based on whether it will save money.  They cover hip replacements surgery based on whether it is considered a “standard practice,”  or whether there is adequate evidence proving that the surgery is efficacious.  Ideally, health care services are determined based on whether they are worthwhile — whether the entire collection of health and economic outcomes is deemed to be favorable to available alternatives.  In the case of hip replacement surgery, the health outcomes include pain reduction, physical function improvement, and various possible complications such as surgical mortality, stroke during recovery, etc.  Economic outcomes include the cost of the surgery, and the cost of dealing with complications, rehabilitation and follow-up, and the savings from avoiding whatever health care would have been required to deal with ongoing pain and disability.  When attempting to compare alternatives with diverse outcomes, it is helpful to reduce all health outcomes into a single summary measure, such as the Quality-Adjusted Life Year (QALY).  Then, the incremental net cost is divided by the incremental QALYs to calculate the cost-effectiveness ratio, which is analogous to the business concept of return on investment.  If the cost-effectiveness ratio is sufficiently high, the health service is deemed worth doing.  There is no reason why wellness and care management interventions should not be considered along with other health care services based on cost effectiveness criteria.

The idea that wellness and care management interventions should only be done if they save money is really just a consequence of the approach being primarily initiated by health plans in the last decade.  I suspect that as care management services shift from health plans to health care providers over the next few years, there will be increased pressure to use the same decision criteria as is used for other provider-delivered health care services.

Read More

Reports of the death of Cost-Effectiveness Analysis in the U.S. may have been exaggerated: The ongoing case of Mammography

Guidelines for the use of mammograms to screen for breast cancer have been the topic of one of the fiercest and longest-running debates in medicine.  Back in the early 1990s, I participated in that debate as the leader of a guideline development team at the Henry Ford Health System.  We developed one of the earliest cost-effectiveness analytic models for breast cancer screening to be used as an integral part of the guideline development process.  I described that process and model in an earlier blog post.  Over the intervening 20 years, however, our nation has fallen behind the rest of the world in the use of cost-effectiveness analysis to drive clinical policy-making.  As described in another recent blog post, other advanced nations use sophisticated analysis to determine which treatments to use, while Americans’ sense of entitlement and duty have turned us against such analysis — describing it as “rationing by death panels.”  Cost-effectiveness analysis and health economics is dead.

But, maybe reports of its death have been exaggerated.

recent paper published on July 5, 2011 in the Annals of Internal Medicine described the results of an analysis of the cost-effectiveness of mammography in various types of women.  The study was conducted by John T. Schousboe, MD, PhD, Karla Kerlikowske, MD, MS, Andrew Loh, BA, and Steven R. Cummings, MD.  It was described in a recent article in the Los Angeles Times.  The authors used a computer model to estimate the lifetime costs and health outcomes associated with mammography.  They used a modeling technique called Markov Microsimulation, basically tracking a hypothetical population of women through time as they transition among various health states such as being well and cancer free, having undetected or detected cancer of various stages and, ultimately, death.

They ran the models for women with different sets of characteristics, including 4 age categories, 4 categories based on the density of the breast tissue (based on the so-called BI-RADS score), whether or not the women had a family history of breast cancer, and whether or not the women had a previous breast biopsy.  So, that’s 4 x 4 x 2 x 2 = 64 different types of women.  They ran the model for no-screening, annual screening, and screening at 2, 3 or 4 year intervals.  For each screening interval, they estimated each of a number of health outcomes, and summarized all the health outcomes in to single summary measure called the Quality-Adjusted Life Year (QALY).  They also calculated the lifetime health care costs from the perspective of a health plan.  Then, they compared the QALYs and costs for each screening interval, to the QALYs and costs associated with no screening to calculate the cost per QALY.  Finally, they compare the cost per QALY to arbitrary thresholds of $50K and $100K to determine whether screening at a particular interval for a particular type of women would be considered by most policy-makers to be clearly costs effective, reasonably cost-effective, or cost ineffective.

The authors took all those cost effectiveness numbers and tried to convert it to a simple guideline:

“Biennial mammography cost less than $100 000 per QALY gained for women aged 40 to 79 years with BI-RADS category 3 or 4 breast density or aged 50 to 69 years with category 2 density; women aged 60 to 79 years with category 1 density and either a family history of breast cancer or a previous breast biopsy; and all women aged 40 to 79 years with both a family history of breast cancer and a previous breast biopsy, regardless of breast density. Biennial mammography cost less than $50 000 per QALY gained for women aged 40 to 49 years with category 3 or 4 breast density and either a previous breast biopsy or a family history of breast cancer. Annual mammography was not cost-effective for any group, regardless of age or breast density.”

Not exactly something that rolls off the tongue.  But, with electronic patient registries and medical records systems that have rule-based decision-support, it should be feasible to implement such logic.  Doing so would represent a step forward in terms of tailoring mammography recommendations to specific characteristics that drive a woman’s breast cancer risk.  And, it would be a great example of how clinical trials and computer-based models work together, and a great example of how to balance the health outcomes experienced by individuals with the economic outcomes borne by the insured population.  It’s not evil.  It’s progress.

It will be interesting to see if breast cancer patient advocacy groups, mammographers and breast surgeons respond as negatively to the author’s proposal as they did to the last set of guidelines approved by the U.S. Preventive Services Task Force which called for a reduction in recommended breast cancer screening in some categories of women.

 

Read More

Why do other countries have different attitudes about Health Economics?

Wednesday morning, I attended a thought-provoking panel discussion entitled “Is Health Economics an Un-American Activity?” — a reference to the McCarthy-era Congressional committees that judged Hollywood movie directors and others considered to be communist sympathizers.  The panel presentation was part of the annual meeting of the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) in Baltimore.  It featured John Bridges, PhD from Johns Hopkins, Peter Neumann, ScD from Tufts, and Jeff White, PharmD, from WellPoint.

The panel started by noting that the field of health economics has, as its fundamental premise, the rational allocation of scarce health care resources. This allocation is informed by “cost-effectiveness analysis” — broadly defined as the process of preparing estimates of both health and economic outcomes for different health care services to support decisions about which services are worth doing.  Health care services often produce a mixture of different health outcomes — sometimes extending life and sometimes affecting different aspects of the quality of life.  So, to deal with the mixed basket of different outcomes, cost-effectiveness analysts commonly combine all the health outcomes into a single summary measure called the “quality-adjusted life year,” or “QALY.”  Then, QALYs are compared to the costs of the health care service.  Based on this comparison, decision-makers determine if the service is worth doing, or “cost-effective.”

The panel noted that the United States is far less supportive of these basic concepts of the field of health economics, compared to almost all other developed nations.  In the U.S. stimulus bill, Congress provided substantial new funding to establish a Patient-Centered Outcomes Research Institute (PCORI).  But, Congress specifically forbade that Institute from using QALYs or establishing cost-effectiveness thresholds.  In the debates leading up to passage of the health care reform legislation, U.S. political leaders went out of their way to emphasize that they did not condone any type of “rationing” or “death panels.”  In contrast, the ISPOR meeting was full of presentations by health economists from Europe, Australia, Asia and elsewhere describing their government-sponsored programs to formally assess the cost-effectiveness of heath services and to use those assessments to determine whether to grant market access to drugs, biomedical devices and other health care products and services.

Although the panel discussion was enlightening and interesting, I felt they generally focused too much on QALYs and too little on the deeper cultural issues.  They made only vague comments on any evidence or theories about why there would be a such an obvious difference in attitudes between the US and other developed countries.  One presenter noted that America was founded by individuals fleeing tyranny, which led Americans to be distrustful of government hundreds of years later.  Another jokingly hypothesized that support for health economics had something to do with having a monarchy.

So why does the US see things differently?

It seems to me that there are two competing explanations for why Americans are so troubled by health economics and cost-effectiveness analysis: Entitlement and Duty.

According to the entitlement hypothesis, after a few generations of economic largess, Americans have come to feel entitled to a worry-free life.  As a result, Americans are supposedly unwilling to accept limits or burdens.  This is described as the decline of our culture.  It supposedly applies not only to health care, but also to our unwillingness to make tough decisions and sacrifices to solve the federal budget deficit, global warming, urban sprawl and even childhood obesity.  Both political parties implicitly support this view when they assert that their party will revive American exceptionalism and put the country back on the right track.  This sense of entitlement applies to both rich and poor.  Rich people hate rationing because they associate it with big government, which they equate with high taxes to pay for generous social welfare programs that transfer their wealth to the poor.  Poor people hate rationing because they fear that it will provide the pretext for the “establishment” to avoid providing them with the high quality health care to which they feel entitled.  According to the entitlement hypothesis, both rich and poor are like spoiled children, stomping their feet at the prospect of any limits to the health care they expect.

In contrast, the duty hypothesis makes seemingly opposite assumptions about the state of American culture.  It emphasizes that Americans have a strong sense of duty, and a romantic sense of chivalry, loyalty and patriotism. They note that Americans, compared to their European counterparts, tend to have more fundamentalist religious beliefs.  Americans tend to have a strong sense of right and wrong, seeing moral issues as black and white, rather than the more relativistic shades of grey prevalent in attitudes of those from other developed countries. Advocates of this hypothesis point out that Americans feel strongly about not leaving a soldier behind in battle, no matter what the risk. This sense of duty translates to an insistence that we spare no expense to rescue the sick from illness.

I can’t say I know which point of view is right.  Perhaps both forces are at work to animate Americans’ opposition to health economics.

What are the implications for ACOs?

ACOs involve providers taking responsibility for the quality and cost of care for a population.  Controlling cost requires reducing waste.  Many health care leaders would like to believe that we can control costs just by eliminating medical procedures that offer zero benefit or that actually harm patients, and by creating leaner care delivery processes so each necessary service is delivered at lower cost.  But, the elephant in the room is the far larger waste in the form of delivery of procedures that do offer some benefit — just not enough to be worth the high cost. Reducing the use of such procedures will face opposition and resistance. To be successful in the face of such resistance, ACOs must overcome the sense of entitlement.  ACOs must channel the strong sense of American duty, honor and righteousness to the act of triaging to help the people who need high value services.  The courts and churches use rituals and solemn settings to convey solemnity, seriousness and integrity.  Perhaps ACOs should use some form of ritual and a solemn setting to build a sense of rigor, transparency and integrity to the process of determining practice guidelines that direct resources to the “right” clinical needs. In this manner, the US culture of duty could potentially overcome any sense of entitlement, enabling the ACO to carry out its stewardship duties and responsibilities regarding quality and cost of care for the population.

I suspect that for-profit health care provider organizations will have a far more difficult time overcoming this resistance to health economics.  For people to internalize a sense of duty to triage, they must have confidence that when practice guidelines cause providers to say no to one patient regarding a low value service, the preserved resources go instead to provide a high value service to another patient.  If they suspect the savings is going into the pockets of stockholders, the cultural opposition to health economics will be strengthened.

Read More

NEJM report of ACO financial model fails to include risk of delayed start on transformation

On March 23, 2011, Trent Haywood, MD, JD, and Keith Kosel, PhD, MBA published the results in the New England Journal of Medicine (NEJM) web site of a financial model of a hypothetical Accountable Care Organization (ACO).  This model shows that ACOs are likely to lose money on the Medicare Shared Savings Program called for in the Patient Protection and Affordable Care Act during the first three years of implementing the ACO model, based on the up-front investment expected to be required.  The authors conclude that “the high up-front investments make the model a poor fit for most physician group practices.”  They call for modifications to the Medicare Shared Savings Program to make it more generous to participating ACOs.

The model is based on assumptions derived from data from the Physician Group Practice (PGP) Demonstration, carried out by CMS from 2005 to 2010.  In the PGP, the average up-front investment by participants was $1.7M, or $737 per primary care physician (PCP).  The authors calculate that an “unlikely” 20% margin would be required to break even during the 3-year time frame of the Medicare Shared Savings Program scheduled to start in 2012.

Haywood and Kosel are to be commended for taking the time to develop a financial model and publish results.  I think that such models are extremely helpful to real-world decision-makers because they force people to be explicit about the assumptions they are making, and they provide some quantitative estimates of the outcomes relevant to the comparison of available alternatives so people can make better choices.  Unfortunately, in my opinion, the authors misconceptualized the model, creating a risk that people will use the negative results of the model to justify inaction, to their own detriment.

Every decision is a choice among available alternatives.  To create a useful model to support decision-making, an analyst must follow the following four basic steps:

  1. Identify the available alternatives being compared
  2. Identify the outcomes that are relevant to the decision-maker and that are thought to be potentially materially different across the available alternatives
  3. Make quantitative estimates of  the magnitude and uncertainty of all such outcomes for all the available alternatives, and
  4. Apply values (including ethical principles and preferences) to determine which set out outcomes is most desirable or optimal

Although this basic process seems simple and straight-forward, experienced analysts know that each of these steps is devilishly difficult.  In the case of Haywood and Kosel’s financial analysis, in my opinion, they ran into trouble with the first two steps; they failed to identify the available alternatives and misconceptualized the choice or decision that the model is designed to support, and therefore failed to recognize non-Medicare outcomes that differ across the available alternatives.  Of course, an error in any particular step cascades to the remaining steps.

Haywood and Kosel did not explicitly explain the decision their model was intended to support.  But, one could infer from the conclusion that among the intended decisions they were supporting was the decision by physician organizations whether or not to make a $737 per PCP up-front investment and then sign-up for the optional Medicare Shared Savings Program in order to reap a return in the form of increased Medicare revenue.  But, the up-front investment required to create a successful ACO takes the form of fundamental transformation of care processes and the organizational structures, human resources, information systems, and cultural changes required to support them.  Such fundamental transformations affect the entire population served by the nascent ACO, not just Medicare patients.  And, they don’t just affect the providers’ relationship with payers, they also affect the providers’ competitive standing with respect to other providers and their relationship with other stakeholders such as employers, state and federal legislators, accreditation organizations, etc.

The correct conceptualization of the decision facing provider organizations right now is a choice between (1) getting started now with ACO-type transformation or (2) waiting until later to decide if such a transformation is necessary.  Physicians and hospitals that are contemplating the formation of ACOs would be wise to invest in the creation of a model to make explicit estimates of all the relevant financial and non-financial outcomes for the available alternatives.  Such a model will, by necessity, include many assumptions not supported by solid data.  That’s not the fault of a model, nor a reason to justify making decisions based only an intuition (what David Eddy calls “global subjective judgement”).  Rather, prudent health care leaders will invest the time to create and use a model to really understand the sensitivity of the results to various assumptions and the dynamics of the outcomes (how outcomes are likely to play out over time).

My prediction is that, when properly conceptualized as a “start transformation now” vs. “put transformation off until later” decision, such a model is likely to show what personal retirement planning models always show — it pays to get started on things that take a long time to achieve.  If you fall too far behind competitors, you may be unable to catch up later.  On the other hand, if provider organizations opt to get started on transformation, obviously there are many smaller decisions that need to be made, such as which care processes to start on, which particular payer-specific deals to cut, which IT investments to prioritize, etc.

One last point:  Although “pay back period” can sometimes be a useful summary measure of a financial analysis, my advice to to avoid over-simplifying the reporting of model results by reducing it down to a single summary measure.  Model authors would serve decision-makers better by presenting a table with their estimates of all the relevant outcomes for all the alternatives being considered, and possibly showing when those results occur over time.  Then, decision-makers can understand the drivers of their decisions and subsequently summarize the results in various ways that communicate their thinking most effectively using various summary measures such as net present value, return-on-investment, internal rate of return, pay-back period, cost per quality-adjusted life-year, cost-benefit ratio, etc.

Read More

The Smoking Intervention Program, a Provider-based Care Management Process

Smoking cessation is an important public health concern, and has been the subject of a recent Agency for Health Care Policy and Research (AHCPR) guideline, as well as a HEDIS measure.   A point prevalence study conducted with the Henry Ford Health System found a 27.4% prevalence of smoking, and an additional 38.6% former smokers.

The CCE developed a first-generation smoking-dependency clinic which was staffed by trained non-physician counselors and overseen by a physician medical director. The original intervention was a 50-minute initial evaluation and counseling visit, with nicotine replacement therapy prescribed for all patients with a high level of nicotine dependency. This intervention was subsequently updated to reflect the AHCPR recommendation that, unless contraindicated, all smoking cessation patients be prescribed nicotine replacement therapy.

Because relapse is a normal part of smoking cessation, the intervention was explicitly designed to address relapse. This was done through return visits, an optional support group, and follow-up telephone counseling calls throughout the year, as illustrated in the following figure.

The program was designed to be inexpensive and simple to execute within the clinic. This was accomplished by automating the logistics of both the intervention and the collection of outcomes measures. The Flexi-Scan System, an internally developed computer application which helps automate outcome studies and disease-management interventions was used to automate (1) data entry through a scanner, (2) prompting of follow-up calls and mailings, and (3) the generation of medical-record notes and letters to the referring physicians. A database that can be used for outcomes-data analyses is acquired as a part of this process.

As illustrated on the figure below, this first-generation program achieved a twelve-month quit rate of 25%. Such a quit rate is about twice as high as the rate achieved with brief counseling intervention.

To evaluate the cost-effectiveness of this program, a decision analytic model was constructed. This model was constructed using the Markov method.  Key assumptions of the model include the following:

  • One year quit rate for usual care (optimistically assumed to consist of brief physician advice) was 12.5%.
  • Spontaneous quit rate of 1% per year in “out years.”
  • Relapse rate for recent quitters of 10%.
  • Age, Sex distribution based on Smoking Clinic patient demographics
  • Life expectancy of smokers and former smokers by age and sex based on literature (life tables).
  • Cost of clinic intervention – $199
  • Cost of nicotine therapy Smoking Clinic – $101 (Assuming 0.9 Rx/Patient)
  • Usual Care – $33 (Assuming 0.3 Rx/Patient)
  • Future health care costs were not considered
  • Annual discount rate of 5%

The results of this model were presented at the annual meeting of the Society for Medical Decision-Making.  The model results are presented in the form a table called a “balance sheet” (a term coined by David Eddy, MD, PhD).  As shown below, the model estimated that the first-generation smoking-dependency clinic cost about $1,600 for each life year gained.

To help evaluate whether this cost-effectiveness ratio is favorable, a league table was constructed (see below).  The league table shows comparable cost-effectiveness ratios for other health care interventions.  Interpretation of the table suggests that the smoking cessation intervention is highly favorable to these other health care interventions.

League Table

Intervention Cost per Quality-adjusted Life Year Gained
Smoking Cessation Counselling $6,400
Surgery for Left Main Coronary Artery Disease for a 55-year old man $7,000
Flexible Sigmoidoscopy (every 3 years) $25,000
Renal Dialysis (annual cost) $37,000
Screening for HIV (at a prevalence of 5/1,000) $39,000
Pap Smear (every year) $40,000
Surgery for 3-vessel Coronary Artery Disease for a 55 year-old man $95,000

Although this first generation program was effective and cost-effective, it was targeted only at the estimated 16,500 smokers in the HFMG patient population who were highly motivated to quit.

The estimated 66,000 other smokers in the HFMG patient population would be unlikely to pursue an intervention that involved visiting a smoking dependency clinic. Even for the smokers who were highly motivated to quit, the smoking cessation clinic had the capacity to provide counseling to about 500 people each year, or about 3% of these highly motivated smokers.

Second Generation Smoking Intervention Program

In response to this problem, the CCE developed a “second generation” Smoking Intervention Program.” This program uses a three tiered approach which includes (1) a “front-end” process for primary care and specialty clinics to use to identify smokers and provide brief motivational advice, (2) a centralized telephone-based triage process to conduct assessment and make arrangements for appropriate intervention, and (3) a stepped-care treatment tier.

In the “front-end” process, clinic physician and support staff were trained to screen their patients from smoking status and readiness to quit and provide tailored brief advise. Each participating clinic was provided with a program “kit” including screening forms, patient brochures, and posters to assist them in implementing the program. Patients who are interested in further intervention are referred to a centralized triage counselor for further assessment and intervention. These counselors are trained, non-physician care providers. They proactively call each patient referred, conduct an assessment of the patients smoking and quitting history and triage into a stepped-care intervention program.

An important part of this intervention has been providing information to clinicians, including a quarterly report showing the number of patients they have referred to the Smoking Intervention Program, the status of those patients, the type of intervention they are receiving, and the number of patients who report not having smoked in the preceding six months.

The clinician-specific data is presented in comparison to data for the medical group as a whole. These reports have a strong motivational effect on clinicians, as evidenced by a sharp increase in Smoking Intervention Program referrals after each reporting cycle.

As shown above, the second generation program achieved a six month quit rate of about 25%. This rate is virtually identical to the first generation program.  The new program, however, has much larger capacity and lower cost per participant. Patient satisfaction with the Smoking Intervention Program is encouraging, with 85% reporting that they would refer a friend to this program.

Read More

Using Eddy’s Explicit Method to Develop Practice Guidelines for Mammography in the 40-49 Age Group

Note that the following write-up is now almost 20 years old!  (How did that happen?)  Amazingly, the debate about the role and frequency of mammography in the 40-49 age group has raged on the entire time since then.  This case study demonstrates the real-world use of the “explicit method” proposed by one of my “health care heroes”, David Eddy.  Eddy’s approach involves using decision-analytic models and cost-effectiveness analysis to interpret evidence and incorporate our values to inform practice guideline development.  This case shows that the explicit method is not just a “purist” methodology.  It is a practical method of achieving agreement among physicians that started the process in angry disagreement.

In the last two decades, our field has largely retreated from difficult discussions about cost-effectiveness and the rational allocation of limited resources — the dreaded “R” word.  In the recent debates about health care reform, those opposing reform talked of “death panels” and berated the U.K.’s NICE program which espouses some of the same principles as we used in this case.  Such harsh talk has replaced the thoughtful, principled discussions we were having at the Henry Ford Health System and in the field in general back then.  True health care reform will require that we go back and cross at this light.

– R. Ward, MD   Jan 27, 2010

Background

As shown in the newspaper clipping below, the role of mammography in the 40-49 age group has long been controvertial.

Many organizations recommend mammography during the 40′s, including the American Cancer Society, the American College of Radiology, the American College of Obstetricians and Gynecologists, the American Medical Women’s Association, the National Alliance of Breast Cancer Organizations, and the National Breast Cancer Coalition.  Other organizations argue that the benefits of mammography have not been proven, and therefore mammography should not be offered until age 50.  These organizations include the American College of Physicians, the American College of Family Practice, the U.S. Preventive Services Task Force, the National Women’s Health Network, the National Center for Medical Consumers, the Darmouth Center for the Evaluative Clinical Sciences, the Canadian National Task Force on the Periodic Examination, and the United Kingdom Public Health Policy Board.

Within Henry Ford Medical Group, this same debate was also raging.  In May, 1991, the HFMG Operations Committee approved a Consensus Guideline calling for bi-annual screening in 40-49 age group.  Then, in October, 1993, the HFMG Consensus Guideline was updated based on recently published results of the Canadian National Breast Screening Study.  The new consensus group formulated a recommendation which said “women age 40-49 not in a high risk group should carefully consider the risks and benefits before scheduling a mammogram.”  This draft was approved by the Clinical Practice Committee, but was subsequently rejected by the Operations Committee.  Strong letters of opposition were sent to HFMG leadership from the Department of Surgery and the Division of Breast Imaging.   This led to a decision by Clinical Practice Committee to re-do analysis using explicit methods.

Objective

A multi-disciplinary team was commissioned by the HFMG Clinical Practice Committee to use an explicit methodology to conduct a clinical policy analysis and develop specific clinical policy recommendations regarding the role of screening mammography for average risk women age 40-49.

Methods

A multi-disciplinary team was assembled to conduct the analysis and formulate policy recommendations.  This team included the Chairmen of the departments of Radiology, Obstetrics & Gynecology, and Surgery.  It also included the Division Head of Breast Imaging, two general surgeons from the breast clinic, a staff oncologist, a Division Head in Internal Medicine, the Clinical Director of Family Practice, the Section Head of Epidemiology, the Associate Medical Director of the Health Alliance Plan (HMO).   The team was led by the Director of the Center for Clinical Effectiveness.

Based on information from the medical literature, internal HFMG data, and expert opinion of team members, a mathematical model was developed and refined in order to gain a greater understanding of the implications and shortcomings of existing scientific evidence and to estimate the health and economic outcomes (with ranges of uncertainty) for three alternative plans: (1) do not recommend mammography until age 50, (2) recommend a program of bi-annual mammography during the 40-49 period, and (3) recommend a program of annual mammography.

Results

Results of Mammography Policy Analysis

As shown in the figure above, compared to not recommending mammograms, a program of 5 bi-annual mammograms for the 2,500 HFMG women entering their 40′s would add about $1.5 million to the net health care cost for the group (90% range of certainty: $918k – $2.1 million). This program could be expected to save between one and six lives, resulting in a gain of about 141 life years (43 – 244). This represents an expenditure of $440 thousand per life saved (undiscounted). With discounting of health and economic outcomes, this represents an incremental cost-effectiveness ratio of $34 thousand per life-year gained (15k – 120k). Earlier detection, in addition to saving lives, would permit the use of breast conserving procedures in about 2 more women, and would permit non-systemic treatment for 4 more women. In addition, such a program of bi-annual mammography could lead to added piece of mind for about 1,700 women receiving all negative screening results. On the down side, 59 more women would suffer the fear, inconvenience and risk associated with falsely positive mammogram leading to a negative biopsy, and an additional 650-950 women would suffer the unneeded worry associated with false positive mammogram results.

The full model results, presented in a table called a “balance sheet”, also shows the calculated health and economic outcomes associated with Plan C, offering annual mammography during the 40-49 age range.

Compared to bi-annual mammography, a program of 10 annual mammograms during the 40′s would cost an additional $900 thousand, saving an additional 0-1 lives for an estimated gain of 26 more life-years (1 – 58). This represents an expenditure of $1.4 million per life saved (undiscounted). With discounting, this represents an incremental cost-effectiveness ratio of $108 thousand per life-year gained (42k – 1.8 million).

Conclusions

On the basis if these estimates, the team recommended bi-annual screening mammograms in average risk women age 40-49. This guideline was intended to serve as a “best-practice”, “minimum practice”, and “maximum practice” guideline, as summarized in the following statement which was unanimously endorsed by the team: “Unless documented, patient-specific circumstances dictate otherwise, it is important to offer screening mammograms every two years during the 40-49 age period. More frequent mammograms are not routinely needed for average risk women during this age period.” This guideline statement was incorporated into the existing HFMG clinical preventive services guideline for breast cancer screening.

Postlude

Shortly after this guideline was approved, a meta-analysis of mammography trials was published in JAMA. The abstract stated “The results of our meta-analysis suggest that screening mammography reduced breast cancer mortality by 26% (95% CI 17-34%) in women aged 50-74 years, but does not significantly reduce breast cancer mortality in women aged 40-49 years.” (emphasis added). The body of the manuscript stated “there were only three clinical trials in which women aged 40-49 years underwent two-view mammography and had 10-12 years of follow-up; in those, the relative risk for reduction in breast cancer mortality . . . was 0.73 (95% CI, 0.54 to 1.0) after 10-12 years of follow-up.” Although the abstract suggested the opposite conclusion as the HFMG clinical policy analysis, apparently based on a confidence interval that touched zero, the point estimate of mammography effectiveness in the 40-49 age group was actually more favorable than the HFMG analysis (27% vs. 25% risk reduction). The fact that the assumptions and calculated outcomes were explicitly documented in the HFMG made this potentially consensus-breaking piece of new information and put it in its appropriate context. This example also illustrates the decision-making criterion implied by the clinical trial and meta-analysis literature: if an intervention has a positive effect which is 95% or more likely to be greater than zero, then it is implicitly recommended, regardless of the magnitude of the outcome in relation or the cost. The explicit method permits policy-makers in health care organizations to use a more sophisticated and philosophically defensible criteria based on an assessment of benefits, costs, and the uncertainties associated with each.

Read More

CQI Methods Used Improve the Quality of Cervical Cancer Screening

A baseline evaluation of “Pap” smears done in a large multi-specialty group practice revealed that over 25% of samples were designated “less than optimal” because of the absence of observed endocervical cells, an indicator of sample adequacy. In addition, there was a large variation in the rates of sample adequacy achieved by different physicians and at different clinic sites.

In response to these concerns about sample adequacy, clinical leaders encouraged the formation of a multidisciplinary clinical quality improvement team to work to improve the process by which cervical cytology samples were obtained and assessed for adequacy. This team included cytopathology staff, obstetrician-gynecologists, internists and clinical effectiveness staff.

The team used a quality improvement framework developed at Hospital Corporation of America (HCA), described by the acronym “FOCUS-PDCA.”  The team prepared process flow charts and identified an initial improvement in the design of the cytology requisition form to provide data required for future analyses of Pap smear adequacy and management of cervical neoplasia.  The team also defined and implemented a new, more reproducible operational definition of the key quality characteristic: the proportion of samples with at least five observed endocervical cells.  Although no direct relationship between observed endocervical cells and decreased cervical cancer mortality has been demonstrated, the team conducted a retrospective analysis which showed an increased prevalence of abnormalities found in samples with endocervical cells (14.9% vs. 6.3%), and for mild abnormalities (11.9% vs. 5.2%), and also an increased prevalence of severe abnormalities found (3.0% vs 0.8%).

The team then prepared “run charts,” plotting monthly sample adequacy rates over a two-year period, stratified by clinic location and specialty.  Then, based on a literature review and input form consultants, an “Ishikawa diagram” was prepared, outlining the known factors that could cause inadequate samples.  A study conducted in the Netherlands found that the cytobrush, a plastic sampling tool with a tip which resembles a pipe cleaner, together with a wooden spatula, produced a higher proportion of samples with endocervical cells in the hands of paramedical sample takers when compared to other commonly used tools, including the cotton swab traditionally used within the institution.  The team conducted a retrospective study and a second prospective study which confirmed these findings.

Based on these results, the team prepared a cost-benefit analysis, and drafted a proposal for an intitutional clinical practice policy calling for sampling using the cytobrush and wooden spatula for screening Pap smears for non-pregnant women.  The policy was approved, and was communicated to the primary care medical staff through a series of scripted 13-minute slide presentations presented at local staff meetings at clinic sites throughout the institution.

Two approaches were used to assess the success of these staff training efforts. First, the team conducted a survey of clinic nurses and assistants to determine which sampling tools each internist and obstetrician-gynecologist used during each month of the study period. The survey was repeated to update information on two occasions.  These surveys revealed a dramatic transition from traditional sampling methods using a spatula, with or without a cotton swab, to methods using the cytobrush.  Second, the team confirmed this survey data by tracking orders for cytobrushes through the purchasing department, revealing that the overall volume of cytobrushes being ordered was consistent with the volume implied by merging survey data with physician-specific Pap smear volume data.

To assess the impact of changes in methods of sampling, the team used a “run chart” to track the proportion of inadequate Pap smears each month.  The run chart shows that the proportion of inadequate smears plunged from baseline levels of 20-25% per month to less than 10%. Possibly due to a decrease in the number of repeat Pap smears needed, the overall volume of Pap smears dropped by more than 10%. Consequently, the number of women who received a Pap smear report with the “less than optimal”designation was cut by well over 50%.

The estimated economic impact of these changes was favorable. On an annual basis, the additional costs from using a more expensive sampling tool was $15,000. The cost of additional physician sampling time for a “two tool” method added $11,000. The cost of added physician cytopathologist interpretation due to discovering an additional 1068 abnormalities added $20,000. However, these costs were far outweighed by a savings of $158,000 from fewer repeat visits and fewer repeat Pap interpretations, leading to a net savings of $212,000 per year.

Read More