Physicians want fewer quality measures, but more might be better if it motivates core process improvement rather than “studying for the test”

In a paper published in the August 23, 2024 issue of JAMA, Claire Boone and Anna Zink, from the University of Chicago Booth School of Business (my MBA alma mater!), collaborated with Bill Wright and Ari Robicsek from the Providence Research Network to disclose something that usually remains a secret. 

They analyzed employment contracts of 809 primary care physicians (PCPs) affiliated with a health system (presumably part of Providence) and the value-based payer contracts relevant to the patients attributed to those PCPs.  The payer contracts included commercial payers, Medicaid and Medicare.  The authors counted the number of unique quality metrics to which each of the PCPs was being held contractually accountable.  By “unique” they meant that if a particular quality metric appeared in multiple contracts, it was counted once.  They found that the PCPs were held accountable under an average of 7.6 value-based contracts, and the PCP were accountable for an average of 57 unique quality metrics, with a range from 0 to 103 unique metrics.  Each Medicare contract contained an average of 13.4 metrics, compared to 10.1 for commercial contracts and 5.4 for Medicaid contracts.

The apparent purpose of the study was to call attention to what the authors referred to as “saturation of the quality measure environment.” The authors asserted that a clinician would be unlikely to be able to “reasonably optimize” against 50 or more measures at a time.

But, as noted in a published comment I wrote for JAMA, for almost three decades, I have disagreed with the premise that we need fewer measures.  On the contrary, I believe that we actually need a lot more measures to finally convince health care provider organizations to stop trying to “study for the test” and to actually learn the subject and improve the core processes that drive health care quality.

The Outcomes Management Movement

Back in the 1990s, I served as both a member and a technical advisor for the Committee on Performance Measurement (CPM) of the National Committee for Quality Assurance (NCQA), the committee that oversaw the HEDIS standards, which were the dominant quality measures at the time.  I was on the CPM as the representative of the American Group Practice Association (AGPA), which has since been renamed the American Medical Group Association (AMGA).   During our lengthy committee meetings in Washington and elsewhere, we had endless debates about competing priorities regarding the future of the HEDIS standards. 

We were concerned that the dominance of process of care metrics could be construed as micromanagement and overreach on the prerogatives of health care professionals and provider organizations to set their own guidelines and policies regarding evidence-based care.  The AGPA was an early advocate for the use of measures of patient experience and outcomes, collaborating with InterStudy and later the Health Outcomes Institute in the development of “TyPE specifications” – condition-specific outcomes measures.  Julie Sanderson-Austin, VP of Quality Management and Operations of AGPA enlisted many AGPA member medical groups to refine and test the TyPE specifications.  At Henry Ford Health System, we implemented TyPE specifications for hip and knee replacement, cataract surgery, diabetes, and asthma, and we organized the SCORE Consortium to develop and test outcomes measures in 6 multi-specialty group practices.  The thinking was that providers should be held accountable for outcomes, so we could be free to innovate on different treatment and care management strategies to achieve improved outcomes.   

Although I was an early advocate for outcomes measurement, I learned that there were two big problems that would make it infeasible to use outcomes survey data as a basis for accountability. 

  • First, the measures were too delayed.  For example, we measured whether patients experience a reduction in bodily pain one year after a hip replacement surgery. Including the pre-operative baseline and the one-year follow-up, and including additional time to analyze and report data, outcomes data is always a year and a half delayed.  Five year cancer survival outcomes are delayed at least five years. 
  • Secondly, the subjective experience measures had far too much variance. Even for common surgical procedures, the sample size at the level of individual physicians or practice locations was just too small to overcome random noise. 

Long delays and high variance are fine for large multi-center studies to evaluate which treatments work, but they don’t really work for physician-level accountability, no matter how lovely the idea seems on the surface.

There was also great concern among the HEDIS committee members and within the medical group community that quality and outcomes measurement might become too burdensome.  This sentiment, which seems to be shared by Boone et al in the recent JAMA paper, was that we needed to reduce the number of measures to which physicians and physician organizations were to be held to account for two reasons; to reduce the expense of measurement and improvement, and to allow physicians and organizations to focus their efforts on a small number of metrics to increase the chances of “moving the needle” to improve quality. 

The Quality Indicators Movement

But, even back then in the late 1990s, there was a dissenting school of thought, most eloquently articulated by Beth McGlynn, a PhD health services researcher from RAND Corporation who was serving as one of the other technical advisors to the HEDIS committee.  (Beth is currently VP for Research at Kaiser Permanente and executive director of the Kaiser Permanent Center for Effectiveness and Safety Research.)

Long ago, driving through the Arizona desert with Beth during one of our HEDIS committee trips, Beth explained her point of view that we should not think of specific process quality measures as objects of our focus in improvement initiatives.  Rather, we should treat the measures as merely “indicators” of the performance of the underlying care processes.  Beth recognized that health care quality was the sum of doing a bewildering number of small things right.  Yes, we needed to remember to order mammograms and pap smears.  But those are just two of perhaps five dozen clinical preventive services that we needed to deliver more consistently.  Beth’s team at RAND went on to develop the RAND Quality Assessment Tools system, including a whopping 439 “indicators of quality of care” covering 30 acute and chronic conditions as well as preventative care.  If one were to propose a panel of measures that large today, peoples’ heads would explode. They tested these measures through the Community Quality Index study, which was done as part of the Community Tracking Study conducted by the Center for Studying Health Systems Change.  Beth and her many collaborators published the results in a famous paper in the New England Journal of Medicine in 2003, in which they reported that 6,712 study subjects from 12 metropolitan areas received only 54.9% of the “recommended care” specified in the quality indicators, a result that was widely referenced as a call to action to improve healthcare quality.

Core Process Focus

Beth’s thinking resonated with my own observations within the Henry Ford Health System and the Health Alliance Plan.  We were one of the pilot sites that tested early versions of HEDIS, including the mammography quality measure.  In the Health Alliance Plan, we tried to improve mammography quality through a program to offer health plan members a HAP-logo gym bag if they got a mammogram and sent us a card saying they had done so.  I wondered if we were going to offer gifts for every preventive service. Perhaps baseball caps for pap smears?  Within the Henry Ford Medical Group, our radiology department tried to improve mammography quality by creating a new “Mammo-Track” computer system.  I wondered if we were going to create separate tracking systems for every radiologic test. What about laboratory tests?  Like Beth, I believed strongly that we were thinking too narrowly – studying for the test, rather than really trying to think through how we could improve the performance of our care processes across the board.   If we focused only on a small number of measures, even if we were successful in achieving dramatic improvements, we would only be addressing a tiny slice of the improvement opportunity space.

From my perspective, the only feasible way to address a sizable portion of the opportunity space is to recognize that there are a smaller number of “core processes” for clinical decision making and care delivery that drive overall quality of care.

Some examples:

  • During every encounter, there is (or should be) a core process to determine which preventive services and treatment monitoring interventions are due and to assure they get added to the care plan.
  • Between encounters, there is a core process to determine when such interventions are due and to initiate outreach if there is not an upcoming encounter scheduled.
  • When any item is added to a care plan (an order for a test, a referral for a consultation or procedure, a prescription, or even “return to clinic in 6 weeks”), there is a core process to track the care plan item to completion, escalate and remediate if it gets stuck, and initiate necessary follow-up plan of care changes.   
  • There are also core processes for ambulatory encounters, abnormal test result follow-up, care setting transitions, and care relationship transitions, among others. 

Over the years, I’ve developed and patented technology focused on enabling these core processes, and I’ve introduced these concepts to many health care organizations and clinically-integrated networks.  The notion that we should focus our improvement efforts and IT investments on fundamentals rings true to many health care leaders, although I’ve also observed that the urge to focus on a small number of narrowly-defined opportunities is a “zombie-idea” – an idea that will not die, no matter how many times it is disproved.

In this context, it may be the case that contractually holding professionals and provider organizations accountable for a larger number of metrics – Beth McGlynn-style – might actually be helpful to force clinicians to realize that the metrics are merely “indicators” of the performance of the underlying core processes, and that the only way to achieve good overall performance is to establish and continuously improve those core processes.

We need to stop just studying for the test (narrowly focusing on a few things being measured) and start really learning the subject and improving core processes.

Share

Facebook
Twitter
LinkedIn
Email

1 thought on “Physicians want fewer quality measures, but more might be better if it motivates core process improvement rather than “studying for the test””

  1. Medscape covered the Boone paper, and also completely fell for the premise that fewer metrics is better. In the Medscape article, Boone is quoted as saying “…there are so many important tasks to do in primary care, and there’s no consensus on which ones should be included in quality-based contracts.” So, she recognized that the number of “important tasks” is large, but rather than concluding we need to focus on improving the core processes that enable performance across many tasks, she implies that there is a need to pick just a few task-specific metrics to be subject to contractual accountability. Also interviewed were Ronald N. Adler, MD (UMass) and Wayne Altman, MD (Tufts), both of whom also implicitly accepted the fewer is better philosophy.

    https://www.medscape.com/viewarticle/primary-care-physicians-track-average-57-quality-measures-2024a1000fqu?form=fpf

Leave a Comment

Your email address will not be published. Required fields are marked *

Free Subscription to Blog

Recent Posts