Physicians want fewer quality measures, but more might be better if it motivates core process improvement rather than “studying for the test”

—

In a paper published in the August 23, 2024 issue of JAMA, Claire Boone and Anna Zink, from the University of Chicago Booth School of Business (my MBA alma mater!), collaborated with Bill Wright and Ari Robicsek from the Providence Research Network to disclose something that usually remains a secret.

They analyzed employment contracts of 809 primary care physicians (PCPs) affiliated with a health system (presumably part of Providence) and the value-based payer contracts relevant to the patients attributed to those PCPs. The payer contracts included commercial payers, Medicaid and Medicare. The authors counted the number of unique quality metrics to which each of the PCPs was being held contractually accountable. By “unique” they meant that if a particular quality metric appeared in multiple contracts, it was counted once. They found that the PCPs were held accountable under an average of 7.6 value-based contracts, and the PCP were accountable for an average of 57 unique quality metrics, with a range from 0 to 103 unique metrics. Each Medicare contract contained an average of 13.4 metrics, compared to 10.1 for commercial contracts and 5.4 for Medicaid contracts.

The apparent purpose of the study was to call attention to what the authors referred to as “saturation of the quality measure environment.” The authors asserted that a clinician would be unlikely to be able to “reasonably optimize” against 50 or more measures at a time.

But, as noted in a published comment I wrote for JAMA, for almost three decades, I have disagreed with the premise that we need fewer measures. On the contrary, I believe that we actually need a lot more measures to finally convince health care provider organizations to stop trying to “study for the test” and to actually learn the subject and improve the core processes that drive health care quality.

The Outcomes Management Movement

Back in the 1990s, I served as both a member and a technical advisor for the Committee on Performance Measurement (CPM) of the National Committee for Quality Assurance (NCQA), the committee that oversaw the HEDIS standards, which were the dominant quality measures at the time. I was on the CPM as the representative of the American Group Practice Association (AGPA), which has since been renamed the American Medical Group Association (AMGA). During our lengthy committee meetings in Washington and elsewhere, we had endless debates about competing priorities regarding the future of the HEDIS standards.

We were concerned that the dominance of process of care metrics could be construed as micromanagement and overreach on the prerogatives of health care professionals and provider organizations to set their own guidelines and policies regarding evidence-based care. The AGPA was an early advocate for the use of measures of patient experience and outcomes, collaborating with InterStudy and later the Health Outcomes Institute in the development of “TyPE specifications” – condition-specific outcomes measures. Julie Sanderson-Austin, VP of Quality Management and Operations of AGPA enlisted many AGPA member medical groups to refine and test the TyPE specifications. At Henry Ford Health System, we implemented TyPE specifications for hip and knee replacement, cataract surgery, diabetes, and asthma, and we organized the SCORE Consortium to develop and test outcomes measures in 6 multi-specialty group practices. The thinking was that providers should be held accountable for outcomes, so we could be free to innovate on different treatment and care management strategies to achieve improved outcomes.

Although I was an early advocate for outcomes measurement, I learned that there were two big problems that would make it infeasible to use outcomes survey data as a basis for accountability.

First, the measures were too delayed. For example, we measured whether patients experience a reduction in bodily pain one year after a hip replacement surgery. Including the pre-operative baseline and the one-year follow-up, and including additional time to analyze and report data, outcomes data is always a year and a half delayed. Five year cancer survival outcomes are delayed at least five years.
Secondly, the subjective experience measures had far too much variance. Even for common surgical procedures, the sample size at the level of individual physicians or practice locations was just too small to overcome random noise.

Long delays and high variance are fine for large multi-center studies to evaluate which treatments work, but they don’t really work for physician-level accountability, no matter how lovely the idea seems on the surface.

There was also great concern among the HEDIS committee members and within the medical group community that quality and outcomes measurement might become too burdensome. This sentiment, which seems to be shared by Boone et al in the recent JAMA paper, was that we needed to reduce the number of measures to which physicians and physician organizations were to be held to account for two reasons; to reduce the expense of measurement and improvement, and to allow physicians and organizations to focus their efforts on a small number of metrics to increase the chances of “moving the needle” to improve quality.

The Quality Indicators Movement

But, even back then in the late 1990s, there was a dissenting school of thought, most eloquently articulated by Beth McGlynn, a PhD health services researcher from RAND Corporation who was serving as one of the other technical advisors to the HEDIS committee. (Beth is currently VP for Research at Kaiser Permanente and executive director of the Kaiser Permanent Center for Effectiveness and Safety Research.)

Long ago, driving through the Arizona desert with Beth during one of our HEDIS committee trips, Beth explained her point of view that we should not think of specific process quality measures as objects of our focus in improvement initiatives. Rather, we should treat the measures as merely “indicators” of the performance of the underlying care processes. Beth recognized that health care quality was the sum of doing a bewildering number of small things right. Yes, we needed to remember to order mammograms and pap smears. But those are just two of perhaps five dozen clinical preventive services that we needed to deliver more consistently. Beth’s team at RAND went on to develop the RAND Quality Assessment Tools system, including a whopping 439 “indicators of quality of care” covering 30 acute and chronic conditions as well as preventative care. If one were to propose a panel of measures that large today, peoples’ heads would explode. They tested these measures through the Community Quality Index study, which was done as part of the Community Tracking Study conducted by the Center for Studying Health Systems Change. Beth and her many collaborators published the results in a famous paper in the New England Journal of Medicine in 2003, in which they reported that 6,712 study subjects from 12 metropolitan areas received only 54.9% of the “recommended care” specified in the quality indicators, a result that was widely referenced as a call to action to improve healthcare quality.

Core Process Focus

Beth’s thinking resonated with my own observations within the Henry Ford Health System and the Health Alliance Plan. We were one of the pilot sites that tested early versions of HEDIS, including the mammography quality measure. In the Health Alliance Plan, we tried to improve mammography quality through a program to offer health plan members a HAP-logo gym bag if they got a mammogram and sent us a card saying they had done so. I wondered if we were going to offer gifts for every preventive service. Perhaps baseball caps for pap smears? Within the Henry Ford Medical Group, our radiology department tried to improve mammography quality by creating a new “Mammo-Track” computer system. I wondered if we were going to create separate tracking systems for every radiologic test. What about laboratory tests? Like Beth, I believed strongly that we were thinking too narrowly – studying for the test, rather than really trying to think through how we could improve the performance of our care processes across the board. If we focused only on a small number of measures, even if we were successful in achieving dramatic improvements, we would only be addressing a tiny slice of the improvement opportunity space.

From my perspective, the only feasible way to address a sizable portion of the opportunity space is to recognize that there are a smaller number of “core processes” for clinical decision making and care delivery that drive overall quality of care.

Some examples:

During every encounter, there is (or should be) a core process to determine which preventive services and treatment monitoring interventions are due and to assure they get added to the care plan.
Between encounters, there is a core process to determine when such interventions are due and to initiate outreach if there is not an upcoming encounter scheduled.
When any item is added to a care plan (an order for a test, a referral for a consultation or procedure, a prescription, or even “return to clinic in 6 weeks”), there is a core process to track the care plan item to completion, escalate and remediate if it gets stuck, and initiate necessary follow-up plan of care changes.
There are also core processes for ambulatory encounters, abnormal test result follow-up, care setting transitions, and care relationship transitions, among others.

Over the years, I’ve developed and patented technology focused on enabling these core processes, and I’ve introduced these concepts to many health care organizations and clinically-integrated networks. The notion that we should focus our improvement efforts and IT investments on fundamentals rings true to many health care leaders, although I’ve also observed that the urge to focus on a small number of narrowly-defined opportunities is a “zombie-idea” – an idea that will not die, no matter how many times it is disproved.

In this context, it may be the case that contractually holding professionals and provider organizations accountable for a larger number of metrics – Beth McGlynn-style – might actually be helpful to force clinicians to realize that the metrics are merely “indicators” of the performance of the underlying core processes, and that the only way to achieve good overall performance is to establish and continuously improve those core processes.

We need to stop just studying for the test (narrowly focusing on a few things being measured) and start really learning the subject and improving core processes.

Dr. Ward

Richard E. Ward, MD, MBA, CEO of Reward Health. Also physician, health care analytics and informatics innovator, husband, father, tenor & gradually improving swimmer

3 thoughts on “Physicians want fewer quality measures, but more might be better if it motivates core process improvement rather than “studying for the test””

David Anderson
September 5, 2024 at 12:14 pm

If “quality” should be assessed based on providers’ use of “core processes,” it’s not enough to measure whether a provider addressed preventive services in their annual wellness visit (which Medicare does today). We need to assess whether the preventive services called for were appropriate and complete. This requires standards groups to agree on core processes covering a plethora of patient conditions, not just for encounters, but also for periods between encounters. While this is possible for high-volume specialty services – e.g., arthroplasty or chest pain presentation – it would be very challenging for primary care physicians, who have to cover a wide variety of conditions and stimuli. (How would we write a diagnostic script for “House, MD”?) The only realistic way to do this is with a robust electronic health record organized around core processes that have standards embedded within them and measures how well providers adhere to them. Are popular EHRs like Epic, Oracle / Cerner, and others organized to do this?
Dr. Ward
September 7, 2024 at 11:32 am

Medscape covered the Boone paper, and also completely fell for the premise that fewer metrics is better. In the Medscape article, Boone is quoted as saying “…there are so many important tasks to do in primary care, and there’s no consensus on which ones should be included in quality-based contracts.” So, she recognized that the number of “important tasks” is large, but rather than concluding we need to focus on improving the core processes that enable performance across many tasks, she implies that there is a need to pick just a few task-specific metrics to be subject to contractual accountability. Also interviewed were Ronald N. Adler, MD (UMass) and Wayne Altman, MD (Tufts), both of whom also implicitly accepted the fewer is better philosophy.

https://www.medscape.com/viewarticle/primary-care-physicians-track-average-57-quality-measures-2024a1000fqu?form=fpf
Dr. Ward
November 4, 2024 at 11:38 am

Dave, thanks for your comment! I agree that the development and implementation of a large panel of process measures focused on the appropriateness of clinical decision-making would necessitate some process for consensus-building, vetting and dissemination of information on practice policies/standards/guidelines. As you know, many professional societies have taken on that role — although investment in guideline development has fallen off greatly over the last decade. Note that, as the number of appropriateness metrics increases, each one becomes a smaller driver of the measured performance, so the overall performance should be interpreted as being generally aligned with accepted practices, and some degree of non-conformance should be considered appropriate and even desirable. I also agree that the burden of capturing the data needed to support a large panel of quality metrics would only be manageable if electronic health records were designed to facilitate it. Although the technology for EHR-driven quality metrics has definitely improved over the last decade, I think there is a great deal of room for further improvement in the “usual suspect” EHR vendors’ products.

Comments are closed.

Physicians want fewer quality measures, but more might be better if it motivates core process improvement rather than “studying for the test”

The Outcomes Management Movement

The Quality Indicators Movement

Core Process Focus

Dr. Ward

Share

3 thoughts on “Physicians want fewer quality measures, but more might be better if it motivates core process improvement rather than “studying for the test””

Free Subscription to Blog

Recent Posts

In response to a JAMA Viewpoint offering 4 principles for public health post COVID, I say yes, but we need a 5th principle – Science staying in its lane.

Health systems need to partner with universities to solve the inadequate supply of analytic talent

Time for a temporarily privatized CDC policymaking process, sponsored and funded by provider & payer organizations, professional societies and states.

Trump/Musk killed AHRQ, but in a sense, the death has been playing out over 25 years. Recalling 4 eras of the agency and my dream for its eventual rebirth.

In the debate about price vs utilization as the main driver of geographic variation in cost of care, I say don’t get distracted from innovation.

Physicians want fewer quality measures, but more might be better if it motivates core process improvement rather than “studying for the test”

The Outcomes Management Movement

The Quality Indicators Movement

Core Process Focus

Dr. Ward

Share

3 thoughts on “Physicians want fewer quality measures, but more might be better if it motivates core process improvement rather than “studying for the test””

Free Subscription to Blog

Recent Posts

In response to a JAMA Viewpoint offering 4 principles for public health post COVID, I say yes, but we need a 5th principle – Science staying in its lane.

Health systems need to partner with universities to solve the inadequate supply of analytic talent

Time for a temporarily privatized CDC policymaking process, sponsored and funded by provider & payer organizations, professional societies and states.

Where Group Dynamics meets Analysis and Design: The importance of conceptual clarity and acknowledgement of complexity.

Trump/Musk killed AHRQ, but in a sense, the death has been playing out over 25 years. Recalling 4 eras of the agency and my dream for its eventual rebirth.

In the debate about price vs utilization as the main driver of geographic variation in cost of care, I say don’t get distracted from innovation.