Elsevier

Journal of Health Economics

Volume 56, December 2017, Pages 397-413
Journal of Health Economics

Plan responses to diagnosis-based payment: Evidence from Germany’s morbidity-based risk adjustment

https://doi.org/10.1016/j.jhealeco.2017.03.001Get rights and content

Highlights

  • Many competitive insurance markets use diagnosis information to adjust payments to plans.

  • We examine the response of German plans to such morbidity-based payments.

  • We implement a differencing approach on diagnoses reported to the regulator.

  • The share and the number of “validated” (payment-relevant) diagnoses rise disproportionally.

  • This response is in line with the incentives generated by the payment formula.

Abstract

Many competitive health insurance markets adjust payments to participating health plans according to their enrollees’ risk − including based on diagnostic information. We investigate responses of German health plans to the introduction of morbidity-based risk adjustment in the Statutory Health Insurance in 2009, which triggers payments based on “validated” diagnoses by providers. Using the regulator’s data from office-based physicians, we estimate a difference-in-difference analysis of the change in the share and number of validated diagnoses for ICD codes that are inside or outside the risk adjustment but are otherwise similar. We find a differential increase in the share of validated diagnoses of 2.6 and 3.6 percentage points (3–4%) between 2008 and 2013. This increase appears to originate from both a shift from not-validated toward validated diagnoses and an increase in the number of such diagnoses. Overall, our results indicate that plans were successful in influencing physicians’ coding practices in a way that could lead to higher payments.

Introduction

Competitive health insurance markets generally calibrate per-capita transfers to health plans based on the risk of their covered populations. In order to enhance the accuracy of payments, risk adjustment (RA) systems in the United States and many European countries have evolved from adjusting payments based on demographic factors to also include diagnosis-based morbidity indicators. However, these payment systems create a financial incentive for plans to report diagnoses that are included in the RA and trigger high payments (relative to resource costs). Plans can encourage coding that is appropriate (“right-coding”) or that unduly substitutes more generously-paid codes for less generously-paid ones (“upcoding”). The resulting change in coding patterns can lead to nominal changes in disease profiles (i.e., increased prevalence of certain diagnoses and/or increased severity) that do not reflect changes in actual disease patterns and severity.

A possible increase in nominal coding due to morbidity-based payments raises several concerns (Geruso and Layton, 2015, Kronick and Welch, 2014). First, in settings without overall budget cap, as in the US Medicare Advantage program and the Health Insurance Exchanges, increases in nominal coding and coding intensity that have no real basis can unduly increase government spending. In the context of Medicare, this concern has triggered repeated legislative adjustments to payments, e.g., via the Deficit Reduction Act of 2005; the Affordable Care Act of 2010; and the American Taxpayer’s Relief Act of 2012 (Kronick and Welch, 2014). Second, in contexts where RA is used to allocate a fixed budget, as in many European countries, such as in Germany, the Netherlands, Belgium and Switzerland, this behavior can generate inefficiencies by distorting the allocation of resources between competing health plans. Third, increases in nominal coding can change patient profiles, as codes are assigned to patients that lack an adequate basis for a diagnosis, or as patients with low severity are assigned to higher-severity codes. As consequence, over time the average costs for the affected diagnoses may fall, pushing down the payment associated with the specific risk adjuster. Because all plans would receive this lower payment, this effect can force all market participants to lower cost or increase revenue, potentially leading to undesirable behavior such as risk-selection or reinforcing intensive coding. Finally, this behavior can divert health plans’ attention from organising provision to engaging in rent-seeking. Plans that successfully manipulate coding may then use some of the additional earnings to distort consumer choices of plans, e.g., by offering premium rebates or supplemental benefits (Geruso and Layton, 2015).

In this paper we examine the impacts on coding of office-based physicians from the introduction of the morbidity-based RA in the German Statutory Health Insurance (SHI) in 2009. The “morbi-RA” replaced a more basic system that adjusted for age, sex and disability-to-work status. The new system includes these parameters, as well as morbidity groups for 80 illnesses that are constructed based on ICD-10 diagnosis codes from hospitals and office-based physicians. Unlike in the US, German health plans are generally not allowed to own or operate health care facilities and contracting is mostly done collectively between the plan and provider associations. However, as described below, even within this heavily regulated environment, German plans have several ways to encourage physicians to adopt coding practices that are associated with (higher) payments through the RA.

We focus on a subtle payment-relevant feature of the German RA system, the designation of outpatient diagnoses. The German SHI’s RA scheme only takes into account diagnoses made by office-based physicians if the latter have designated the diagnosis as “validated”. Validation means that the physician is affirming the patient has the respective condition, as opposed to merely suspecting a diagnosis or recording an earlier diagnosis that is no longer relevant. In this paper, we examine changes in prevalence and count of diagnoses that are “validated” and hence taken into account by the RA scheme. We estimate the impact of the RA on the documentation of these diagnoses in difference-in-difference analyses on a random sample of administrative data used to execute the RA payments for the years 2008–2013. Specifically, we examine the change over time in the share and count of validated diagnoses at the level of an individual ICD code, for codes that were or were not part of the RA scheme. Our analyses are based on diagnoses submitted by office-based physicians who are not required to report validated diagnoses but who are required to mark each diagnoses as validated or not, and whose individual payments are based on procedures and not diagnoses codes.

Fig. 1 previews our main finding that the average share of validated diagnoses increases faster for codes that are included in the RA scheme relative to those diagnoses that are excluded. The regression estimates indicate that the relative increase for these codes was 2.6 and 3.6 percentage points between 2008 and 2013. We also find that this effect is driven by both a shift from not-validated toward validated (payment-relevant) diagnoses and an increase in the number of such diagnoses. We further investigate differences in this effect across types of health plans and find that although this effect exists for most plan types, regional health plans may have experienced larger changes in coding than their competitors. This could indicate that the substantial and historical local ties of regional plans provide an effective means to shape physician coding practices, and may act as a substitute for explicit vertical integration in settings like the United States (Geruso and Layton, 2015). Our results are robust to excluding those codes and groups of codes that changed over time because of revisions of the RA system or the ICD catalog. We argue that these effects are likely the consequences of nominal rather than real changes in morbidity, as the latter are unlikely to affect only payment-relevant codes (and should therefore be captured by our control group) and are unlikely to differ across plan types. Finally, we find no clear correlation between payments to plans for specific diagnoses and the change in the coding patterns, possibly because plans are unable to narrowly target specific diagnoses due to practical or legal constraints.

Research on plan responses to coding incentives in the US Medicare program has leveraged the fact that RA is only used for the Medicare Advantage (MA) component and not for the Fee-for-Service (FFS) component. FFS is used as control group to capture real changes in diagnoses that can be subtracted from the combined nominal and real changes in the MA diagnoses, after accounting for risk selection between MA and FFS. Using this approach, Kronick and Welch (2014) find that each year between 2004 and 2013, risk scores in MA rise faster than risk scores in FFS. They conclude that this rise in relative risk score reflects changes in coding intensity rather than real increases in morbidity. Geruso and Layton (2015) investigate differences in coding intensities for FFS and MA, and among types of MA plans. They estimate that the relatively more intensive coding by MA plans generates risk scores that are 6–16% higher than they would have been in FFS. They also find that the risk scores are higher for MA insurers that are vertically integrated with providers, possibly because this makes it easier for insurers to influence providers’ coding behavior.

A related literature on hospitals’ responses to diagnosis-based payments has exploited the introduction of the diagnosis-related group (DRG) payment system or recalibrations in the payments of specific DRGs. Jürges and Köberlein (2015) examine how German hospitals responded to the introduction of DRG payments in 2003 by focusing on sharp thresholds for birth weight in DRG assignments that determine payments for preterm babies. They find that hospitals responded to the introduction of the birth weight thresholds by shifting newborns’ reported birth weights from above to below the relevant thresholds, leading to DRGs with higher payment. Dafny (2005) studies US hospitals’ responses to a recalibration of Medicare’s DRG reimbursements in 1988. She investigates pairs of codes that are clinically similar but are associated with different payment amounts. Her findings suggest that the share of lucrative codes within a pair increased in the pairs’ payment gap. She also finds that the response was primarily nominal (via coding practices) rather than real (changes in admission volumes and intensity of care). For the period after the 1988 change, Silverman and Skinner (2004) find a disproportionate increase in the prevalence of most generous codes for pneumonia and respiratory infections. A similar methodology has been used to document responses to changes in DRG payments by hospitals in Portugal and Norway (Barros and Braun, 2016, Januleviciute et al., 2016). Sacarny (2016) evaluates hospitals’ responses to a 2008 reform that increased Medicare payments for claims that had detailed codes describing the patients’ type of heart failure. He finds that hospitals were aware of the rewards to more detailed coding and responded accordingly. However, he also finds that the more lucrative coding diffused only slowly because physicians lacked incentives to change their coding practices.

Our analysis contributes a different perspective to the literature on plan responses, and in a context that differs from Medicare in important and informative ways. First, we highlight a novel approach to identifying coding responses induced by health plans by exploiting differential incentives of diagnoses that are included or excluded from the RA. Our approach is closely related to that used in studies of hospital coding behavior (Dafny, 2005, Sacarny, 2016) but − to our knowledge − has not yet been applied to health plans that do not directly control coding decisions. Second, we investigate changes at the level of individual ICD codes. This is in contrast to most related work on coding by MA plans, which focuses on aggregate risk scores (an exception is Geruso and Layton (2015) who examine the transition of elderly in Massachusetts into MA or FFS). The analysis on the code level allows for controlling for time constant factors at the code level, as well as for explicitly accounting for changes in the coding system by excluding affected codes and groups of codes. By focusing on changes in designations (validated vs. others) within ICD codes, we analyze subtler changes than reclassification of patients to different diagnoses as in the research on hospital coding. The changes in designations are unlikely to affect treatment and do not directly affect physicians’ pay, but nonetheless can generate substantial increases in payments to plans. Related, we observe changes over the transition from the demographic to the morbidity-based RA, which allows us to examine a major change in addition to later minor revisions within the morbidity-based system. A similar transition in Medicare in 2004 has been studied to examine risk selection by focusing on risk scores rather than diagnoses (Brown et al., 2014, Newhouse et al., 2012); and research on diagnosis coding for existing Medicare beneficiaries has not used data prior to the new RA. Our setting also facilitates a longitudinal difference-in-difference design that leverages data from before the change. − Fourth, the SHI operates as a single system so that there is no risk of selection between program components in the cross-section, e.g., as between MA and Medicare FFS. Our results also apply to the entire SHI population rather than a subgroup. Finally, we examine plan responses in a regulatory environment that is very different from the US and more similar to that of other European countries. For instance, US plans may be able to address principal-agent problems related to coding behavior through mechanisms not available to German plans, e.g., vertical integration (Geruso and Layton, 2015).

The remainder of the paper is structured as follows. In Section 2, we present additional information on the German health insurance system and describe relevant features of the morbidity-adjusted RA scheme, as well as coding incentives for hospitals and physicians, and how plans can influence them. In Section 3, we describe the administrative data used for our analysis, and in Section 4 we explain the estimation strategy. Results are presented in Section 5. Section 6 concludes with a discussion.

Section snippets

The statutory health insurance

The German SHI covers about 90% of the population.1 The remainder is mostly covered by a separate private insurance system that restricts enrolment.2

Data

We conduct our primary analyses on a 10% random sample of administrative data used by the German insurance regulator (Bundesversicherungsamt) to construct the RA weights and to execute the payments to plans. Our dataset includes the yearly count of diagnoses that outpatient physicians and hospitals reported for the period 2008–2013 at the level of individual ICDs and classified using the 10th revision of the International Statistical Classification of Diseases and Related Health Problems

Methods

We implement a difference-in-difference analysis to estimate the change, over time, in the share and count of validated diagnoses using ICD codes that are included in the RA scheme as treatment group and those not included as control group. Our main estimation equation is:yit=β0+β1InRAit+δ'InRAitt+ηi+γt+uitWhere yit represents our outcome measures, i.e., either the share or the log number of validated diagnoses for ICD code i in year t. InRAit takes on the value 1 if code i is included in the

Results

Table 2 reports the estimates from our primary analyses. Columns 1 and 2 report results for the share of validated diagnoses, columns 3 and 4 for the log number of validated diagnoses. Columns 1 and 3 are estimated on all ICD codes, while column 2 and 4 are estimated on codes with similar chronicity at baseline, i.e., a chronicity between 45 and 55%. As the negative coefficient estimates of the indicator for inclusion in the RA (inRA) in columns 1 and 2 indicate, the codes that were included in

Discussion

Our findings suggest that German health plans were successful in responding to the financial incentives embedded in the 2009 morbidity-based RA by inducing physicians to more often designate as “validated” (and hence payment-relevant) those diagnosis codes that are included in the payment formula. This increase in the share of payment-relevant validated diagnoses is relative to changes in other diagnoses that are not payment-relevant, and is more pronounced for certain plan types. It therefore

Funding sources

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References (32)

  • BVA

    Tätigkeitsbericht 2015

    (2015)
  • P. Barros et al.

    Upcoding in a national health service: the evidence from Portugal: upcoding in a national health service: the evidence from Portugal

    Health Econ.

    (2016)
  • N. Brandt

    Moving Towards more Sustainable Healthcare Financing in Germany. OECD Econ. Dep. Work. Pap. No 612

    (2008)
  • J. Brown et al.

    How New evidence from the Medicare Advantage Program

    Am. Econ. Rev.

    (2014)
  • R. Busse et al.

    Germany health system review

    Health Syst. Transit.

    (2014)
  • DIMDI

    Welche Zusatzkennzeichen gibt es in ICD-10-GM und OPS und wie werden sie angewendet? (FAQ Nr. 1010) [WWW Document]

    (2012)
  • Cited by (13)

    • Prenatal exposure to the German food crisis 1944–1948 and health after 65 years

      2021, Economics and Human Biology
      Citation Excerpt :

      Hence it is possible that selective infant mortality has driven down the observed detrimental effects of intrauterine malnutrition we find today among the survivors. Finally, one potential drawback of using claims data is that some diagnoses may be either under or over-reported due to financial incentives (Bauhoff et al., 2017). Although this may be the case in general, so that morbidity rates are under- or over-estimated across the board, it seems highly unlikely that morbidity rates are exaggerated to a greater extent among the cohorts that were born just after the Second World War and thus exposed to the food crisis.

    • Reducing low value services in surgical inpatients in Taiwan: Does diagnosis-related group payment work?

      2020, Health Policy
      Citation Excerpt :

      Hospital characteristics were included accreditation level, ownership, geographic location, urbanization of hospital location, and volume of surgical procedures performed at the hospital annually (≤550, 550–1000, 1000–2400, and >2400). Following the statistical approaches used in the literatures [37,38], we compared the differential changes in individual surgical inpatient’s positive use of specific low-value preoperative services between the DRG and FFS groups before (2008–2009) and each year after implementation of the 2010 Tw-DRG policy (2010–2013) using a longitudinal technique. Our dependent variables were constructed to have a binary distribution in the model with a logit link.

    • References for Part I

      2018, Risk Adjustment, Risk Sharing and Premium Regulation in Health Insurance Markets: Theory and Practice
    View all citing articles on Scopus
    View full text