Skip to main content

You may have heard the terms lumping and splitting applied to medical records coding, or maybe to the analysis of coded data. Whether we’re producing coded data or analyzing it, lumping and splitting describe a basic orientation towards the data: Simply put, lumping means making fewer distinctions in the coded data and splitting means making more distinctions in the coded data. But as in all things healthcare-y, the devil is in the details.

In coding, since we are producing the data that downstream users rely on, we must negotiate the tradeoffs between lumping and splitting and the broader implications for data quality. When we evaluate proposed changes to the ICD-10-CM/PCS classification systems, choosing whether to lump or split has lasting implications for data quality. Official coding advice must also consider the consequences of lumping vs. splitting.

For an example of how lumping vs. splitting manifests in proposed changes to ICD-10, we can look at the diagnosis portion of the ICD-10-CM meeting held this past week on Tuesday and Wednesday, September 12-13. The first code change proposal on the agenda, page 11, is a classic example of splitting. This proposal would add new codes to ICD-10-CM to distinguish the specific anatomic sites of anal and rectal abscesses. Each of these specific anatomic sites has its own treatment regimen and prognosis. Knowing the specific anatomic site in the coded data tells us the depth and therefore the seriousness of the abscess, so that the cost and efficacy of the treatment can be more accurately understood. In other words, in making this change to the classification, we would be splitting the coded data along more precise boundary lines—between people who had deeper, more difficult to treat abscesses, and more superficial abscesses. You can check out page 11 of the agenda PDF posted on the CDC website for the details of this proposal.

Over the course of its lifetime, a classification system tends to evolve in the direction of more detail (splitting) rather than less. However, the ICD-10-PCS update going into effect in a couple of weeks, on October 1, will become less detailed in some areas. In this update, hundreds of anatomically specific codes were deleted in select areas and replaced by less specific codes, i.e., we did some serious lumping. Why? Because these distinctions weren’t clinically useful—the consensus was the distinctions were problematic as coded data, and that the data would be more useful without them.

Here’s an example: Before this latest update to ICD-10-PCS, for nearly all procedures performed on the diaphragm muscle (that big sheet of muscle that expands the thoracic cavity as it contracts, thereby drawing air into the lungs) the available ICD-10-PCS codes required coders to specify the right or left side of the diaphragm. There was no general body part that just said Diaphragm. There were serious problems with forcing this distinction. In three types of situations, the actual procedure site could not be accurately captured in the PCS code. In all these types cases—1) the procedure site is the middle of the diaphragm, 2) the procedure site spans both sides of the diaphragm, and 3) the procedure site is an unspecified area of the diaphragm because the documentation is unclear—coders ended up assigning two codes, for both left and right diaphragm, when only a single procedure was performed. Having two distinct codes was pointless as coded data, because the doubling of code assignment and the false precision implied by the codes themselves was misleading. Furthermore, knowing left or right or middle or both was not considered particularly valuable information to collect for clinical research, cost, or quality.

The moral of this story is: More detail is not an absolute good. Sometimes lumping gets you better data. In this and similar instances, the classification was improved by making it less detailed. Starting October 1, all procedures performed on the diaphragm will be coded using the single generic body part Diaphragm—no left, no right, no problem.

Official coding advice must also negotiate the tradeoffs between lumping and splitting. One of the most common questions asked is: When is a portion of an operative episode coded as an additional procedure? In other words, the coder is asking, in this case should I lump (assign a single code) or split (assign an additional code)? But rather than lumping that discussion in with this blog (groan), I’ll split it off into my next blog (double groan).

Rhonda Butler is a clinical research manager with 3M Health Information Systems.