Why Duplication Isn't Always the Enemy in Software Development
Every developer has heard the mantra: "Don't Repeat Yourself." The DRY principle is drilled into us from the earliest days of learning to code. We're taught that duplicated code is a smell, a sin, something to be hunted down and eliminated at every opportunity. But what if that instinct — left unchecked — leads us somewhere far worse than a little repetition? What if the abstraction we reach for to eliminate duplication is actually the wrong one?
This is the core insight behind a deceptively simple but profoundly important idea in software design: prefer duplication over the wrong abstraction. Originally articulated by programmer and author Sandi Metz, and discussed widely in the developer community, this principle challenges us to reconsider one of our most deeply held assumptions about writing clean code.
The Hidden Cost of the Wrong Abstraction
Abstractions are powerful. A well-chosen abstraction hides complexity, reduces cognitive load, and makes a system easier to reason about. But an abstraction that doesn't quite fit the problem it's supposed to solve carries a hidden and compounding cost that is easy to underestimate.
Here's how it typically unfolds. A developer notices two pieces of code that look similar. Following DRY instincts, they extract the shared logic into a single function or class. At first, this feels like a win. The codebase is smaller, and there's one place to make changes. But then requirements evolve. The two original use cases start to diverge. The abstraction, which was built around their similarities, must now accommodate their differences.
To handle the new case, a developer adds a parameter. Then another. Then a conditional branch inside the shared function. Then another parameter to control that branch. Before long, the abstraction has become a tangled mess of flags and special-case logic that is far harder to understand than two straightforward, separate implementations would ever have been. And because the code looks "shared," no one feels comfortable untangling it — they just keep adding to it.
This is the wrong abstraction in action. And it is, in many ways, more dangerous than duplication because it obscures intent, makes testing harder, and discourages future refactoring.
What "Prefer Duplication" Actually Means
Preferring duplication over the wrong abstraction does not mean abandoning all principles of code reuse. It is not a license to copy and paste recklessly across an entire codebase. Rather, it is a reminder that duplication is a temporary and recoverable situation, while a bad abstraction can embed itself deeply into the architecture of a system and resist removal for years.
When two pieces of code look similar but represent genuinely different concepts or business rules, they should be separate. Forcing them into a single abstraction couples things that should not be coupled. A change to one use case bleeds into the other, creating subtle bugs and unexpected behavior.
Duplication, by contrast, is transparent. Duplicated code makes no claims about shared meaning. Each copy can evolve independently. When the time comes to refactor, it is straightforward to read two separate blocks of code and understand what each one does before deciding how — or whether — to unify them.
The Right Time to Abstract
The practical question this principle raises is: when should you abstract? The answer lies not in the visual similarity of code, but in the semantic similarity of the concepts it represents.
- Abstract when things change together. If two pieces of code always need to be updated at the same time for the same reason, they likely represent the same concept and belong in a shared abstraction.
- Wait for the third instance. The "rule of three" suggests that one instance of code is unique, two instances might be coincidence, but three instances are a pattern worth abstracting. Waiting for sufficient evidence before abstracting prevents premature generalization.
- Let the abstraction emerge from real duplication. Genuine patterns reveal themselves over time. Starting with duplication and refactoring later — once the shape of the problem is clear — tends to produce better abstractions than trying to anticipate them upfront.
- Name the abstraction by what it is, not by what it does. If you struggle to give an abstraction a meaningful name that captures its purpose clearly, that is often a sign that the abstraction itself is not well-defined yet.
Untangling a Wrong Abstraction: A Practical Approach
If you find yourself faced with a wrong abstraction already embedded in your codebase, the recommended approach is to inline it. Yes — deliberately reintroduce the duplication. Copy the body of the shared function back into each of its call sites, making each one explicit and self-contained. Then, with the duplication visible and the differences clear, you can refactor each piece on its own terms, free from the constraints of a structure that was never quite right to begin with.
This process feels counterintuitive. It can even feel like regression. But it is often the fastest path to clarity, and clarity is what makes software maintainable over the long term.
A Cultural Shift Worth Making
Embracing this principle requires a cultural shift in how development teams talk about code quality. Duplication should not be treated as automatically shameful. The real questions to ask are: does this duplication represent the same concept, or coincidentally similar implementations of different concepts? Is the pressure to abstract coming from genuine shared meaning, or from discomfort with seeing similar lines of code?
Code reviews that penalize any duplication without asking these questions can push teams toward the very trap this principle warns against. Encouraging developers to tolerate duplication while they wait for the right abstraction to emerge takes confidence and discipline — but it pays dividends in systems that remain readable and adaptable as they grow.
Conclusion: Duplication Is a Tool, Not a Failure
The principle of preferring duplication over the wrong abstraction is ultimately about intellectual honesty. It asks us to resist the urge to tidy code into a shape that feels clean but misrepresents the underlying problem. It asks us to sit with a little redundancy rather than reach prematurely for a solution that creates more complexity than it removes.
Good software design is not measured by the absence of repeated lines. It is measured by how accurately the code reflects the structure of the problem it solves. Sometimes that means duplication. And when it does, duplication is not the problem — it is the honest answer.
