The Wrong Abstraction

Putting stateless at the top of your coding concert reasonates with me. Hold your urge of refactoring duplicated code untill it is necessary. When, maybe untill it’s or will be shared in many places.

Here are comments on the priority of stateless and decoupling.

Dependencies (coupling) is an important concern to address, but it’s only 1 of 4 criteria that I consider and it’s not the most important one. I try to optimize my code around reducing state, coupling, complexity and code, in that order. I’m willing to add increased coupling if it makes my code more stateless. I’m willing to make it more complex if it reduces coupling. And I’m willing to duplicate code if it makes the code less complex. Only if it doesn’t increase state, coupling or complexity do I dedup code.

The reason I put stateless code as the highest priority is it’s the easiest to reason about. Stateless logic functions the same whether run normally, in parallel or distributed. It’s the easiest to test, since it requires very little setup code. And it’s the easiest to scale up, since you just run another copy of it. Once you introduce state, your life gets significantly harder.

I think the reason that novice programmers optimize around code reduction is that it’s the easiest of the 4 to spot. The other 3 are much more subtle and subjective and so will require greater experience to spot. But learning those priorities, in that order, has made me a significantly better developer.

There’s a few ways in which state vs coupling can play out. Often they’re part of the architecture of a system rather than the low-level functions and types that a developer creates. As an example, should you keep an in-memory queue (state) of jobs coming into your system or maintain a separate queue component (coupling). By extracting the state from your component and isolating it in Rabbit or some other dedicated state management piece, you’ve made the job of managing that state easier and more explicit.

As for complexity, there are many different types. Coupling is a form of complexity, but it’s not the only one. Cyclomatic complexity is another and one. Using regular expressions often increases the complexity of code. And one need only look at the spec for any reasonably popular hash function to see a completely different sort of complexity that’s not the result of either coupling unique paths through code. The composite of all the different forms of complexity is how I’d define it since they all add to a developers cognitive load.

Off the top of my head, there are two reasons people seem to de-duplicate code. One is because two or more things happen to share similar code. Another reason is because two or more things must share similar code.

It seems like you are speaking of the first reason. There is no dependency, and the programmer is creating one. IMHO you should have at least 3 instances before creating an abstraction to reduce your code.

The second reason is different though. By creating the abstraction you are not adding a dependency, you are making an implicit dependency explicit. There is a huge difference. In this instance any duplicate code is bad.

Abstraction is primarily about separation of concerns, not about avoiding repetition. Drying out code that’s repeated all over isn’t the same as creating a formal abstraction for some element of the overall logic.

historical comments related to the same topic

I’m reminded off ESR’s “curse of the gifted” email, schooling Linus Torvalds on the lack of modularization and code-sharing in Linux’s driver code. :)

http://lwn.net/2000/0824/a/esr-sharing.php3

amusing quotes

Amusing arguments for statelessness from the ØMQ Guide by Pieter Hintjens [1]:

“If there’s one lesson we’ve learned from 30+ years of concurrent programming, it is: just don’t share state. It’s like two drunkards trying to share a beer. It doesn’t matter if they’re good buddies. Sooner or later, they’re going to get into a fight. And the more drunkards you add to the table, the more they fight each other over the beer.”

“Code that wants to scale without limit does it like the Internet does, by sending messages and sharing nothing except a common contempt for broken programming metaphors.”

[1] http://zguide.zeromq.org/page:all#toc45