You are here: Home Blog Balancing DRY and Readability

Balancing DRY and Readability

by Ross Patterson last modified Nov 25, 2008 10:31 AM

Some complexity belongs in your editor/IDE

I was writing a combination of setuphandlers and functional test fixtures recently when I found myself doing something noteworthy. I was converting both apparatus from a set of loops over content type names and prefixes to a few utility functions with lots of repetitious calls. Why was I violating DRY? Because I couldn't make sense of things! Oh yeah, DRY should never compromise readability. :)

It's definitely possible to take DRY too far leading to such slap-your-forehead moments as this. I think that setup code, whether for install or a test fixture, is particularly prone to this. I've both written a lot of setup code and read a lot of it in the wild where a bunch of complex structure is introduced in the name of DRY.

What I ended up doing was writing a small number of utility functions whose call signatures were intuitive and readable. Then I used a bunch of complex "M-x query-replace-regexp" commands in Emacs to convert the complicated structure into a simple, readable, flat, but nonetheless very repetitious function calls. Later when I needed additional set up, I used the same complex editor command to accomplish the task.

It occurred to me that what I was doing was taking unreadable complexity out of code where I, let alone someone else, couldn't remember or read what was going on, and i was moving it into my tools. My tools I use every day so I remember how to manage the complexity and I'm not condemning others to learn the complexity when I put it there. I think this makes a simple guideline for balancing DRY with readability. Some complexity belongs in your tool usage, not in the code.

Document Actions

Amen!

Posted by http://palladion.com/ at Nov 25, 2008 09:13 PM
DRY is actually an anti-goal for me in some cases:

 - Shared code leads to fragile / expensive / bad unit
   tests[1]

 - The craze for "convention over configuration" puzzles
   the hell out of me: separating policy from mechanism
   is a fundamental cornerstone of building reusable
   software[2], but the "conventional" crowd weld them
   together at the hip.

[1] http://palladion.com/[…]/unit_testing_notes-20080724)

[2] http://en.wikipedia.org/[…]/Separation_of_mechanism_and_policy

Amen!

Posted by Ross Patterson at Nov 26, 2008 08:57 AM
Heh, thanks for calling out the "convention over configuration" craze. That's something else I've been wanting to blog about. I don't get it either.

We still agree that explicit is better than implicit magic, right? The idea of code that I never see running around and matching things based on names gives me the willies.

As for having to change things in three different places every time
you make a new whatever, sure sometimes it's annoying when I forget
one and my tests remind me. Just as annoying as the interpreter
telling me my syntax is incorrect. That doesn't make it a bad thing.

I don't see how you get separation of concerns, modular systems, and
good domain specialization *without* having to hook multiple things
together. Isn't that the whole point? What is a system that hides
the multiple points of integration doing for me except helping me to
*not* think about things I probably want to understand and be mindful
of.

Then there's the death-to-XML camp. I don't get that either, but
honestly I don't care. If you give me configuration syntax in Python,
I'm still going to put my configuration in different files because I
want constant reinforcement of the separation between concerns.

Personally, I think XML is crap which is what makes it a great
configuration language. It keeps you from doing more than you should.
Have you ever marveled at ConfigParser syntax? It's pretty awful, no
consistency, extremely limited support for rich types, etc.. That's
what makes it a great configuration language, you only have to glance
at it to understand it because no one would ever use it for anything
complex.

Anywho, this is turning into a blog post in itself, so I'll stop here
and think about making an actual post of this later. :)

Amen!

Posted by http://faassen.myopenid.com/ at Nov 26, 2008 01:09 PM
Explicit is not always better than implicit. As is much in programming, it's a matter of trade-offs.

Sometimes convenience beats purity. A fairly innocent example of this is Python's automatic casting of integers to floats whenever you do a calculation that involves both. On the other hand, letting your language add a number to a string and trying to do something sensible (like trying to convert the string to the number) tends to be rather surprising, and failing early would be better.

We're programmers. We thrive on automation and abstraction. When the automation is bad and the rules are confusing or not well thought out, we call this automation magic and don't like it anymore. That doesn't mean automation or rules are bad.

Having a set of rules on how things get configured can create uniformity that helps readability of code. A good example of this is Python's indentation policy. You can configure which lines of code are part of a single block of code by using indentation. By this measure, you replace freedom to indent with a simple rule where the indentation of your code determines function. Two positive side effects from the indentation convention:

* less extraneous markers in the code where indentation is enough to show block structure anyway.

* Python code written by different authors looks more uniform.

You may of course call this "not really configuration" and I won't debate you. It *is* a convention. Incidentally I think it is interesting to consider whether configuration has levels of abstraction like code does.

We agree that having to change things in three different places can be annoying. So I think you can agree that if indeed you can reduce the amount of places where you have to change things without reducing comprehensibility or flexibility, that's a good thing.

Now let's look at what a system like Grok, which uses convention over configuration, really is doing for you. You need to explicitly signal the system that you want the magic: you actually need to subclass from particular classes to make things that are configured together. It employs some rules to figure out what gets hooked up to what, but if it cannot guess, the configuration system will fail loudly, just like when you try to add a string to an integer in Python.

In my experience, Grok's convention over configuration doesn't take away much power from a system where the configuration is done with ZCML: you retain the ability to override configuration elsewhere - the various mechanisms such as skins and ZCML overrides remain available.

Grok isn't typically *hiding* the points of integration - it's just making them much more succinct. The configuration being nearer to the code makes it a lot easier to read without having to have 2 or 3 files open all the time. You *can't* move your configuration syntax into a separate file because where this configuration syntax is (grok directives, base classes) is determines what gets configured in the first place.

"Convention over configuration" is phrased a bit misleadingly, I think. I'd prefer "configuration by convention". "convention" is a bit scary too. You could say "rule-based configuration by code-embedded directives with sensible defaults so you can leave them out" but that gets rather unwieldy. That is what Grok's doing though, which probably means it has a more sophisticated approach towards convention over configuration than most do. This was thanks to having the explicit configuration system available so we could actually think about automating it.

My blog is currently down, but I think I'll turn this into a blog post eventually too. :)

Amen!

Posted by Ross Patterson at Nov 28, 2008 11:39 AM
"""
Explicit is not always better than implicit. As is much in programming, it's a matter of trade-offs.
"""

Of course, definitely moderation in all things. For my money, configuration is a very important part of any modular or component based system. As such, using some sort of implicit magic for configuration isn't a matter of pragmatic moderation to the notion that explicit is better than implicit, it's an outright exception.

"""
Sometimes convenience beats purity. A fairly innocent example of this is Python's automatic casting of integers to floats whenever you do a calculation that involves both. On the other hand, letting your language add a number to a string and trying to do something sensible (like trying to convert the string to the number) tends to be rather surprising, and failing early would be better.
"""

Comparing such dynamic typing in python to configuration through conventions seems like comparing apples to oranges. In the case of dynamic typing, it's not just convenience, it makes natural sense and easily becomes something I don't have to think about at all. In the case of configuration through convention, I'm configuring components I've built for purposes that only I know and so the configuration is something I have to think about no matter what. This is why I prefer explicit configuration to configuration through convention, it doesn't save me thinking about something I don't need to and it encourages me to forget about a vital part of the design.

"""
Having a set of rules on how things get configured can create uniformity that helps readability of code. A good example of this is Python's indentation policy. You can configure which lines of code are part of a single block of code by using indentation. By this measure, you replace freedom to indent with a simple rule where the indentation of your code determines function. Two positive side effects from the indentation convention:

* less extraneous markers in the code where indentation is enough to show block structure anyway.

* Python code written by different authors looks more uniform.

You may of course call this "not really configuration" and I won't debate you. It *is* a convention. Incidentally I think it is interesting to consider whether configuration has levels of abstraction like code does.
"""

Oh, I'm *huge* fan on convention, I'm even a big fan of conventing *in* configuration, just not configuration *through* convention. Like implicitness, convention is valuable where it saves me thinking about something I don't need to think about. Configuration is very rarely something I don't need to think about.

"""
We agree that having to change things in three different places can be annoying. So I think you can agree that if indeed you can reduce the amount of places where you have to change things without reducing comprehensibility or flexibility, that's a good thing.
"""

Actually, my point was that it's just as annoying as the compiler reporting syntax errors. Even if the compiler were psychic enough to correctly interpret my flawed syntax, the next developer to read my code may not be. IOW, not everything annoying is wrong or bad. There needs to be one more requirement for a convenience to past muster in addition to preserving comprehensibility and flexibility. It needs to not hide from my vital parts of the system that I as a developer must be responsible for. Configuration is something the developer must most often be responsible for.
Contact

me@rpatterson.net

IRC: zenwryly@irc.freenode.net
GTalk: merpattersonnet@gmail.com
Yahoo IM: patterson_ross
AIM: rosspatters
MSN: me@rpatterson.net
Skype: merpattersonnet

831-338-9197
Fax: 831-480-5894

PO Box 32
Boulder Creek, CA
95006