Sunday, January 17, 2010

Beauty and the beast (on elegance versus performance)

When you are programming you have to choose between code elegance, clean code, versus code performance.

It starts with the programming language itself which you choose. Pick a low abstraction programming language, like C, and you'll have a lot of performance. You could go even further and write in assembler, for even more performance. But consequently the code will be large and hard to read. The program will tend to focus on the specifics of the machine or language, how to do it, instead on the specifics of the problem to be resolved, what to do. In other words, it will be low in abstractions. Otherwise, if you pick a higher abstraction programming language, an OOP or FP language, your program will be more succinct and readable, but you may get performance penalties. This time your program will be rich in abstraction, but it may also suffer on performance.

Even in one programming language you still have options for elegance versus performance. If you choose to use the simplest and cleanest solution you come up with, you may have poorer performance than if you would have used a more complex algorithm, but also harder to understand. Also if you choose to generalize the solution to include a family of problems, in the eventually of program growth, you would get into performance problems.

I faced this issue recently. I tried to write a program as simple and readable as possible. I used Groovy. I used a lot of reflection; the language itself is also heavily based on reflection. I tried to generalize each aspect to include as many cases I could think at the time. I used a lot of dynamic attributes and collections instead of static ones. The program was very domain centric, easy to read and understand (I think).
Note: I don't want to seem to be bragging; there were also plenty of hacks, shortcuts, to resolve a problem in minutes, instead of spending hours to do the wright way.

It all went fine until I tried to use large input data. It took forever to process the data. All the reflection, dynamics and generalities came back on me, and had dramatic effects on performance. I tried to improve some critical points of the program, hot-spots, but the overall performance was still poor.

So in the end you always have to pick: code elegance versus performance. Sometimes the domain will force you to choose one, because of its restrictions. But you will always have to balance these two aspects. If you favor performance you will have hard time maintaining the code on the long-run, If you favor code elegance you will have performance issues, like I had. It's a fine balance, you have to get it right.

What's your opinion: Code elegance or performance? Beauty or the beast?