Monday, February 23, 2009

Make it Maintainable and Readable First

You're probably aware of the Donald Knuth quote "Premature optimization is the root of all evil (or at least most of it) in programming.". I reviewed some code recently that reminded me of that quote and why making code readable and maintainable is the first order of business. The code I reviewed reversed the usual order for checking for a null reference to -

if (null != instance) { ...}

I don't like this code because it uses a non-standard idiom and forces me to stop and read the code from right to left to understand it. I've run across this idiom once before and the best reason I have gotten for using it is that it is supposed to be faster. Well, I can't imagine why it would be faster and even it if was there are certainly better performance optimizations that don't affect code readability. Being curious I decided to test the this optimization and wrote a little java test code.

public class NullTest {
public static long testNullFirstAgainstNonNull(int count, boolean display) {
long start = System.currentTimeMillis();
Object obj = getObject();
int x = 0;

for (int i = 0; i < count; i++) {
if (null == obj) {
x++;
}
}

long end = System.currentTimeMillis();
if (display) System.out.println("if (null == [non-null]) -- " + (end - start));
return x;
}

public static int testNullFirstAgainstNull(int count, boolean display) {
long start = System.currentTimeMillis();
Object obj = getNullObject();
int x = 0;

for (int i = 0; i < count; i++) {
if (null != obj) {
x++;
}
}

long end = System.currentTimeMillis();
if (display) System.out.println("if (null != [null]) -- " + (end - start));
return x;
}

public static int testNullLastAgainstNonNull(int count, boolean display) {
long start = System.currentTimeMillis();
Object obj = getObject();
int x = 0;

for (int i = 0; i < count; i++) {
if (obj == null) {
x++;
}
}

long end = System.currentTimeMillis();
if (display) System.out.println("if ([non-null] == null) -- " + (end - start));
return x;
}

public static int testNullLastAgainstNull(int count, boolean display) {
long start = System.currentTimeMillis();
Object obj = getNullObject();
int x = 0;

for (int i = 0; i < count; i++) {
if (obj != null) {
x++;
}
}

long end = System.currentTimeMillis();
if (display) System.out.println("if ([null] != null) -- " + (end - start));
return x;
}


private static Object getObject() {
return new Object();
}

private static Object getNullObject() {
return null;
}

public static void main(String[] args) {
int count = 100000;
testNullFirstAgainstNonNull(count, false);
testNullFirstAgainstNull(count, false);
testNullLastAgainstNonNull(count, false);
testNullLastAgainstNull(count, false);
testNullFirstAgainstNonNull(count, false);
testNullFirstAgainstNull(count, false);
testNullLastAgainstNonNull(count, false);
testNullLastAgainstNull(count, false);

count = 100000000;
System.out.println(" Results from " + count + " executions.");
System.out.println("============================================================");
testNullFirstAgainstNonNull(count, true);
testNullFirstAgainstNull(count, true);
testNullLastAgainstNonNull(count, true);
testNullLastAgainstNull(count, true);
System.out.println();
testNullFirstAgainstNonNull(count, true);
testNullFirstAgainstNull(count, true);
testNullLastAgainstNonNull(count, true);
testNullLastAgainstNull(count, true);
System.out.println();
testNullFirstAgainstNonNull(count, true);
testNullFirstAgainstNull(count, true);
testNullLastAgainstNonNull(count, true);
testNullLastAgainstNull(count, true);
System.out.println();
testNullFirstAgainstNonNull(count, true);
testNullFirstAgainstNull(count, true);
testNullLastAgainstNonNull(count, true);
testNullLastAgainstNull(count, true);
System.out.println();
testNullFirstAgainstNonNull(count, true);
testNullFirstAgainstNull(count, true);
testNullLastAgainstNonNull(count, true);
testNullLastAgainstNull(count, true);
System.out.println();
}
}

The test tries to prime execution and warmup the JVM by running each test before displaying any results. The usual caveats about micro-benchmarks, etc. apply to the results below. I ran this under to Sun JVMs, 1.4 and 6, with similar results that show no performance benefits and just reinforce for me the importance of writing code that is readable, maintainable and understandable.

$ /c/j2sdk1.4.2_17/bin/java -version
java version "1.4.2_17"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_17-b06)
Java HotSpot(TM) Client VM (build 1.4.2_17-b06, mixed mode)

$ /c/j2sdk1.4.2_17/bin/java NullTest
Results from 100000000 executions.
============================================================
if (null == [non-null]) -- 141
if (null != [null]) -- 156
if ([non-null] == null) -- 141
if ([null] != null) -- 172

if (null == [non-null]) -- 141
if (null != [null]) -- 156
if ([non-null] == null) -- 140
if ([null] != null) -- 172

if (null == [non-null]) -- 141
if (null != [null]) -- 172
if ([non-null] == null) -- 140
if ([null] != null) -- 157

if (null == [non-null]) -- 140
if (null != [null]) -- 172
if ([non-null] == null) -- 141
if ([null] != null) -- 156

if (null == [non-null]) -- 141
if (null != [null]) -- 172
if ([non-null] == null) -- 140
if ([null] != null) -- 172


$ java -version
java version "1.6.0_07"
Java(TM) SE Runtime Environment (build 1.6.0_07-b06)
Java HotSpot(TM) Client VM (build 10.0-b23, mixed mode, sharing)

$ java NullTest
Results from 100000000 executions.
============================================================
if (null == [non-null]) -- 62
if (null != [null]) -- 63
if ([non-null] == null) -- 62
if ([null] != null) -- 63

if (null == [non-null]) -- 62
if (null != [null]) -- 78
if ([non-null] == null) -- 63
if ([null] != null) -- 62

if (null == [non-null]) -- 63
if (null != [null]) -- 62
if ([non-null] == null) -- 63
if ([null] != null) -- 62

if (null == [non-null]) -- 63
if (null != [null]) -- 62
if ([non-null] == null) -- 63
if ([null] != null) -- 62

if (null == [non-null]) -- 63
if (null != [null]) -- 62
if ([non-null] == null) -- 63
if ([null] != null) -- 62



Saturday, July 26, 2008

Just Say No to Deflaut Configuration

What I'm talking about is when an application sets its configurable properties using default values because the configuration for the environment the application is intended to run in cannot be set. Default configuration can be the result of specific application logic or a feature of the configuration framework or mechanism used. How this usually plays out is that some generic set of values, often the ones used in the "dev" environment, are used instead of the values for a specific environment like "qa", "stage", or "prod". Sometimes a debug or info message or, if you are lucky, a warning message will be logged. But just as often there is no direct evidence that the desired configuration has not been used.

At first blush having a default configuration to fall back on seems like a good idea. It can simplify development, especially unit testing, because things just work. There is no need to understand or be bothered with the details about configuration before you've even started coding. Where this will come back to bite you, and I guarantee that sooner or later it will, is when someone forgets to configure your application before a new production deployment. Everything looks great, the smoke tests pass, and everyone is happy. At least they are happy until, for some unknown reason, the pricing for widget XYZ is wrong or the customer ABC just seems to have disappeared. Since you turned off info and debug logging in production, there is nothing in your logs to indicate a problem. Now you get to spend some time frantically looking through code for the change, that doesn't exist, that is causing the problems.

Using default configuration violates the principle of failing fast. When there is a problem in your application, the application code should fail as soon as possible and it should be crystal clear why it failed. You don't want to detect a problem through side effects. When there is a serious error, and what could be more serious than incorrect configuration, I want to see a big red flag waving or at least a big loud error message in the logs! I want my application to fail before it can startup and cause even greater problems by reading or writing the wrong database or processing messages from the wrong message queue.

Thursday, July 24, 2008

Bonehead Exception Handling

I found this bit of Java exception handling code in a major open source project -
 
try {
....
} catch (ClassNotFoundException e) {
throw new IntegrationException("Rule Set jar is not correctly formed.");
} catch (IOException e) {
throw new IntegrationException("Rule Set jar is not correctly formed.");
};

The specifics of the code in the try block are unimportant, what does matter is how the exceptions thrown by that code are handled. Besides being completely amateurish, the exception handling is wrong in at least three ways.

First it completely loses the original exception. Whatever context the original exception had is just gone. Since Java 1.4 it has been possible to include the root cause of the error. Before Java 1.4 the programmer could at least have appended the original error message.

Second it turns two completely different types of errors in a one type. Great so now I get an error that could have two very different causes and no way to figure out the actual cause.

Third the error message doesn't say anything at all useful. There is absolutely nothing in the error message that helps me understand or fix the problem. What does "not correctly formed" mean and what in the world does being or not being well formed have to do with a ClassNotFoundException or an IOException?

I suppose I should take some solace in the fact that at least it reports an error. But frankly this code is only slightly better than just swallowing the errors.

Tuesday, March 25, 2008

10 Principles for Developer Testing

1) Code a Little, Test a Little

Test driven development is all the rage and is a great way to do "test a little, code a little". But it is important to remember that it is the results not how you achieve them that matter most. As long as test code is written concurrently with implementation code, it doesn't matter if you follow strict test driven development practices because when the tests are done you won't be able to tell what method was followed. The only wrong way to write test code is after the implementation code is done.

2) Use Tests to Improve the Design

Testability is an excellent indicator of design quality. If your code is hard to test then the design needs to be fixed. Look for warning signs and use the tests to guide the design. Hard to test code includes singletons, static calls, and too many parameters. Warnings signs of a design that needs help include making private methods public for testing, complicated test setup, excessive test assertions and mocking concrete classes.

3) Write as Much Test Code as Implementation Code

On average you should expect to write at least as many lines of test code as there are implementation code.

4) Test the Right Thing

It's a unit test, so it should be testing one unit. Since most of us are doing object oriented development the unit is a class. Make sure you are only testing the class under test and not some infrastructure such as database or web server. Make sure you are only testing the class under test and not other classes it may depend on, the dependent classes should already have been well tested by their own unit tests. Make sure you test the things that are likely to break and that usually does not include things like simple getters and setters.

5) Test the Error Handling

You need to test more than the expected paths through the code. Make sure to test what the code will do under error conditions. Is it catching the right exceptions? Does it handle errors gracefully and cleanup things like database connections?

6) Always Write a Test Before Fixing a Bug

Whenever you find a bug write a test that fails because of the bug before you try and fix the bug. If you don't have a test before you attempt fix the bug how exactly will you know the bug is gone? Adding a test for every bug will also help ensure that the same bug does not creep back into the system later.

7) Mock Objects are Your Friends

If you are not using mock objects you need to start. There are numerous excellent and mature open source mock object frameworks for most any language. The benefits of mock objects are to great to ignore. Mock objects let you write truly isolated unit tests, speed up you unit tests, and test things such as error handling that would probably be impossible to automate without mocks.

8) Use a Code Coverage Analyzer

Test coverage analyzers are wonderful tools that can help find inadequately tested code. But always keep in mind that test coverage is about the journey not the destination. Your goal is to have better tested code not necessary 100% code coverage. There are many excellent open source and commercial coverage tools available so there is no reason not to use one.

9) Don't be Fooled by High Code Coverage Numbers

Depending on the metric used it is possible to have 100% coverage with garbage unit tests. Statement coverage is the poorest way to measure coverage because it only says which lines were executed and says nothing about what branches or paths in the code were tested. Branch coverage tells you which branches in a control structure such as an if or switch statement were executed. So branch coverage combined with statement coverage is pretty good. A better metric is path coverage that will tell you which execution paths through the code were covered.

10) Know Your Cyclometric Complexity

Cyclometric complexity is a measure of how many execution paths there are through a piece of code. A method or function with no control flow, such as a if statement, will have a complexity of one. The complexity goes up for each execution branch such as and if or else statement. Cyclometric complexity is a good way to get a feel for how many tests a method needs to cover all execution paths through it.

Tuesday, February 26, 2008

Behavior Driven Development is not Domain Driven Design?

I had been aware of behavior driven development for a while, but researching my previous post on Test Driven Development I came across something that really had me scratching my head. The general view of BDD is an evolution or refinement of TDD with the goal of making TDD more natural and easier to understand. A big part of BDD is a change in mindset from thinking about testing to thinking about behavior and class design. To this end doing BDD has behavior classes, not test cases. BDD also has methods that specify behavior with names that clearly describe the expected behavior and begin with "should" instead of "test". All this is well and good and I expect will make things easier to understand, especially for developers new to agile development.

What has me more than a little bothered is the linking of BDD with "domain driven design" and the "ubiquitous language" as described by Eric Evans in his excellent book. The Behaviour-Driven Development web site and Wikipedia entry read as if BDD is somehow a logical extension of domain driven design. I read everything I could find and saw nothing to backup the claim. Maybe the site is incomplete, or perhaps it just shows the weakness of implementing the Behaviour-Driven Development site as a wiki. So far I am not convinced that BDD has the depth to accomplish domain driven design. From Domain-Driven Design -
"Domain-driven design is not a technology or a methodology. It is a way of thinking and a set of priorities, aimed at accelerating software projects that have to deal with complicated domains."
Stories will not produce the rich domain model that is needed. Here is the template for a story -

Title (one line describing the story)

Narrative:
As a [role]
I want [feature]
So that [benefit]

Acceptance Criteria: (presented as Scenarios)
This is just an variation on the agile user story and is still too simple, low level, to narrow in focus to result in deep domain knowledge. BDD may give you a well crafted set of classes that implement a feature or story. But how will those classes look in the context of the domain? Will the result superficial or insightful? Will the object connections and interactions be deep and rich, full of meaning, or shallow and empty?

Will the developers following BDD understand the vision of the domain model? Will they understand that changing the code is changes the model and the changing the model requires changing the code? It's an issue of the old saying "not seeing the forest for the trees". How do you see the big picture when the focus is so tight on a story and feature and the classes that implement them.

"Getting the Words Right" and method names that start with "should" and read like sentences won't guide anyone toward a ubiquitous language for the domain. Getting the language right requires a thorough understanding the the problem domain and that does not come from coding user stories.

Don't get me wrong I'm a big believer in agile development and unit testing. I understand that TDD and BDD are as much about design as they are about testing. I been using JUnit since something like 1999. I know how much better unit tests make my code and how much they can improve the design of the classes being tested. But I don't use tests to for domain modeling.

References


DanNorth.net » Introducing BDD
Beyond Test Driven Development: Behaviour Driven Development (Google Video)
In pursuit of code quality: Adventures in behavior-driven development
Test Early » Using BDD to drive development