Are Tests Specs?

I've presented TDD concepts many times to many different audiences. Usually I can answer questions about coding, but sometimes I'm thinking about something else. I tend to think better in front of a keyboard and some code, and sometimes I can't visualize what an audience member is asking. That happened to me last week, when an audience member said you couldn't effectively test blah-blah-blah and have that test work as a "readable specification," because blah-blah-blah.

At the time, the best I could think of to say is that sometimes there are things that just aren't effective to test in TDD. That's true. Some algorithms take way too long to execute with any set of data. Sometimes there are concepts that just don't lend well to coding them as "specs by example." But these examples are rare.

I can always write write unit tests that are named well and read well from start to finish. By reading both the test name and statements that comprise it, for a set of tests, I have a comprehensive understanding of what the class under test is capable of doing. Further, I have examples that show me how to work with the class under test. And if all its unit tests are passing (which they should pretty much always do), I know that the example code will actually work.

In retrospect, I figured out what the audience member was asking. It was as simple as exponentiation (somehow I heard it as something more complex at the time--my failure). The argument was that it would just be simpler to write a single comment that says, "this method raises x to the power y."

That's a deceptive example. The one-line comment isn't a specification, it's a summary description. If you didn't already know what "raising x to the power y" really meant, that comment would be useless to you. But we all think we know what exponentiation is about. So using exponentation as an example sounds like a good argument against using tests to express specifications. Seemingly, it's simpler to just provide a short comment.

In fact, I doubt most people could recite all the specifics required to completely document exponentiation. Here they are, from Sun's own javadoc for the `Math.pow`

function.

Returns the value of the first argument raised to the power of the second argument. Special cases: * If the second argument is positive or negative zero, then the result is 1.0. * If the second argument is 1.0, then the result is the same as the first argument. * If the second argument is NaN, then the result is NaN. * If the first argument is NaN and the second argument is nonzero, then the result is NaN. * If o the absolute value of the first argument is greater than 1 and the second argument is positive infinity, or o the absolute value of the first argument is less than 1 and the second argument is negative infinity, then the result is positive infinity. * If o the absolute value of the first argument is greater than 1 and the second argument is negative infinity, or o the absolute value of the first argument is less than 1 and the second argument is positive infinity, then the result is positive zero. * If the absolute value of the first argument equals 1 and the second argument is infinite, then the result is NaN. * If o the first argument is positive zero and the second argument is greater than zero, or o the first argument is positive infinity and the second argument is less than zero, then the result is positive zero. * If o the first argument is positive zero and the second argument is less than zero, or o the first argument is positive infinity and the second argument is greater than zero, then the result is positive infinity. * If o the first argument is negative zero and the second argument is greater than zero but not a finite odd integer, or o the first argument is negative infinity and the second argument is less than zero but not a finite odd integer, then the result is positive zero. * If o the first argument is negative zero and the second argument is a positive finite odd integer, or o the first argument is negative infinity and the second argument is a negative finite odd integer, then the result is negative zero. * If o the first argument is negative zero and the second argument is less than zero but not a finite odd integer, or o the first argument is negative infinity and the second argument is greater than zero but not a finite odd integer, then the result is positive infinity. * If o the first argument is negative zero and the second argument is a negative finite odd integer, or o the first argument is negative infinity and the second argument is a positive finite odd integer, then the result is negative infinity. * If the first argument is finite and less than zero o if the second argument is a finite even integer, the result is equal to the result of raising the absolute value of the first argument to the power of the second argument o if the second argument is a finite odd integer, the result is equal to the negative of the result of raising the absolute value of the first argument to the power of the second argument o if the second argument is finite and not an integer, then the result is NaN. * If both arguments are integers, then the result is exactly equal to the mathematical result of raising the first argument to the power of the second argument if that result can in fact be represented exactly as a double value. (In the foregoing descriptions, a floating-point value is considered to be an integer if and only if it is finite and a fixed point of the method ceil or, equivalently, a fixed point of the method floor. A value is a fixed point of a one-argument method if and only if the result of applying the method to the value is equal to the value.) A result must be within 1 ulp of the correctly rounded result. Results must be semi-monotonic. Parameters: a - the base. b - the exponent. Returns: the value a^b.

All that blather, and it's still a poor specification! Why? Because it doesn't define what it means to "raise an argument to the power of a second argument." You have to already know what that means. It's like defining a word by using that word itself in the definition.

In most code we write, we're not encapsulating a simple math API call or a single call to some already known quantity. We're building new classes and methods that each do very different, very unique things that we probably can't guess from a glib one-line description. The javadoc for `Math.pow`

should really say something like: "returns the identity element, 1, multiplied by the base, as many times as indicated by the exponent." That's a mathematically correct definition (not counting the exceptional cases).

So I took a few minutes and built a test class that I think acts as a readable specification for how exponentiation works. I chose to support exponentiation for integers, not floating point numbers. As such, I chose to also omit support for negative exponents. Otherwise the return value would need to be either a float or a fractional abstraction. I didn't feel like dealing with that--yet. (Want to see it? Let me know.)

Here are the tests:

package util; import junit.framework.*; public class MathUtilTest extends TestCase { static final int LARGE_NUMBER = 10000000; public void testSquares() { for (int i = 1; i < 10; i++) assertEquals(i + " squared:", i * i, MathUtil.power(i, 2)); } public void testCubes() { for (int i = 1; i < 10; i++) assertEquals(i + " cubed:", i * i * i, MathUtil.power(i, 3)); } public void testLargerExponents() { assertEquals(16, MathUtil.power(2, 4)); assertEquals(256, MathUtil.power(2, 8)); assertEquals(65536, MathUtil.power(2, 16)); } public void testNegativeBases() { assertEquals(-2, MathUtil.power(-2, 1)); assertEquals(4, MathUtil.power(-2, 2)); assertEquals(-8, MathUtil.power(-2, 3)); assertEquals(16, MathUtil.power(-2, 4)); } public void testAnythingRaisedToZeroIsAlwaysOne() { assertEquals(1, MathUtil.power(-2, 0)); assertEquals(1, MathUtil.power(-1, 0)); assertEquals(1, MathUtil.power(0, 0)); assertEquals(1, MathUtil.power(1, 0)); assertEquals(1, MathUtil.power(2, 0)); assertEquals(1, MathUtil.power(LARGE_NUMBER, 0)); } public void testZeroRaisedToAnyPositiveIsAlwaysZero() { assertEquals(0, MathUtil.power(0, 1)); assertEquals(0, MathUtil.power(0, 2)); assertEquals(0, MathUtil.power(0, LARGE_NUMBER)); } public void testOneRaisedToAnythingIsAlwaysOne() { assertEquals(1, MathUtil.power(1, 1)); assertEquals(1, MathUtil.power(1, 2)); assertEquals(1, MathUtil.power(1, LARGE_NUMBER)); } public void testNegativeZeroExponentIsOne() { assertEquals(1, MathUtil.power(1, -0)); } public void testNegativeExponentsUnsupported() { try { MathUtil.power(1, -1); fail("should not be supported"); } catch (UnsupportedOperationException expected) { } } public void testOverflow() { try { MathUtil.power(3, 100); fail("expected overflow"); } catch (IntegerOverflowException expected) { } } }

(The class IntegerOverflowException is an empty subclass of RuntimeException.)

Before I present the tests, here's the code for the resulting power function:

package util; import junit.framework.*; public class MathUtil { public static int power(int base, int exponent) { if (exponent < 0) throw new UnsupportedOperationException(); if (exponent == 0) return 1; long result = 1; for (int i = 0; i < exponent; i++) { result *= base; if (result > Integer.MAX_VALUE) throw new IntegerOverflowException(); } return (int)result; } }

I built the production code incrementally, in accordance with each new bit of unit test code I wrote.

The tests certainly are not exhaustive. They *are* good enough (a) to give me confidence that the code works, and (b) to describe what exponentiation is all about. I'm sure there's room for improvement in these tests, in how they read and in their completeness.

Still, these tests have a clear advantage over javadoc: they don't require the reader to interpret a lot of English gobbledygook. Simple code examples speak far larger volumes about what I really want to know. The examples that these tests present tell me how to use the power function, and about the results it produces. That's most of what I need. And for most of the real programmers I know, that's what they would prefer.

Having said all that, more frequently I encounter tests that don't do such a good job of explaining themselves. They contain lots of magic numbers, their names don't tell me what's important, the tests contain complex logic, they run on for several screens, and so on. Ultimately I have to do a lot of reading between the lines in order to figure out what the test is all about.

Are tests specs? Yes, they can be, although in a small number of cases it's probably better to just write a brief English specification. But I'm doing test-driven development anyway, for other benefits that include fewer defects and better design. If I'm going to expend that effort, why shouldn't I also strive to make the tests as useful as they can be?

I often imagine that someone offers me a choice between two systems. The first system has profuse comments, lots of paper specifications, and exhaustive UML models. But it has no tests, or maybe some poorly written tests. This first system is typical of most of the systems out there (except that most systems don't even have good models).

The second system I can choose from has no such documentation. It contains no source code comments, no paper specifications, and no UML models. But it does have comprehensive, well-written unit tests (it was probably built using TDD). I'll take the latter any day, with glee.

Comments:

Links to this post:

<< Home

I once listened to a lecture from a famous late mathematician. He introduced the "proof" that he was going to show to us with the words

"I will show you that the result holds for 3. Then you'll see that it holds for all choices of 3"

This was funny. It was also deep, in that a fully formal proof with "n" in place of 3 would have been more correct, but less clear. And compelling belief in the truth of the theorem is what a proof is all about.

So this is what I would have answered in your place: "I'll test for 0, then for 1, 2, then for 3, and add a comment saying that it is expected to work for all n > 3 or something like this. TDD is not about "trying to break" production code. It's (also) about communicating, and understanding. When I see that the production code does not depend on "3" being the input, I'll be confident that 4 and 4000 also work.

"I will show you that the result holds for 3. Then you'll see that it holds for all choices of 3"

This was funny. It was also deep, in that a fully formal proof with "n" in place of 3 would have been more correct, but less clear. And compelling belief in the truth of the theorem is what a proof is all about.

So this is what I would have answered in your place: "I'll test for 0, then for 1, 2, then for 3, and add a comment saying that it is expected to work for all n > 3 or something like this. TDD is not about "trying to break" production code. It's (also) about communicating, and understanding. When I see that the production code does not depend on "3" being the input, I'll be confident that 4 and 4000 also work.

Excellent post! I have started doing this wholeheartedly by citing actual test code into the documentation. This gives readers both a narrative (where I can give background on and justifications for the API's design), and hard proof that what I am saying is what the system actually does. And I like the fact that writing both the docs and these use-case tests first forces me to really think things through from the user's point of view.

Another blog poster suggested these tests were just as long as the written spec. Maybe they are. That's missing the point. The idea is that it's possible to use examples to demonstrate the spec--not that the test code could somehow magically compress the amount of specification detail. And yes, it's not truly specification, but it is far easier for developers to figure out.

>>Math.pow should really say something like: "returns the identity element, 1, multiplied by the base, as many times as indicated by the exponent."

That's actually not correct. E.g. pow(2, 2.5) is not an exceptional case, and you can't two and a half time multiply something.

Regards

Post a Comment
That's actually not correct. E.g. pow(2, 2.5) is not an exceptional case, and you can't two and a half time multiply something.

Regards

Links to this post:

<< Home

February 2004 March 2004 May 2004 September 2004 October 2004 January 2005 February 2005 September 2005 October 2005 November 2005 December 2005 January 2006 February 2006 March 2006 June 2006 August 2006 January 2007 February 2007 March 2007 April 2007 September 2007 October 2007 November 2007 December 2007 January 2008