What Exactly Are You Trying To Prove?

Nov 16, 2007 tdd methodologies suggest edit

Frans Bouma wrote an interesting response to my last post, Writing Testable Code Is About Managing Complexity entitled Correctness Provability should be the goal, not Testability.

He states in his post:

When focusing on testability, one can fall into the trap of believing that the tests prove that your code is correct.

God I hope not. Perhaps someone in theory could fall into that trap, but a person could also fall into the trap and buy a modestly priced bridge I have to sell to them in the bay area? This seems like a straw man fallacy to me.

Certainly no major TDD proponent has ever stated that testing provides proof that your code is correct. That would be outlandish.

Instead, what you often hear testing proponents talk about is confidence. For example, in my post Unit Tests cost More To Write I make the following point (emphasis added):

They reduce the true cost of software development by promoting cleaner code with less bugs. They reduce the TCO by documenting how code works and by serving as regression tests, giving maintainers more confidence to make changes to the system.

Frans goes on to say (emphasis mine)…

Proving code to be correct isn’t easy, but it should be your main focus when writing solid software. Your first step should be to prove that your algorithms are correct. If an algorithm fails to be correct, you can save yourself the trouble typing the executable form of it (the code representing the algorithm) as it will never result in solid correct software.

Before I address this, let me tell you a short story from my past. I promise it’ll be brief.

When I was a young bright eyed bushy tailed math major in college, I took a fantastic class called Differential Equations that covered equations which describe continuous phenomena in one or more dimension.

During the section on partial differential equations, we wracked our brains going through crazy mental gymnastics in order to find an explicit formula that solved a set of equations with multiple independent variables. With these techniques, it seemed like we could solve anything. Until of course, near the end of the semester when the cruel joke was finally revealed.

The sets of equations we solved were heavily contrived examples. As difficult as they were to solve, it turns out that only the most trivial sets of differential equations can be solved by an explicit formula. All that mental gymnastics we were doing up until that point was essentially just mental masturbation. Real world phenomena is hardly ever described by sets of equations that lined up so nicely.

Instead, mathematicians use techniques like Numerical Analysis (the Monte Carlo Method is one classic example) to attempt to find approximate solutions with reasonable error bounds.

Disillusioned, I never ended up taking Numerical Analysis (the next class in the series), choosing to try my hand at studying stochastic processes as well as number theory at that point.

The point of this story is that trying to prove the correctness of computer programs is a lot like trying to solve a set of partial differential equations. It works great on small trivial programs, but is incredibly hard and costly on anything resembling a real world software system.

Not only that, what exactly are you trying to prove?

In mathematics, a mathematician will take a set of axioms, a postulate, and then spend years converting caffeine into a long beard (whether you are male or female) and little scribbles on paper (which mathematicians call equations) that hopefully result in a proof that the postulate is true. At that point, the postulate becomes a theorem.

The key here is that the postulate is an unambiguous specification of a truth you wish to prove. To prove the correctness of code, you need to know exactly what correct behavior is for the code, i.e. a complete and unambiguous specification for what the code should do. So tell me dear reader, when was the last time you received an unambiguous fully detailed specification of an application?

If I ever received such a thing, I would simply execute that sucker, because the only unambiguous complete spec for what an application does is code. Even then, you have to ask, how do you prove that the specification does what the customers want?

This is why proving code should not be your main focus, unless, maybe, you write code for the Space Shuttle.

Like differential equations, it’s too costly to explicitly prove code in all but the most trivial cases. If you are an algorithms developer writing the next sort algorithm, perhaps it is worth your time to prove your code because that cost is amortized over the life of such a small reusable unit of code. You have to look at your situation and see if the cost is worth it.

For large real world data driven applications, proving code correctness is just not reasonable because it calls for an extremely costly specification process, whereas tests are very easy to specify and cheap to write and maintain.

This is somewhat more obvious with an example. Suppose I asked you to write a program that could break a CAPTCHA. Writing the program is very time consuming and difficult. But first, before you write the program, what if I asked you to write some tests for the program you will write. That’s trivially easy, isn’t it? You just feed in some CAPTCHA images and then check that the program spits out the correct value. How do you know your tests are correct? You apply the red-green-refactor cycle along with the principle of triangulation. ;)

As we see, testing is easy. So how do you prove its correctness? Is it as easy as testing it?

As I said before, testing doesn’t give you a proof of correctness, but like the approaches of numerical analysis, it can give you an approximate proof with reasonable error bounds, aka, a confidence factor. The more tests, the smaller the error bounds and the better your confidence. This is a way better use of your time than trying to prove everything you write.

Found a typo or mistake in the post? suggest edit

Comments

42 responses

Frans Bouma • November 15th, 2007
I said:
- prove your algorithms are correct. You as a Math major must know that this is doable
- check your implementation of the algoritms by proving you did that correctly.
The point of the article is that focusing on testability is actually not really what you should do: you should focus on provability, and therefore setup your architecture around that instead of making compromises to meet testability goals, simply because testability is a bonus when focusing on provability AND take separation of concerns as a given.
Also, if writing tests isn't meant to prove that your software is correct, why write them in the first place? To see if it FAILS, but even then: if you then fix it, it doesn't mean it's indeed not failing.
If you think a lot of people who write unit-tests have the firm believe that their tests don't prove their software works, you're wrong: a LOT of people think that. They also think that achieving 100% code coverage is actually something worth having.
Frans Bouma • November 15th, 2007
Also, your CAPTCHA example illustrates why focusing on testability is useless: so you are able to write 100 unittests to see if your code works. It still proves nothing if your code can break the captcha in a random image I throw at it. That's why I said: "the set of tests prove that your code is correct in the solution space defined by the tests." You might say "I get more confidence with every test which succeeds", but that's just telling yourself you're doing great without any real reason to base that that on. If your software ends up in a mission critical application, you can't say "I'm pretty confident", you have to say "I KNOW it will work, BECAUSE...".
Take a library which performs o/r mapping features for you. You can write a million unittests for it, and it still can break with the easiest query you can throw at it, simply because the scope of the thing is so incredibly wide. That's thus not going to work, even with 100% code coverage. Tests certainly help, I'm not denying that, they are a useful tool and we use them a lot. The thing is though that by NOT focusing on testability but on provability, you have AND testability (due to the separation of concerns as I discussed as well) AND you can prove your algorithms, the foundation of your software works.
Santiago • November 15th, 2007
You may want to take a look at Spec# maybe, which is an attempt to automatically prove code correctness by adding invariants, pre and post conditions (which *are* an unambiguous spec much simpler than code itself).
However, it doesn't prove the whole correctness of the code, just the properties you specify that you want to check. The more annotations you add to the code, the more properties the automatic demonstrator can assure.
http://research.microsoft.c...
There is a similar attempt for C also, but I have never tried it.
Josh Stodola • November 15th, 2007
Ugh. I swore to myself I would never see the word postulate again upon graduation.
norapinephrine • November 15th, 2007
Unit tests cannot validate the correctness of an application, however, they can definitely be used to validate the algorithmic correctness , reliability, and robustness of a small "chunk" of code. It is system tests that verify the correctness of an application as a whole by exposing weaknesses in everything from the requirements definitions through to software performance.
In response to the comments above, I personally have not met any developers who think that unit tests validate correctness. Rather, I have had the misfortune of working with developers that are so focussed on application correctness that they skip unit testing entirely and rely on system testers to validate their code. This allows some spurious bugs such as memory leaks, address space violations, and interruption handler errors to slip by undetected until the product has landed in the customers' hands.
Scott • November 15th, 2007
Unit test, by definition, NEVER test for correctness. "correctness" is such a broad term. It not only involves unit testing, but also integration testing, functional testing, and finally acceptance testing. Because we all know that the app doesn't do what it is supposed to until the client SAYS that it does everything that it's supposed to. Two of those aspects, functional and acceptance testing, are (or should be) independent of the architecture. (sorry, we're talking CS here. Should I have said "orthogonal"? :) )
I think that some TDD proponents oversell the benefits of TDD and that leads to some misunderstandings.
Justin Etheredge • November 15th, 2007
I was going to put a comment on here, but it ended up so long I just posted it on my blog. If you get the urge please check it out. My blog is linked from this post.
Troy DeMonbreun • November 16th, 2007
I, for one, have met and read (via blogs, mostly) of many silver-bullet seeking developers who think that unit tests validate "code correctness". That may not be true in your particular developer microcosm, but for the multitudes of developers not closely or semi-closely networked with MS, ALT.NET, MVPs, and other high-profile/top-10% talent, I see them mistakenly groking TDD and 100% code coverage as strong proof of a correct application, regardless of the major TDD proponents' intentions or even direct statements to the contrary.
Haacked • November 16th, 2007
- prove your algorithms are correct. You as a Math major must know that this is doable
- check your implementation of the algoritms by proving you did that correctly

It is doable in trivial cases, but such algorithms are a tiny percent of a real application. Imagine trying to prove the correctness of SharePoint.
Not only that, let's say you can prove every algorithm and the composition of algorithms in an application. Your user wants to make a change. Now what? You have to re-prove every area of the code touched by the change? Do you have a proof that only those areas are touched by the change? I'm sure you separated concerns, but did you prove that your concerns are separated well enough? Can you prove all that in a reasonable amount of time? What about decisions made via external systems like a database or a service call?

If you think a lot of people who write unit-tests have the firm believe that their tests don't prove their software works, you're wrong: a LOT of people think that.

And a lot of people send money to spammers in Nigeria. I don't think many developers of reasonable sanity and intelligence believes that. It's still a straw man argument to me.
Feel free to prove me wrong. Can you point me to someone who [preaches|teaches|is an authority] on testing that says that? If so, let's censure them because they're nuts! ;)
Tests (and I'm not limiting myself to unit tests, let's include Integration and System Tests) provide an approximation that an application works.
Even with proofs, you have to make sure your spec covers everything. The Tacoma Narrows bridge collapse is a great example. They most certainly had engineering "proofs" that the bridge would manage the stress. But their "spec" didn't take into consideration harmonic oscillation caused by wind, so the proof wasn't enough.
So while I agree with you that Code Coverage is not enough (as I've written before), I'd say proving correctness is not enough because you can't prove the correctness of the spec in meeting customer needs to begin with.
Chris • November 16th, 2007
I think the dude on the show NUMB3RS would argue that even if the algorithm is wrong, he could discern your intent just be how you wrote the incorrect code, and in doing so, provide you with the code you meant to write. ;)
Robz • November 16th, 2007
Great post. I agree that those that think tests prove their code works are very wrong because you can only approach correctness. 100% code coverage doesn't mean your code is all correctly tested either.
I could believe that there are probably people (although hopefully very few) that do believe that their tests prove their code is right. *shudder*

PS. What's with all of the funny looking pictures? Is this something new for Subtext for those that do not have Gravatars?
Christopher Steen • November 16th, 2007
ASP.NET Posted Updated Samples from My Microsoft ASP.NET Connections Las Vegas Sessions [Via: plitwin...
Haacked • November 16th, 2007
@Robz - They are identicons.
Stephen • November 16th, 2007
I thought TDD was more about the Design and less about the Testing. The fact you get a suite of regression tests was a bonus but not the actual point.
The act of performing TDD makes (nudges, helps, whatever) you write better designed code. Code you can't test tends to be pretty poor code whereas code you can test tends to be better (but not necessarily perfect obviously).
John Stoker • November 16th, 2007
Proving correctness is one of a number of techniques that can be brought to bear. Like testing, it can still be useful even if it is incomplete. But unlike testing, it deals with infinite numbers of cases.

Unit testing advocates often take testing as a religion and a substitute for adequate and ongoing design, to the extent that they cannot accept that techniques that take into account an infinite number of cases are inherently more complete than techniques that only cover a finite number of cases.

Every argument I read above about how hard proving correctness supposedly is applies doubly for unit testing, and those who claim that testing is trivial must be doing trivial testing that is useless in all but the most trivial apps.

The argument of redesigning so you can test better also applies to proving correctness, but, once again, it deals with infinite numbers of cases instead of finite numbers of cases.

Another technique frequently rejected by testing zealots that, unlike testing, deals with infinite possible numbers of cases is assertions, and the list goes on.

Testing has its place, but it is not a legitimate replacement for good ongoing design, assertions, and other techniques that prove (even if incompletely, not as incompletely as testing) that portions of the application are correct.

For the ueberdweebs from extreme programming who claim that designing by test is the right way -- if it passes all the tests it works by definition -- I will write a simple implementation generator that examines the test and derives the implementation from that, covering exactly and only the finite inputs of the test, and claim that this is inherently the simplest most-easily-refactorable and maintainable implementation.
Marc • November 16th, 2007
There have been great strides in proving code correctness and major advances in tools to support doing that. So much so that it's become feasible to prove the correctness of not-so-trivial programs, and still be productive when developing it. Here are some final project stats from the aforementioned link:
"The TIS system’s key statistics are:
- lines of code: 9939
- total effort (days): 260
- productivity (lines of code per day, overall): 38
- defects found post delivery per 1000 lines of code: zero"
Haacked • November 16th, 2007
I will write a simple implementation generator that examines the test and derives the implementation from that, covering exactly and only the finite inputs of the test

That is certainly a way to explicitly find the points in a test suite that fail, but hopefully, in the real world, when you're writing code and tests, you don't work against yourself in such a manner. Of course you'd be working on making sure as you write the code that it is correct.
Also, this ignores the fact that your inputs to the tests generally represent equivalence classes. For example, when you write a method that performs some calculation on an int. You don't have to pass in every possible int to test the calculation thoroughly. You would pass in a set of values that represent equivalence classes. For example:
0 - //often finds divide by 0
1
int.MinValue //Tests for overflow handling
int.MaxValue //Tests for overflow handling
7 - //some random int where you know the expected outcome
There may be other values you'd test based on the specifics of the calculation.
With five or so test cases, depending on the calculation and how well you understand the code to perform that calculation, you as the developer writing that code now have a huge amount of confidence without having to prove the calculation.
Again, I'm not saying you shouldn't spend time in design. I'm saying that provably correct code is *not* a developer's main focus. It's being able to respond to the needs of the user and writing as correct code as possible in a reasonable time frame as determined by the requirements.
For the Space Shuttle, that probably means provably correct code. For most other apps, provably correct code means you never respond to market shifts. Your app comes out too late.
If provably correct code is a developer's main focus, we should all be writing LISP.
John Stoker • November 17th, 2007
You missed the point. I can easily make a program to pass every fixed test you can possibly write -- even write code that will create the program based upon the tests -- and, except in trivial apps, it will not resemble the functionality it should except at the specific points tested. Tests cannot describe general behaviour of beyond the most trivial finite cases. You write 10000 unit tests and I write a lookup table with those 10000 input sets in it that trivially returns correct results for those cases more easily than read your mind for all the cases in the whole continuum that you did not test or specify the behaviour for. Even a woefully incomplete specification is light years ahead of the inherent incompleteness of tests.

The int function example is completely bogus. You make assumptions about the implementation, which are likely to become false if reimplemented. There is no contract at all for the actual desired behaviour.

This sort of idea of unit testing is programming backwards -- reminds me of the movie Amadeus, where Mozart says they are in a new place where people fart backwards -- but incomplete. While there is merit and purpose for testing, it can never serve the primary role some people say it should, which is bass-ackwards to programming. They are equivalent to the required design specification only in a world of make believe.

You write one set of tests focussed on how one implementation was made, and you have no guarantees whatsoever when it is reimplemented, because you only have those points you chose to test based upon assumptions about the original implementation.

Assertions, like proving things about the behaviour, will also catch a far wider set of problems than fixed tests, because they can specify the behavior across a complete continuum.

Tests are spot checks, nothing more. Spot checks are good tripwires to catch certain types of breakage in certain types of environments. They are a very poor substitute for design, type systems, assertions, memory management, and many other things that all boil down to guaranteeing/proving things about an implementation.

This is the horror you get when you promote testers to be programmers. Give me a programmer any day who can design. Making complex things simple is the hardest thing from a design perspective, although plenty will try to foist off simplistic designs as simple when they are not. Design and mathematical reasoning about program behaviour is worth the effort.. With a good design, I can trivially fix any problems to conform to the design. Without a competent design, there is no fixing the behaviour however many tests you may write.

Without good assertions, you are unlikely to discover that your test failed anyway, let alone catch cases that arise that you didn't test for, etc.

Please write me the unit test for computing quadratic formula to double precision without assuming an implementation. If you assumed the implementation, then you may as well have written the solution correctly to begin with.

A test is good as a sanity check. That is all. If you don't think you need anything more to produce quality code, fine, but don't work for me on any project.
Chung-chieh Shan • November 17th, 2007
"If I ever received [an unambiguous fully detailed specification of an application], I would simply execute that sucker, because the only unambiguous complete spec for what an application does is code."

To sort a list L of integers is to produce a nondecreasing list M of integers such that for any integer x, if x appears exactly n times in L, x also appears exactly n times in M.
Haacked • November 17th, 2007
I can easily make a program to pass every fixed test you can possibly write...

But why would you do that? If you are trying to write correct code *and* you are writing tests for that code, what would be the point in doing that?
I get the point. You might somehow accidentally end up with tests and code that only passes those tests. And the Earth might fall into the sun tomorrow. The odds of that are miniscule when applying real red-green factor testing. That situation would only happen if you're trying to sabotage yourself.
Haacked • November 17th, 2007
If you don't think you need anything more to produce quality code, fine, but don't work for me on any project.

I'm not the only one here missing points. I never said testing was enough to produce quality code. The only point I made is that writing provably correct code should not be the developer's main focus. Code proofs only work for trivial cases, not for real world systems.
pel • November 17th, 2007
I'm from Austin and went to UT when Edsger Dijkstra was a lecturer there.
There was a rumor that floated around, that Dijkstra wrote out a C compiler by hand, on paper, and proved it correct.
That rumor may be just an urban legend, but I wouldn't put it past the guy. Surely if it's possible to prove theories about complex calculus, it should be possible to prove common code pattern implementations, and then build on those theories to prove correct a large software product.
Justin Etheredge • November 18th, 2007
@pel This almost certainly an urban legend. And yes, it *may* be possible to prove something more than just a trivial piece of code, it would almost certainly not be a good use of your time. I could go through the pieces of my refrigerator to see if each part is working, but why not just open the door and see if it is cold?
Greg M • November 18th, 2007
I assert that the properties that make your code conducive to correctness proving are the same properties that make it robust and maintainable. Type systems are correctness-provers, but their value (coverage?) depends on their expressiveness. Fortunately languages, type systems and programming practices are getting better and better.
For all these reasons, I'd say the code I write now can in general be _automatically_ proved correct to far more detailed and expressive specifications than I would have dreamed of proving by hand ten years ago.
chuck • November 18th, 2007
To sort a list L of integers is to produce a nondecreasing list M of integers such that for any integer x, if x appears exactly n times in L, x also appears exactly n times in M.

See, even that isn't a perfect description of what you intend: it allows for elements to appear in M that weren't in L. You can't count on people to come up with a perfect description even of trivial things. :D
Scott • November 18th, 2007

To sort a list L of integers is to produce a nondecreasing list M of integers such that for any integer x, if x appears exactly n times in L, x also appears exactly n times in M.

FizzBuzz!
Frans Bouma • November 18th, 2007
"So while I agree with you that Code Coverage is not enough (as I've written before), I'd say proving correctness is not enough because you can't prove the correctness of the spec in meeting customer needs to begin with."
erm... so what do you deliver to said customer then? An empty zipfile? Or what the customer requested?
I'm not debating requirements engineering results here, mind you, I'm arguing if a piece of code is correct or not. A piece of code is correct if the specs it represents in executable form (!) are equal to said executable form.
Ok, let's hammer you once more and then I'll leave it.
- you've to write a piece of code
- you cook up an algorithm A, and you implement A. A is actually subdivided in smaller algo's, A1...An. You all implement them. As you think algorithms are hard to prove correct (did you ever used quicksort? Why did you assume the code worked?), you don't prove A1...An are correct, why bother, right?
- you write 10,000 tests, they all succeed.
- After a while a customer calls, they have a case where your code fails.
- you write the 10,001st unittest with that case. It indeed fails.
What do you conclude there:
a) A suckes, but your implementation of it is correct
b) A is ok, but your implementation of it sucks
c) all of the above.
?
Don't you realize that if you don't know if A is correct, you're DOOMED to patch your code till oblivion ? Because what if your code is actually a) ? Then the poor sods who take over the project from you (or yourself!) are in deep ****, because it's a matter of time before some other customer calls with yet another case.
I.o.w.: if you don't know IF your algorithms work (for the algorithm-impaired: that's not 'code'), you can never deliver a program which is correct, as you can put a lot of work in writing the code, but it is highly unlikely that that code is correctly functioning software. Scrap-book programming.
Oh and Phil... EVERY programmer out there has deep confidence in his/her own work. So if a programmer comes to me and says "I'm confident it works" I don't give a hoot: I'd be confident too if I wrote it, aren't we all? Even if there are thousands of tests to prove the confidence is 'legit'! ;)
Like I said: proving correctness is hard sometimes (read: a lot of work). But no-one said it was easy: developing software IS hard (read: a lot of work).
I also fail to see what Nasa has to do with it btw. As if your customer isn't important. As if THEY don't pay perhaps hundreds of thousands of dollars to get software their company will RELY on!
Think about that for a change: a lot of software has to be reliable, has to work all the time, no edge cases etc. Do you have 'enough confidence' to say that to your customer, based on a set of unittests? Why not take the safe side and be sure?
Frans Bouma • November 18th, 2007
"@pel This almost certainly an urban legend. And yes, it *may* be possible to prove something more than just a trivial piece of code, it would almost certainly not be a good use of your time. I could go through the pieces of my refrigerator to see if each part is working, but why not just open the door and see if it is cold?"
You declare it an 'urban legend' but you don't know for sure?
Ever WRITTEN a compiler, Justin? It's perfectly provable how for example the LR(n) parsers work, how you should generate the action/goto/shift/reduce tables with the transitive closures you can construct from the grammar.
Ugh, that must be hard, right, proving that a compiler is correct! No. It's not. Aho/Sehti/Ullman proved it in their book 'Compilers', a book every software engineer should have read.
What's left is proving your implementation of the algorithm is correct. If you can't do that, you can never find a single bug in a piece of code as that too requires interpreting code.
Chung-chieh Shan • November 18th, 2007
chuck, I should have defined what it means for something to appear exactly a number of times in a list. In fact, as you might have suspected, I should have written the spec in a formal language. Happily, Yves Bertot and Pierre Casteran already did it for me. Too bad you won't get to it.
Brandon Moore • November 19th, 2007
Sure, specs are fuzzy and change. How do you decide what tests to write? Once you know what you are testing for, you can consider trying to prove it instead.
Then it should be just a simple engineering tradeoff - how much confidence can you get for various work with either approach for various properties. Checking that sort always returns a list when given a list is almost certainly best proved (using a type system). Can you really beat a type annotation with any test suite? Checking that it returns an ordered list? Probably easier to test. (hopefully you have something like QuickCheck).
Your argument that proving things about programming is really weird. You give an anecdote where you learn it's hard to give closed form solutions to partial differential equations, then decline to go on to learn the mathematical techniques that you say people actually use in practice to establish useful properties about the behavior of partial differential equations (e.g, the value at time t is within e of the value computed by this algorithm). How does this have anything to do with proving programs? In good cases, proving a program works isn't much harder than writing comments that explain how it works. In bad cases, don't prove it!
A proof is just a formal explanation of why the code works. The difference comes down to weak tools, or fuzziness in your explanation. So subtract the tools. If you are trying to write solid code, when shouldn't one of your goals be writing your code so you can understand and explain why it works?
Haacked • November 19th, 2007
@Frans - Out of curiosity, do you have a proof that LLBLGen Pro is correct? In other words, can you prove that there are no bugs in the code? What tool/tools did you use to do the proof?
If you didn't do an automated proof, how do you know your proof is correct?
Frans Bouma • November 19th, 2007
Critical parts, e.g. the entity management system internally, the save / load mechanism, the inheritance manager etc. indeed they're build with proved algorithms. Done on paper. It's not that hard really, you just have to use logic to prove the algorithm.
Another easy trick is to use proven algorithms which are proven by others. The world is full of algorithms already designed and proven for tasks used every day. Sedgewick's books are filled to the brim with them.
In logic, it's often easier to prove there's no possible way it will go wrong than it will always go right.
We do have bugs once in a while in our code. In far most of the time, they're in code where we didn't use algorithm proving on or we reviewed the code implementation not well enough so we overlooked some mistakes. As I've written in my Linq to LLBLGen Pro series before, we've been bitten by successful tests+sloppy algorithm usage ourselves a couple of years ago: the save train was broken, the algorithms used had flaws: in complex graphs things could go wrong. I then had enough of it and went back to the core and used a set of proven algorithms, and also re-checked the majority of the core code again: algorithms check, code review if the algorithm is implemented correctly, rinse repeat. We have a low bugrate due to that, I'm sure.
Without the algorithm proving and the reviewing of the code to see if we implemented the algorithm correctly, I would have no confidence if our code was OK to be shipped: I mean: on what ground can I conclude it's good enough otherwise?
We're all human, so we make mistakes. One can make a mistake in the proof of an algorithm, or overlook a mistake in the review of an implementation of it. That's also why we still use unit-tests for the integration testing (generated code + runtime lib + use cases => tests), but we don't rely on them for correctness, as said we have other ways to get correct code. I think it's essential in our line of business that we have a way to get correct code in our product, because others rely their software on our code, and also, unittests are pretty meaningless for a library with an infinite scope and an infinite number of input types/series/graphs etc. I mean: what do they tell you except you didn't break anything according to the tests (but you still might have broken something!)
That's IMHO unacceptable for a library which is a foundation of other software. It might take more time to write the code, but it does pay off.
Try it next time you have to write some code: what algorithm or algorithms are you using? Why these and not others? What makes them a perfect fit? Is the scope of the algorithm the same as the scope of the functionality? What are the states of the algorithm, is there a theoretical state possible that something went wrong? If so, how to get there? If it's not possible to reach that state, it's not possible to get into the state where things went wrong (e.g. quicksort which returns with an unsorted set). Etc. As an illustration of this, a mindgame:
http://en.wikipedia.org/wik...
The logical proof that it's impossible to come up with a reliable algorithm/protocol to solve the problem illustrates how one can prove an algorithms correctness as well or when to expect which state. It's likely you already practise this kind of logic in your work in one way or the other.
Troy DeMonbreun • November 19th, 2007
Back to the argument on how Frans' quote:

When focusing on testability, one can fall into the trap of believing that the tests prove that your code is correct.

...is a "straw man" fallacy.
I noticed via the Wikipedia entry for TDD:

Test-driven development can help to build software better and faster. It offers more than just simple validation of correctness...

(Presumably, at least one of the "major TDD proponents" has recently reviewed the Wikipedia article.)
So, Phil, why do find it so hard to believe that there are a significant percentage of developers out there (uninterested in buying a cheap Bay-area bridge) that believe tests prove code correctness?
Haacked • November 19th, 2007
@Frans - Interesting. I think part of our disagreement lies in the fact that the type of software you write is not the necessarily the same as the software many (if not most) developers write.
For example, when I was a consultant writing large scale business apps, the code we wrote was not very "algorithmic". Certainly pulling data out of a database, going through some manipulations of that data, and then presenting the data is an "algorithm" in the strictest sense of the word (well-defined instructions for completing a task), but the "algorithm" is custom to every situation and not something you can look up in a book.
Most algorithms we used were already "proven" and encapsulated in re-usable components (often 3rd party).
So I never had to prove an "entity management system". We used LLBLGen Pro on that project.
I'd say the typical large software project rarely writes heavily algorithmic code (as a percentage of entire code base). Much of the code pulls these components together and is focused on UI interactions.
Did you prove your UI interactions within the GUI tools for LLBLGen Pro?
The other thing is, proving components of your code is not a proof for the whole system. Because then you have to go one step up and prove the interactions of those components and so on.
In any case, perhaps our disagreement also lies in a misunderstanding. When you said a developer's *main* concern is writing provably correct code, I thought you meant that I, as a developer, should focus on writing code that could be proved by a formal proof. As in, you could right now present me a proof (perhaps automated) that LLBLGEN Pro is correct code, from the UI down to the nitty gritty. Certainly such proofs are possible in small programs written in LISP, but it seemed to me unreasonable (today).
As a developer, I always focus on writing correct code and thinking through the steps of what I write. I can walk you through the code and tell you why it will work. In that regards, I can "prove" my code is correct to a reasonable degree (as a human, my proof might end up being suspect).
Personally, I think TDD is a great process for doing that. It's a mental aid for thinking through the design of the code.
I still don't think provably correct code nor testability is our main concern.
I think Reg Braithwraite said it well.

I am not saying that strong typing or provability, iterative development or BDD, or anything else is necessarily better or worse in this respect. I am just pointing out that the surface, buzzwords for these things—fewer bugs, provability, test coverage, separation of concerns—are not the goal, they are ways and means of achieving the goal, which is the discovery of the true requirements and the delivery of working software which meets the true requirements over the project’s complete span.
Haacked • November 19th, 2007
So, Phil, why do find it so hard to believe that there are a significant percentage of developers out there (uninterested in buying a cheap Bay-area bridge) that believe tests prove code correctness?

Damn you Wikipedia!
Troy DeMonbreun • November 19th, 2007
@Phil: LOL!
joe • November 20th, 2007
Many of these comments appear to be from the 1990's, I thought these arguments were long gone.
No one promised TDD would produce faultless software.
It's blatantly obvious a bad developer can write a million test scripts and still release bad code; rather - all things being equal, the code quality improves with TDD.
"Prove your algorithms" is meaningless in business software - even in financial applications it is through testing that the specification evolves – the spec is almost never finished in a real business application.
A method that encourages you to think about the code you are writing, despite the impending deadline, is useful and will lead to better business applications.
Which leads us back to Phil’s MVC - if this helps with TDD then I look forward to it. Stop wasting time replying to Frans and get it released! :)
While I respect the guys who put NUnitAsp together, this isn’t the answer, perhaps MVC is…
Keyvan Nayyeri • November 20th, 2007
Here is just a following to the recent conversation between Phil Haack and Frans Bouma about the unit
Micah Dylan • November 29th, 2007
Nothing to prove here, move along
you've been HAACKED • December 3rd, 2007
Was My Code Provability Post An Inspiration To Joel?
Jeff Handley • December 4th, 2007
TDD - Proof of Incorrectness (via Haacked)
StevenHarman.net • December 20th, 2007
Anonymous Delegates, Events, and Lambda Fun!