Write Readable Code By Making Its Intentions Clear

Apr 19, 2007 code patterns suggest edit

I don’t think it’s too much of a stretch to say that the hardest part of coding is not writing code, but reading it. As Eric Lippert points out, Reading code is hard.

First off, I agree with you that there are very few people who can read code who cannot write code themselves. It’s not like written or spoken natural languages, where understanding what someone else says does not require understanding why they said it that way.

Hmmm, now why did Eric say that in that particular way?

This in part is why reinventing the wheel is so common (apart from the need to prove you can build a better wheel). It’s easier to write new code than try and understand and use existing code.

It is crucial to try and make your code as easy to read as possible. Strive to be the Dr. Seuss of writing code. Making your code easy to read makes it easier to use.

The basics of readable code include the usual advice of following code conventions, formatting code properly, and choosing good names for methods and variables, among other things. This is all included within Code Complete which should be your software development bible.

Aside from all that, a key tactic to improve code readibility and usability is make your code’s intentions crystal clear.

Oftentimes it’s paying attention to the little things that can really help your code along this path. Let’s look at a few examples.

out vs ref

A while ago I encountered some code that looked something like this contrived example:

int y = 7;
//...
bool success = TrySomething(someParam, ref y);

Ignore the terrible names and focus on the parameters. At a glance, what is your initial expectation of this code regarding its parameter?

When I encountered this code, I assumed that that the y parameter value passed in to this method is important somehow and that the method probably changes the value.

I then took a look at the method (keep in mind this is all extremely simplified from the actual code).

public bool TrySomething(object something, ref int y)
{
  try
  {
    y = resultOfCalculation(something);
  }
  catch(SomeException)
  {
    return false;
  }
  return true;
}

Now this annoyed me. Sure, this method is perfectly valid and will compile. But notice that the value of y is never used. It is immediately assigned to something else.

The intention of this method is not clear. It’s intent is not to ever use the value of y, but to merely set it. But since the method uses the ref keyword, you are required to set the value of the parameter before you call it. You can’t do this:

int y;
bool success = TrySomething(someParam, ref y);

In this case, using the out keyword expresses the intentions much better.

public bool TrySomething(object something, out int y)
{
  try
  {
    y = resultOfCalculation(something);
  }
  catch(SomeException)
  {
    return false;
  }
  return true;
}

It’s a really teeny tiny thing, something you might accuse me of being nitpicky even bringing it up, but anything you can do so that the reader of the code doesn’t have to interrupt her train of thought to figure out the meaning of the code will make your code more readable and the API more usable.

Boolean Arguments vs Enums

Brad Abrams touched upon this one a while ago. Let’s look at an example.

BlogPost p = CreatePost(post, true, false);

What exactly is this code doing? Well it’s obvious it creates a blog post. But what is that true indicate? Hard to say. I better pause, look up the method, and then move on. What a pain!

BlogPost p = CreatePost(post
  , PostStatus.Published, CommentStatus.CommentsDisabled);

In the second case, the intentions of the code is much clearer and there is no interruption for the reader to figure out the context of the true or false as in the first method.

Assigning a Value You Don’t Use

Another common example I’ve seen is where the result of a method is assigned to the value of a variable, but the variable is never used. I think this often happens because some developers falsely believe that if a method returns a value, that value has to be assigned to something.

Let’s look at an example that uses the TrySomething method I wrote earlier.

int y;
bool success = TrySomething(something, out y);
/*success is never used again.*/

Fortunately, Resharper makes this sort of thing stick out like a sore thumb. The problem here is that as a code reader, I’m left wondering if you meant to use the variable and forgot, or if this is an unecessary declaration. Do this instead.

int y;
TrySomething(something, out y);

Again, these are very small things, but they make a big difference. Don’t worry about coming across as anal (you will) because the payout is worth it in the end.

What are some examples that you can think of to make code more readable and usable?

UPDATE: Lesson learned. If you oversimplify your code examples, your main point is lost. Especially on the topic of code readability. Touche! I’ve updated the sample code to better illustrate my point. The comments may be out of synch with what you read here as a result.

UPDATE AGAIN: I found another great blog post about writing concise code that adds a lot to this discussion. It is part of the Fail Fast and Return Early school of thought. Short, concise and readable code - invert your logic and stop nesting already!

Found a typo or mistake in the post? suggest edit

Comments

34 responses

Andy • April 19th, 2007
Which is best for readability?
Option a)
if (value)
{
x=true;
y=true;
z=true;
}
else
x=false;
y=false;
z=false;
}

Option b)
x = y= z = value;
Damien Guard • April 19th, 2007
I find the conditional operator an improvement in readability.
x = (a == b) ? 9 : 11;
Which screams "we will be assigning a value to x conditionally." If you're not interested in how x is assigned your eyes can scan to the next line of code.
Other developers I've met prefer the following:
if (a == b)
x = 9;
else
x = 11;
Which puts the condition first and means you have to read through all lines to be confident of the what this will do. It also has the redundancy of having 'x' twice which means should the second assignment be a different variable you're left wondering if this is intentional.
[)amien
Keyvan Nayyeri • April 19th, 2007
Brad's book (Frame Design Guidelines) is full of similar stuff and great tips.
One of most important tips is around using collection types as parameters and return types in public APIs. For myself it has been a pain to find my way in some APIs where designers haven't provided right collection types.
Christopher B. [christb@MSFT] • April 19th, 2007
I know that everyone in the blogosphere is saying this but...
* Explicitly naming variables, members, etc.
* Single Responsibility Principle
Of course, there are lots of little things, but those two are foremost in my mind lately.
I've been writing a rather complex ActionScript (Flash) app the last few months and I keep asking myself, "How will a developer who comes after me understand this code?" Unfortunately, there's not a comparable ReSharper tool for ActionScript, as readability is often a product of refactoring.
Christopher B. [christb@MSFT] • April 19th, 2007

I know that everyone in the blogosphere is saying this but...
* Explicitly naming variables, members, etc.
* Single Responsibility Principle
Of course, there are lots of little things, but those two are foremost in my mind lately.
I've been writing a rather complex ActionScript (Flash) app the last few months and I keep asking myself, "How will a developer who comes after me understand this code?" Unfortunately, there's not a comparable ReSharper tool for ActionScript, as readability is often a product of refactoring.
Eric Lippert • April 19th, 2007
Good tips all.
Normally C# warns on all variables and fields which are never read, never written, etc. But in the case you mention -- local variable assigned a value but never read -- we suppress the warning on purpose.
This is because there is no good way in the Visual Studio debugger to say "show me the return value of the last function call". Though I would agree were you to sensibly point out that the way to solve this is to fix the debugger, given that it is not fixed we need a solution in C#.
Therefore we allow this assignment without producing a warning, and you can then very easily step through the debugger and see what functions returned what values when.
Ron Evans • April 19th, 2007
Good points here. Reminds me of when I first learned that there were people who could read sheet music like they were listening to the score, not just input to a player piano. Code is written to be read by people, not just executed by compilers or interpreters.
I also have noticed the connection between Dr. Seuss and good software practices:
I Speak For The Code
Florian Krüsch • April 19th, 2007
Consider expressing your intent in a domain specific language :)
Florian Krüsch • April 19th, 2007
Use a domain specific languages...
Haacked • April 19th, 2007
Along with the conditional operator, I love the null coalescing operator.
MyClass instance = SomeMethod() ?? GetDefault();
Aaron Davis • April 19th, 2007
+1 for the null coalescing operator. I had never heard of that, which I guess would make that code less readable for me.
I think that is the main problem with code readability where I work: many of the people here don't understand some of C#'s syntax, (even if it is standard in many languages) nor do they want to learn it. In my first code review here, I was all but told not to use the ternary operator because apparently not everyone knew what it was. If I came across some piece of syntax that I didn't recognize, I would try to look it up or ask a coworker about it, just like I would an API call.
The added readability of "syntactic sugar" is useless unless you know what the sugar does.
Eric Lippert • April 19th, 2007
The first two are great tips. As for the last one, as it turns out there is a good reason why people do this in C#.
I'll blog about it on Monday.
Haacked • April 19th, 2007
Looking forward to it.
Dan Monego • April 19th, 2007
I'm hearing what you mean when you say the bad code is bad, but not so much what you're saying as a fix. The reason I'd say that the first block is hard to read is that you're passing a variable when you should be returning one - that's less like the Dr. Seuss of code, and more like the Yoda of code.
The last example is a similar thing, and you get it happening a lot when people are writing side effect functions that return true on success and false on failure. That makes sense if you're writing in Perl or some other scripting language where you want as few lines as possible, but in something like .NET or Java where you have tons of features for handling exceptions, writing a function that behaves like that is crazy. Where's the try-catch block?
Haacked • April 19th, 2007
Well there are cases where it doesn't make sense to absolutely force an exception in a "failure" case. A good example is this:
Say you have a form field in which a user types in a number. You then need to parse that to an int. The user accidentally typing in a letter is not an "exceptional" case. I would expect it to happen. But doing this throws an exception.
int x = int.Parse(formInput.Text);
Completely unnecessary. Instead, I would use this.
int x; if(!int.TryParse(formInput.Text, out x)) { x = -1; //use a constant in the real world. }
int.TryParse returns true or false. If you don't care whether it succeeds, I would call it like so.
int x; int.TryParse(formInput.Text, out x);
Once again, my examples are too simplified and hide the key point I was trying to make.
Haacked • April 19th, 2007
...The point being that if TryParse was implemented using ref instead of out, its intentions are not as clear.
Scott • April 20th, 2007
Anytime I see a method with a void return type and a ref or out argument I wonder "why not just return the value you want from the method?" My second thought is, "Is that out arg a global variable? What's its scope? Why is it getting passed around like a hot potato?"
With bool x = doSomething(); I wonder "when does whatever is being returned by doSomething() fall out of scope and get GC'ed?" That's an easy example though because it's easily refactored to "if(doSomething()) { ... rest of code ... }. A more insidious one is MyStatusCodeEnum x = doSomething(); ( I see some old C/C++ folks returning status codes from their methods sometimes) or MyStruct x = doSomething(); where x is never used.
Dan Monego • April 20th, 2007
That's a really good example of why out is better than ref.
What I was trying to get across is how bad ass the try-catch form is in terms of readability. For example:
try { x = int.parse(userInput); y = pretendFunctionThatAccessesADatabase(x); } catch(FormatException ex) { user.scold(); } catch(NotFoundInPretendDatabaseException ex) { //Give the user a different error } catch(DataException ex) { //oops! }
This is another use that's too trivial to get a good feel for it, but the idea is that you have all of the straightforward logic in the try block and almost all weird outlying error checking is completely separate, hidden away where people don't have to see it unless it happens.
Kevin Dente • April 20th, 2007
A few of my biggest readability peccadilloes are:
a)abbreviated identifiers
Why do some people treat letters like freaking precious resources to be conserved? GetTmplNm (instead of GetTemplateName), srchOpts instead of searchOptions, nmCont instead of namingContainer. It kills me. Especially in this day and age of intellisense, where you rarely type the full identifier name more than once.
b) long methods
Longer than 10-15 lines and I start getting annoyed. Longer than 1-2 pages and you should be smacked. 2-3 thousand lines (which I've seen) and you should be drawn and quartered.
c) complex composite conditional checks
How many times do you need to read stuff like:
if ((val is string) && ((string)val).Length < 100)) || ((string)val).StartsWith("...") && ((string)val).IndexOf("...") > -1)) ...
It's amazing how often you see stuff like that.
Haacked • April 20th, 2007
@Scott: I redid my code samples to clarify my point.
Regarding: bool x = doSomething();
If you never use x, does it matter when it is GC'd? Usually, you care about the value of x. But there are cases where doSomething() performs the action you are interested in, and you don't care about x.
Haacked • April 20th, 2007
@Dan, I agree. I like the fail fast and return early approach to software development. There's a good blog post on this topic. The author focuses on Java, but the same principles apply to C#.
Short, concise and readable code - invert your logic and stop nesting already!
Jennifer Smith • April 20th, 2007
I have GhostDoc installed - it suggests summary comments based on the method name/parameter values you put in.
My thinking goes: if GhostDoc can't read it then future developers won't have a fun time either.
Rob Conery • April 20th, 2007
Dr. Suess! Man you're gonna change your mind about this in 6 months or so when you're trying to read Fox in Socks!
I still can't make my way through it, and I try to hide it from my kid but she still finds it and demands I read it. I don't know what Suess was trying to say with it and man if someone wrote "Fox in Sox Code" I would really have to freak like a geek and throw up in a cup
(Freak Geek in a Cup-thrower-upper)
DotNetKicks.com • April 20th, 2007
You've been kicked (a good thing) - Trackback from DotNetKicks.com
Scott • April 20th, 2007
"If you never use x, does it matter when it is GC'd?"
Only if x is a reference type that contains some references to resources (db connections, other ref types, etc...) that could tie up system resources. Maybe the called method always expected you to manually close a .connection object on x, or commit a transaction.
I've seen some executeSQL type statements that would return objects with properties that the calling method was expected to verify. If they verified, then the calling method had to call a "commit" method on the "x" type object.
That was some bad code.
Haacked • April 20th, 2007
Ah, good point. I should probably qualify my statement that it applies to non-disposable types.
Steven R • April 21st, 2007
I agree that the enum does make the code more readable but what happens when you are expecting a true (an enum with a value of 1) or false (an enum with a value of 2) value and someone sends your method a maybe (an enum with a value of 3)?
Wouldn't it make more sense if the variables fit together nicely to refactor it to a parameter object?
Haacked • April 21st, 2007
I assume you mean passing in a value that is not a defined enum value. For example:
enum MyEnum { Yes = 1 ,No = 2 } void SomeMethod(MyEnum arg) { } //This is legal SomeMethod((MyEnum)3);

Well the recommended guideline (and I subscribe to this) is that if at the time you define SomeMethod and it expects an enum with 2 possible values, but someone passes avalue that is out of range, throw an argument exception.
For example:
void SomeMethod(MyEnum arg) { if(MyEnum.Yes > arg || arg > MyEnum.No) throw new ArgumentException("arg", "Message"); }
Manuel Klimek • April 21st, 2007
Structure is only the first step to readable code. After that you still have to figure out the right level of abstraction. (See my thoughts on this in my article on abstraction)
Blog-HowTo.com • April 21st, 2007
The Top SEO Rule Blogger Need to Know
//engtech • April 22nd, 2007
Tags: 300, Funny, GoogleReader, Wordpress, amazon, amiga, analysis, ant, articles, attention, blog, blogger, blogging, blogs, blogsearch, business, career, cartoon, cat, cellphone, code, coding, comic, community, compsci, content, contests, copyright, ...
Ian Griffiths • April 25th, 2007
Of course your 'improved' TrySomething example would be more compelling if it actually compiled...
As it stands, it fails to compile, because there's a code path in which the function returns without ever assigning a value to 'y'. That's illegal for an 'out' parameter. (It's the case where SomeException gets thrown.)
Out parameters aren't really a good fit with methods that can fail but which don't throw exceptions when they do so.
Haacked • April 25th, 2007
Doh! Good point. I suck at writing sample code in a blog editor.
Jolyon Smith • July 8th, 2008
There's a problem with your analysis of the first example (or perhaps with the simplification of it, but since you have since (LONG since) updated those, that shouldn't be the case any more - but either way...)
public bool TrySomething(object something, ref int y)
{
try
{
y = resultOfCalculation(something);
}
catch(SomeException)
{
return false;
}
return true;
}
Is not guaranteed to set the y param. If, as would appear to be not only possible but expected, an exception occurs in resultOfCalculation, then y will not be set - if the caller has not initialised it then the value of y is potentially undefined.
So, as implemented (to expect and silently handle all exceptions) the y param absolutely should be required to be initialised before calling this routine.

2nd example: Instead of having to pause and figure out what TRUE/FALSE means in your alternate approach the reader now has to wonder what:
PostStatus.Published
CommentStatus.CommentsDisabled
are. Are these functions or properties of the two Status objects? Where does one find these objects? What do these properties/functions yield?
Oh by all means, looking at the parameter declarations will help figure out that they are enum values? But, those same declarations should help answer the "What does TRUE/FALSE mean?" question too. If we're to be forced to use the param declarations to interpret the param values, shouldn't the param values be as simple as possible to avoid the possibility of further confusion?
If I were to do anything in this case it might be to illuminate the param name as a comment:
BlogPost p = CreatePost(post, (*Published*)true,
(*Comments*)false);
Just as revealing, arguably clearer and certainly less verbose.
Possibly dangerous too of course. What if the params were reversed at some future point for some bizarre reason?
So where there are multiple such params, I'd be tempted to collapse them all into a "flags" param, accepting an enum set. Rigidly safe and furthermore easily extendable without impacting on existing code.
The only thing is - I use Delphi, in which set types are very simple, but aiui, C# is somewhat lacking in this fundamental area.
So perhaps the real lesson is, if you want to produce readable code, use a language that intrinsically promotes it without taking away any of the power you need, instead of one that you have to work hard at to produce barely readable code for even the simplest things.
;)