open source comments edit

A while back I wrote a riveting 3-part developer’s guide to copyright law and open source licensing for developers.

I’m pretty sure you read every word at the edge of your seat. Who doesn’t love reading about laws, licenses, and copyright!?

Seriously though, I hope some of you found it useful. In this post, I want to talk about some recent developments that should make it easier for developers to license their code.

Choosealicense.com

A couple days ago I published a blog post on the GitHub blog about an effort I’ve been involved with, http://choosealicense.com/. Per the about page:

GitHub wants to help developers choose a license for their source code.

If you already know what you’re doing and have a license you prefer to use, that’s great! We’re not here to change your mind. But if you are bewildered by the large number of OSS license choices, maybe we can help.

I’m interested in helping developers be explicit and clear about their intent for their code. Adding a LICENSE (or an UNLICENSE if that’s your thing) file to the root of your public repository is a good way to state your intent. We even include an option if you really do want to retain all rights to your code, you grinch (I kid! I do not judge.)

But before you can choose a license, you need to be informed about what the license entails. That’s what we hope the site helps with.

Combined with the site, GitHub now has a feature that lets you choose a license when creating a repository on GitHub.

AddALicense.com

That’s great! But what about all your existing projects? Well one of my co-workers, Garen Torikian, has you covered. He built http://addalicense.com/ as a little side project. Note that the project is full of disclaimers:

This site is **not owned by or affiliated with GitHub**. But I work there, and I’m using the API to add each new license file. You’ll be asked to authenticate this app for your public repositories on the next page.

Perhaps in the future, we may integrate this into http://choosealicense.com/.

But in the meanwhile check it out and get those projects licensed!

company culture comments edit

A finely honed bullshit detector is a benefit to everyone. Let’s try a hypothetical conversation to test yours!

“Hey, we should release that under a more permissive license for XYZ reasons.”

“We’d like to, but the lawyers won’t let us.”

If it’s not malfunctioning, you should feel your bullshit detector tingling right now.

bull Yep, it’s a bull. Photo by Graeme Law CC BY 2.0

A lot of folks think that a lawyer’s job is to protect the business at all costs – that their job is to say “no!” Unfortunately, many places do structure it that way. After all, if a lawyer says “go ahead” and you get sued, the lawyer loses. But if the lawyer says “don’t”, there’s no immediate downside for the lawyer. Eventually the business may collapse from inaction, but there’s always teaching at law school as a backup. So why would the lawyer ever say “yes” in such a situation?

One of the best lessons I learned while at Microsoft was from Scott Guthrie when I expressed concern that the legal team wouldn’t let us break new ground with how we shipped open source.

He reminded me that the lawyers work for us. We do not work for the lawyers. If the lawyers had their way, we wouldn’t do anything because that’s the safest option.

You can see why so many people love the red polo.

Many decisions where legal gets involved are business decisions, not legal decisions. Unless the decision is downright illegal, the lawyer’s job is to help figure out how to do what’s best for the business. Along the way, they should make sure we’re aware of the risks, but also find ways to minimize the risks. At least that’s what a good lawyer does and I’ve been fortunate to work with some.

At the end of the day, even if the lawyer is uneasy about a course of action, they do not get to make the business decisions. That’s someone else’s job (unless you happen to work at a law firm I guess). Perhaps it’s your job.

So when someone tells you that “legal won’t let us do XYZ”, unless they follow that with “because it’s illegal and will land us all in jail and that’s no fun”, you should recognize it as a copout.

Sometimes what they mean is “I don’t really know what’s in the best interest of our business (or I’m too busy to care) so I’ll play it safe and blame the lawyers.”

What you hope they mean is “we won’t do this because it is not in the best interest of our business.” Now that is a fair answer. You may disagree, but it serves as a starting point for a more interesting conversation.

blogging, open source comments edit

Google is shuttering Google Reader in a little over a day (on July 1st, 2013) as I write this. If you use Google Reader to read my blog, this means you might miss out on my posts and I KNOW YOU DON’T WANT THIS!

Then again, maybe this is finally your chance to make a break, get some fresh air, stop reading blogs and start creating! I won’t hold it against you.

But for the rest of you, it’s a good time to find a replacement. Or at the very least follow me on Twitter since I do tweet when I blog.

There’s a lot of Google Reader replacements out there, but only two that I like so far.

Feedly

feedly

Feedly is gorgeous. There are apps for many platforms, but the browser works pretty well. Also, you can use Google to log into it and import your Google Reader feeds. I hope Google allows exporting to Feedly and other aggregators after July 1st even as they close down the Google Reader site.

The problem I have with Feedly is that it doesn’t work like Google Reader. It wouldn’t be so bad if it had a better flow for reading items, but I find its interface to be quirky and in some cases, unintuitive. For example, it seems I have to mark items as read by clicking “mark above articles as read” rather than having it do it automatically like Reader does after you scroll past it.

This leads me to…

Go Read

go-read

Go Read is a late entry into the list, but there are three important things I really like about it:

  1. It is intended to be a clean and simple clone of Google Reader.
  2. It supports Google Reader’s keyboard shortcuts.
  3. It is open source and up on GitHub!

For some more details, check out the announcement blog post by the author, Matt Jibson, a developer at Stack Exchange:

I would like to announce the release of Go Read. It as a Google Reader clone, and designed to be close to its simplicity and cleanliness. I wanted to build something as close to Google Reader as made sense for one person to build in a few months.

It’s basically Google Reader, but without all the cruft and where you can send pull requests to improve things!

In fact, there’s already a few pull requests with some nice user interface polish that should hopefully make it into the site soon.

Despite some false starts, I have it up and running on my machine. I sent a few pull requests to update the README to help other clueless folks like me get it set up for hacking on.

So check it out, import your Google Reader feeds, and never miss out on another Haacked.com post EVER!

UPDATE: I forgot to mention what is perhaps the most important reason for me to prefer Go Read. I don’t want to end up in another Google Reader situation again and rely on an RSS Aggregator that isn’t a solid business and might not stick around. At least with an open source option, I have the code running on my own machine as a backup in a pinch.

code comments edit

UPDATE: The .NET team removed the platform limitations.

Let me start by giving some kudos to the Microsoft BCL (Base Class Library) team. They’ve been doing a great job of shipping useful libraries lately. Here’s a small sampling on Nuget:

However, one trend I’ve noticed is that the released versions of most of these packages have a platform limitation in the EULA (the pre-release versions have an “eval-only” license which do not limit platform, but do limit deployment for production use). At this point I should remind everyone I’m not a lawyer and this is not legal advice blah blah blah.

Here’s an excerpt from section 2. c. in the released HttpClient license, emphasis mine:

a. Distribution Restrictions. You may not

  • alter any copyright, trademark or patent notice in the Distributable Code;
  • use Microsoft’s trademarks in your programs’ names or in a way that suggests your programs come from or are endorsed by Microsoft;
  • distribute Distributable Code to run on a platform other than the Windows platform;

I think this last bullet point is problematic and should be removed.

Why should they?

I recently wrote the following tweet in response to this trend:

Dear Microsoft BCL team. Please remove the platform limitation on your very cool libraries. Love, cross-platform .NET devs.

And a Richard Burte tweeted back:

And that pays the rent how exactly?

Great question!

There is this sentiment among many that the only reason to make .NET libraries cross platform or open source is just to appease us long haired open source hippies.

Well first, let me make it crystal clear that I plan to get a haircut very soon. Second, the focus of this particular discussion is the platform limitation on the compiled binaries. I’ll get to the extra hippie open source part later.

There are several reasons why removing the platform limitation benefits Microsoft and the .NET team.

It benefits Microsoft’s enterprise customers

Let’s start with Microsoft’s bread and butter, the enterprise. There’s a growing trend of enterprises that support employees who bring their own devices (BYOD) to work. As Wikipedia points out:

BYOD is making significant inroads in the business world, with about 75% of employees in high growth markets such as Brazil and Russia and 44% in developed markets already using their own technology at work.

Heck, at the time I was an employee, even Microsoft supported employees with iPhones connecting to Exchange to get email. I assume they still do, Ballmer pretending to break an iPhone notwithstanding.

Microsoft’s own software supports cross-platform usage. Keeping platform limitations on their .NET code hamstrings enterprise developers who want to either target the enterprise market or want to make internal tools for their companies that work on all devices.

It’s a long play benefit to Windows 8 Phone and Tablet

While developing Windows 8, Microsoft put a ton of energy and focus into a new HTML and JavaScript based development model for Windows 8 applications, at the cost of focus on .NET and C# in that time period.

The end result? From several sources I’ve heard that something like 85% of apps in the Windows app store are C# apps.

Now, I don’t think we’re going to see a bunch of iOS developers suddenly pick up C# in droves and start porting their apps to work on Windows. But there is the next generation to think of. If Windows 8 devices can get enough share to make it worthwhile, it may be easier to convince this next generation of developers to consider C# for their iOS development and port to Windows cheaply. Already, with Xamarin tools, using C# to target iOS is a worlds better environment than Objective-C. I believe iOS developers today tolerate Objective-C because it’s been so successful for them and it was the only game in town. As Xamarin tools get more notice, I don’t think the next generation will tolerate the clumsiness of the Objective-C tools.

There’s no good reason not to

Ok, this isn’t strictly speaking a benefit. But it speaks to a benefit.

The benefit here is that when Microsoft restricts developers without good reason, it makes them unhappy.

If you recall, Ballmer is the one who once went on stage to affirm Microsoft’s focus on developers! developers! developers! through interpretive dance.

ballmer-developers-dance

Unless there’s something I’m missing (and feel free to enlighten me!), there’s no good reason to keep the platform restriction on most of these libraries. In such cases, focus on the developers!

At a recent Outercurve conference, Scott Guthrie, a corporate VP at Microsoft in change of the Azure Development platform told the audience that his team’s rule of thumb with new frameworks is to default it to open source unless they have a good reason not to.

The Azure team recognizes that a strategy that requires total Windows hegemony will only lead to tears. Microsoft can succeed without having Windows on every machine. Hence Azure supports Linux, and PHP, and other non-Microsoft technologies.

I think the entire .NET team should look to what the Azure team is doing in deciding what their strategy regarding licensing should be moving forward. It makes more developers happy and costs very little to remove that one bullet point from the EULA. I know, I’ve been a part of a team that did it. We worked to destroy that bullet with fire (among others) in every ASP.NET MVC EULA.

Update: It looks like I may have overstated this. Licenses for products are based on templates. Typically a product team’s lawyer will grab a template and then modify it. So with ASP.NET MVC 1 and 2, we removed the platform restriction in the EULA. But it looks like the legal team switched to a different license template in ASP.NET MVC 3 and we forgot to remove the restriction. That was never the intention. Shame on past Phil. Present Phil is disappointed.

At least in this case, the actual source code is licensed under the Apache 2.0 license developers have the option to compile and redistribute, making this a big inconvenience but not a showstopper.

Next Steps

I recently commented on a BCL blog post suggesting that the team remove the platform limitation on a library. Immo Landwerth, a BCL Program Manager responded with a good clarifying question:

Thanks for sharing your concerns and the candid feedback. You post raised two very different items:

​(1) Windows only restriction on the license of the binaries

​(2) Open sourcing immutable collections

From what I read, it sounds you are more interested in (1), is this correct?

The post he refers to is actually one that Miguel de Icaza wrote when MEF came out with a license that had a platform restriction entitled Avoid the Managed Extensibility Framework. Fortunately, that was quickly corrected in that case.

But now we seem to be in a similar situation again.

Here was my response:

@Immo, well I’m interested in both. But I also understand how Microsoft works enough to know that (1) is much easier than (2). :P

So ultimately, I think both would be great, but for now, (1) is a good first step and a minimal requirement for us to use it in ReactiveUI etc.

So while I’d love to see these libraries be open source, I think a minimal next step would be to remove the platform limitation on the compiled library and all future libraries.

And not just to make us long haired (but soon to have a haircut) open source hippies happy, but to make us enterprise developers happy. To make us cross-platform mobile developers happy.

code comments edit

One of the side projects I’ve been working on lately is helping to shepherd the Semantic Versioning specification (SemVer) along to its 2.0.0 release. I want to thank everyone who sent pull requests and engaged in thoughtful, critical, spirited feedback about the spec. Your involvement has made it better!

I also want to thank Tom for creating SemVer in the first place and trusting me to help move it along.

I’ve mentioned SemVer in the past as it relates to NuGet. The 2.0.0 release of SemVer addresses some of the issues I raised.

What’s Changed?

Not too much has changed. Most of the changes focus around clarifications.

Build metadata

Perhaps the biggest change is the addition of optional build metadata (what we used to call a build number). This simply allows you to add a bit of metadata to a version in a manner that’s compliant with SemVer.

The metadata does not affect version precedence. It’s analogous to a code comment.

It’s useful for internal package feeds and for being able to tie a specific version to some mechanism that generated it.

For existing package managers that choose to be SemVer 2.0 compliant, the logic change needed is minimal. Instead of reporting an error when encountering a version with build metadata, all they need to do is ignore or strip the build metadata. That’s pretty much it.

Some package managers may choose to do more with it (for internal feeds for example) but that’s up to them.

Pre-release identifiers

Pre-release labels have a little more structure to them now. For example, they can be separated into identifiers using the “.” delimiter and identifiers that only contain digits are compared numerically instead of lexically. That way, 1.0.0-rc.1 < 1.0.0-rc.11 as you might expect. See the specification for full details.

Clarifications

The rest of the changes to the specification are concerned with clarifications and resolving ambiguities. For example, we clarified that leading zeroes are not allowed in the Major, Minor, or Patch version nor in pre-release identifiers that only contain digits. This makes a canonical form for a version possible.

If you find an ambiguity, feel free to report it.

What’s Next?

As SemVer matures, we expect the specification to become a little more formal in nature as a means of removing ambiguities. One such effort underway is to include a BNF grammar for the structure of a version number in the spec. This should hopefully be part of SemVer 2.1.

code comments edit

Code is unforgiving. As the reasonable human beings that we are, when we review code we both know what the author intends. But computers can’t wait to Well, Actually all over that code like a lonely Hacker News commenter:

Well Actually, Dave. I’m afraid I can’t do that.

Hal, paraphrased from 2001: A Space Odyssey

As an aside, imagine the post-mortem review of that code!

Code review is a tricky business. Code is full of hidden mines that lay dormant while you test just to explode in a debris of stack trace at the most inopportune time – when its in the hands of your users.

The many times I’ve run into such mines just reinforce how important it is to write code that is intention revealing and to make sure assumptions are documented via asserts.

Such devious code is often the most innocuous looking code. Let me give one example I ran into the other day. I was fortunate to defuse this mine while testing.

This example makes use of the Enumerable.ToDictionary method that turns a sequence into a dictionary. You supply an expression to produce a key for each element. In this example, loosely based on the actual code, I am using the CloneUrl property of Repository as the key of the dictionary.

IEnumerable<Repository> repositories = GetRepositories();
repositories.ToDictionary(r => r.CloneUrl);

It’s so easy to gloss over this line during a code review and not think twice about it. But you probably see where this is going.

While I was testing I was lucky to run into the following exception:

System.ArgumentException: 
An item with the same key has already been added.

Doh! There’s an implicit assumption in this code – that two repositories cannot have the same CloneUrl. In retrospect, it’s obvious that’s not the case.

Let’s simplify this example.

var items = new[]
{
    new {Id = 1}, 
    new {Id = 2}, 
    new {Id = 2}, 
    new {Id = 3}
};
items.ToDictionary(item => item.Id);

This example attempts to create a dictionary of anonymous types using the Id property as a key, but we have a duplicate, so we get an exception.

What are our options?

Well, it depends on what you need. Perhaps what you really want is a dictionary that where the value contains every item with the given key. The Enumerable.GroupBy method comes in handy here.

Perhaps you only care about the first value for a given key and want to ignore any others. The Enumerable.GroupBy method comes in handy in this case.

In the following example, we use this method to group the items by Id. This results in a sequence of IGrouping elements, one for each Id. We can then take advantage of a second parameter of ToDictionary and simply grab the first item in the group.

items.GroupBy(item => item.Id)
  .ToDictionary(group => group.Key, group => group.First());

This feels sloppy to me. There is too much potential for this to cover up a latent bug. Why should the other items be ignored? Perhaps, as in my original example, it’s fully normal to have more than one element for the key and you should handle that properly. Instead of grabbing the first item from the group, we retrieve an array.

items.GroupBy(item => item.Id)
  .ToDictionary(group => group.Key, group => group.ToArray());

In this case, we end up with a dictionary of arrays.

UPDATE: Or, as Matt Ellis points out in the comments, you could use theEnumerable.ToLookupmethod. I should have known such a thing would exist. It’s exactly what I need for my particular situation here.

What if having more than one element with the same key is not expected and should throw an exception. Well you could just use the normal ToDictionary method since it will throw an exception. But that exception is unhelpful. It doesn’t have the information we probably want. For example, you just might want to know, which key was already added as the following demonstrates:

items.GroupBy(item => item.Id)
    .ToDictionary(group => group.Key, group =>
    {
        try
        {
            return group.Single();
        }
        catch (InvalidOperationException)
        {
            throw new InvalidOperationException("Duplicate
  item with the key '" + group.First().Id + "'");
        }
    });

In this example, if a key has more than one element associated with it, we throw a more helpful exception message.

System.InvalidOperationException: Duplicate item with the
key '2'

In fact, we can encapsulate this into our own better extension method.

public static Dictionary<TKey, TSource>
  ToDictionaryBetter<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector)
{
  return source.GroupBy(keySelector)
    .ToDictionary(group => group.Key, group =>
    {
      try
      {
        return group.Single();
      }
      catch (InvalidOperationException)
      {
        throw new InvalidOperationException(
            string.Format("Duplicate item with the key
          '{0}'", keySelector(@group.First())));
      }
    });
}

Code mine mitigated!

This is just one example of a potential code mine that might go unnoticed during a code review if you’re not careful.

Now, when I review code and see a call to ToDictionary, I make a mental note to verify the assumption that the key selector must never lead to duplicates.

When I write such code, I’ll use one of the techniques I mentioned above to make my intentions more clear. Or I’ll embed my assumptions into the code with a debug assert that proves that the items cannot have a duplicate key. This makes it clear to the next reviewer that this code will not break for this reason. This code still might not open the hatch, but at least it won’t have a duplicate key exception.

If I search through my code, I will find many other examples of potential code mines. What are some examples that you can think of? What mines do you look for when reviewing code?

personal, empathy, parenting comments edit

This post is a departure from my typical software related topics, but I think you’ll find parallels with management and dealing with software developers.

Parenting is a skill like any other – it can be improved (for some more than others, amirite?!).

Look, I’m not trying to claim I’m the world’s greatest dad. But I was given a coffee mug with that claim by my kids. I don’t mean to brag, but I’m pretty sure they did a quantitative exhaustive analysis of all dads before conferring that award to me because that’s just how I raised them. Right kids?!

But I digress.

When my son was still a very young toddler, my wife and I took advantage of a Microsoft benefit that paid for parenting classes (Many Microsoft employees who are parents have no idea this benefit exists). We attended a series on “Reflective Parenting.” It was an amazing learning experience that taught us this idea that parenting is a skill like any other.

It’s a strange conceit of many parents that because they can reproduce, that they suddenly are imbued with unassailable parenting skills.

As Richard Stallman once remarked, perhaps callously,

It doesn’t take special talents to reproduce—even plants can do it. On the other hand, contributing to a program like Emacs takes real skill. That is really something to be proud of.

It helps more people, too.

And you never have to clean up poo from an Emacs blowout!

He’s right about one part. Even plants can reproduce. But reproducing is the easy part. Plants can’t parent.

Parenting is a subject that trends towards being heavy on tradition. “If it was good enough for me, it’s good enough for my kids.” But that’s not how progress is made my friends.

Despite my megalomaniacal tendencies, I like to think I turned out ok so far. My parents did a pretty good job. Does that mean I can’t strive to do even better? It’s worth a try. So in this post, I’ll explore what SCIENCE brings to bear on the subject. It may seem weird to invoke science in a subject as personal and emotional as parenting. But the scientific method is effective, even on a personal scale.

xkcd-standback

Note that the focus here is on core principles and less on specifics. I’m not going to tell you to spank or not spank your child (because we know that’ll end in a shit storm debate).

This post will focus on principles to consider when making your own decisions about these things. Because in the end, if you are a parent, it is ultimately up to you what you do…within reason.

That’s one reason try to embrace the “no judging” philosophy towards other parents. Each parent has a different situation and different background. I may offer ideas that I think are helpful, but I won’t judge. Unless you tend to drive off with your five-week-old child on top of your car. I might judge just an teensy weensy bit then.

Lessons from Reflective Parenting

The about page for the Center of Reflective Parenting says it was founded…

..in response to groundbreaking research in child development and the study of the neurobiology of the developing mind showing that the single best way to positively impact the attachment relationship is to increase a parent’s capacity to reflect on their relationship with their child – to think about the meaning that underlies behavior.

There’s a lot of science and research underlying the core precepts of this approach. But when you hear it, it doesn’t sound academic at all. In fact, it sounds a lot like common sense.

There are three core lessons I took from the classes.

Empathy

The first is to work on developing empathy and understanding for your child. We learned a lot about what children are capable of developmentally at certain ages. For example, at very young ages, children aren’t very good at understanding cause and effect.

This allows you to develop more appropriate expectations and responses to the things your child may do. At some ages, you just can’t expect them to respond to reason, for example. (By their teenage years, they can respond to reason, they just choose not to. It’s different.)

Self Control

The second, and perhaps more important lesson for me personally, is that good parenting is more about controlling yourself than your child. This is because children reflect the behavior of their parents.

For example, we’ve all been there, in the car, with the kids loudly misbehaving, when a parent gets fed up, blows up, and screams at the kids. I’ve been there.

In that moment the parent is not disciplining. The parent is only momentarily making him or herself feel better. But this teaches the kids that the best way to handle a stressful situation is to lose your shit. Discipline comes from the calm moments when a parent is very considered and in control of his or her actions. Remember, kids don’t do what you tell them to do. They do as you do.

In such situations, the class taught us to attempt to empathize with what the children might be experiencing and base our actions on that. If we can’t help but to lose our temper, it’s OK to separate ourselves from the situation. For example, in extreme situations, you might pull over, step out of the car, and let the kids scream their heads off while you (out of earshot) calm your nerves.

Repairing

Now this last point is the most important lesson. Parents, we are going to fuck up. We’re going to do it royally. Accept it. Forgive yourself. And then repair the situation.

I’ve lost my shit plenty of times. I’m pretty sure I did it twice this past week. It doesn’t make me a bad parent, though I feel that way at the moment. What would make me a bad parent is if I doubled down on my anger and never apologized to the kids and never tried to repair whatever damage I may have caused.

Some parents believe in the doctrine of parental infallibility. Never let them see you sweat and never admit fault to your children lest they see an opening and walk all over you.

But when you consider the principle that kids do as you do, I don’t think this doctrine stands up to scrutiny. I hope you want your children to be able to admit when they were wrong and know how to deliver a sincere apology. Teach by living the example.

The Economist’s Guide to Parenting

Many years after the reflective parenting class, I listened to this outstanding Freakonomics Podcast epidsode on parenting.

I know what you’re thinking when you read the title of this podcast. You’re thinking what the **** — economists? What can economists possibly have to say about something as emotional, as nuanced, as humane, as parenting? Well, let me say this: because economists aren’t necessarily emotional (or, for that matter, all that nuanced or humane), maybe they’re exactly the people we need to sort this through. Maybe.

As you might expect, it’s hard to conduct a double blind laboratory study of raising kids. Are you going to separate twins at birth and knowingly give one to shitty parents and another to wonderful parents to examine the effects? Cool idea bro, but…

octopus-nope

But there are such things as natural experiments. There were studies done of large groups of twins separated at birth and raised by different adoptive parents.

The striking result of the studies is that what the parents did had very little influence in how kids ended up. Letting kids watch as much TV as they want? Restricting TV? Helicopter parenting? Piano and violin lessons? Sorry Tiger Mom, it made very little difference.

Over and over again, guess what made the difference. It wasn’t what the parents did, it’s who they were. Educated parents ended up with educated kids. As far as I could tell, the study didn’t really get into cause and effect much. For example, is it because educated parents tend to do the things that lead to educated kids? Inconclusive. But they did find that many of the practices of “helicopter parents” such as music lessons, etc. had very little affect on future success and happiness of the child.

But studies reveal there is one thing parents do that had a strong correlation with how their progeny end up - how parents treated wait staff. Those who were rude to waiters and waitresses ended up with rude children. Those who were kind and tipped well, ended up with kind kids.

See a pattern here?

Kids aren’t affected as much by what you tell them and what you teach them as much as what you do. If you’re curious and love learning, your kids are more likely to be infused with a similar passion for learning.

On one level, this is encouraging. You don’t need to schedule every hour of your children’s free time with Latin and theremin lessons for them to turn out well.

On the other hand, it’s also very challenging. If you’re a naturally awful misanthropic person it’s much harder to change yourself than to simply pay for classes.

I can be pretty lazy. But after the Freakonomics podcast, I started making an effort to do one simple thing every morning. Make the bed before I left the room. Honestly, I didn’t care so much if the bed was made, but I did like how a clean room made me feel less disheveled as I started my day.

And here’s the amazing thing. My son, who’s only five, will now sometimes come into the room to make our bed. I never asked him nor told him to. He saw me doing it and he’s reflecting my behavior. It’s really rather striking.

Go Forth And Parent

In the beginning I mentioned how parenting research applies to software developers. I wasn’t making a comparison of software developers to children (though if the description fits…). It’s more a comment on the idea that parenting is a lot like leadership. Like parents, leaders lead by doing, not by telling others what to do.

The good news is what you do as a parent has little effect on how your kids end up. The best thing you can do is focus on being the type of person you want your kids to be.

However, what you do in the interim can affect how well you cope with being a parent and all the travails that come with it. It also may affect what your relationship with your children will look like down the road. It’d kind of suck to raise wonderful successful kids who want nothing to do with you. So don’t be awful to them.

This is why I still think it’s worthwhile working on improving parenting skills. It’s less about affecting your kids success as adults and more about building a good lasting relationship with them.

Like any skill, there’s always new evidence coming in that might cause you to reevaluate how you parent. For example, here’s a list of ten things parents are often dead wrong about.

Perhaps you factor those in and maybe improve your technique. Maybe not. The key thing is, don’t sweat it too much. Ultimately, what we all want is for our kids to lead fulfilled and happy lives. This is one reason I optimize for my own happiness so they hopefully reflect that.

Happy parenting!

happiness

code, open source, github comments edit

In some recent talks I make a reference to Conway’s Law named after Melvin Conway (not to be confused with British Mathematician John Horton Conway famous for Conway’s Game of Life nor to be confused with Conway Twitty) which states:

Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.

Many interpret this as a cynical jibe at software management dysfunction. But this was not Melvin’s intent. At least it wasn’t his only intent. On his website, he quotes from Wikipedia, emphasis mine:

Conway’s law was not intended as a joke or a Zen koan, but as a valid sociological observation. It is a consequence of the fact that two software modules A and B cannot interface correctly with each other unless the designer and implementer of A communicates with the designer and implementer of B. Thus the interface structure of a software system necessarily will show a congruence with the social structure of the organization that produced it.

I savor Manu Cornet’s visual interpretation of Cornet’s law. I’m not sure how Manu put this together, but it’s not a stretch to suggest that the software architectures these companies produce might lead to these illustrations.

organizational_charts

Having worked at Microsoft, the one that makes me laugh the most is the Microsoft box. Let’s zoom in on that one. Perhaps it’s an exaggerated depiction, but in my experience it’s not without some basis in truth.

ms-org 

The reason I mention Conway’s Law in my talks is to segue to the topic of how GitHub the company is structured. It illustrates why GitHub.com is structured the way it is.

So how is GitHub structured?

Well Zach Holman has written about it in the past where he talks about the distributed and asynchronous nature of GitHub. More recently, Ryan Tomayko gave a great talk (with associated blog post) entitled Your team should work like an open source project.

By far the most important part of the talk — the thing I hope people experiment with in their own organizations — is the idea of borrowing the natural constraints of open source software development when designing internal process and communication.

GitHub in many respects is structured like a set of open source projects. This is why GitHub.com is structured the way it is. It’s by necessity.

Like the typical open source project, we’re not all in the same room. We don’t work the same hours. Heck, many of us are not in the same time zones even. We don’t have top-down hierarchical management. This explains why GitHub.com doesn’t focus on the centralized tools or reports managers often want as a means of controlling workers. It’s a product that is focused more on the needs of the developers than on the needs of executives. It’s a product that allows GitHub itself to continue being productive.

Apply Conway’s Law

So if Conway’s Law is true, how can you make it work to your advantage? Well by restating it as Jesse Toth does according to this tweet by Sara Mei:

Conway’s Law restated by @jesse_toth: we should model our teams and our communication structures after the architecture we want.  #scotruby

Conway’s Law in its initial form is passive. It’s an observation of how software structures tend to follow social structures. So it only makes sense to move from observer to active participant and change the organizational structures to match the architecture you want to produce.

Do you see the effects of Conway’s Law in the software you produce?

code comments edit

In a recent post, Test Better, I suggested that developers can and ought do a better job of testing their own code. If you haven’t read it, I recommend you read that post first. I’m totally not biased in saying this at all. GO DO IT ALREADY!

There was some interesting pushback in the comments. Some took it to mean that we should get rid of all the testers. Whoa whoa whoa there! Slow down folks.

I can see how some might come to that conclusion. I did mention that my colleague Drew wants to destroy the role of QA. But it’s not because we want to just watch it burn.

Rather, we’re interested in something better rising from the ashes. It’s not that there’s no need for testers in a software shop. It’s that what we need is a better idea of what a tester is.

Testers Are Not Second Class Citizens

Perhaps you’ve had different experiences than me with testers. Good for you, here’s a Twinkie. For the vast majority of you, you can probably relate to the following.

At almost every position I’ve been at, developers treated testers like second class citizens in the pecking order of employees. Testers were just a notch above unskilled labor.

Not every tester mind you. There were always standouts. But I can’t tell you how many times developers would joke that testers are just wannabe developers who didn’t make the cut. The general attitude was that you could replace these folks with Amazon’s Mechanical Turk and not know the difference.

mechanical turkMechanical Turk from Wikimedia Commons. Public Domain image.

And in some cases, it was true. At Microsoft, for example, it’s easier to be hired as a tester than as a developer. And once your foot is in the door, you can eventually make that transition to developer if you’re half way decent.

This makes it very challenging to hire and retain good testers.

Elevate the Profession

But it shouldn’t be this way. The problem is we need to elevate the profession of tester. Drew and I talk about these things from time to time and he told me to think of testers as folks who provide a service to developers. Developers should test their own code, but testers can provide guidance on how to better test the code, suggest usage scenarios, set up test labs, etc.

Then it hit me. We already do have testers that are well respected by developers and may serve as a model for what we mean by the concept of better testers.

Security Testers

By most accounts, good security testers are well respected and not seen as folks who are developer wannabes. If not respected, they are feared for what they might do to your system should you piss them off. One security expert I know mentioned developers never click on links he sends without setting up a Virtual Machine to try it in first. Either way, it works.

Perhaps by looking at some of the qualities of security testers and how they integrate into the typical development flow, we can tease out some ideas on what better testers look like.

Like regular testers, many security testers test code that’s ready to ship. Many sites hire white hat penetration testers to attempt to locate and exploit vulnerabilities in a site that’s already been deployed. These folks are experts who keep up to date on the latest in security testing. They are not folks you can just replace with a Mechanical Turk.

Of course, smart developers don’t wait till code is deployed to get a security expert involved. That’s way too late. Security testers can help in the early stages of planning. Provide guidance on what patterns to avoid, what to look out for, some good practices to follow. During the coding stages they can provide code reviews with an eye towards security or simply answer questions you may have about tricky situations.

Testing as a Service

There are other testers that also follow a similar model. If you need to target 12 languages, you’ll definitely want to work with a localization/internationalization tester. If you value usability you may want to work with a usability expert. The list goes on.

It’s just not possible for a developer to be an expert in all these possible areas. I’d expect developers to have a basic understanding of these areas. Perhaps be quite knowledgeable in each, but never as knowledgeable as someone who is focused on these areas all the time.

The common theme among these testers is that they are providing a service to developers. They are sought out for their expertise.

General feature and quality testers should be no different. Good testers spend all their time learning and thinking about better and more efficient ways to test products. They are advocates for the end users and just as concerned about shipping software as developers. They are not gate keepers. They are enablers. They enable developers to ship better code.

This idea of testers as a service is not mine. It’s something Drew told me (he seriously needs to start his blog up again) that struck me.

By necessity, these would be folks who are great developers who have chosen to focus their efforts on the art and science of testing, just as another developer might choose to focus their efforts on the art and science of native clients, or reactive programming.

I love working with someone who knows way more about testing software and building in quality from the start than I do.

This is one of the motivations for me to test my own code better. If I’m going to leverage the skills of a great tester, it’s a matter of pride not to embarrass myself with stupid bugs I should have caught in my own testing. I want to impress these folks with crazy hard bugs I have no idea how to test.

Ok, maybe that last bit didn’t come out the way I intended. The point is when you work with experts, you don’t want them spending all their time with softballs. You want their help with the meaty stuff.

community, open source, personal comments edit

Someone recently emailed me to ask if I’m speaking at any upcoming conferences this year. Good question!

I’ve been keeping it pretty light this year since my family and I are doing a bit of travelling ourselves and I like spending time with them.

But I will be hitting up two conferences that I know of.

<anglebrackets> April 8 – 11

Ohmagerd! That’s this week! I better prepare!

I’ll be giving two talks this week. One of them will be a joint talk with the incomparable Scott Hanselman. Usually that means him taking potshots at me for your enjoyment. ARE YOU NOT ENTERTAINED?!

are_you_not_entertained-135569

You will be!

Jazz Up Your Open Source with GitHub

Wednesday April 10 3:30 PM – 4:45 PM - Room 5 (Just Me)

You write some code that handles angle brackets like nobody’s business and you’re ready to share it with the world on GitHub. Great! Now what?

The story doesn’t end there. When the first users and contributors show up at your doorstep, you need to be prepared. Find out some tips for engaging an audience with your open source project and really make your project sing.

Return of the HaaHa Show: How to Open Source

Thursday April 11 8:00 AM (HWHAT!?) – 9:00 AM – Keynote Room 2 – Scott and Phil

They are back. ScottHa and PhilHaa reprise their legendary (OK, not really) HaaHa show that has thrilled audiences on three continents. There will be code. There will be jokes, bad ones. There will be Pull Requests. There will be Markdown. Will there be injuries? Papercuts? Let’s find out as we join Phil Haack and Scott Hanselman as they learn how to open source. We will answer questions like: How do I get involved in open source? How do I clone and repro, branch it, do a pull request and commit to an open source project? Seems kind of hard. Let’s see if it is!

MonkeySpace 2013 July 22-25

The call for proposals for this conference is still open. If you know anyone who might bring a diverse and unique perspective to this conference, please encourage them to submit. We’d really love to get a more diverse speaker cast than is typical for a conference on .NET open source. This conference is no longer just a conference on Mono. Mono figures prominently, but the scope has expanded to the broader topic of .NET open source and cross platform .NET.

Others

I’ll be in Tokyo Japan in late April. So if you have a user group there that meets on Tuesday 4/30 and want to hear about GitHub, Git, NuGet, or even ASP.NET MVC, let me know. I’d be happy to swing by, but be warned I do not speak Japanese.

There might also be some local upcoming conferences I’ll speak at.

Podcasts

I recently was a guest on Yet Another Podcast with Jesse Liberty where I talked about Git, GitHub, GitHub for Windows, and subverting the oppressive traditional hierarchical organizational structure that serves to keep us down. FIGHT THE POWER!

Check it out.

Tags: speaking, talks, opensource, podcast

code, tdd, github comments edit

Developers take pride in speaking their mind and not shying away from touchy subjects. Yet there is one subject makes many developers uncomfortable.

Testing.

I’m not talking about drug testing, unit testing, or any form of automated testing. After all, while there are still some holdouts, at least these types of tests involve writing code. And we know how much developers love to write code (even though that’s not what we’re really paid to do).

No, I’m talking about the kind of testing where you get your hands dirty actually trying the application. Where you attempt to break the beautifully factored code you may have just written. At the end of this post, I’ll provide a tip using GitHub that’s helped me with this.

TDD isn’t enough

I’m a huge fan of Test Driven Development. I know, I know. TDD isn’t about testing as Uncle Bob sayeth from on high in his book, Agile Software Development, Principles, Patterns, and Practices,

The act of writing a unit test is more an act of design than of verification.

And I agree! TDD is primarily about the design of your code. But notice that Bob doesn’t omit the verification part. He simply provides more emphasis to the act of design.

In my mind it’s like wrapping a steak in bacon. The steak is the primary focus of the meal, but I sure as hell am not going to throw away the bacon! I know, half of you are hitting the reply button to suggest you prefer the bacon. Me too but allow me this analogy.

bacon-wrapped-steakMMMM, gimme dat! Credit: Jason Lam CC-BY-SA-2.0

The problem I’ve found myself running into, despite my own advice to the contrary, is that I start to trust too much in my unit tests. Several times I’ve made changes to my code, crafted beautiful unit tests that provide 100% assurance that the code is correct, only to have customers run into bugs with the code. Apparently my 100% correct code has a margin of error. Perhaps Donald Knuth said it best,

Beware of bugs in the above code; I have only proved it correct, not tried it.

It’s surprisingly easy for this to happen. In one case, we had a UI gesture bound to a method that was very well tested. Our UI was bound to this method. All tests pass. Ship it!

Except when you actually execute the code, you find that there’s a certain situation where an exception might occur that causes the code to attempt to modify the UI on a thread other than the UI thread #sadtrombone. That’s tricky to catch in a unit test.

Getting Serious about Testing

When I joined the GitHub for Windows (GHfW) team, we were still in the spiking phase, constantly experimenting with the UI and code. We had very little in the way of proper unit tests. Which worked fine for two people working in the same code in the same room in San Francisco. But here I was, the new guy hundreds of miles away in Bellevue, WA without any of the context they had. So I started to institute more rigor in our unit and integration tests as the product transitioned to a focus on engineering.

But we still lacked rigor in regular non-automated testing. Then along comes my compatriot, Drew Miller. If you recall, he’s the one I cribbed my approach structuring unit tests from.

Drew really gets testing in all its forms. I first started working with him on the ASP.NET MVC team when he joined as a test lead. He switched disciplines from a developer to become a QA person because he wanted a venue to test this theories on testing and eventually show the world that we don’t need separate QA person. Yes, he became a tester so he could destroy the role, in order to save the practice.

In fact, he hates the term QA (which stands for Quality Assurance):

The only assurance you will ever have is that code has bugs. Testing is about confidence. It’s about generating confidence that the user’s experience is good enough. And it’s about feedback. It’s about providing feedback to the developer in lieu of a user in the room. Be a tester, don’t be QA.

On the GitHub for Windows team, we don’t have a tester. We’re all responsible for testing. With Drew on board, we’re also getting much better at it.

Testing Your Own Code and Cognitive Bias

There’s this common belief that developers shouldn’t test their own code. Or maybe they should test it, but you absolutely need independent testers to also test it as well. I used to fully subscribe to this idea. But Drew has convinced me it’s hogwash.

It’s strange to me how developers will claim they can absolutely architect systems, provide insights into business decisions, write code, and do all sorts of things better than the suits and other co-workers, but when it comes to testing. Oh no no no, I can’t do that!

I think it’s a myth we perpetuate because we don’t like it! Of course we can do it, we’re smart and can do most anything we put our minds to. We just don’t want to so we perpetuate this myth.

There is some truth that developers tend to be bad at testing their own code. For example, the goal of a developer is to write software as bug free as possible. The presence of a bug is a negative. And it’s human nature to try to avoid things that make us sad. It’s very easy to unconsciously ignore code paths we’re unsure of while doing our testing.

While a tester’s job is to find bugs. A bug is a good thing to these folks. Thus they’re well suited to testing software.

But this oversimplifies our real goals as developers and testers. To ship quality software. Our goals are not at odds. This is the mental switch we must make.

And We Can Do It!

After all, you’ve probably heard it said a million times, when you look back on code written several months ago, you tend to cringe. You might not even recognize it. Code in the brain has a short half-life. For me, it only takes a day before code starts to slip my mind. In many respects, when I approach code I wrote yesterday, it’s almost as if I’m someone else approaching the code.

And that’s great for testing it.

When I think I’m done with a feature or a block of code, I pull a mental trick. I mentally envision myself as a tester. My goal now is to find bugs in this code. After all, if I find them and fix them first, nobody else has to know. Whenever a customer finds a bug caused by me, I feel horrible. So I have every incentive to try and break this code.

And I’m not afraid to ask for help when I need it. Sometimes it’s as simple as brainstorming ideas on what to test.

One trick that my team has started doing that I really love is when a feature is about done, we update the Pull Request (remember, a pull request is a conversation about some code and you don’t have to wait for the code to be ready to merge to create a PR) with a test plan using the new Task Lists feature via GitHub Flavored Markdown.

This puts me in a mindset to think about all the possible ways to break the code. Some of these items might get pulled from our master test plan or get added to it.

Here’s an example of a portion of a recent test plan for a major bug fix I worked on (click on it to see it larger).

test-plan-in-pr

The act of writing the test plan really helps me think hard about what could go wrong with the code. Then running through it just requires following the plan and checking off boxes. Sometimes as I’m testing, I’ll think of new cases and I’ll just edit the plan accordingly.

Also, the test plan can serve as an indicator to others that the PR is ready to be merged. When you see everything checked off, then it should be good to go! Or if you want to be more explicit about it, add a “sign-off” checkbox item. Whatever works best for you.

The Case for Testers

Please don’t use this post to justify firing your test team. The point I’m trying to make is that developers are capable of and should test their own (and each others) code. It should be a badge of pride that testers cannot find bugs in your code. But until you reach that point, you’re probably going to need your test team to stick around.

While my team does not have dedicated testers, we consider each of us to be testers. It’s a frame of mind we can put our minds into when we need to.

But we’re also not building software for the Space Shuttle so maybe we can get away with this.

I’m still of the mind that many teams can benefit from a dedicated tester. But the role this person has is different from the traditional rote mechanical testing you often find testers lumped into. This person would mentor developers in the testing part of building software. Help them get into that mindset. This person might also work to streamline whatever crap gets in the way so that developers can better test their code. For example, building automation that sets up test labs for various configuration in a moment’s notice. Or helping to verify incoming bug reports from customers.

Related Posts

nuget comments edit

How can you trust anything you install from NuGet? It’s a simple question, but the answer is complicated. Trust is not some binary value. There are degrees of trust. I trust my friends to warn me before they contact the authorities and maybe suggest a lawyer, but I trust my wife to help me dispose of the body and uphold the conspiracy of silence (Honey, it was in the fine print of our wedding vows in case you’re wondering).

The following are some ideas I’ve been bouncing around with the NuGet team about trust and security since even before I left NuGet. Hopefully they spark some interesting discussions about how to make NuGet a safer place to install packages.

Establish Identity and Authorship

The question “do I trust this package” is not the best question to ask. The more pertinent question is “do I trust the author of this package?”

NuGet doesn’t change how you go about answering this question yet. Whether you found a zip file on some random website or install it via NuGet, you still have to answer the following questions (perhaps unconsciously):

  1. Who is the author?
  2. Is the author trustworthy?
  3. Do I trust that the this software really was written by the author?
  4. Is the author’s means of distributing software tamper resistant and verifiable?

In some cases, the non-NuGet software is signed with a certificate. That helps answer questions 1, 2, and 3. But chances are, you don’t restrict yourself to only using certificate signed libraries. I looked through my own installed Visual Studio Extensions and several were not certificate signed.

NuGet doesn’t yet support package signing, but even if it did, it wouldn’t solve this problem sufficiently. If you want to know more why I think that, read the addendum about package signing at the end of this post.

What most people do in such situations is try to find alternate means to establish identity and authorship:

  1. I look for other sites that link to this package and mention the author.
  2. I look for sites that I already know to be in control of the author (such as a blog or Twitter account) and look for links to the package.
  3. I look for blog posts and tweets from other people I trust mentioning the package and author.

I think NuGet really needs to focus on making this better.

A Better Approach

There isn’t a single solution that will solve the problem. But I do believe a multipronged approach will make it much easier for people to establish the identity and authorship of a package and make an educated decision on whether or not to install any given package.

Piggy back on other verification systems

This first idea is a no-brainer to me. I’m a lazy bastard. If someone else has done the hard work, I’d like to build on what they’ve done.

This is where social media can come into play and have a useful purpose beyond telling the world what you ate for lunch.

For example, suppose you want to install RouteMagic and you see that the package owner is some user named haacked on NuGet. Who is this joker?

Hey! Maybe you happen to know haacked on GitHub! Is that the same guy as this one? You also know a haacked on Twitter and you trust that guy. Can we tie all these identities together?

Well it’d be easy through Oauth. The NuGet gallery could allow me to verify that I am the same person as haacked on GitHub and Twitter by doing an Oauth exchange with those sites. Only the real haacked on Twitter could authenticate as haacked on Twitter.

The more identities I attach to my NuGet account, the more you can trust that identity. It’s unlikely someone will hack both my GitHub and Twitter accounts.

The NuGet Gallery would need to expose these verifications in the UI anywhere I see a package owner, perhaps with little icons.

With Twitter, you could go even further. Twitter has the concept of verified identities. If we trust their process of verification, we could piggyback on that and show a verified icon next to Twitter verified users, adding more weight to your claimed identity.

This would be so easy and cheap to implement and provide a world of benefit for establishing identity.

Build our own verification system

Eventually, I think NuGet might want to consider having its own verification system and NuGet Verified Accounts™. This is much costlier than my previous suggestion to do it right and not simply favor corporations over the little guy.

Honestly, if we implemented the first idea well, I’m not sure this would ever have to happen anytime soon.

Vouching

This idea is inspired by the concept of a Web of Trust with PGP which provides a decentralized approach to establishing the identity of the owner of a public key.

While the previous ideas help establish identity, we still don’t know if we can trust these people. Chances are, if someone has a well established identity they won’t want to smudge their reputation with malware. But what about folks without well established reputations?

We could implement a system of vouching. For example, suppose you trust me and I vouch for ten people. And they in turn vouch for ten people each. That’s a network of 111 potentially trustworthy people. Of course, each degree you move out, the level of trust declines. You probably trust me more than the people I trust. And those people more than the people they trust. And so on.

How do we use this information in NuGet?

It could be as simple as factoring it into sort order. For example, one factor in establishing trust in a package today is looking at the download count of a package. Chances are that a malware library is not going to get ten thousand downloads.

We could also incorporate the level of trust of the package owner into that sort order. For example, show me packages for sending emails in order of trust and download count.

Other attack vectors

So far, I’ve focused on establishing trust in the author of a package. But a package manager system has other attack vectors.

For example, the place where packages are stored could be hacked or the service itself could be hacked.

If Azure Blob storage was hacked, an attacker could swap out packages of trusted authors with untrusted materials. This is a real concern. NuGet.org luckily stores the hash of each package and presents it in the feed. The NuGet client verifies the contents before installing it on the users machine.

However, suppose NuGet.org database was hacked. There is still a level of protection because any hash tampering would be caught by the clients.

An attacker would have to compromise both the Azure Blob Storage and the NuGet.org database.

Or worse, if the attacker compromises the machine that hosts NuGet, then it’s game over as they could corrupt the hashes and run code to pull packages from another location.

Mitigations of this nightmare scenario include having different credentials for Blobs and the database and constant security reviews of the NuGet code base.

Another thing we should consider is storing package hashes in packages.config so that Package Restore could at least verify packages during a restore in this nightmare scenario. But this wouldn’t solve the issue with installing new packages.

PowerShell Scripts

NuGet makes use of PowerShell scripts to perform useful tasks not covered by a typical package.

A lot of folks get worried about this as an attack vector and want a way to disable these scripts. There are definitely bad things that could happen and I’m not opposed to having an option to disable them, but this only gives a false sense of security. It’s security theater.

Why’s that you say? Well a package with only assemblies can still bite you through the use of Module Initializers.

Modules may contain special methods called module initializers to initialize the module itself.

All modules may have a module initializer. This method shall be static, a member of the module, take no parameters, return no value, be marked with rtspecialname and specialname, and be named .cctor.

There are no limitations on what code is permitted in a module initializer. Module initializers are permitted to run and call both managed and unmanaged code.

The module’s initializer method is executed at, or sometime before, first access to any types, methods, or data defined in the module

If you’re installing a package, you’re about to run some code with or without PowerShell scripts. The proper mitigation is to stop running your development environment as an administrator and make sure you trust the package author before you install the package.

At least with NuGet, when you install a package it doesn’t require elevation. If you install an MSI, you’d typically have to elevate privileges.

Addendum: Package Signing is not the answer

Every time I talk about NuGet security, someone gets irate and demands that we implement signing immediately as if it were some magic panacea. I’m definitely not against implementing package signing, but let’s be clear. It is a woefully inadequate solution in and of itself and there’s a lot better things we should do first as I’ve already outlined in this post.

The Cost and Ubiquity Problem

Very few people will sign their packages. Ruby Gems supports package signing and I’ve been told the number that take advantage of it is nearly zero. Visual Studio Extensions also supports package signing. Quick, go look at your list of installed extensions. Were any unsigned?

The problem is this, if you require certificate signing, you’ve just created too much friction to create a package and the package manager ecosystem will dry up and die. Requiring signing is just not an option.

The reason is that obtaining and properly signing software with a certificate is a costly proposition by its very nature. A certificate implies that some authority has verified your identity. For that verification to have value, it must be somewhat reliable and thorough. It’s not going to be immediate and easy or bad agents could easily do it.

Package signing is only a good solution if you can guarantee near ubiquity. Otherwise you still need alternative solutions.

The User Interface Problem

Once you allow package signing, you then have the user interface problem. Visual Studio Extensions is an interesting example of this conundrum. You only see that a package is digitally signed after you’ve downloaded and decided to install it. At that point, you tend to be committed already.

vs-extension-gallery

Also notice that the message that this package isn’t signed is barely noticeable.

Ok, so it’s not signed. What can I do about it other that probably Install it anyways because I really want this software. The fact that a package was signed didn’t change my behavior in any way.

Visual Studio could put more dire looking warnings, but it would alienate the community of extension authors by doing so. It could require signing, but that would put onerous restrictions on creating packages and would cause the community of signed packages to wither away, leaving only packages sponsored by corporations.

The point here is that even with signed packages, there’s not much it would do for NuGet. Perhaps we could support a mode where it gave a more dire warning or even disallowed unsigned packages, but that’d just be annoying and most people would never use that mode because the selection of packages would be too small.

The only benefit in this case of signing is that if a package did screw something up, you could probably chase down the author if they signed it. But that’s only a benefit if you never install unsigned packages. Since most people won’t sign them, this isn’t really a viable way to live.

Conclusion

Just to be clear. I’m actually in favor of supporting package signing eventually. But I do not support requiring package signing to make it into the NuGet gallery. And I think there are much better approaches we can take first to mitigate the risk of using NuGet before we get to that point.

I worry that implementing signing just gives a false sense of security and we need to consider all the various ways that people can establish trust in packages and package authors.

git, github, code comments edit

The other day I needed a simple JSON parser for a thing I worked on. Sure, I’m familiar with JSON.NET, but I wanted something I could just compile into my project. The reason why is not important for this discussion (but it has to do with world domination, butterflies, and minotaurs).

I found the SimpleJson package which is also on GitHub.

SimpleJson takes advantage of a neat little feature of NuGet that allows you to include source code in a package and have that code transformed into the appropriate namespace for the package target. Oftentimes, this is used to install sample code or the like into a project. But SimpleJson uses it to distribute the entire library.

At first glance, this is a pretty sweet way to distribute a small single source file utility library. It gets compiled into my code. No binding redirects to worry about. No worries about different versions of the same library pulled in by dependencies. In my particular case, it was just what I needed.

But I started to think about the implications of such an approach on a wider scale. What if everybody did this?

The Update Problem

If such a library were used by multiple packages, it actually could limit the consumer’s ability to update the code.

For example, suppose I have a project that installs the SimpleJson package and also the SimpleOtherStuff package, where SimpleOtherStuff has a dependency on SimpleJson 1.0.0 and higher. The following diagram outlines the NuGet package dependency graph. It’s very simple.

nuget-dependency-graph

Now suppose we learn that SimpleJson 1.0.0 has a very bad security issue and we need to upgrade to the just released SimpleJson 1.1.

So we do just that. Everything should be hunky dory as we’re now using SimpleJson 1.0.0 everywhere. Or are we?

nuget-dependency-graph-2

If all the references to SimpleJson were assembly references, we’d be fine. But recall, it’s a source code package. Even though we upgraded it in our application, SimpleOtherStuff 1.0.0 has SimpleJson 1.0.0 compiled into it.

There’s no way to upgrade SimpleOtherStuff’s reference other than to wait for the package author to do it or to manually recompile it ourselves (assuming the source is available).

You Are in Control

A guiding principle in the design of NuGet is we try and keep you, the consumer of the packages, in control of things. Want to uninstall a package even though other packages reference it? We’ll prevent it by default but then offer you a –Force flag so you can tell NuGet, “No really, I know what I’m doing here and am ready to face the consequences.”

We don’t do this perfectly in every case. Pre-release packages come to mind. But it’s a principle we try to follow.

Source code packages are interesting in that they give you more control in one area (you have the source), but take it away in another (upgrades are no longer complete).

Note that I’m not picking on SimpleJson. As I said before, I really needed this. In fact, I contributed back with several Pull Requests. I’m just pointing out a caveat to consider when using such packages.

Making it Better

So yeah, be careful. There are caveats. But couldn’t we make this better? Well I have an idea. Ok, it’s not my idea but an idea that some of my coworkers and I have bounced around for a while.

Imagine if you could attach a Git repository to your NuGet package. When you install the package, you could add a flag to install it as a Git Submodule rather than the normal assembly approach. Maybe it’d look like this.

Install-Package SimpleJson –AsSource

What this would do is initialize a submodule, and grab the source from GitHub. Perhaps it goes further and adds the files as linked files into your target project based on a bit of configuration in the source tree.

There’s a lot of possibilities here to flesh out. The Upgrade-Package command simply run a Git update submodule command on these submodules and do a normal update for all the other packages.

Since Microsoft recently made it clear that Git is the future of DVCS as far as Microsoft is concerned, maybe now is the time to think about tighter integration with NuGet. What do you think?

At the very least, perhaps NuGet needs a better extensibility model so we could build this support in outside of NuGet. That’s the more prudent approach of course, but I’m not feeling so prudent today.

code comments edit

Today I learned something new and I love that!

I was looking at some code that looked like this:

try
{
    await obj.GetSomeAsync();
    Assert.True(false, "SomeException was not thrown");
}
catch (SomeException)
{
}

That’s odd. We’re using xUnit. Why not use the Assert.Throws method? So I tried with the following naïve code.

Assert.Throws<SomeException>(() => await obj.GetSomeAsync());

Well that didn’t work. I got the following helpful compiler error:

error CS4034: The ‘await’ operator can only be used within an async lambda expression. Consider marking this lambda expression with the ‘async’ modifier.

Oh, I never really thought about applying the async keyword to a lambda expression, but it makes total sense. So I tried this:

Assert.Throws<SomeException>(async () => await obj.GetSomeAsync());

Hey, that worked! I rushed off to tell the internets on Twitter.

But I made a big mistake. That only made the compiler happy. It doesn’t actually work. It turns out that Assert.Throws takes in an Action and thus that expression doesn’t return a Task to be awaited upon. Stephen Toub explains the issue in this helpful blog post, Potential pitfalls to avoid when passing around async lambdas.

Ah, I’m gonna need to write my own method that takes in a Func<Task>. Let’s do this!

I wrote the following:

public async static Task<T> ThrowsAsync<T>(Func<Task> testCode)
      where T : Exception
{
  try
  {
    await testCode();
    Assert.Throws<T>(() => { }); // Use xUnit's default behavior.
  }
  catch (T exception)
  {
    return exception;
  }
  // Never reached. Compiler doesn't know Assert.Throws above always throws.
  return null;
}

Here’s an example of a unit test (using xUnit) that makes use of this method.

[Fact]
public async Task RequiresBasicAuthentication()
{
  await ThrowsAsync<SomeException>(async () => await obj.GetSomeAsync());
}

And that works. I mean it actually works. Let me know if you see any bugs with it.

Note that you have to change the return type of the test method (fact) from void to return Task and mark it with the async keyword as well.

So as I was posting all this to Twitter, I learned that Brendan Forster (aka @ShiftKey) already built a library that has this type of assertion. But it wasn’t on NuGet so he’s dead to me.

But he remedied that five minutes later.

Install-Package AssertEx.

So we’re all good again.

If I were you, I’d probably just go use that. I just thought this was an enlightening look at how await works with lambdas.

personal comments edit

Back in March of last year, Stephen Wolfram wrote a blog post, The Personal Analytics of My Life. It’s a fascinating look at the data he’s accumulated over years about his own personal activities and habits such as daily incoming and outgoing email.

Since I read that, I’ve been fascinated about the idea of how personal data analytics might prove useful to me. It turns out I found an application to my health.

In my series on The Real Pain of Software Development (part1 and part2), I talked about my history with pain related to work and the various measures I took to remedy that pain including intense physical and occupational therapy.

I neglected to mention was how much difference a bit of weight loss makes. A single pound reduces the force on your joints, back, and other muscles an immense amount over the course of a day, week, or years. I am now more aware of how much pain I feel ebbs and flows as my weight does.

Sometime last year, GitHub gave all of us a Fitbit. It’s a little device that tracks the number of steps you take during the day. It can also track vertical distance changes and if you’re diligent, how much sleep you get. It posts all the data online so you can take a look at your numbers and compare to friends.

It wasn’t long before my co-workers hooked it up to Hubot (just another example of how chat is important to us and is at the center of how we work). Here’s a screenshot of the /fitbit me command which shows the leaderboards for step counts in the past 7 days. I blocked out names for privacy reasons, though I bet the top four would love to be unredacted.

fitbit-me

What I love about this is it adds an element of friendly competition to the mix. There’s absolutely NOTHING riding on this other than pride. Yet, it’s amazing how motivating this is. I now take a long walk to get coffee every day because I want to be up there near the top. And there’s no downsides to that. Maybe it’s the wrong motivation, but it’s definitely the right result. The evidence for this result is in the following.

I also happened to purchase the FitBit Aria Wi-Fi scale because I love me them stats. As you can see from the graph, the /fitbit me motivation is effective.

2012-now-weight

Unfortunately, I injured my knee snowboarding a couple weeks ago and have been sick the past few days so it’s starting to trend slightly up. But since I’ve had the scale, I’ve lost 6.3 lbs overall. This isn’t only due to the FitBit. We recently started a single subscription to BistroMD to supplement our cooking efforts since we both work and I work from home. I’ve found portion control a lot easier when it’s controlled for you.

FitBit can also track sleep, but this is not as automatic as step counting. You have to remember to wear it at night and put it in sleep mode. It’s a bit high maintenance.

2012-sleep-over-time

Also, it seems to count nights that you forget to set it as a day with 0 hours of sleep, which is not really what I want. Based on this graph you might incorrectly assume I averaged from 3 to 6 hours of sleep a night over the past year. Clearly that’s not correct or I’d be a raging hallucinatory lunatic by now. I know, some of you think I am such a beast, but if so it’s for other reasons, not due to lack of sleep.

I hope to be more diligently track my sleep patterns this year to see if there are any interesting correlations between the amount of sleep I get, my fitness level, my mood, and my GitHub contributions graph. Heh.

One thing I wish FitBit did better is provide better graphs for seeing my step counts over time. For example, here’s what I could find to see my step counts over the past 30 days.

steps-over-last-30-days

I’d love to see a graph of my steps over the year in fine detail.

Speaking of step counts, see that big spike in January, that’s the GitHub Winter Summit, our all company meeting in San Francisco. The way that spikes might give you the impression that we’re all health freaks and the summit is a very physical endeavor. Well, maybe.

The following is a breakdown of one of a couple of those days.

2013-01-17-stepsJanuary 17, 2013: 23,285 (approx 10.69 miles)

2013-01-18-stepsJanuary 18, 2013: 24,512 steps (approx: 11.35 miles)

Notice the hours of the big spikes. Yeah, we like to dance late at night.

Notice the big midday bump on the second day? That was a scavenger hunt that took us all around the great city of San Francisco.

All in all, I’m pretty happy with the FitBit and like how the data driven lifestyle it encourages has been a net positive for me. Your mileage may vary of course.

Some downsides to the FitBit is that it’s easy to lose. It’s easy to forget about it and launder it. It’s somewhat easy to break. Mine’s survived so far, but not without a chip or two. Also, it requires charging almost every day or at least every other day. Also, I’m not aware of a way to get the absolute raw data even with the premium account.

I’d be willing to pay money to get the raw data and create my own graphs. FitBit isn’t the only personal fitness tracker out there, but it’s the only one I’ve tried and I’m a big fan. I wouldn’t mind trying others, but much like the network effects of other social networks, the fact that many of my friends and co-workers are all on FitBit will keep me tied to it for now.

UPDATE: Looks like there is a FitBit API. I’ll have to play around with it. Thanks to @geeksmeetgirl for pointing it out to me.

code comments edit

I love automation. I’m pretty lazy by nature and the more I can offload to my little programmatic or robotic helpers the better. I’ll be sad the day they become self-aware and decide that it’s payback time and enslave us all.

But until that day, I’ll take advantage of every bit of automation that I can.

the-matrix-humans

For example, I’m a big fan of the Code Analysis tool built into Visual Studio. It’s more more commonly known as FxCop, though judging by the language I hear from its users I’d guess it’s street name is “YOU BIG PILE OF NAGGING SHIT STOP WASTING MY TIME AND REPORTING THAT VIOLATION!”

Sure, it has its peccadilloes, but with the right set of rules, it’s possible to strike a balance between a total nag and a helpful assistant.

As a developer, it’s important for us to think hard about our code and take care in its crafting. But we’re all fallible. In the end, I’m just not smart enough to remember ALL the possible pitfalls of coding ALL OF THE TIME such as avoiding the Turkish I problem when comparing strings. If you are, more power to you!

I try to keep the number of rules I exclude to a minimum. It’s saved my ass many times, but it’s also strung me out in a harried attempt to make it happy. Nothing pleases it. Sure, when it gets ornery, it’s easy to suppress a rule. I try hard to avoid that because suppressing one violation sometimes hides another.

Here’s an example of a case that confounded me today. The following very straightforward looking code ran afoul of a code analysis rule.

public sealed class Owner : IDisposable
{
    Dependency dependency;

    public Owner()
    {
        // This is going to cause some issues.
        this.dependency = new Dependency { SomeProperty = "Blah" };
    }

    public void Dispose()
    {
        this.dependency.Dispose();
    }
}

public sealed class Dependency : IDisposable
{
    public string SomeProperty { get; set; }
        
    public void Dispose()
    {
    }
}

Code Analysis reported the following violation:

CA2000 Dispose objects before losing scope \ In method ‘Owner.Owner()’, object ‘<>g__initLocal0’ is not disposed along all exception paths. Call System.IDisposable.Dispose on object ‘<>g__initLocal0’ before all references to it are out of scope.

That’s really odd. As you can see, dependency is disposed when its owner is disposed. So what’s the deal?

Can you see the problem?

A Funny Thing about Object Initializers

A2600_PitfallThe big clue here is the name of the variable that’s not disposed, <>g__initLocal0. As Phil Karlton once said, emphasis mine,

There are only two hard things in Computer Science: cache invalidation and naming things.

Naming may be hard, but I can do better than that. Clearly the compiler came up with that name, not me. I fired up Reflector to see the generated code. The following is the constructor for Owner.

public Owner()
{
    Dependency <>g__initLocal0 = new Dependency {
        SomeProperty = "Blah"
    };
    this.dependency = <>g__initLocal0;
}

Aha! So we can see that the compiler generated a temporary local variable to hold the initialized object while its properties are set, before assigning it to the member field.

So what’s the problem? Well if for some reason setting SomeProperty throws an exception, <>g__initiLocal0 will never be disposed. That’s what the Code Analysis is complaining about. Note that if an exception is thrown while setting that property, my member field is also never set to the instance. So it’s a dangling undisposed instance.

So what’s the Plan Stan?

Well the fix to keep code analysis happy is simple in this case.

public Owner()
{
    this.dependency = new Dependency();
    this.dependency.SomeProperty = "Blah";
}

Don’t use the initializer and set the property the old fashioned way.

This shuts up CodeAnalysis, but did it really solve the problem? Not in this specific case because we happen to be inside a constructor. If the Owner constructor throws, nobody is going to dispose of the dependency.

As Greg Beech wrote so long ago,

From this we can ascertain that if the object is not constructed correctly then the reference to the object will not be assigned, which means that no methods can be called on it, so the Dispose method cannot be used to deterministically clean up managed resources. The implication here is that if the constructor creates expensive managed resources which need to be cleaned up at the earliest opportunity then it should do so in an exception handler within the constructor as it will not get another chance.

So a more robust approach would be the following.

public Owner()
{
    this.dependency = new Dependency();
    try
    {
        this.dependency.SomeProperty = "Blah";
    }
    catch (Exception)
    {
        dependency.Dispose();
        throw;
    }       
}

This way, if setting the properties of Dependency throws an exception, we can dispose of it properly.

Why isn’t the compiler smarter?

I’m not the first to run into this pitfall with object initializers and disposable instances. Ayende wrote about a related issue with using blocks and object initializers back in 2009. In that post, he suggests the compiler should generate safe code for this scenario.

It’s an interesting question. Whenever I think of such questions, I put on my Eric Lippert hat and hear his proverbial voice (I’ve never heard his actual voice but I imagine it to be sonorous and profound) in my head saying:

I’m often asked why the compiler does not implement this feature or that feature, and of course the answer is always the same: because no one implemented it. Features start off as unimplemented and only become implemented when people spend effort implementing them: no effort, no feature. This is an unsatisfying answer of course, because usually the person asking the question has made the assumption that the feature is so obviously good that we need to have had a reason tonot implement it. I assure you that no, we actually don’t need a reason to not implement any feature no matter how obviously good. But that said, it might be interesting to consider what sort of pros and cons we’d consider if asked to implement the “silently put inferred constraints on class type parameters” feature.

The current implementation of object initializers seems correct for most cases. The only time it breaks down is in the case of disposable types. So let’s think about some possible solutions.

Why the intermediate variable?

First, let’s look at why the intermediate local variable. My initial knee-jerk reaction (ever notice how often your knee-jerk reaction makes you sound like jerk?) was that the intermediate variable is unecessary. But I thought about it some more and came up with the scenario. Suppose we’re setting a property to the value of an object created via an initializer.

this.SomePropertyWithSideEffects = new Dependency { Prop = 42 };

The way to do this without an intermediate local variable is the following.

this.SomePropertyWithSideEffects = new Dependency();
this.SomePropertyWithSideEffects.Prop = 42;

The first code block only calls the setter of SomePropertyWithSideEffects. But the second code block calls both the getter and setter. That’s pretty different behavior.

Now imagine we’re setting multiple properties or using a collection initializer with multiple items instead. We’d be calling that property getter multiple times. Who knows what awful side-effects that might produce. Sure, side effects in property getters are bad, but as I’ll point out later, there’s another reason this approach is fraught with error.

The intermediate local variable is necessary to ensure the object is only assigned after it’s fully constructed.

Dispose it for me?

So given that, let’s try implementing the the Owner constructor of my previous example the way a compiler might do it.

public Owner()
{
    var <>g__initLocal0 = new Dependency();
    try
    {
        <>g__initLocal0.SomeProperty = "Blah";
    }
    catch (Exception)
    {
        <>g__initLocal0.Dispose();
        throw;
    }
    this.dependency = <>g__initLocal0;
}

That’s certainly seems much safer, but there’s still a potential flaw. It’s optimistically calling dispose on the object when the exception is thrown. What if I didn’t want to call dispose on it even though it’s disposable? Maybe the Dispose method of this specific object deletes your hard-drive and plays Justin Bieber music when invoked. In 99.9 times out of 100, you would want Dispose called in this case. But this is still a change in behavior and I can understand why the compiler might not risk it.

Perhaps the compiler could attempt to figure out if that instance eventually gets disposed and do the right thing. All you have to do find a flaw in Turing’s proof of the Halting Problem. No problem, right?

Perhaps we could be satisfied with good enough. Dispose it always and just say that’s the behavior of object initializers. It’s probably too late for that change as that’d be a breaking change. It’d be one I could live with honestly.

Let me dispose it

Perhaps the problem isn’t that we want the compiler to automatically dispose of the intermediate object in the case of an exception. What we really want is the assignment to  happen no matter what so we can dispose of it in our code if an exception is thrown. Perhaps the compiler can generate code that would allow us to do this in our code.

public Owner()
{
    try
    {
        this.dependency = new Dependency { SomeProperty = "blah" };
    }
    catch (Exception)
    {
        if (this.dependency != null)
            this.dependency.Dispose();
    }
}

What might the generated code look like in this case?

public Owner()
{
    var <>g__initLocal0 = new Dependency();
    this.dependency = <>g__initLocal0;
    <>g__initLocal0.SomeProperty = "Blah";
}

That’s not too shabby. We got rid of the try/catch block that we had to introduce previously, and if an exception is thrown in the property setter, we can clean up after it. I’m a genius!

Not so fast Synkowski. There’s a potential problem here. Suppose we’re not inside a constructor, but rather are in a method that’s setting a shared member.

public void DoStuff()
{
    var <>g__initLocal0 = new Dependency();
    this.dependency = <>g__initLocal0;
    <>g__initLocal0.SomeProperty = "Blah";
}

We’ve now introduced a possible race condition if this method is used in an async or multithreaded environment.

Notice that after this.dependency is set to the local incomplete instance, but before the local property is set, there’s room for another thread to modify this.dependency in some way right in that gap leading to indeterminate results. That’s definitely a change you wouldn’t want the compiler doing.

In fact, this same issue affects my earlier proposal of not using an intermediate variable.

So about that Code Analysis

Note that I didn’t specifically address Ayende’s case. In his case, the initializer is in a using block. That seems like a more tractable problem for the compiler to solve, but this post is getting long as it is and it’s time to wrap up. Maybe someone else can analyze that case for shits and giggles.

In my case, we’re setting a member that we plan to dispose later. That’s a much harder (if not impossible) nut to crack.

And here we get to the moral of the story. I get a lot more work done when I don’t stop every hour to write a blog post about some interesting bug I found in my code.

No wait, that’s not it.

The point here is that code analysis is a very helpful tool for writing more robust and correct code. But it’s just an assistant. It’s not a safety net. It’s more like an air bag. It’ll keep you from splattering your brains on the dashboard, but you can still totally wreck your car and break that nose if you’re not careful at the wheel.

Here’s an example where automated tools can both lead you into an accident, but save your butt at the last second.

If you use Resharper (another tool with its own automated analysis) like I do and you write some code in a constructor that doesn’t use an object initializer, you’re very likely to see this (with the default settings).

resharper-nag

See that green squiggly under the new keyword just inviting, no begging, you to hit ALT+ENTER and convert that bad boy into an object initializer? Go ahead, it seems to suggest. What could go wrong? Oh it could cause you to now leak a resource as pointed out earlier.

I often like to hit CTRL E + CTRL C in Resharper to reformat my entire source file to be consistent with my coding standards. Depending on how you set up the reformatting, such an automatic action could easily change this code from working code to subtly broken code.

I still have to pay careful attention to what it’s doing. It’s easy to get lulled into a sense of safety when performing automatic refactorings. But you can’t trust it one hundred percent. You are the one who is responsible, not the tools. You are the one in control.

Fortunately in this case, Code Analysis brought this issue to my attention. And in doing so, pointed out an interesting topic for a blog post. Yay automation!

code comments edit

Tony Hoare, the computer scientist who implemented null references in ALGOL calls it his “billion-dollar mistake.”

I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

It may well be that a billion is a vast underestimate. But if you’re going to make a mistake, might as well go big. Respect!

To this day, we pay the price with tons of boilerplate code. For example, it’s generally good practice to add guard clauses for each potentially null parameter to a public method.

public void SomeMethod(object x, object y) {
  // Guard clauses
  if (x == null)
    throw new ArgumentNullException("x");
  if (y == null)
    throw new ArgumentNullException("y");

  // Rest of the method...
}

While it may feel like unnecessary ceremony, Jon Skeet gives some good reasons why guard clauses like this are a good idea in this StackOverflow answer:

Yes, there are good reasons:

  • It identifies exactly what is null, which may not be obvious from a NullReferenceException
  • It makes the code fail on invalid input even if some other condition means that the value isn’t dereferenced
  • It makes the exception occur before the method could have any other side-effects you might reach before the first dereference
  • It means you can be confident that if you pass the parameter into something else, you’re not violating theircontract
  • It documents your method’s requirements (using Code Contracts is even better for that of course)

I agree. The guard clauses are needed, but it’s time for some Real Talk™. This is shit work. And I hate shit work.

In this post,

  • I’ll explain the idea of non-nullable parameters and why I didn’t use CodeContracts in the hopes that heads off the first 10 comments asking “why didn’t you use CodeContracts dude?”
  • I’ll cover an approach using PostSharp to automatically validate null arguments.
  • I’ll then explain how I hope to create an even better approach.

Stick with me.

Non Null Parameters

With .NET languages such as C#, there’s no way to prevent a caller of a method from passing in a null value to a reference type argument. Instead, we simply end up having to validate the passed in arguments and ensure they’re not null.

In practice (at least with my code), the number of times I want to allow a null value is far exceeded by the number of times a null value is not valid. What I’d really like to do is invert the model. By default, a parameter cannot be null unless I explicitly say it can. In other words, make allowing null opt-in rather than opt-out as it is today.

I recall that there was some experimentation around this by Microsoft with the Spec# language that introduced a syntax to specify that a value cannot be null. For example…

public void Foo(string! arg);

…defines the argument to the method as a non-nullable string. The idea is this code would not compile if you attempt to pass in a null value for arg. It’s certainly not a trivial change as Craig Gidney writes in this post. He covers many of the challenges in adding a non-nullable syntax and then goes further to provide a proposed solution.

C# doesn’t have such a syntax, but it does have Code Contracts. After reading up on it, I really like the idea, but for me it suffers from one fatal flaw. There’s no way to apply a contract globally and then opt-out of it in specific places. I still have to apply the Contract calls to every potentially null argument of every method. In other words, it doesn’t satisfy my requirement to invert the model and make allowing null opt in rather than opt out. It’s still shit work. It’s also error-prone and I’m too lazy a bastard to get it right in every case.

IL Rewriting to the Rescue

So I figured I’d go off the deep end and experiment with Intermediate Language (IL)weaving with PostSharp to insert guard clauses automatically. Usually, any time I think about rewriting IL, I take a hammer to my head until the idea goes away. A few good whacks is plenty. However in this case, I thought it’d be a fun experiment to try. Not to mention I have a very hard head.

I chose to use PostSharp because it’s easy to get started with and it provides a simple, but powerful, API. It does have a few major downsides for what I want to accomplish that I’ll cover later.

I wrote an aspect, EnsureNonNullAspect, that you apply to a method, a class, or an assembly that injects on null checks for all public arguments and return values in your code. You can then opt out of the null checking using the AllowNullAttribute.

Here’s some examples of usage:

using NullGuard;

[assembly: EnsureNonNullAspect]

public class Sample 
{
    public void SomeMethod(string arg) {
        // throws ArgumentNullException if arg is null.
    }

    public void AnotherMethod([AllowNull]string arg) {
        // arg may be null here
    }

    public string MethodWithReturn() {
        // Throws InvalidOperationException if return value is null.
    }
   
    // Null checking works for automatic properties too.
    public string SomeProperty { get; set; }

    [AllowNull] // can be applied to a whole property
    public string NullProperty { get; set; }

    public string NullProperty { 
        get; 
        [param: AllowNull] // Or just the setter.
        set; 
}

For more examples, check out the automated tests in the NullGuard GitHub repository.

By default, the attribute only works for public properties, methods, and constructors. It also validates return values, out parameters, and incoming arguments.

If you need more fine grained control of what gets validated, the EnsureNonNullAspect accepts a ValidationFlags enum. For example, if you only want to validate arguments and not return values, you can specify: [EnsureNonNullAspect(ValidationFlags.AllPublicArguments)].

Downsides

This approach requires that the NullGuard and PostSharp libraries are redistributed with the application. Also, the generated code is a bit verbose. Here’s an example of the generated code of a previously one line method.

Another downside is that you’ll need to install the PostSharp Visual Studio extension and register for a license before you can fully use my library. The license for the free community edition is free, but it does add a bit of friction just to try this out.

I’d love to see PostSharp add support for generating IL that’s completely free of dependencies on the PostSharp assemblies. Perhaps by injecting just enough types into the rewritten assembly so it’s standalone.

Try it!

To try this out, install the NullGuard.PostSharp package from NuGet.  (It’s a pre-release library so make sure you include preleases when you attempt to install it).

Install-Package NullGuard.PostSharp IncludePrelease

Make sure you also install the PostSharp Visual Studio extension.

When you install the NuGet package into a project, it should modify that project to use PostSharp. If not, you’ll need to add an MSBuild task to run PostSharp against your project. Just look at Tests.csproj file in the NullGuard repository for an example.

If you just want to see things working, clone the NullGuard repository and run the unit tests.

File an issue if you have ideas on how to improve it or anything that’s wonky.

Alternative Approaches and What’s Next?

NullGuard.PostSharp is really an experiment. It’s my first iteration in solving this problem. I think it’s useful in its current state, but there are much better approaches I want to try.

  • Use Fody to write the guard blocks. Fody is an IL Weaver tool written by Simon Cropp that provides an MSBuild task to rewrite IL. The benefit of this approach is there is no need to redistribute parts of Fody with the application. The downside is Fody is much more daunting to use as compared to PostSharp. It leverages Mono.Cecil and requires a decent understanding of IL. Maybe I can convince Simon to help me out here. In the meanwhile, I better start reading up on IL. I think this will be the next approach I try. UPDATE: Turns out that in response to this blog post, the Fody team wrote NullGuard.Fody that does exactly this!
  • Use T4 to rewrite the source code. Rather than rewrite the IL, another approach would be to rewrite the source code much like T4MVC does with T4 Templates. One benefit of this approach is I could inject code contracts and get all the benefits of having them declared in the source code. The tricky part is doing this in a robust manner that doesn’t mess up the developer’s workflow.
  • Use Roslyn. It seems to me that Roslyn should be great for this. I just need to figure out how exactly I’d incorporate it. Modify source code or update the IL?
  • Beg the Code Contracts team to address this scenario. Like the Temptations, I ain’t too proud to beg.

Yet another alternative is to embrace the shit work, but write an automated test that ensures every argument is properly checked. I started working on a method you could add to any unit test suite that’d verify every method in an assembly, but it’s not done yet. It’s a bit tricky.

If you have a better solution, do let me know. I’d love for this to be handled by the language or Code Contracts, but right now those just don’t cut it yet.

personal comments edit

I wasn’t prepared to write an end-of-year blog post given the impending destruction of the world via a Mayan prophesied cataclysmic fury. But since that didn’t pan out I figured I’d better get typing.

reflections

Those of us that are software developers shouldn’t be too surprised that the world didn’t end. After all, how often do projects come in on time within the estimated date amirite?! (high five to all of you).

Highlights of 2012

This year has just been a blast. As my kids turn five and three, my wife and I find them so much more fun to hang out with. Also, this year I reached the one year mark at the best place to work ever. Here’s a breakdown of some of the highlights from the year for me.

  • January Twice a year we have an all-company summit where we get together, give talks, plan, and just have a great time together. This was my first one and I loved every moment of it.
  • February The MVP summit was in town. I wasn’t eligible to be an MVP as a recently departed employee, but I was eligible to host my first GitHub Drinkup for all the MVPs and others in town. We had a big crowd and a great time.
  • March I travelled to the home of the Hobbits, New Zealand to give a keynote at CodeMania.
  • April My family and I visited Richard Campbell and his family in the Vancouver area. I also recorded a Hanselminutes podcast.
  • May We released GitHub for Windows in May. I also visited GitHub HQ this month for a mini-summit with the other native app teams and recorded more podcasts including Herding Code and Deep Fried Bytes.
  • June I spoke at NDC in Oslo Norway. Had a great conference despite the awkward “Azure Girls” incident.
  • July Gave a last minute talk at ASPConf. The software I used to record it crashed and so there’s no recording of this talk sadly.
  • August Back in San Francisco for the GitHub all company summit. I corner Skrillex and force him to take a photo with me.
  • September Family vacation to Oahu Hawaii. I also end up giving a talk to a local user group and hosting a drink up. And my son started Kindergarten.
  • October I spoke at MonkeySpace and got really fired up about the future of Open Source in the .NET world.
  • November At the end of the month I was a guest on the .NET Rocks Roadshow. We had a rollicking good time. I went on a private tour of SpaceX with the fellas. We took the RV to the venue and I got to sample some of the Kentucky Whiskey they collected on their travels before recording a show on Git, GitHub, NuGet, and the non-hierarchical management model we have at GitHub.
  • DecemberThis was a quiet month for me. No travels. No talks. Just hacking on code, spending time with the family, and celebrating one year at GitHub. Oh, I also loved watching this Real Life Fruit Ninja to Dubstep video. Perhaps the highlight of 2013.

Top 3 Blog Posts by the numbers

As I did in 2010, I figured I’d post my top three blog posts according to the Ayende Formula.

  • Introducing GitHub for Windows introduces the Git and GitHub client application my team and I worked on this past year (103 comments, 68,672 web views, 25,048 aggregator views).
  • It’s the Little Things About ASP.NET MVC 4 highlights some of the small improvements in ASP.NET MVC 4 that are easy to overlook, but are nice for those that need them (49 comments 56900 web views, 26,044 aggregator views)
  • Structuring Unit Tests covers a nice approach to structuring unit tests for C# developers that I learned from others. This post was written in January which might help explain why it’s in the top three (52 comments, 41,852 web views, 26,073 aggregator views).

My Favorite three posts

These are the three posts that I wrote that were my favorites.

  • You Don’t Need a Thick Skin describes the realization that rather than develop a thick skin, I should focus on developing more empathy for folks that use my software.
  • One year at GitHub is a look back at my year at GitHub and how much I’m enjoying working there.
  • How to Talk to Employees argues that the way to talk to employees is simply the way you’d want to be addressed.

You People

Enough about me, let’s talk about you. As I did in my 2010 post, I thought it’d be fun to post some numbers.

According to Google Analytics:

  • Hello Visitors! 1,880,184 absolute unique visitors (up 6.15% from 2011) made 2,784,021 (down half a %) visits to my blog. came from 223 countries/territories. Most of you came from the United States (875,837) followed by India (267,164) with the United Kingdom (221,727) in third place.
  • Browser of choice:Just two years ago, most of my visitors used Firefox. Now it’s Google Chrome with 45.84%. In second place at 26.37% is Firefox  with IE users at 19.08%. Safari is next at 4% with Opera users still holding on to 2%. I really need to stop making those Opera user jokes. You guys are persistent!
  • Operating System: As I expected, most of you (87.16%) are on Windows, but that number seems to decline every year. 5.71% on a Mac and 2.24% on Linux. The mobile devices are a tiny percentage, but I would imagine that’ll pick up a lot next year.
  • What you read: The blog post most visited in 2012 was written in 2011, C# Razor Syntax Quick Reference with 119,962 page views.
  • How’d you get here: Doesn’t take a genius to guess that most folks come to my blog via Google search results (1,691,540), which probably means most of you aren’t reading this. Winking
smile StackOverflow moves to second place (292,670) followed closely by direct visitors (237,862).

My blog is just a single sample, but it’s interesting to me that these numbers seem to reflect trends I’ve seen elsewhere.

Well that’s all I have for 2012. I’m sure there are highlights I forgot to call out that are more memorable or important than the ones I listed. I’m bad at keeping track of things.

One big highlight for me is all the encouraging feedback, interesting comments, insightful thoughts, etc. that I’ve received from many of you in the past year either through comments on my blog or via Twitter. I appreciate it and I hope many of you have found something useful in something I’ve written on my blog or on Twitter. I’ll work hard to provide even more useful writings in the next year.

Happy New Year and I hope that 2013 is even better for you than 2012!

comments edit

Merry Christmas!

I’m going to be migrating my comment system to Disqus. You might notice missing comments or other such weirdness. Do not be alarmed.

I’ll try to do this at a time I expect the lowest amount of traffic.

Hope you’re having a great holidays!

code, community, empathy comments edit

I have a confession to make.

I sometimes avoid your feedback on the Twitters. It’s nothing personal. I have a saved search for stuff I work on because I want to know what folks have to say. I want to know what problems they might run into or what ideas they have to improve things. Nonetheless, I sometimes just let the unread notifications sit there while I hesitate and cringe at the thought of the vitriol that might be contained within.

I know. I know. That’s terrible. It’s long been conventional wisdom that if you’re going to write software and ship it to other humans, you better develop a thick skin.

Hey, I used to work at Microsoft. People have…strong…opinions about software that Microsoft ships. It’s the type of place you learn to develop a full body callus of a thick skin. So I’m with you.

But even so, when you invest so much of yourself into something you create, it’s hard not to take criticisms personally. The Oatmeal captures this perfectly in this snippet of his brilliant post: Some thoughts on and musing about making things for the web.

oatmeal-on-reading-comments

That is me right there. I’m not going to stop shipping software so the best thing to do is work harder and develop a thicker skin. Right?

Nope.

I strongly believed this for years, but a single blog post changed my mind. This post didn’t say anything I hadn’t heard before. But it was the experience of the author that somehow clicked and caused me to look at things in a new way. In this case, it was Sam Stephenson’s blog post, You are not your code that did it for me.

In his post, he talks about the rise and fall of his creation, Prototype and how he took its failure personally, and the lesson that he learns as a result.

I have learned that in the open-source world, you are not your code. A critique of your project is not tantamount to a personal attack. An alternative take on the problem your software solves is not hostile or divisive. It is simply the result of a regenerative process, driven by an unending desire to improve the status quo.

This sparked an epiphany. Reinforcing a thick skin detaches me from the people using my software. Even worse, it puts me in an adversarial position towards the folks who just want to get something done with the software. This is so wrong. Rather than work on a developing a thicker skin, I really should work on developing more empathy.

Show of hands. Have any of you ever been frustrated with a piece of software you’re trying to use? Of course you have! Now put your hand down. You look silly raising your hand for no reason.

How did you feel? I know how I’ve felt. Frustrated. Impotent. Stupid. Angry. Perhaps I said a few words I’m not proud of about how I might inflict bodily harm on the author in anatomically impossible ways should we ever meet in a dark alley.

I certainly didn’t mean those words (except in the case ofbundled software written by hardware companies. That shit makes me cray!). I was simply lashing out due to my frustrations.

And it hit me.

The angry tweets calling my work “a piece of crap” is written by folks just like me. Rather than harden my stance in opposition to these folks, I need to be on their side!

I need to remove the adversarial mindset and instead share in their frustration as a fellow human who also understands what it’s like to be angry at software. I no longer need to take this criticism personally. This shift in mindset unblocked me from diving right into all that feedback on Twitter. I started replying to folks with something along the lines of “I’m sorry. That does suck. I know it’s frustrating. I’m going to have a word with the idiot who wrote that (namely me)! Email me with details and I’ll work to get it fixed.

The end result is I’m able to provide much better support for the software.

By doing this, I’ve also noticed a trend. When you sincerely address people’s frustrations, they tend to respond very warmly. Many of them know what it’s like to be criticized as well. People are quick to forgive if they know you care and will work to make it better.

Sure, there will still be moments where I have a knee jerk reaction and maybe lose my temper for a moment. But I think this framework for how to think about feedback will help me do that much less and preserve my sanity. I am definitely not my code. But I am here to help you with it.