open source, github, code, code review comments edit

If I had to pick just one feature that embodies GitHub (besides emoji support of course , I’d easily choose the Pull Request (aka PR). According to GitHub’s help docs (emphasis mine),

Pull requests let you tell others about changes you’ve pushed to a GitHub repository. Once a pull request is sent, interested parties can review the set of changes, discuss potential modifications, and even push follow-up commits if necessary.

Some folks are confused by the name “pull request.” Just think of it as a request for the maintainer of the project to “pull” your changes into their repository.

Here’s a screenshot of a pull request for GitHub for Windows where Paul Betts patiently explains why my code might result in the total economic collapse of the world economy.

sample code review

A co-worker code review is a good way to avoid the Danger Zone (slightly NSFW).

Code review is at the heart of the GitHub collaboration model. And for good reason! There’s a rich set of research about the efficacy of code reviews.

In one of my favorite software books, Facts and Fallacies of Software Engineering by Robert Glass, Fact 37 points out,

Rigorous inspections can remove up to 90 percent of errors from a software product before the first test case is run.

And the best part is, that reviews are cost effective!

Furthermore, the same studies show that the cost of inspections is less than the cost of the testing that would be necessary to find the same errors.

One of my other favorite software books, Code Complete by Steve McConnell, points out that,

the average defect detection rate is only 25 percent for unit testing, 35 percent for function testing, and 45 percent for integration testing. In contrast, the average effectiveness of design and code inspections are 55 and 60 percent.

Note that McConnell is referring to evidence for the average effectiveness while Glass refers to evidence for the peak effectiveness.

The best part though, is that Code Review isn’t just useful for finding defects. It’s a great way to spread information about coding standards and conventions to others as well as a great teaching tool. I learn a lot when my peers review my code and I use it as an opportunity to teach others who submit PRs to my projects.

Effective Code Review

You’ll notice that Glass and McConnell use the term “code inspection” and not review. A lot of time, when we think of code review, we think of simply looking the code up and down a bit, making a few terse comments about obvious glaring errors, and then calling it a day.

I know I’ve been guilty of this “drive-by” code review approach. It’s especially easy to do with pull requests.

But what these gentlemen refer to is a much more thorough and rigorous approach to reviewing code. I’ve found that when I do it well, a proper code review is just as intense and mentally taxing as writing code, if not more so. I usually like to take a nap afterwards.

Here are a few tips I’ve learned over the years for doing code reviews well.

Review a reasonable amount of code at a time

This is one of the hardest tips for me to follow. When I start a review of a pull request, I am so tempted to finish it in one sitting because I’m impatient and want to get back to my own work. Also, I know that others are waiting on the review and I don’t want to hold them up.

But I try and remind myself that code review is my work! Also, a poorly done review is not much better than no review at all. When you realize that code reviews are important, you understand that it’s worth the extra time to do it well.

So I usually stop when I reach that point of review exhaustion and catch myself skipping over code. I just take a break, move onto something else, and return to it later. What better time to catch up on Archer episodes?!

Focus on the code and not the author

This has more to do with the social aspect of code review than defect finding. I try to do my best to focus my comments on the code and not the ability or the mental state of the author. For example, instead of asking “What the hell were you thinking when you wrote this?!” I’ll say, “I’m unclear about what this section of code does. Would you explain it?”.

See? Instead of attacking the author, I’m focusing on the code and my understanding of it.

Of course, it’s possible to follow this advice and still be insulting, “This code makes me want to gouge my eyes out in between my fits of vomiting.” While this sentence focuses on the code and how it makes me feel, it’s still implicitly insulting to the author. Try to avoid that.

Keep a code review checklist

A code review checklist is a really great tool for conducting an effective code review. The checklist should be a gentle reminder of common issues in code you want to review. It shouldn’t represent the only things you review, but a minimal set. You should always be engaging your brain during a review looking for things that might not be on your checklist.

I’ll be honest, as I started writing this post, I only had a mental checklist I ran through. In an effort to avoid being a hypocrite and leveling up my code review, I created a checklist gist.

My checklist includes things like:

  1. Ensure there are unit tests and review those first looking for test gaps. Unit tests are a fantastic way to grasp how code is meant to be used by others and to learn what the expected behavior is.
  2. Review arguments to methods. Make sure arguments to methods make sense and are validated. Consider what happens with boundary conditions.
  3. Look for null reference exceptions. Null references are a bitch and it’s worth looking out for them specifically.
  4. Make sure naming, formatting, etc. follow our conventions and are consistent.I like a codebase that’s fairly consistent so you know what to expect.
  5. Disposable things are disposed. Look for usages of resources that should be disposed but are not.
  6. Security.There is a whole threat and mitigation review process that falls under this bucket. I won’t go into that in this post. But do ask yourself how the code can be exploited.

I also have separate checklists for different platform specific items. For example, if I’m reviewing a WPF application, I’m looking out for cases where we might update the UI on a non-UI thread. Things like that.

Step Through The Code

You’ll note that I don’t mention making sure the code compiles and that the tests pass. I already know this through the magic of the commit status API which is displayed on our pull requests.

-396

However, for more involved or more risky code changes, I do think it’s worthwhile to actually try the code and step through it in the debugger. Here, GitHub has your back with a relatively new feature that makes it easy to get the code for a specific pull request down to your machine.

If you have GitHub for Windows or GitHub for Mac installed and you scroll down to the bottom of any pull request, you’ll see a curious new button.

clone-pr-in-desktop

Click on that button and we’ll clone the pull request code to your local machine so you can quickly and easily try it out.

Note that in Git parlance, this is not the original pull request branch, but reference (usually named something like pr/42 where 42 is the pull request number) so you should treat it as a read-only branch. But you can always create a branch from that reference and push it to GitHub if you need to.

I often like to do this and run Resharper analysis on the code to highlight things like places where I might want to convert code to use a LINQ expression and things like that.

Sign Off On It

After a few rounds of review, when the code looks good, make sure you let the author know! Praise where praise is due is an important part of code reviews.

At GitHub, when a team is satisfied with a pull request, we tend to comment it and include the ship it squirrel emoji (:shipit:) . That indicates the review is complete, everything looks good, and you are free to ship the changes and merge it to master.

Every team is different, but on the GitHub for Windows team we tend to let the author merge the code into master after someone else signs off on the pull request.

This works well when dealing with pull requests from people who also have commit access. On my open source projects, I tend to post a thumbs up reaction gif to show my immense appreciation for their contribution. I then merge it for them.

Here’s one of my favorites for a very good contributions.

Bruce Lee gives a thumbs up

Be Good To Each Other

Many of my favorite discussions happen around code. There’s something about having working code to focus a discuss in a way that hypothetical discussions do not.

Of course, even this can break down on occasion. But for the most part, if you go into a code review with the idea of both being taught as well as teaching, good things result.

community, personal comments edit

I love a good argument. No really! Even ones online.

The problem is, so few of them are any good. They tend to go nowhere and offer nothing of value. They just consist of one side attempting to browbeat the other into rhetorical submission.

What?! You are not persuaded by my unassailable argument? THEN LET ME MAKE THE SAME POINTWITH ALL CAPS!

ARE YOU NOT CONVINCED?!

red-cardYou want to argue? Argue with this card! Image: from wikipedia CC BY-SA 3.0.

So what makes an argument good? (besides when you agree with me which is always a good move)

A while back, I read an interesting article about Professor Daniel H. Cohen, a philosopher who specializes in argumentation theory, that tackles this question.

As an aside, I wonder how his kids feel arguing with someone who’s basically a professor of arguing? Must be hard winning that argument about extending that curfew.

The article starts off with a scenario that captures 99.9% of arguments (online or offline) well:

You are having a heated debate with friends about, say, equality of the sexes. You’ve taken a standpoint and you’re sticking with it. Before you know it, you’ve got so worked up that, regardless of whether you believe your argument is the most valid, you simply just want to win, employing tactics and subterfuge to seek victory.

I like to think of myself as a very logical reasonable person. But when I read this scenario, I realized how often I’ve fallen prey to that even in what should be dispassionate technical arguments!

I’m pretty sure I’m not the only one. I’m just willing to admit it.

Cohen points out that the “war metaphor” is at fault for this tendency. Often, it’s the so-called “losers” of an argument who really win:

He explains, “Suppose you and I have an argument. You believe a proposition, P, and I don’t. I’ve objected, I’ve questioned, I’ve raised all sorts of counter-considerations, and in every case you’ve responded to my satisfaction. At the end of the day, I say, ‘You know what? I guess you’re right.’ So I have a new belief. And it’s not just any belief, but it’s a well-articulated, examined and battle-tested belief. Cohen continues, “So who won that argument? Well, the war metaphor seems to force us into saying you won, even though I’m the only one who made any cognitive gain.

The point of a good argument isn’t for one person to simply win over the other. It’s ideally for both to come away with cognitive gains.

Even if the goal of an argument is to reach a decision, the goal isn’t to win, the goal is to define the parameters for a good decision and then make the best possible decision with that in mind.

I’ve come to believe that when two reasonably smart people disagree on a subject, at the core, it is often because one of the following:

  1. One or both of the participants is missing key information.
  2. One or both of the participants made a logic error that leads to a wrong conclusion.
  3. The participants agree on the facts, but have different values and priorities leading them to either disagree on what conclusion should come from the facts.

In my mind, a good debate tries to expose missing facts and illogical conclusions so that two in the debate can get to the real crux of the matter, how their biases, experiences, and values shape their beliefs.

I’m assuming here that both participants are invested in the debate. When one isn’t, it becomes overwhelmingly tempting to resort to any means necessary in order to wipe that smug smirk off your opponent’s face.

Troll_Face

Of course, both sides will believe they’re the one who is drawing conclusions from years of objective rational analysis, but they’re both wrong. In the end, we all succumb to various biases and our values. And a good debate can expose those and allow participants to discuss whether those are the right biases and values to have in the first place? That’s where an argument really gets somewhere.

Another philosopher, Daniel Dennett, lays out these rhetorical habits when critiquing or arguing in his book, Intuition Pumps And Other Tools for Thinking:

How to compose a successful critical commentary:

​1. Attempt to re-express your target’s position so clearly, vividly and fairly that your target says: “Thanks, I wish I’d thought of putting it that way.”

​2. List any points of agreement (especially if they are not matters of general or widespread agreement).

​3. Mention anything you have learned from your target.

​4. Only then are you permitted to say so much as a word of rebuttal or criticism.

These habits nicely complement the improved metaphor for arguing espoused by Cohen.

So the next time you get into an argument, think about your goals. Are you just trying to win or are you trying to reach mutual understanding? Then try to apply Dennett’s rhetorical habits as you argue. I’ll try to do the same so if we end up in an argument, there’s a better chance it’ll result in a good one.

This will serve you well not only in your work, but in your personal relationships as well.

code, open source comments edit

Just shipped a new release of RestSharp to NuGet. For those who don’t know, RestSharp is a simple REST and HTTP API Client for .NET.

This release is primarily a bug fix release with a whole lotta bug fixes. It should be fully compatible with the previous version. If it’s not, I’m sorry.

Some highlights:

  • Added Task<T> Overloads for async requests
  • Serialization bug fixes
  • ClientCertificate bug fix for Mono
  • And many more bug fixes…

Full release notes are up on GitHub. If you’re interested in the nitty gritty, you can see every commit that made it into this release using the GitHub compare view.

I want to send a big thanks to everyone who contributed to this release. You should feel proud of your contribution!

Who are you and what did you do to Sheehan?!

Don’t worry! John Sheehan is safe and sound in an undisclosed location. Ha! I kid. I’m beating him senseless every day.

Seriously though, if you use RestSharp, you should buy John Sheehan a beer. Though get in line as Paul Betts owes him at least five beers.

-359John started RestSharp four years ago and has shepherded it well for a very long time. But a while back he decided to focus more on other technologies. Even so, he held on for a long time tending to his baby even amidst a lot of frustrations, until he finally stopped contributing and left it to the community to handle.

And the community did. Various other folks started taking stewardship of the project and it continued along. This is the beauty of open source.

We at GitHub use RestSharp for the GitHub for Windows application. A little while back, I noticed people stopped reviewing and accepting my pull requests. Turns out the project was temporarily abandoned. So Sheehan gave me commit access and I took the helm getting our bug fixes in as well as reviewing and accepting the many dormant pull requests. That’s why I’m here.

Why RestSharp when there’s HttpClient?

Very good question! System.Net.HttpClient is only available for .NET 4.5. There’s the Portable Class Library (PCL) version, but that is encumbered by silly platform restrictions. I’ve written before that this is harms .NET. I am hopeful they will eventually change it.

RestSharp is unencumbered by platform restrictions - another beautiful thing about open source.

So until Microsoft fixes the licensing on HttpClient, RestSharp is one of the only options for a portable, multi-platform, unencumbered, fully open source HTTP client you can use in all of your applications today. Want to build the next great iOS app using Xamarin tools? Feel free to use RestSharp. Find a bug in using it on Mono? Send a pull request.

The Future of RestSharp

I’m not going to lie. I’m just providing a temporary foster home for RestSharp. When the HttpClient licensing is fixed, I may switch to that and stop shepherding RestSharp. I fully expect others will come along and take it to the next level. Of course it really depends on the feature set it supplies and whether they open source it.

As they say, open source is about scratching an itch. Right now, I’m scratching the “we need fixes in RestSharp” itch. When I no longer have that itch, I’ll hand it off to the next person who has the itch.

But while I’m here, I’m going to fix things up and make them better.

code, open source, github comments edit

The first GitHub Data Challenge launched in 2012 and asked the following compelling question: what would you do with all this data about our coding habits?

The GitHub public timeline is now easy to query and analyze. With hundreds of thousands of events in the timeline every day, there are countless stories to tell.

Excited to play around with all this data? We’d love to see what you come up with.

It was so successful, we did it again this past April. One of those projects really caught my eye, a site that analysise Popular Coding Conventions on GitHub. It ended up winning second place.

It analyzes GitHub and provides interesting graphs on which coding conventions are more popular among GitHub users based on analyzing the code. This lets you fight your ever present software religious wars with some data.

For example, here’s how the Tabs vs Spaces debate lands among Java developers on GitHub.

java-tabs-vs-spaces

With that, I’m sure nobody ever will argue tabs over spaces again right? RIGHT?!

What about C#?!

UPDATE: JeongHyoon Byun added C# support! Woohoo!

Sadly, there is no support for C# yet. I logged an issue in the repository about that a while back and was asked to provide examples of C# conventions.

I finally got around to it today. I simply converted the Java examples to C# and added one or two that I’ve debated with my co-workers.

However, to get this done faster, perhaps one of you would be willing to add a simple CSharp convention parser to this project. Here’s a list of the current parsers that can be used as the basis for a new one.

Please please please somebody step up and write that parser. That way I can show my co-worker Paul Betts the error of his naming ways.

humor, personal, company culture comments edit

I avoid mailing lists the same way I avoid fun activities like meetings and pouring lemon juice into bloody scrapes. Even so, I still somehow end up subscribed to one or two. Even worse, once in a while, despite my better judgment, I send an email to such a list and am quickly punished for my transgression with an onslaught of out of office auto replies. You know the type:

Hey there friend! Thanks for your email! No seriously, I’m thanking you even though I haven’t read it. I’m sure it’s important because I’m important.

Unfortunately (for you), I’m off to some island paradise drinking one too many Mai Tais and probably making an ass of myself.

If you need to reach me, you can’t, LOL! You can contact this list of people you don’t know in my absence. Good luck with that!

punishment Wait till you see the punishment for sending an email during the holidays! Photo by: Tomas Alexsiejunas license: CC BY 2.0

If you have such a rule set up, let me humbly offer you a friendly constructive tip:

NOBODY FUCKING CARES!

The universe has gone on for around 14 billion years before you were born just fine. And chances are, it’ll continue to survive after your death for another 100 googol years until the entropy death of the last proton, or another universe inflates to take its place. Whichever comes first.

So in the grand scheme of things, nobody cares that you’re out of the office, on vacation, or worse, too busy to respond to email so you automatically send me one as if I have all the time in the world to deal with more email.

Ok, that might have come across as a eensy weensy bit ranty. I’ll try to tone it down and offer something more constructive. After all, I’ve probably been guilty of this in my past and I apologize and have paid my penance (see photo above).

Maybe there’s a tiny bit of a purpose

The first time I experienced widespread use of out-of-office replies is during my time at Microsoft. And to be fair, it does serve a small purpose. While 99.999999% of the world doesn’t care if you’re out of the office (that’s science folks), sometimes someone has a legitimate need to know who they should contact instead of you. For example, at Microsoft, I had an internal reply set up that directed emails to my manager. The lucky guy.

Fortunately for those using Outlook with Exchange, you can choose a different reply for internal emails than external emails. So definitely go and do that.

The two email rule of out-of-office replies

But what about the rest of us who really don’t care? I offer the following simple idea:

If you must have an out-of-office auto reply, create a rule to only send it when you receive two direct emails without a response. The idea here is that if I send you one email directly, I can probably wait for you to get back to respond. If I can’t, I’ll send you another “hey, bump!” email and then receive auto notice. After all, if I send you two emails, sending me one is fair game.

Also, make sure you never ever ever send an auto-reply when you are not in the TO list. That rule alone will cut out billions of email list out-of-office spam. Ideally, the auto-reply should only occur if you’re the only one in the TO list. Chances are someone else in the TO list will know you’re gone and can reply to the others if necessary. Again, the two email rule could come into play here.

In the meanwhile, I think I’m going to switch tactics. Spam me, I spam you. So I may respond to those auto-replies with something like:

Dear So-and-so,

Hey dude, thanks for letting me know that you’re not in your office. I bet you’re on a really important business trip and/or vacation! I bet you have such important things to do there!

Me? Not so much. I wish I was on an important business trip and/or vacation.

It turns out, I have nothing better to do than respond to your automatically generated email to me! Thank you so much for that. The fact that it had your name at the bottom of the email and my email address in the TO: list was a nice personal touch. It really felt like you took the time to lovingly craft that email just for me.

So I thought it would be rude not to respond in kind with a thoughtful response of my own.

Sincerely and without regret,

Phil

p.s. I left you a little surprise in your office, but since you’re not there, I hope it doesn’t die before you get back. If it smells bad when you get back, you’ll know.

Hopefully email clients take this up and just implement it automatically because I don’t expect people to take the time to do this right.

What do you think? Are auto-replies more important than I give credit or do we live in a world with a lot of narcissists who must be stopped? Tell me how I’m right or wrong in the comments. Thanks!

code, community comments edit

There’s something about being outdoors in Alaska that inspires poetic thoughts. In my case it’s all bad poetry so I’ll spare you the nausea and just show a photo instead.

This was taken descending from Flattop, a strenuous but easy climb. At the top, we couldn’t see more than ten yards because of the clouds. But as we descended, they suddenly cleared giving us a sliver of a view of the ocean and coastline.

flattop-hike

My first experience of Alaska was when my family moved out here from the tropical island of Guam. Yeah, it was a bit of a change. I went to high school there and returned to Anchorage every Christmas and Summer break during college.

When I lived there, I was completely oblivious to the existence of a software community out here. But around four years ago, I brought my family out to visit my parents and on a whim gave a talk to the local .NET user group.

This has become kind of a thing for me to try and find a local user group to speak at when I go on vacation.

This summer, after subjecting my wife to back to back weeks of travel, I decided to give her a long overdue break from family duties and take my kids to Alaska. One benefit of working at a mostly distributed company is I can pretty much work from any location that has good internet.

My parents were game to watch and entertain the kids while I worked from their house. Since I was up there, I also got in touch with folks about giving another talk.

Unfortunately, the .NET user group had disbanded not long after I left. They assured me it wasn’t my fault. It turns out that it is very difficult to get good speakers up there.

This isn’t too surprising. Alaska is pretty out-of-the way for most people. At the same time, it’s an amazing place to visit during the months of June and July. The days are long and sunny. There’s all sorts of outdoor activities to partake in.

So if you’re a decent speaker passionate about technology and happen to find yourself up there for vacation or otherwise, let me know and I can put you in touch with some people who would love to have you give a talk. You can probably even write-off a portion of your vacation if the talk is related to your work (though you should talk to your tax person about that and don’t listen to me).

alaska-github-talk

They put together an ad-hoc event while I was there and we had around twenty-five or so people show up. It doesn’t sound like a lot, but it’s a pretty good number for Anchorage and they really appreciate it. Afterwards, I recommend going out for some Alaskan Amber. It’s quite a good beer! And definitely hike Flattop.

open source comments edit

A while back I wrote a riveting 3-part developer’s guide to copyright law and open source licensing for developers.

I’m pretty sure you read every word at the edge of your seat. Who doesn’t love reading about laws, licenses, and copyright!?

Seriously though, I hope some of you found it useful. In this post, I want to talk about some recent developments that should make it easier for developers to license their code.

Choosealicense.com

A couple days ago I published a blog post on the GitHub blog about an effort I’ve been involved with, http://choosealicense.com/. Per the about page:

GitHub wants to help developers choose a license for their source code.

If you already know what you’re doing and have a license you prefer to use, that’s great! We’re not here to change your mind. But if you are bewildered by the large number of OSS license choices, maybe we can help.

I’m interested in helping developers be explicit and clear about their intent for their code. Adding a LICENSE (or an UNLICENSE if that’s your thing) file to the root of your public repository is a good way to state your intent. We even include an option if you really do want to retain all rights to your code, you grinch (I kid! I do not judge.)

But before you can choose a license, you need to be informed about what the license entails. That’s what we hope the site helps with.

Combined with the site, GitHub now has a feature that lets you choose a license when creating a repository on GitHub.

AddALicense.com

That’s great! But what about all your existing projects? Well one of my co-workers, Garen Torikian, has you covered. He built http://addalicense.com/ as a little side project. Note that the project is full of disclaimers:

This site is **not owned by or affiliated with GitHub**. But I work there, and I’m using the API to add each new license file. You’ll be asked to authenticate this app for your public repositories on the next page.

Perhaps in the future, we may integrate this into http://choosealicense.com/.

But in the meanwhile check it out and get those projects licensed!

company culture comments edit

A finely honed bullshit detector is a benefit to everyone. Let’s try a hypothetical conversation to test yours!

“Hey, we should release that under a more permissive license for XYZ reasons.”

“We’d like to, but the lawyers won’t let us.”

If it’s not malfunctioning, you should feel your bullshit detector tingling right now.

bull Yep, it’s a bull. Photo by Graeme Law CC BY 2.0

A lot of folks think that a lawyer’s job is to protect the business at all costs – that their job is to say “no!” Unfortunately, many places do structure it that way. After all, if a lawyer says “go ahead” and you get sued, the lawyer loses. But if the lawyer says “don’t”, there’s no immediate downside for the lawyer. Eventually the business may collapse from inaction, but there’s always teaching at law school as a backup. So why would the lawyer ever say “yes” in such a situation?

One of the best lessons I learned while at Microsoft was from Scott Guthrie when I expressed concern that the legal team wouldn’t let us break new ground with how we shipped open source.

He reminded me that the lawyers work for us. We do not work for the lawyers. If the lawyers had their way, we wouldn’t do anything because that’s the safest option.

You can see why so many people love the red polo.

Many decisions where legal gets involved are business decisions, not legal decisions. Unless the decision is downright illegal, the lawyer’s job is to help figure out how to do what’s best for the business. Along the way, they should make sure we’re aware of the risks, but also find ways to minimize the risks. At least that’s what a good lawyer does and I’ve been fortunate to work with some.

At the end of the day, even if the lawyer is uneasy about a course of action, they do not get to make the business decisions. That’s someone else’s job (unless you happen to work at a law firm I guess). Perhaps it’s your job.

So when someone tells you that “legal won’t let us do XYZ”, unless they follow that with “because it’s illegal and will land us all in jail and that’s no fun”, you should recognize it as a copout.

Sometimes what they mean is “I don’t really know what’s in the best interest of our business (or I’m too busy to care) so I’ll play it safe and blame the lawyers.”

What you hope they mean is “we won’t do this because it is not in the best interest of our business.” Now that is a fair answer. You may disagree, but it serves as a starting point for a more interesting conversation.

blogging, open source comments edit

Google is shuttering Google Reader in a little over a day (on July 1st, 2013) as I write this. If you use Google Reader to read my blog, this means you might miss out on my posts and I KNOW YOU DON’T WANT THIS!

Then again, maybe this is finally your chance to make a break, get some fresh air, stop reading blogs and start creating! I won’t hold it against you.

But for the rest of you, it’s a good time to find a replacement. Or at the very least follow me on Twitter since I do tweet when I blog.

There’s a lot of Google Reader replacements out there, but only two that I like so far.

Feedly

feedly

Feedly is gorgeous. There are apps for many platforms, but the browser works pretty well. Also, you can use Google to log into it and import your Google Reader feeds. I hope Google allows exporting to Feedly and other aggregators after July 1st even as they close down the Google Reader site.

The problem I have with Feedly is that it doesn’t work like Google Reader. It wouldn’t be so bad if it had a better flow for reading items, but I find its interface to be quirky and in some cases, unintuitive. For example, it seems I have to mark items as read by clicking “mark above articles as read” rather than having it do it automatically like Reader does after you scroll past it.

This leads me to…

Go Read

go-read

Go Read is a late entry into the list, but there are three important things I really like about it:

  1. It is intended to be a clean and simple clone of Google Reader.
  2. It supports Google Reader’s keyboard shortcuts.
  3. It is open source and up on GitHub!

For some more details, check out the announcement blog post by the author, Matt Jibson, a developer at Stack Exchange:

I would like to announce the release of Go Read. It as a Google Reader clone, and designed to be close to its simplicity and cleanliness. I wanted to build something as close to Google Reader as made sense for one person to build in a few months.

It’s basically Google Reader, but without all the cruft and where you can send pull requests to improve things!

In fact, there’s already a few pull requests with some nice user interface polish that should hopefully make it into the site soon.

Despite some false starts, I have it up and running on my machine. I sent a few pull requests to update the README to help other clueless folks like me get it set up for hacking on.

So check it out, import your Google Reader feeds, and never miss out on another Haacked.com post EVER!

UPDATE: I forgot to mention what is perhaps the most important reason for me to prefer Go Read. I don’t want to end up in another Google Reader situation again and rely on an RSS Aggregator that isn’t a solid business and might not stick around. At least with an open source option, I have the code running on my own machine as a backup in a pinch.

code comments edit

UPDATE: The .NET team removed the platform limitations.

Let me start by giving some kudos to the Microsoft BCL (Base Class Library) team. They’ve been doing a great job of shipping useful libraries lately. Here’s a small sampling on Nuget:

However, one trend I’ve noticed is that the released versions of most of these packages have a platform limitation in the EULA (the pre-release versions have an “eval-only” license which do not limit platform, but do limit deployment for production use). At this point I should remind everyone I’m not a lawyer and this is not legal advice blah blah blah.

Here’s an excerpt from section 2. c. in the released HttpClient license, emphasis mine:

a. Distribution Restrictions. You may not

  • alter any copyright, trademark or patent notice in the Distributable Code;
  • use Microsoft’s trademarks in your programs’ names or in a way that suggests your programs come from or are endorsed by Microsoft;
  • distribute Distributable Code to run on a platform other than the Windows platform;

I think this last bullet point is problematic and should be removed.

Why should they?

I recently wrote the following tweet in response to this trend:

Dear Microsoft BCL team. Please remove the platform limitation on your very cool libraries. Love, cross-platform .NET devs.

And a Richard Burte tweeted back:

And that pays the rent how exactly?

Great question!

There is this sentiment among many that the only reason to make .NET libraries cross platform or open source is just to appease us long haired open source hippies.

Well first, let me make it crystal clear that I plan to get a haircut very soon. Second, the focus of this particular discussion is the platform limitation on the compiled binaries. I’ll get to the extra hippie open source part later.

There are several reasons why removing the platform limitation benefits Microsoft and the .NET team.

It benefits Microsoft’s enterprise customers

Let’s start with Microsoft’s bread and butter, the enterprise. There’s a growing trend of enterprises that support employees who bring their own devices (BYOD) to work. As Wikipedia points out:

BYOD is making significant inroads in the business world, with about 75% of employees in high growth markets such as Brazil and Russia and 44% in developed markets already using their own technology at work.

Heck, at the time I was an employee, even Microsoft supported employees with iPhones connecting to Exchange to get email. I assume they still do, Ballmer pretending to break an iPhone notwithstanding.

Microsoft’s own software supports cross-platform usage. Keeping platform limitations on their .NET code hamstrings enterprise developers who want to either target the enterprise market or want to make internal tools for their companies that work on all devices.

It’s a long play benefit to Windows 8 Phone and Tablet

While developing Windows 8, Microsoft put a ton of energy and focus into a new HTML and JavaScript based development model for Windows 8 applications, at the cost of focus on .NET and C# in that time period.

The end result? From several sources I’ve heard that something like 85% of apps in the Windows app store are C# apps.

Now, I don’t think we’re going to see a bunch of iOS developers suddenly pick up C# in droves and start porting their apps to work on Windows. But there is the next generation to think of. If Windows 8 devices can get enough share to make it worthwhile, it may be easier to convince this next generation of developers to consider C# for their iOS development and port to Windows cheaply. Already, with Xamarin tools, using C# to target iOS is a worlds better environment than Objective-C. I believe iOS developers today tolerate Objective-C because it’s been so successful for them and it was the only game in town. As Xamarin tools get more notice, I don’t think the next generation will tolerate the clumsiness of the Objective-C tools.

There’s no good reason not to

Ok, this isn’t strictly speaking a benefit. But it speaks to a benefit.

The benefit here is that when Microsoft restricts developers without good reason, it makes them unhappy.

If you recall, Ballmer is the one who once went on stage to affirm Microsoft’s focus on developers! developers! developers! through interpretive dance.

ballmer-developers-dance

Unless there’s something I’m missing (and feel free to enlighten me!), there’s no good reason to keep the platform restriction on most of these libraries. In such cases, focus on the developers!

At a recent Outercurve conference, Scott Guthrie, a corporate VP at Microsoft in change of the Azure Development platform told the audience that his team’s rule of thumb with new frameworks is to default it to open source unless they have a good reason not to.

The Azure team recognizes that a strategy that requires total Windows hegemony will only lead to tears. Microsoft can succeed without having Windows on every machine. Hence Azure supports Linux, and PHP, and other non-Microsoft technologies.

I think the entire .NET team should look to what the Azure team is doing in deciding what their strategy regarding licensing should be moving forward. It makes more developers happy and costs very little to remove that one bullet point from the EULA. I know, I’ve been a part of a team that did it. We worked to destroy that bullet with fire (among others) in every ASP.NET MVC EULA.

Update: It looks like I may have overstated this. Licenses for products are based on templates. Typically a product team’s lawyer will grab a template and then modify it. So with ASP.NET MVC 1 and 2, we removed the platform restriction in the EULA. But it looks like the legal team switched to a different license template in ASP.NET MVC 3 and we forgot to remove the restriction. That was never the intention. Shame on past Phil. Present Phil is disappointed.

At least in this case, the actual source code is licensed under the Apache 2.0 license developers have the option to compile and redistribute, making this a big inconvenience but not a showstopper.

Next Steps

I recently commented on a BCL blog post suggesting that the team remove the platform limitation on a library. Immo Landwerth, a BCL Program Manager responded with a good clarifying question:

Thanks for sharing your concerns and the candid feedback. You post raised two very different items:

​(1) Windows only restriction on the license of the binaries

​(2) Open sourcing immutable collections

From what I read, it sounds you are more interested in (1), is this correct?

The post he refers to is actually one that Miguel de Icaza wrote when MEF came out with a license that had a platform restriction entitled Avoid the Managed Extensibility Framework. Fortunately, that was quickly corrected in that case.

But now we seem to be in a similar situation again.

Here was my response:

@Immo, well I’m interested in both. But I also understand how Microsoft works enough to know that (1) is much easier than (2). :P

So ultimately, I think both would be great, but for now, (1) is a good first step and a minimal requirement for us to use it in ReactiveUI etc.

So while I’d love to see these libraries be open source, I think a minimal next step would be to remove the platform limitation on the compiled library and all future libraries.

And not just to make us long haired (but soon to have a haircut) open source hippies happy, but to make us enterprise developers happy. To make us cross-platform mobile developers happy.

code comments edit

One of the side projects I’ve been working on lately is helping to shepherd the Semantic Versioning specification (SemVer) along to its 2.0.0 release. I want to thank everyone who sent pull requests and engaged in thoughtful, critical, spirited feedback about the spec. Your involvement has made it better!

I also want to thank Tom for creating SemVer in the first place and trusting me to help move it along.

I’ve mentioned SemVer in the past as it relates to NuGet. The 2.0.0 release of SemVer addresses some of the issues I raised.

What’s Changed?

Not too much has changed. Most of the changes focus around clarifications.

Build metadata

Perhaps the biggest change is the addition of optional build metadata (what we used to call a build number). This simply allows you to add a bit of metadata to a version in a manner that’s compliant with SemVer.

The metadata does not affect version precedence. It’s analogous to a code comment.

It’s useful for internal package feeds and for being able to tie a specific version to some mechanism that generated it.

For existing package managers that choose to be SemVer 2.0 compliant, the logic change needed is minimal. Instead of reporting an error when encountering a version with build metadata, all they need to do is ignore or strip the build metadata. That’s pretty much it.

Some package managers may choose to do more with it (for internal feeds for example) but that’s up to them.

Pre-release identifiers

Pre-release labels have a little more structure to them now. For example, they can be separated into identifiers using the “.” delimiter and identifiers that only contain digits are compared numerically instead of lexically. That way, 1.0.0-rc.1 < 1.0.0-rc.11 as you might expect. See the specification for full details.

Clarifications

The rest of the changes to the specification are concerned with clarifications and resolving ambiguities. For example, we clarified that leading zeroes are not allowed in the Major, Minor, or Patch version nor in pre-release identifiers that only contain digits. This makes a canonical form for a version possible.

If you find an ambiguity, feel free to report it.

What’s Next?

As SemVer matures, we expect the specification to become a little more formal in nature as a means of removing ambiguities. One such effort underway is to include a BNF grammar for the structure of a version number in the spec. This should hopefully be part of SemVer 2.1.

code comments edit

Code is unforgiving. As the reasonable human beings that we are, when we review code we both know what the author intends. But computers can’t wait to Well, Actually all over that code like a lonely Hacker News commenter:

Well Actually, Dave. I’m afraid I can’t do that.

Hal, paraphrased from 2001: A Space Odyssey

As an aside, imagine the post-mortem review of that code!

Code review is a tricky business. Code is full of hidden mines that lay dormant while you test just to explode in a debris of stack trace at the most inopportune time – when its in the hands of your users.

The many times I’ve run into such mines just reinforce how important it is to write code that is intention revealing and to make sure assumptions are documented via asserts.

Such devious code is often the most innocuous looking code. Let me give one example I ran into the other day. I was fortunate to defuse this mine while testing.

This example makes use of the Enumerable.ToDictionary method that turns a sequence into a dictionary. You supply an expression to produce a key for each element. In this example, loosely based on the actual code, I am using the CloneUrl property of Repository as the key of the dictionary.

IEnumerable<Repository> repositories = GetRepositories();
repositories.ToDictionary(r => r.CloneUrl);

It’s so easy to gloss over this line during a code review and not think twice about it. But you probably see where this is going.

While I was testing I was lucky to run into the following exception:

System.ArgumentException: 
An item with the same key has already been added.

Doh! There’s an implicit assumption in this code – that two repositories cannot have the same CloneUrl. In retrospect, it’s obvious that’s not the case.

Let’s simplify this example.

var items = new[]
{
    new {Id = 1}, 
    new {Id = 2}, 
    new {Id = 2}, 
    new {Id = 3}
};
items.ToDictionary(item => item.Id);

This example attempts to create a dictionary of anonymous types using the Id property as a key, but we have a duplicate, so we get an exception.

What are our options?

Well, it depends on what you need. Perhaps what you really want is a dictionary that where the value contains every item with the given key. The Enumerable.GroupBy method comes in handy here.

Perhaps you only care about the first value for a given key and want to ignore any others. The Enumerable.GroupBy method comes in handy in this case.

In the following example, we use this method to group the items by Id. This results in a sequence of IGrouping elements, one for each Id. We can then take advantage of a second parameter of ToDictionary and simply grab the first item in the group.

items.GroupBy(item => item.Id)
  .ToDictionary(group => group.Key, group => group.First());

This feels sloppy to me. There is too much potential for this to cover up a latent bug. Why should the other items be ignored? Perhaps, as in my original example, it’s fully normal to have more than one element for the key and you should handle that properly. Instead of grabbing the first item from the group, we retrieve an array.

items.GroupBy(item => item.Id)
  .ToDictionary(group => group.Key, group => group.ToArray());

In this case, we end up with a dictionary of arrays.

UPDATE: Or, as Matt Ellis points out in the comments, you could use theEnumerable.ToLookupmethod. I should have known such a thing would exist. It’s exactly what I need for my particular situation here.

What if having more than one element with the same key is not expected and should throw an exception. Well you could just use the normal ToDictionary method since it will throw an exception. But that exception is unhelpful. It doesn’t have the information we probably want. For example, you just might want to know, which key was already added as the following demonstrates:

items.GroupBy(item => item.Id)
    .ToDictionary(group => group.Key, group =>
    {
        try
        {
            return group.Single();
        }
        catch (InvalidOperationException)
        {
            throw new InvalidOperationException("Duplicate
  item with the key '" + group.First().Id + "'");
        }
    });

In this example, if a key has more than one element associated with it, we throw a more helpful exception message.

System.InvalidOperationException: Duplicate item with the
key '2'

In fact, we can encapsulate this into our own better extension method.

public static Dictionary<TKey, TSource>
  ToDictionaryBetter<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector)
{
  return source.GroupBy(keySelector)
    .ToDictionary(group => group.Key, group =>
    {
      try
      {
        return group.Single();
      }
      catch (InvalidOperationException)
      {
        throw new InvalidOperationException(
            string.Format("Duplicate item with the key
          '{0}'", keySelector(@group.First())));
      }
    });
}

Code mine mitigated!

This is just one example of a potential code mine that might go unnoticed during a code review if you’re not careful.

Now, when I review code and see a call to ToDictionary, I make a mental note to verify the assumption that the key selector must never lead to duplicates.

When I write such code, I’ll use one of the techniques I mentioned above to make my intentions more clear. Or I’ll embed my assumptions into the code with a debug assert that proves that the items cannot have a duplicate key. This makes it clear to the next reviewer that this code will not break for this reason. This code still might not open the hatch, but at least it won’t have a duplicate key exception.

If I search through my code, I will find many other examples of potential code mines. What are some examples that you can think of? What mines do you look for when reviewing code?

personal, empathy, parenting comments edit

This post is a departure from my typical software related topics, but I think you’ll find parallels with management and dealing with software developers.

Parenting is a skill like any other – it can be improved (for some more than others, amirite?!).

Look, I’m not trying to claim I’m the world’s greatest dad. But I was given a coffee mug with that claim by my kids. I don’t mean to brag, but I’m pretty sure they did a quantitative exhaustive analysis of all dads before conferring that award to me because that’s just how I raised them. Right kids?!

But I digress.

When my son was still a very young toddler, my wife and I took advantage of a Microsoft benefit that paid for parenting classes (Many Microsoft employees who are parents have no idea this benefit exists). We attended a series on “Reflective Parenting.” It was an amazing learning experience that taught us this idea that parenting is a skill like any other.

It’s a strange conceit of many parents that because they can reproduce, that they suddenly are imbued with unassailable parenting skills.

As Richard Stallman once remarked, perhaps callously,

It doesn’t take special talents to reproduce—even plants can do it. On the other hand, contributing to a program like Emacs takes real skill. That is really something to be proud of.

It helps more people, too.

And you never have to clean up poo from an Emacs blowout!

He’s right about one part. Even plants can reproduce. But reproducing is the easy part. Plants can’t parent.

Parenting is a subject that trends towards being heavy on tradition. “If it was good enough for me, it’s good enough for my kids.” But that’s not how progress is made my friends.

Despite my megalomaniacal tendencies, I like to think I turned out ok so far. My parents did a pretty good job. Does that mean I can’t strive to do even better? It’s worth a try. So in this post, I’ll explore what SCIENCE brings to bear on the subject. It may seem weird to invoke science in a subject as personal and emotional as parenting. But the scientific method is effective, even on a personal scale.

xkcd-standback

Note that the focus here is on core principles and less on specifics. I’m not going to tell you to spank or not spank your child (because we know that’ll end in a shit storm debate).

This post will focus on principles to consider when making your own decisions about these things. Because in the end, if you are a parent, it is ultimately up to you what you do…within reason.

That’s one reason try to embrace the “no judging” philosophy towards other parents. Each parent has a different situation and different background. I may offer ideas that I think are helpful, but I won’t judge. Unless you tend to drive off with your five-week-old child on top of your car. I might judge just an teensy weensy bit then.

Lessons from Reflective Parenting

The about page for the Center of Reflective Parenting says it was founded…

..in response to groundbreaking research in child development and the study of the neurobiology of the developing mind showing that the single best way to positively impact the attachment relationship is to increase a parent’s capacity to reflect on their relationship with their child – to think about the meaning that underlies behavior.

There’s a lot of science and research underlying the core precepts of this approach. But when you hear it, it doesn’t sound academic at all. In fact, it sounds a lot like common sense.

There are three core lessons I took from the classes.

Empathy

The first is to work on developing empathy and understanding for your child. We learned a lot about what children are capable of developmentally at certain ages. For example, at very young ages, children aren’t very good at understanding cause and effect.

This allows you to develop more appropriate expectations and responses to the things your child may do. At some ages, you just can’t expect them to respond to reason, for example. (By their teenage years, they can respond to reason, they just choose not to. It’s different.)

Self Control

The second, and perhaps more important lesson for me personally, is that good parenting is more about controlling yourself than your child. This is because children reflect the behavior of their parents.

For example, we’ve all been there, in the car, with the kids loudly misbehaving, when a parent gets fed up, blows up, and screams at the kids. I’ve been there.

In that moment the parent is not disciplining. The parent is only momentarily making him or herself feel better. But this teaches the kids that the best way to handle a stressful situation is to lose your shit. Discipline comes from the calm moments when a parent is very considered and in control of his or her actions. Remember, kids don’t do what you tell them to do. They do as you do.

In such situations, the class taught us to attempt to empathize with what the children might be experiencing and base our actions on that. If we can’t help but to lose our temper, it’s OK to separate ourselves from the situation. For example, in extreme situations, you might pull over, step out of the car, and let the kids scream their heads off while you (out of earshot) calm your nerves.

Repairing

Now this last point is the most important lesson. Parents, we are going to fuck up. We’re going to do it royally. Accept it. Forgive yourself. And then repair the situation.

I’ve lost my shit plenty of times. I’m pretty sure I did it twice this past week. It doesn’t make me a bad parent, though I feel that way at the moment. What would make me a bad parent is if I doubled down on my anger and never apologized to the kids and never tried to repair whatever damage I may have caused.

Some parents believe in the doctrine of parental infallibility. Never let them see you sweat and never admit fault to your children lest they see an opening and walk all over you.

But when you consider the principle that kids do as you do, I don’t think this doctrine stands up to scrutiny. I hope you want your children to be able to admit when they were wrong and know how to deliver a sincere apology. Teach by living the example.

The Economist’s Guide to Parenting

Many years after the reflective parenting class, I listened to this outstanding Freakonomics Podcast epidsode on parenting.

I know what you’re thinking when you read the title of this podcast. You’re thinking what the **** — economists? What can economists possibly have to say about something as emotional, as nuanced, as humane, as parenting? Well, let me say this: because economists aren’t necessarily emotional (or, for that matter, all that nuanced or humane), maybe they’re exactly the people we need to sort this through. Maybe.

As you might expect, it’s hard to conduct a double blind laboratory study of raising kids. Are you going to separate twins at birth and knowingly give one to shitty parents and another to wonderful parents to examine the effects? Cool idea bro, but…

octopus-nope

But there are such things as natural experiments. There were studies done of large groups of twins separated at birth and raised by different adoptive parents.

The striking result of the studies is that what the parents did had very little influence in how kids ended up. Letting kids watch as much TV as they want? Restricting TV? Helicopter parenting? Piano and violin lessons? Sorry Tiger Mom, it made very little difference.

Over and over again, guess what made the difference. It wasn’t what the parents did, it’s who they were. Educated parents ended up with educated kids. As far as I could tell, the study didn’t really get into cause and effect much. For example, is it because educated parents tend to do the things that lead to educated kids? Inconclusive. But they did find that many of the practices of “helicopter parents” such as music lessons, etc. had very little affect on future success and happiness of the child.

But studies reveal there is one thing parents do that had a strong correlation with how their progeny end up - how parents treated wait staff. Those who were rude to waiters and waitresses ended up with rude children. Those who were kind and tipped well, ended up with kind kids.

See a pattern here?

Kids aren’t affected as much by what you tell them and what you teach them as much as what you do. If you’re curious and love learning, your kids are more likely to be infused with a similar passion for learning.

On one level, this is encouraging. You don’t need to schedule every hour of your children’s free time with Latin and theremin lessons for them to turn out well.

On the other hand, it’s also very challenging. If you’re a naturally awful misanthropic person it’s much harder to change yourself than to simply pay for classes.

I can be pretty lazy. But after the Freakonomics podcast, I started making an effort to do one simple thing every morning. Make the bed before I left the room. Honestly, I didn’t care so much if the bed was made, but I did like how a clean room made me feel less disheveled as I started my day.

And here’s the amazing thing. My son, who’s only five, will now sometimes come into the room to make our bed. I never asked him nor told him to. He saw me doing it and he’s reflecting my behavior. It’s really rather striking.

Go Forth And Parent

In the beginning I mentioned how parenting research applies to software developers. I wasn’t making a comparison of software developers to children (though if the description fits…). It’s more a comment on the idea that parenting is a lot like leadership. Like parents, leaders lead by doing, not by telling others what to do.

The good news is what you do as a parent has little effect on how your kids end up. The best thing you can do is focus on being the type of person you want your kids to be.

However, what you do in the interim can affect how well you cope with being a parent and all the travails that come with it. It also may affect what your relationship with your children will look like down the road. It’d kind of suck to raise wonderful successful kids who want nothing to do with you. So don’t be awful to them.

This is why I still think it’s worthwhile working on improving parenting skills. It’s less about affecting your kids success as adults and more about building a good lasting relationship with them.

Like any skill, there’s always new evidence coming in that might cause you to reevaluate how you parent. For example, here’s a list of ten things parents are often dead wrong about.

Perhaps you factor those in and maybe improve your technique. Maybe not. The key thing is, don’t sweat it too much. Ultimately, what we all want is for our kids to lead fulfilled and happy lives. This is one reason I optimize for my own happiness so they hopefully reflect that.

Happy parenting!

happiness

code, open source, github comments edit

In some recent talks I make a reference to Conway’s Law named after Melvin Conway (not to be confused with British Mathematician John Horton Conway famous for Conway’s Game of Life nor to be confused with Conway Twitty) which states:

Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.

Many interpret this as a cynical jibe at software management dysfunction. But this was not Melvin’s intent. At least it wasn’t his only intent. On his website, he quotes from Wikipedia, emphasis mine:

Conway’s law was not intended as a joke or a Zen koan, but as a valid sociological observation. It is a consequence of the fact that two software modules A and B cannot interface correctly with each other unless the designer and implementer of A communicates with the designer and implementer of B. Thus the interface structure of a software system necessarily will show a congruence with the social structure of the organization that produced it.

I savor Manu Cornet’s visual interpretation of Cornet’s law. I’m not sure how Manu put this together, but it’s not a stretch to suggest that the software architectures these companies produce might lead to these illustrations.

organizational_charts

Having worked at Microsoft, the one that makes me laugh the most is the Microsoft box. Let’s zoom in on that one. Perhaps it’s an exaggerated depiction, but in my experience it’s not without some basis in truth.

ms-org 

The reason I mention Conway’s Law in my talks is to segue to the topic of how GitHub the company is structured. It illustrates why GitHub.com is structured the way it is.

So how is GitHub structured?

Well Zach Holman has written about it in the past where he talks about the distributed and asynchronous nature of GitHub. More recently, Ryan Tomayko gave a great talk (with associated blog post) entitled Your team should work like an open source project.

By far the most important part of the talk — the thing I hope people experiment with in their own organizations — is the idea of borrowing the natural constraints of open source software development when designing internal process and communication.

GitHub in many respects is structured like a set of open source projects. This is why GitHub.com is structured the way it is. It’s by necessity.

Like the typical open source project, we’re not all in the same room. We don’t work the same hours. Heck, many of us are not in the same time zones even. We don’t have top-down hierarchical management. This explains why GitHub.com doesn’t focus on the centralized tools or reports managers often want as a means of controlling workers. It’s a product that is focused more on the needs of the developers than on the needs of executives. It’s a product that allows GitHub itself to continue being productive.

Apply Conway’s Law

So if Conway’s Law is true, how can you make it work to your advantage? Well by restating it as Jesse Toth does according to this tweet by Sara Mei:

Conway’s Law restated by @jesse_toth: we should model our teams and our communication structures after the architecture we want.  #scotruby

Conway’s Law in its initial form is passive. It’s an observation of how software structures tend to follow social structures. So it only makes sense to move from observer to active participant and change the organizational structures to match the architecture you want to produce.

Do you see the effects of Conway’s Law in the software you produce?

code comments edit

In a recent post, Test Better, I suggested that developers can and ought do a better job of testing their own code. If you haven’t read it, I recommend you read that post first. I’m totally not biased in saying this at all. GO DO IT ALREADY!

There was some interesting pushback in the comments. Some took it to mean that we should get rid of all the testers. Whoa whoa whoa there! Slow down folks.

I can see how some might come to that conclusion. I did mention that my colleague Drew wants to destroy the role of QA. But it’s not because we want to just watch it burn.

Rather, we’re interested in something better rising from the ashes. It’s not that there’s no need for testers in a software shop. It’s that what we need is a better idea of what a tester is.

Testers Are Not Second Class Citizens

Perhaps you’ve had different experiences than me with testers. Good for you, here’s a Twinkie. For the vast majority of you, you can probably relate to the following.

At almost every position I’ve been at, developers treated testers like second class citizens in the pecking order of employees. Testers were just a notch above unskilled labor.

Not every tester mind you. There were always standouts. But I can’t tell you how many times developers would joke that testers are just wannabe developers who didn’t make the cut. The general attitude was that you could replace these folks with Amazon’s Mechanical Turk and not know the difference.

mechanical turkMechanical Turk from Wikimedia Commons. Public Domain image.

And in some cases, it was true. At Microsoft, for example, it’s easier to be hired as a tester than as a developer. And once your foot is in the door, you can eventually make that transition to developer if you’re half way decent.

This makes it very challenging to hire and retain good testers.

Elevate the Profession

But it shouldn’t be this way. The problem is we need to elevate the profession of tester. Drew and I talk about these things from time to time and he told me to think of testers as folks who provide a service to developers. Developers should test their own code, but testers can provide guidance on how to better test the code, suggest usage scenarios, set up test labs, etc.

Then it hit me. We already do have testers that are well respected by developers and may serve as a model for what we mean by the concept of better testers.

Security Testers

By most accounts, good security testers are well respected and not seen as folks who are developer wannabes. If not respected, they are feared for what they might do to your system should you piss them off. One security expert I know mentioned developers never click on links he sends without setting up a Virtual Machine to try it in first. Either way, it works.

Perhaps by looking at some of the qualities of security testers and how they integrate into the typical development flow, we can tease out some ideas on what better testers look like.

Like regular testers, many security testers test code that’s ready to ship. Many sites hire white hat penetration testers to attempt to locate and exploit vulnerabilities in a site that’s already been deployed. These folks are experts who keep up to date on the latest in security testing. They are not folks you can just replace with a Mechanical Turk.

Of course, smart developers don’t wait till code is deployed to get a security expert involved. That’s way too late. Security testers can help in the early stages of planning. Provide guidance on what patterns to avoid, what to look out for, some good practices to follow. During the coding stages they can provide code reviews with an eye towards security or simply answer questions you may have about tricky situations.

Testing as a Service

There are other testers that also follow a similar model. If you need to target 12 languages, you’ll definitely want to work with a localization/internationalization tester. If you value usability you may want to work with a usability expert. The list goes on.

It’s just not possible for a developer to be an expert in all these possible areas. I’d expect developers to have a basic understanding of these areas. Perhaps be quite knowledgeable in each, but never as knowledgeable as someone who is focused on these areas all the time.

The common theme among these testers is that they are providing a service to developers. They are sought out for their expertise.

General feature and quality testers should be no different. Good testers spend all their time learning and thinking about better and more efficient ways to test products. They are advocates for the end users and just as concerned about shipping software as developers. They are not gate keepers. They are enablers. They enable developers to ship better code.

This idea of testers as a service is not mine. It’s something Drew told me (he seriously needs to start his blog up again) that struck me.

By necessity, these would be folks who are great developers who have chosen to focus their efforts on the art and science of testing, just as another developer might choose to focus their efforts on the art and science of native clients, or reactive programming.

I love working with someone who knows way more about testing software and building in quality from the start than I do.

This is one of the motivations for me to test my own code better. If I’m going to leverage the skills of a great tester, it’s a matter of pride not to embarrass myself with stupid bugs I should have caught in my own testing. I want to impress these folks with crazy hard bugs I have no idea how to test.

Ok, maybe that last bit didn’t come out the way I intended. The point is when you work with experts, you don’t want them spending all their time with softballs. You want their help with the meaty stuff.

community, open source, personal comments edit

Someone recently emailed me to ask if I’m speaking at any upcoming conferences this year. Good question!

I’ve been keeping it pretty light this year since my family and I are doing a bit of travelling ourselves and I like spending time with them.

But I will be hitting up two conferences that I know of.

<anglebrackets> April 8 – 11

Ohmagerd! That’s this week! I better prepare!

I’ll be giving two talks this week. One of them will be a joint talk with the incomparable Scott Hanselman. Usually that means him taking potshots at me for your enjoyment. ARE YOU NOT ENTERTAINED?!

are_you_not_entertained-135569

You will be!

Jazz Up Your Open Source with GitHub

Wednesday April 10 3:30 PM – 4:45 PM - Room 5 (Just Me)

You write some code that handles angle brackets like nobody’s business and you’re ready to share it with the world on GitHub. Great! Now what?

The story doesn’t end there. When the first users and contributors show up at your doorstep, you need to be prepared. Find out some tips for engaging an audience with your open source project and really make your project sing.

Return of the HaaHa Show: How to Open Source

Thursday April 11 8:00 AM (HWHAT!?) – 9:00 AM – Keynote Room 2 – Scott and Phil

They are back. ScottHa and PhilHaa reprise their legendary (OK, not really) HaaHa show that has thrilled audiences on three continents. There will be code. There will be jokes, bad ones. There will be Pull Requests. There will be Markdown. Will there be injuries? Papercuts? Let’s find out as we join Phil Haack and Scott Hanselman as they learn how to open source. We will answer questions like: How do I get involved in open source? How do I clone and repro, branch it, do a pull request and commit to an open source project? Seems kind of hard. Let’s see if it is!

MonkeySpace 2013 July 22-25

The call for proposals for this conference is still open. If you know anyone who might bring a diverse and unique perspective to this conference, please encourage them to submit. We’d really love to get a more diverse speaker cast than is typical for a conference on .NET open source. This conference is no longer just a conference on Mono. Mono figures prominently, but the scope has expanded to the broader topic of .NET open source and cross platform .NET.

Others

I’ll be in Tokyo Japan in late April. So if you have a user group there that meets on Tuesday 4/30 and want to hear about GitHub, Git, NuGet, or even ASP.NET MVC, let me know. I’d be happy to swing by, but be warned I do not speak Japanese.

There might also be some local upcoming conferences I’ll speak at.

Podcasts

I recently was a guest on Yet Another Podcast with Jesse Liberty where I talked about Git, GitHub, GitHub for Windows, and subverting the oppressive traditional hierarchical organizational structure that serves to keep us down. FIGHT THE POWER!

Check it out.

Tags: speaking, talks, opensource, podcast

code, tdd, github comments edit

Developers take pride in speaking their mind and not shying away from touchy subjects. Yet there is one subject makes many developers uncomfortable.

Testing.

I’m not talking about drug testing, unit testing, or any form of automated testing. After all, while there are still some holdouts, at least these types of tests involve writing code. And we know how much developers love to write code (even though that’s not what we’re really paid to do).

No, I’m talking about the kind of testing where you get your hands dirty actually trying the application. Where you attempt to break the beautifully factored code you may have just written. At the end of this post, I’ll provide a tip using GitHub that’s helped me with this.

TDD isn’t enough

I’m a huge fan of Test Driven Development. I know, I know. TDD isn’t about testing as Uncle Bob sayeth from on high in his book, Agile Software Development, Principles, Patterns, and Practices,

The act of writing a unit test is more an act of design than of verification.

And I agree! TDD is primarily about the design of your code. But notice that Bob doesn’t omit the verification part. He simply provides more emphasis to the act of design.

In my mind it’s like wrapping a steak in bacon. The steak is the primary focus of the meal, but I sure as hell am not going to throw away the bacon! I know, half of you are hitting the reply button to suggest you prefer the bacon. Me too but allow me this analogy.

bacon-wrapped-steakMMMM, gimme dat! Credit: Jason Lam CC-BY-SA-2.0

The problem I’ve found myself running into, despite my own advice to the contrary, is that I start to trust too much in my unit tests. Several times I’ve made changes to my code, crafted beautiful unit tests that provide 100% assurance that the code is correct, only to have customers run into bugs with the code. Apparently my 100% correct code has a margin of error. Perhaps Donald Knuth said it best,

Beware of bugs in the above code; I have only proved it correct, not tried it.

It’s surprisingly easy for this to happen. In one case, we had a UI gesture bound to a method that was very well tested. Our UI was bound to this method. All tests pass. Ship it!

Except when you actually execute the code, you find that there’s a certain situation where an exception might occur that causes the code to attempt to modify the UI on a thread other than the UI thread #sadtrombone. That’s tricky to catch in a unit test.

Getting Serious about Testing

When I joined the GitHub for Windows (GHfW) team, we were still in the spiking phase, constantly experimenting with the UI and code. We had very little in the way of proper unit tests. Which worked fine for two people working in the same code in the same room in San Francisco. But here I was, the new guy hundreds of miles away in Bellevue, WA without any of the context they had. So I started to institute more rigor in our unit and integration tests as the product transitioned to a focus on engineering.

But we still lacked rigor in regular non-automated testing. Then along comes my compatriot, Drew Miller. If you recall, he’s the one I cribbed my approach structuring unit tests from.

Drew really gets testing in all its forms. I first started working with him on the ASP.NET MVC team when he joined as a test lead. He switched disciplines from a developer to become a QA person because he wanted a venue to test this theories on testing and eventually show the world that we don’t need separate QA person. Yes, he became a tester so he could destroy the role, in order to save the practice.

In fact, he hates the term QA (which stands for Quality Assurance):

The only assurance you will ever have is that code has bugs. Testing is about confidence. It’s about generating confidence that the user’s experience is good enough. And it’s about feedback. It’s about providing feedback to the developer in lieu of a user in the room. Be a tester, don’t be QA.

On the GitHub for Windows team, we don’t have a tester. We’re all responsible for testing. With Drew on board, we’re also getting much better at it.

Testing Your Own Code and Cognitive Bias

There’s this common belief that developers shouldn’t test their own code. Or maybe they should test it, but you absolutely need independent testers to also test it as well. I used to fully subscribe to this idea. But Drew has convinced me it’s hogwash.

It’s strange to me how developers will claim they can absolutely architect systems, provide insights into business decisions, write code, and do all sorts of things better than the suits and other co-workers, but when it comes to testing. Oh no no no, I can’t do that!

I think it’s a myth we perpetuate because we don’t like it! Of course we can do it, we’re smart and can do most anything we put our minds to. We just don’t want to so we perpetuate this myth.

There is some truth that developers tend to be bad at testing their own code. For example, the goal of a developer is to write software as bug free as possible. The presence of a bug is a negative. And it’s human nature to try to avoid things that make us sad. It’s very easy to unconsciously ignore code paths we’re unsure of while doing our testing.

While a tester’s job is to find bugs. A bug is a good thing to these folks. Thus they’re well suited to testing software.

But this oversimplifies our real goals as developers and testers. To ship quality software. Our goals are not at odds. This is the mental switch we must make.

And We Can Do It!

After all, you’ve probably heard it said a million times, when you look back on code written several months ago, you tend to cringe. You might not even recognize it. Code in the brain has a short half-life. For me, it only takes a day before code starts to slip my mind. In many respects, when I approach code I wrote yesterday, it’s almost as if I’m someone else approaching the code.

And that’s great for testing it.

When I think I’m done with a feature or a block of code, I pull a mental trick. I mentally envision myself as a tester. My goal now is to find bugs in this code. After all, if I find them and fix them first, nobody else has to know. Whenever a customer finds a bug caused by me, I feel horrible. So I have every incentive to try and break this code.

And I’m not afraid to ask for help when I need it. Sometimes it’s as simple as brainstorming ideas on what to test.

One trick that my team has started doing that I really love is when a feature is about done, we update the Pull Request (remember, a pull request is a conversation about some code and you don’t have to wait for the code to be ready to merge to create a PR) with a test plan using the new Task Lists feature via GitHub Flavored Markdown.

This puts me in a mindset to think about all the possible ways to break the code. Some of these items might get pulled from our master test plan or get added to it.

Here’s an example of a portion of a recent test plan for a major bug fix I worked on (click on it to see it larger).

test-plan-in-pr

The act of writing the test plan really helps me think hard about what could go wrong with the code. Then running through it just requires following the plan and checking off boxes. Sometimes as I’m testing, I’ll think of new cases and I’ll just edit the plan accordingly.

Also, the test plan can serve as an indicator to others that the PR is ready to be merged. When you see everything checked off, then it should be good to go! Or if you want to be more explicit about it, add a “sign-off” checkbox item. Whatever works best for you.

The Case for Testers

Please don’t use this post to justify firing your test team. The point I’m trying to make is that developers are capable of and should test their own (and each others) code. It should be a badge of pride that testers cannot find bugs in your code. But until you reach that point, you’re probably going to need your test team to stick around.

While my team does not have dedicated testers, we consider each of us to be testers. It’s a frame of mind we can put our minds into when we need to.

But we’re also not building software for the Space Shuttle so maybe we can get away with this.

I’m still of the mind that many teams can benefit from a dedicated tester. But the role this person has is different from the traditional rote mechanical testing you often find testers lumped into. This person would mentor developers in the testing part of building software. Help them get into that mindset. This person might also work to streamline whatever crap gets in the way so that developers can better test their code. For example, building automation that sets up test labs for various configuration in a moment’s notice. Or helping to verify incoming bug reports from customers.

Related Posts

nuget comments edit

How can you trust anything you install from NuGet? It’s a simple question, but the answer is complicated. Trust is not some binary value. There are degrees of trust. I trust my friends to warn me before they contact the authorities and maybe suggest a lawyer, but I trust my wife to help me dispose of the body and uphold the conspiracy of silence (Honey, it was in the fine print of our wedding vows in case you’re wondering).

The following are some ideas I’ve been bouncing around with the NuGet team about trust and security since even before I left NuGet. Hopefully they spark some interesting discussions about how to make NuGet a safer place to install packages.

Establish Identity and Authorship

The question “do I trust this package” is not the best question to ask. The more pertinent question is “do I trust the author of this package?”

NuGet doesn’t change how you go about answering this question yet. Whether you found a zip file on some random website or install it via NuGet, you still have to answer the following questions (perhaps unconsciously):

  1. Who is the author?
  2. Is the author trustworthy?
  3. Do I trust that the this software really was written by the author?
  4. Is the author’s means of distributing software tamper resistant and verifiable?

In some cases, the non-NuGet software is signed with a certificate. That helps answer questions 1, 2, and 3. But chances are, you don’t restrict yourself to only using certificate signed libraries. I looked through my own installed Visual Studio Extensions and several were not certificate signed.

NuGet doesn’t yet support package signing, but even if it did, it wouldn’t solve this problem sufficiently. If you want to know more why I think that, read the addendum about package signing at the end of this post.

What most people do in such situations is try to find alternate means to establish identity and authorship:

  1. I look for other sites that link to this package and mention the author.
  2. I look for sites that I already know to be in control of the author (such as a blog or Twitter account) and look for links to the package.
  3. I look for blog posts and tweets from other people I trust mentioning the package and author.

I think NuGet really needs to focus on making this better.

A Better Approach

There isn’t a single solution that will solve the problem. But I do believe a multipronged approach will make it much easier for people to establish the identity and authorship of a package and make an educated decision on whether or not to install any given package.

Piggy back on other verification systems

This first idea is a no-brainer to me. I’m a lazy bastard. If someone else has done the hard work, I’d like to build on what they’ve done.

This is where social media can come into play and have a useful purpose beyond telling the world what you ate for lunch.

For example, suppose you want to install RouteMagic and you see that the package owner is some user named haacked on NuGet. Who is this joker?

Hey! Maybe you happen to know haacked on GitHub! Is that the same guy as this one? You also know a haacked on Twitter and you trust that guy. Can we tie all these identities together?

Well it’d be easy through Oauth. The NuGet gallery could allow me to verify that I am the same person as haacked on GitHub and Twitter by doing an Oauth exchange with those sites. Only the real haacked on Twitter could authenticate as haacked on Twitter.

The more identities I attach to my NuGet account, the more you can trust that identity. It’s unlikely someone will hack both my GitHub and Twitter accounts.

The NuGet Gallery would need to expose these verifications in the UI anywhere I see a package owner, perhaps with little icons.

With Twitter, you could go even further. Twitter has the concept of verified identities. If we trust their process of verification, we could piggyback on that and show a verified icon next to Twitter verified users, adding more weight to your claimed identity.

This would be so easy and cheap to implement and provide a world of benefit for establishing identity.

Build our own verification system

Eventually, I think NuGet might want to consider having its own verification system and NuGet Verified Accounts™. This is much costlier than my previous suggestion to do it right and not simply favor corporations over the little guy.

Honestly, if we implemented the first idea well, I’m not sure this would ever have to happen anytime soon.

Vouching

This idea is inspired by the concept of a Web of Trust with PGP which provides a decentralized approach to establishing the identity of the owner of a public key.

While the previous ideas help establish identity, we still don’t know if we can trust these people. Chances are, if someone has a well established identity they won’t want to smudge their reputation with malware. But what about folks without well established reputations?

We could implement a system of vouching. For example, suppose you trust me and I vouch for ten people. And they in turn vouch for ten people each. That’s a network of 111 potentially trustworthy people. Of course, each degree you move out, the level of trust declines. You probably trust me more than the people I trust. And those people more than the people they trust. And so on.

How do we use this information in NuGet?

It could be as simple as factoring it into sort order. For example, one factor in establishing trust in a package today is looking at the download count of a package. Chances are that a malware library is not going to get ten thousand downloads.

We could also incorporate the level of trust of the package owner into that sort order. For example, show me packages for sending emails in order of trust and download count.

Other attack vectors

So far, I’ve focused on establishing trust in the author of a package. But a package manager system has other attack vectors.

For example, the place where packages are stored could be hacked or the service itself could be hacked.

If Azure Blob storage was hacked, an attacker could swap out packages of trusted authors with untrusted materials. This is a real concern. NuGet.org luckily stores the hash of each package and presents it in the feed. The NuGet client verifies the contents before installing it on the users machine.

However, suppose NuGet.org database was hacked. There is still a level of protection because any hash tampering would be caught by the clients.

An attacker would have to compromise both the Azure Blob Storage and the NuGet.org database.

Or worse, if the attacker compromises the machine that hosts NuGet, then it’s game over as they could corrupt the hashes and run code to pull packages from another location.

Mitigations of this nightmare scenario include having different credentials for Blobs and the database and constant security reviews of the NuGet code base.

Another thing we should consider is storing package hashes in packages.config so that Package Restore could at least verify packages during a restore in this nightmare scenario. But this wouldn’t solve the issue with installing new packages.

PowerShell Scripts

NuGet makes use of PowerShell scripts to perform useful tasks not covered by a typical package.

A lot of folks get worried about this as an attack vector and want a way to disable these scripts. There are definitely bad things that could happen and I’m not opposed to having an option to disable them, but this only gives a false sense of security. It’s security theater.

Why’s that you say? Well a package with only assemblies can still bite you through the use of Module Initializers.

Modules may contain special methods called module initializers to initialize the module itself.

All modules may have a module initializer. This method shall be static, a member of the module, take no parameters, return no value, be marked with rtspecialname and specialname, and be named .cctor.

There are no limitations on what code is permitted in a module initializer. Module initializers are permitted to run and call both managed and unmanaged code.

The module’s initializer method is executed at, or sometime before, first access to any types, methods, or data defined in the module

If you’re installing a package, you’re about to run some code with or without PowerShell scripts. The proper mitigation is to stop running your development environment as an administrator and make sure you trust the package author before you install the package.

At least with NuGet, when you install a package it doesn’t require elevation. If you install an MSI, you’d typically have to elevate privileges.

Addendum: Package Signing is not the answer

Every time I talk about NuGet security, someone gets irate and demands that we implement signing immediately as if it were some magic panacea. I’m definitely not against implementing package signing, but let’s be clear. It is a woefully inadequate solution in and of itself and there’s a lot better things we should do first as I’ve already outlined in this post.

The Cost and Ubiquity Problem

Very few people will sign their packages. Ruby Gems supports package signing and I’ve been told the number that take advantage of it is nearly zero. Visual Studio Extensions also supports package signing. Quick, go look at your list of installed extensions. Were any unsigned?

The problem is this, if you require certificate signing, you’ve just created too much friction to create a package and the package manager ecosystem will dry up and die. Requiring signing is just not an option.

The reason is that obtaining and properly signing software with a certificate is a costly proposition by its very nature. A certificate implies that some authority has verified your identity. For that verification to have value, it must be somewhat reliable and thorough. It’s not going to be immediate and easy or bad agents could easily do it.

Package signing is only a good solution if you can guarantee near ubiquity. Otherwise you still need alternative solutions.

The User Interface Problem

Once you allow package signing, you then have the user interface problem. Visual Studio Extensions is an interesting example of this conundrum. You only see that a package is digitally signed after you’ve downloaded and decided to install it. At that point, you tend to be committed already.

vs-extension-gallery

Also notice that the message that this package isn’t signed is barely noticeable.

Ok, so it’s not signed. What can I do about it other that probably Install it anyways because I really want this software. The fact that a package was signed didn’t change my behavior in any way.

Visual Studio could put more dire looking warnings, but it would alienate the community of extension authors by doing so. It could require signing, but that would put onerous restrictions on creating packages and would cause the community of signed packages to wither away, leaving only packages sponsored by corporations.

The point here is that even with signed packages, there’s not much it would do for NuGet. Perhaps we could support a mode where it gave a more dire warning or even disallowed unsigned packages, but that’d just be annoying and most people would never use that mode because the selection of packages would be too small.

The only benefit in this case of signing is that if a package did screw something up, you could probably chase down the author if they signed it. But that’s only a benefit if you never install unsigned packages. Since most people won’t sign them, this isn’t really a viable way to live.

Conclusion

Just to be clear. I’m actually in favor of supporting package signing eventually. But I do not support requiring package signing to make it into the NuGet gallery. And I think there are much better approaches we can take first to mitigate the risk of using NuGet before we get to that point.

I worry that implementing signing just gives a false sense of security and we need to consider all the various ways that people can establish trust in packages and package authors.

git, github, code comments edit

The other day I needed a simple JSON parser for a thing I worked on. Sure, I’m familiar with JSON.NET, but I wanted something I could just compile into my project. The reason why is not important for this discussion (but it has to do with world domination, butterflies, and minotaurs).

I found the SimpleJson package which is also on GitHub.

SimpleJson takes advantage of a neat little feature of NuGet that allows you to include source code in a package and have that code transformed into the appropriate namespace for the package target. Oftentimes, this is used to install sample code or the like into a project. But SimpleJson uses it to distribute the entire library.

At first glance, this is a pretty sweet way to distribute a small single source file utility library. It gets compiled into my code. No binding redirects to worry about. No worries about different versions of the same library pulled in by dependencies. In my particular case, it was just what I needed.

But I started to think about the implications of such an approach on a wider scale. What if everybody did this?

The Update Problem

If such a library were used by multiple packages, it actually could limit the consumer’s ability to update the code.

For example, suppose I have a project that installs the SimpleJson package and also the SimpleOtherStuff package, where SimpleOtherStuff has a dependency on SimpleJson 1.0.0 and higher. The following diagram outlines the NuGet package dependency graph. It’s very simple.

nuget-dependency-graph

Now suppose we learn that SimpleJson 1.0.0 has a very bad security issue and we need to upgrade to the just released SimpleJson 1.1.

So we do just that. Everything should be hunky dory as we’re now using SimpleJson 1.0.0 everywhere. Or are we?

nuget-dependency-graph-2

If all the references to SimpleJson were assembly references, we’d be fine. But recall, it’s a source code package. Even though we upgraded it in our application, SimpleOtherStuff 1.0.0 has SimpleJson 1.0.0 compiled into it.

There’s no way to upgrade SimpleOtherStuff’s reference other than to wait for the package author to do it or to manually recompile it ourselves (assuming the source is available).

You Are in Control

A guiding principle in the design of NuGet is we try and keep you, the consumer of the packages, in control of things. Want to uninstall a package even though other packages reference it? We’ll prevent it by default but then offer you a –Force flag so you can tell NuGet, “No really, I know what I’m doing here and am ready to face the consequences.”

We don’t do this perfectly in every case. Pre-release packages come to mind. But it’s a principle we try to follow.

Source code packages are interesting in that they give you more control in one area (you have the source), but take it away in another (upgrades are no longer complete).

Note that I’m not picking on SimpleJson. As I said before, I really needed this. In fact, I contributed back with several Pull Requests. I’m just pointing out a caveat to consider when using such packages.

Making it Better

So yeah, be careful. There are caveats. But couldn’t we make this better? Well I have an idea. Ok, it’s not my idea but an idea that some of my coworkers and I have bounced around for a while.

Imagine if you could attach a Git repository to your NuGet package. When you install the package, you could add a flag to install it as a Git Submodule rather than the normal assembly approach. Maybe it’d look like this.

Install-Package SimpleJson –AsSource

What this would do is initialize a submodule, and grab the source from GitHub. Perhaps it goes further and adds the files as linked files into your target project based on a bit of configuration in the source tree.

There’s a lot of possibilities here to flesh out. The Upgrade-Package command simply run a Git update submodule command on these submodules and do a normal update for all the other packages.

Since Microsoft recently made it clear that Git is the future of DVCS as far as Microsoft is concerned, maybe now is the time to think about tighter integration with NuGet. What do you think?

At the very least, perhaps NuGet needs a better extensibility model so we could build this support in outside of NuGet. That’s the more prudent approach of course, but I’m not feeling so prudent today.

code comments edit

Today I learned something new and I love that!

I was looking at some code that looked like this:

try
{
    await obj.GetSomeAsync();
    Assert.True(false, "SomeException was not thrown");
}
catch (SomeException)
{
}

That’s odd. We’re using xUnit. Why not use the Assert.Throws method? So I tried with the following naïve code.

Assert.Throws<SomeException>(() => await obj.GetSomeAsync());

Well that didn’t work. I got the following helpful compiler error:

error CS4034: The ‘await’ operator can only be used within an async lambda expression. Consider marking this lambda expression with the ‘async’ modifier.

Oh, I never really thought about applying the async keyword to a lambda expression, but it makes total sense. So I tried this:

Assert.Throws<SomeException>(async () => await obj.GetSomeAsync());

Hey, that worked! I rushed off to tell the internets on Twitter.

But I made a big mistake. That only made the compiler happy. It doesn’t actually work. It turns out that Assert.Throws takes in an Action and thus that expression doesn’t return a Task to be awaited upon. Stephen Toub explains the issue in this helpful blog post, Potential pitfalls to avoid when passing around async lambdas.

Ah, I’m gonna need to write my own method that takes in a Func<Task>. Let’s do this!

I wrote the following:

public async static Task<T> ThrowsAsync<T>(Func<Task> testCode)
      where T : Exception
{
  try
  {
    await testCode();
    Assert.Throws<T>(() => { }); // Use xUnit's default behavior.
  }
  catch (T exception)
  {
    return exception;
  }
  // Never reached. Compiler doesn't know Assert.Throws above always throws.
  return null;
}

Here’s an example of a unit test (using xUnit) that makes use of this method.

[Fact]
public async Task RequiresBasicAuthentication()
{
  await ThrowsAsync<SomeException>(async () => await obj.GetSomeAsync());
}

And that works. I mean it actually works. Let me know if you see any bugs with it.

Note that you have to change the return type of the test method (fact) from void to return Task and mark it with the async keyword as well.

So as I was posting all this to Twitter, I learned that Brendan Forster (aka @ShiftKey) already built a library that has this type of assertion. But it wasn’t on NuGet so he’s dead to me.

But he remedied that five minutes later.

Install-Package AssertEx.

So we’re all good again.

If I were you, I’d probably just go use that. I just thought this was an enlightening look at how await works with lambdas.