asp.net mvc, code comments edit

Sometimes, you need to scan all the types in an assembly for a certain reason. For example, ASP.NET MVC does this to look for potential controllers.

One naïve implementation is to simply call Assembly.GetTypes() and hope for the best. But there’s a problem with this. As Suzanne Cook points out,

If a type can’t be loaded for some reason during a call to Module.GetTypes(), ReflectionTypeLoadException will be thrown. Assembly.GetTypes() also throws this because it calls Module.GetTypes().

In other words, if any type can’t be loaded, the entire method call blows up and you get zilch.

There’s multiple reason why a type can’t be loaded. Here’s one example:

public class Foo : Bar // Bar defined in another unavailable assembly
{
}

The class Foo derives from a class Bar, but Bar is defined in another assembly. Here’s a non-exhaustive list of reasons why loading Foo might fail:

  • The assembly containing Bar does not exist on disk.
  • The current user does not have permission to load the assembly containing Bar.
  • The assembly containing Bar is corrupted and not a valid assembly.

Once again, for more details check out Suzanne’s blog post on Debugging Assembly Loading Failures.

Solution

As you might expect, being able to get a list of types, even if you don’t plan on instantiating instances of them, is a common and important task. Fortunately, the ReflectionTypeLoadException thrown when a type can’t be loaded contains all the information you need. Here’s an example of ASP.NET MVC taking advantage of this within the internal TypeCacheUtil class (there’s a lot of other great code nuggets if you look around the source code)

Type[] typesInAsm;
try
{
    typesInAsm = assembly.GetTypes();
}
catch (ReflectionTypeLoadException ex)
{
    typesInAsm = ex.Types;
}

This would be more useful as a generic extension method. Well the estimable Jon Skeet has you covered in this StackOverflow answer (slightly edited to add in parameter validation):

public static IEnumerable<Type> GetLoadableTypes(this Assembly assembly)
{
    if (assembly == null) throw new ArgumentNullException(nameof(assembly));
    try
    {
        return assembly.GetTypes();
    }
    catch (ReflectionTypeLoadException e)
    {
        return e.Types.Where(t => t != null);
    }
}

I’ve found this code to be extremely useful many times.

code, personal, tech comments edit

As a kid, I was an impatient little brat. On any occasion that required waiting, I became Squirmy Wormy until I pushed my dad to make the demand parents so often make of fidgety kids, “Sit still!

Recent evidence suggests a rejoinder to kids today in response to this command, “What!? Are you trying to kill me?!

There is compelling evidence that modern workers propensity to sit for prolonged periods every day makes them fat and shortens their lives. Hmmm, you wouldn’t happen to know any professions where sitting limply at a desk for long periods of time is common, would you?

Yeah, me too.

This spurred me to learn more which led me to The Daily Infographic’s great summary of this research. Seriously, click on the image below. I could have stopped there and called it a post. But as always, I don’t know when to stop.

sitting-is-killing-you

Much has been written about the detrimental health effects of inactivity. According to Marc Hamilton, a biomedical researcher, sitting down shuts off your fat burning.

Physiologists analyzing obesity, heart disease, and diabetes found that the act of sitting shuts down the circulation of a fat-absorbing enzyme called lipase.

The same Hamilton goes into more details in this interesting NY Times article on the potential lethality of prolonged sitting,

This is your body on chairs: Electrical activity in the muscles drops — “the muscles go as silent as those of a dead horse,” Hamilton says — leading to a cascade of harmful metabolic effects. Your calorie-burning rate immediately plunges to about one per minute, a third of what it would be if you got up and walked.

In other words, sitting down is the off button.

This LifeHacker article points out that the the long term health effects of sitting multiple hours a day go way beyond weight gain.

After 10-20 Years of Sitting More Than Six Hours a Day

Sitting for over six hours a day for a decade or two can cut away about seven quality adjusted life years (the kind you want). It increases your risk of dying of heart disease by 64 percent and your overall risk of prostate or breast cancer increases 30 percent.

I want all kinds of life years, but the “quality adjusted” variety sounds extra special.

You might think that you’ll be just fine because you exercise the recommended 30 minutes a day, but a study from the British Journal of Sports Medicine notes that’s not the case.

Even if people meet the current recommendation of 30 minutes of physical activity on most days each week, there may be significant adverse metabolic and health effects from prolonged sittingthe activity that dominates most people’s remaining “non-exercise” waking hours.

That’s particularly disheartening. All that other exercise you do might not counteract all the prolonged sitting.

Get up, stand up! Stand up for your code!

With apologies to Bob Marley

So what’s a developer to do? Note that these studies put an emphasis on prolonged. The simple solution is to stop sitting for prolonged periods at a time. Get up at least once an hour and move!

But developers are interesting creatures. We easily get in the zone on a problem and focus so deeply that three hours pass in a blink. Ironically this wasn’t a problem I faced as a Program Manager since I was moving from meeting to meeting nearly every hour.

But in my new job, writing code at home, I knew I needed more than an egg timer to tell me to move every hour. I want to move constantly if I can. For example, the way you do when you stand. So I looked into adjustable desks.

According to Alan Hedge, director of Cornell’s Human Factors and Ergonomics laboratory, workers fare better when using adjustable tables (emphasis mine).

We found that the computer workers who had access to the adjustable work surfaces also reported significantly less musculoskeletal upper-body discomfort, lower afternoon discomfort scores and significantly more productivity,” said Alan Hedge, professor of design and environmental analysis in the College of Human Ecology at Cornell and director of Cornell’s Human Factors and Ergonomics Laboratory.

So I went on a quest to find the perfect adjustable desk. How did I choose which desk to purchase?

dodecahedron \ Critical hit! Photo by disoculated from Flickr Ok, not quite. I might have put in a little more research into it than that.

I asked around the interwebs and received a lot of feedback on various options. I found two desk companies that stood out: Ergotron and GeekDesk.

Initially, I really liked the Ergotron approach. Rather than a motorized system for moving the desk up and down, it has a clever quick release lever system that makes it easy to adjust the desk’s height quickly without requiring any tools or electricity.

For this reason, I initially settled on the Workfit-D Sit Stand Desk. Unfortunately, Ergotron is a victim of its own success in this particular case. They were backordered until our sun grows into a red giant and engulfs the planet and I couldn’t wait that long.

So I ended up ordering the GeekDesk Max. This desk uses a motor to adjust to specific heights, but has four presets. This is important because without the presets, you’re sitting there holding the button until it reaches the height you want. While the motor is slower than the Ergotron approach, with the presets, I can just hit the button and go get a coffee. To be fair, it’s not all that slow. Did I mention I’m impatient?

I’m very happy with this desk. Here’s a photo of my workspace that I sent to CoderWall with the desk in a standing configuration.

my-workspace

If you are looking for a more inexpensive option, I recently learned about this Adjustable Keyboard Podium that seems like a good option. Jarrod, a StackOverflow developer, uses it in his office.

As far as I can find, there’s only one study that points to a potential negative health impact from standing at work. http://www.ncbi.nlm.nih.gov/pubmed/10901115

Significant relationships were found between the amount of standing at work and atherosclerotic progression.

However, as you might expect, this one study is not conclusive and doesn’t focus solely on the work habits of office workers who stand. From what I can tell so far, the health benefits far outweigh the detriments assuming you don’t over do it.

If you do stand at work, I highly recommend getting some sort of gel or foam padding to stand on. Especially if you’re not wearing shoes. The hard floor might seem fine at first because you’re a tough guy or girl, but over the course of a day, it’ll feel like someone’s taken a bat to your soles.

Also, vary it up throughout the day. Don’t stand all day. Take breaks where you work sitting down and alternate.

Fight malaise!

Not every developer is the same, clearly. Some are fit, but many, well, let’s just say that the health benefits mentioned in this post might not factor into their decision making.

James Levine, a researcher at the Mayo clinic, had a more philosophical point to make about sitting all day that goes beyond just the physical health benefits. He also sees a mental health benefit.

For all of the hard science against sitting, he admits that his campaign against what he calls “the chair-based lifestyle” is not limited to simply a quest for better physical health. His is a war against inertia itself, which he believes sickens more than just our body. “Go into cubeland in a tightly controlled corporate environment and you immediately sense that there is a malaise about being tied behind a computer screen seated all day,” he said. “The soul of the nation is sapped, and now it’s time for the soul of the nation to rise.”

http://www.nytimes.com/2011/04/17/magazine/mag-17sitting-t.html

In other words, stop sitting and write better code! Go forth and conquer.

code comments edit

Take a look at the following code.

const string input = "interesting";
bool comparison = input.ToUpper() == "INTERESTING";
Console.WriteLine("These things are equal: " + comparison);
Console.ReadLine();

Let’s imagine that input is actually user input or some value we get from an API. That’s going to print out These things are equal: True right? Right?!

Well not if you live in Turkey. Or more accurately, not if the current culture of your operating system is tr-TR (which is likely if you live in Turkey).

To prove this to ourselves, let’s force this application to run using the Turkish locale. Here’s the full source code for a console application that does this.

using System;
using System.Globalization;
using System.Threading;
internal class Program
{
    private static void Main(string[] args)
    {      
        Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR");
        const string input = "interesting";
        
        bool comparison = input.ToUpper() == "INTERESTING";

        Console.WriteLine("These things are equal: " + comparison);
        Console.ReadLine();
    }
}

Now we’re seeing this print out These things are equal: False.

To understand why this is the case, I recommend reading much more detailed treatments of this topic:

The tl;dr summary summary is that the uppercase for i in English is I (note the lack of a dot) but in Turkish it’s dotted, İ. So while we have two i’s (upper and lower), they have four.

My app is English only. AMURRICA!

Even if you have no plans to translate your application into other languages, your application can be affected by this. After all, the sample I posted is English only.

Perhaps there aren’t going to be that many Turkish folks using your app, but why subject the ones that do to easily preventable bugs? If you don’t pay attention to this, it’s very easy to end up with a costly security bug as a result.

The solution is simple. In most cases, when you compare strings, you want to compare them using StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase. It just turns out there are so many ways to compare strings. It’s not just String.Equals.

Code Analysis to the rescue

I’ve always been a fan of FxCop. At times it can seem to be a nagging nanny constantly warning you about crap you don’t care about. But hidden among all those warnings are some important rules that can prevent some of these stupid bugs.

If you have the good fortune to start a project from scratch in Visual Studio 2010 or later, I highly recommend enabling Code Analysis (FxCop has been integrated into Visual Studio and is now called Code Analysis). My recommendation is to pick a set of rules you care about and make sure that the build breaks if any of the rules are broken. Don’t turn them on as warnings because warnings are pointless noise. If it’s not important enough to break the build, it’s not important enough to add it.

Of course, many of us are dealing with existing code bases that haven’t enforced these rules from the start. Adding in code analysis after the fact is a daunting task. Here’s an approach I took recently that helped me retain my sanity. At least what’s left of it.

First, I manually created a file with the following contents:

<?xml version="1.0" encoding="utf-8"?>
<RuleSet Name="PickAName" Description="Important Rules" ToolsVersion="10.0">
  <Rules AnalyzerId="Microsoft.Analyzers.ManagedCodeAnalysis"
      RuleNamespace="Microsoft.Rules.Managed">

    <Rule Id="CA1309" Action="Error" />    
  
  </Rules>
</RuleSet>

You could create one per project, but I decided to create one for my solution. It’s just a pain to maintain multiple rule sets. I named this file SolutionName.ruleset and put it in the root of my solution (the name doesn’t matter. Just make the extension .ruleset)

I then configured each project that I cared about in my solution (I ignored the unit test project) to enable code analysis using this ruleset file. Just go to the project properties and select the Code Analysis tab.

CodeAnalysisRuleSet

I changed the selected Configuration to “All Configurations”. I also checked the “Enable Code Analysis…” checkbox. I then clicked “Open” and selected my ruleset file.

At this point, every time I build, Code Analysis will only run the one rule, CA1309, when I build. This way, adding more rules becomes manageable. Every time I fixed a warning, I’d add that warning to this file one at a time. I went through the following lists looking for important rules.

I didn’t add every rule from each of these lists, only the ones I thought were important.

At some point, I reached the point where I was including a large number of rules and it made sense for me to invert the list so rather than listing all the rules I want to include, I only listed the ones I wanted to exclude.

<?xml version="1.0" encoding="utf-8"?>
<RuleSet Name="PickAName" Description="Important Rules" ToolsVersion="10.0">
  <IncludeAll Action="Error" />
  <Rules AnalyzerId="Microsoft.Analyzers.ManagedCodeAnalysis"
      RuleNamespace="Microsoft.Rules.Managed">

    <Rule Id="CA1704" Action="None" />    
  
  </Rules>
</RuleSet>

Notice the IncludeAll element now makes every code analysis warning into an error, but then I turn CA1704 off in the list.

Note that you don’t have to edit this file by hand. If you open the ruleset in Visual Studio it’ll provide a GUI editor. I prefer to simply edit the file.

RuleSetEditor

One other thing I did was for really important rules where there were too many issues to fix in a timely manner, I would simply use Visual Studio to suppress all of them and commit that. At least that ensured that no new violations of the rule would be committed. That allowed me to fix the existing ones at my leisure.

I’ve found this approach makes using code analysis way more useful and less painful than simply turning on every rule and hoping for the best. Hope you find this helpful as well. May you never ship a bug with the Turkish I problem again!

git, github, nuget, code comments edit

A couple weeks ago I had the great pleasure to speak at the Norwegian Developer’s Conference (NDC). This is my second time speaking at NDC. The first time was back in 2009 and it was a blast!

I gave two talks this year. My slides and a video of each presentation are available as well.

Git and GitHub for Developers on Windows

GitHub.com is the place for open source developers to collaborate on their projects. But there’s a perception that GitHub and Git are the domain of Mac and *nix users. Not so! In this talk, Phil Haack, a GitHub employee, will show how GitHub makes open source collaboration fun and tools and techniques for using Git with GitHub on Windows.

NuGet: Zero to DONE in no time

Developers are known for “scratching their own itch,” producing thousands of libraries to handle every imaginable task. NuGet is an open source package manager from the Outercurve Foundation for .NET (and Windows) developers that brings these libraries together in one gallery and accelerates getting started with a project. It makes it easy to incorporate these libraries in a solution. In this talk, Phil Haack, project coordinator for NuGet, will describe the problem that NuGet solves, how to make effective use of it, and some tips and tricks around the newer features of NuGet.

My “Slides”

Speaking of slides, a lot of people asked me what PowerPoint themes I used for my slides. All of my “slide decks” are actually HTML pages with CSS and JavaScript. I’ve become a bit enamored of the Impress.js library.

If you go this route, it’s a lot more work to create a presentation, but if done well, I think it creates a nice effect of an infinite canvas for your thoughts. But it’s very easy to abuse it, which has caused a bit of a backlash against this approach from some people I know. I’m still experimenting with it because I think with a moderate approach, it provides a nice backdrop to a presentation.

Note that the slides might not work well on IE. Take that up with the impress.js author, not me.

If you want to see my decks for other presentations, they’re all posted on http://talks.haacked.com/.

Oslo, Norway

As always, Norway is beautiful in June. The speakers were treated to a boat ride on the Fjords.

ndc-boat-ride

At the conference, I was flattered by some folks who created some swag based on my blog!

ndc-swag

Collect them all!

And of course, the best part of any conference is the people. Here a large group of us spontaneously gathered at a Jazz bar. The mixture of accents from around the world was music to my ears.

ndc-jazz-bar

Once again, NDC does not disappoint.

The only low point of the conference was the incredibly awkward Windows Azure event punctuated with offensive lyrics. Though I don’t think the conference organizers had anything to do with that, so I don’t hold it against NDC as I do against the local Norwegian Microsoft offices.

github, git comments edit

In my last blog post, I mentioned that GitHub for Windows (GHfW) works with non-GitHub repositories, but I didn’t go into details on how to do that. GHfW is optimized for GitHub.com of course, but using it with non-GitHub repositories is quite easy.

All you need to do is drag and drop the HTTPS clone URL into the dashboard of the application.

For example, suppose you want to work on a project hosted on CodePlex.com. In my case, I’ll choose NuGet. The first thing you need to find is the Clone URL. In CodePlex, click on the Source Code tab and then click on the sidebar Git link to get the remote URL. If there is no Git link, then you are out of luck.

nuget-codeplex-git-url

Next, select the text of the clone url, then click on it and drag it into the GitHub for Windows dashboard. Pretty easy!

ghfw-drag-drop-repo

You’ll see the repository listed in the list of local repositories. Double click the repository (or click on the blue arrow) to navigate to the repository.

ghfw-nuget-local-repo

The first time you navigate to the repository, GHfW prompts you for your credentials to the Git host, in this case, CodePlex.com. This probably goes without saying, but do not enter your GitHub.com credentials here.

ghfw-codeplex-credentials

GHfW will securely store the credentials for this repository so that you only need to enter it once. GHfW acts as a credentials provider for Git so the credentials you enter here will also work with the command line as long as you launch it from the Git Shell shortcut that GHfW installs. That means you won’t have to enter the credentials every time you push or pull commits from the server.

With that, you’re all set. Work on your project, make local commits, and when you’re ready to push your changes to the server, click on the sync button.

ghfw-nuget-sync-codeplex

While we think you’ll have the best experience on GitHub.com, we also think GitHub for Windows is a great client for any Git host.

Tags: git, github, ghfw, gh4w

github, git, code comments edit

For the past several months I’ve been working on a project with my amazing cohorts, Paul, Tim, and Adam, and Cameron at GitHub. I’ve had the joy of learning new technologies and digging deep into the inner workings of Git while lovingly crafting code.

But today, is a good day. We’ve called the shipit squirrel into action once again! We all know that the stork delivers babies and the squirrel delivers software. In our case, we are shipping GitHub For Windows! Check out the official announcement on the GitHub Blog. GitHub for Windows is the easiest and best way to get Git on your Windows box.

gh4w-app

If you’re not familiar with Git, it’s a distributed version control system created by Linus Torvalds and his merry Linux hacking crew. If you are familiar with Git, you’ll know that Git has historically been a strange and uninviting land for developers on the Windows platform. I call this land, Torvaldsia, replete with strange incantations required to make things work.

Better Git on Windows

In recent history, this has started to change due to the heroic efforts of the MSysGit maintainers who’ve worked hard to provide a distribution of Git that works well on Windows.

GitHub for Windows (or GH4W for short) builds on those efforts to provide a client to Git and GitHub that’s friendly, approachable, and inviting. If you’re a Git noob, this is a good place to start. If you’re a Git expert on Windows, at the very least, GitHub for Windows can still be a useful part of your workflow. Just visit http://windows.github.com/ and click the big green download button.

In this post, I’ll give a brief rundown of what gets installed and how to customize the shell for you advanced users of Git.

As the GitHub blog post shows, you can easily access and clone repositories on GitHub either by clicking the Clone in Windows link from a repository on GitHub.com itself, or by cloning a repository associated with your account directly from the application.

The application allows you to browse, make, revert, and rollback commits. You can also find, create, publish, merge, and delete branches. I’ll go into more details about this sort of thing in future blog posts. In this post, I want to talk about what gets installed and then cover customizing the Git shell we include for you advanced Git users.

Installation

If you’ve ever read the old guide to installing msysgit for Windows on the GitHub help page, you’d know there’s a lot of configuration steps involved. We use ClickOnce to install the application and to provide Google Chrome style silent, automated, updates that install in the background to keep it up-to-date.

GH4W is a sandboxed installation of Git and the GitHub application that takes care of all that configuration. Please note, it will not mess with your existing Git environment if you have one. There will be two shortcuts installed on your machine, one for the GH4W application and another labeled “Git Shell”.

The Git Shell shortcut launches the shell of your choice as configured within the GH4W application’s options menu. You can also launch the shell from within the application for any given repository.

gh4w-default-shells

By default, this is PowerShell but you can change it to Bash, Cmd, or even a custom option, which I’ll cover in a second.

Posh-Git and PowerShell

When you launch the shell, you’ll notice that the PowerShell option includes Posh-Git by Keith Dahlby. I’ve written about Posh-Git before and we love it so much we included it in the box. This is an even easier way to get Posh-Git on your machine and stay up to date with the latest version.

You might notice that our PowerShell icon doesn’t execute your existing PowerShell profile. We worried about conflicts with existing Posh-Git installs or whatever you might have. Instead, we execute a custom profile script if it exists, GitHub.PowerShell_profile.ps1.

Just create one in the same directory as your $profile script. In my case, it’s in the C:\Users\Haacked\Documents\WindowsPowerShell directory.

Custom Shell

I’m a huge fan of pimping out my command line shell with Console2. As the previous screenshot shows, you can specify a custom shell like Console2. However, when you launch a custom shell, it won’t load our profile script and also won’t load the version of Posh-Git that we include. However, we added an environment you can check within the Microsoft.Powershell_profile.ps1 script.

# If Posh-Git environment is defined, load it.
if (test-path env:posh_git) {
    . $env:posh_git
}

The benefit here, as I mentioned earlier is that you won’t have to worry about keeping Posh-Git up-to-date since we’ll do it for you as part of GH4W updates.

What’s Next?

I’ll try and cover a few other topics later. For example, GH4W works with local Git repositories as well as those from other hosts. I’ll also try and cover how I fit GitHub for Windows into my Git workflow developing with Visual Studio. If you have other ideas for topics you’d like me to cover, let me know.

In the meanwhile, try it out!

If you have feedback, mention @github on Twitter (hashtag #gh4w). We make sure to read every mention on Twitter. If you find a bug, submit it to support@github.com. Every email is read by a real person.

But of course, I expect many of you will comment right here and I’ll do my best to keep up with responses because I love you all.

personal, tech, code comments edit

Around eight years ago I wrote a blog post about Repetitive Strain Injury entitled The Real Pain of Software Development [part 1]. I soon learned the lesson that it’s a bad idea to have “Part 1” in any blog post unless you’ve already written part 2. But here I am, eight years later, finally getting around to part 2.

But better late than never!

The original reason that led me to write about this topic was a period of debilitating pain I went through when coding. Too many long hours at the keyboard took their toll on me so that even placing my fingers on the keyboard would cause me pain. I experienced numbness in my fingers, pain in my wrists, back and shoulders, and lots of headaches. In short, I was a mess.

Road to Recovery

Fortunately, my employer at the time was supportive of me filing a Worker’s Compensation claim. I know for some, that has a negative connotation, but keep in mind it’s insurance that you pay in to specifically for cases of injuries. So it makes sense to use it if you’re legitimately injured on the job. Per wikipedia:

Workers’ compensation is a form of insurance providing wage replacement and medical benefits to employees injured in the course of employment in exchange for mandatory relinquishment of the employee’s right to sue his or her employer for the tort of negligence.

The insurance covered several things for me:

  • Doctor visits
  • Physical Therapy (PT)
  • Occupational Therapy (OT)
  • An ergonomic chair (Neutral Posture)

I am extremely grateful for these measures as they’ve taught me the means to care for myself and deal with ongoing pain in a productive manner. I love to code and the thought of switching careers at the time was depressing.

One thing that’s important to understand is that every person is different. Some folks can work 16 hrs a day slouching the whole time, and have no problems. While others can work 8 hrs a day in perfect posture and have tons of pain. It’s important to listen to what your own body is telling you.

Therapy

Have an fully grown friend lay down on the floor and relax. No funny business here, I promise. Then lift their head up gently with your two hands. Notice how heavy that is? A human head (without hair) weighs around 8 to 12 pounds. And I’ve been told, some noggins are larger than others.

Ok, you can put it down now. Gently! A head is a pretty heavy thing, even when engaging your arms to lift it. Now consider the fact that you have only your neck muscles to hold it up all day.

So unless you’re built by this guy (photo from The NFL’s Widest Necks article on Slate)

big-neck

Holding your head up all day can be a literal pain in the neck. The trick, of course, is to balance the head well so you’re neck isn’t constantly engaged.

What I learned in PT was how all these systems are connected. Pain in the neck and shoulders can impinge on nerves that run through the arm, elbow, and into your hands.

So a lot of Physical Therapy involved strengthening these muscles to better handle the stresses of the day combined with various massages and stretches to release tension in these muscles.

A lot of Occupational Therapy was focused on habits and behaviors so that these muscles weren’t overused in the first place. No matter how good your posture is, you need to take regular breaks. The body doesn’t respond well to being overly static. Even sitting in place with perfect posture for hours on end takes its toll. The body needs movement.

During my therapy, I bought a foam roller and would bring it to the office. I didn’t care how silly I looked, regular stress breaks with the roller helped me out a lot.

Dvorak Keyboard Layout

Another change I made at the same time was to switch to a Dvorak Simplified Keyboard Layout:

Because the Dvorak layout concentrates the vast majority of key strokes to the home row, the Dvorak layout uses about 63% of the finger motion required by QWERTY, thus making the Dvorak layout more ergonomic.^[16]^ Because the Dvorak layout requires less finger motion from the typist compared to QWERTY, many users with repetitive strain injuries have reported that switching from QWERTY to Dvorak alleviated or even eliminated their repetitive strain injuries.

I hoped that reducing finger motion would result in less strain on my hands over all.

There’s some controversy around whether Dvorak is really better than QWERTY. A reason.com article on QWERTY vs Dvorak pointed out that the idea that QWERTY was designed to slow down typists is a myth. It goes on to provide evidence that there’s no reason to believe Dvorak is superior to QWERTY.

While the part about QWERTY is true, the evidence in the Reason article that QWERTY is superior to Dvorak is also suspect.

The fact is that there’s too little research to make any claims. And all these studies focused on typing speed and not on impact to repetitive stress injuries.

And I’m not sure my experience can lend credence either way because it was not a controlled experiment. I switched to Dvorak while also engaging in new habits meant to improve my condition. So it’s hard to say whether Dvorak helped, I do subjectively feel that it’s more comfortable given how much my fingers stay on the home row.

The Right Chair

In his blog post, The Programmer’s Bill of Rights, Jeff Atwood calls out the need for a comfortable chair.

Let’s face it. We make our livings largely by sitting on our butts for 8 hours a day. Why not spend that 8 hours in a comfortable, well-designed chair? Give developers chairs that make sitting for 8 hours not just tolerable, but enjoyable. Sure, you hire developers primarily for their giant brains, but don’t forget your developers’ other assets.

He also has a great follow-up blog post, Investing in a Quality Programming Chair.

I mentioned earlier that Workman’s Comp paid for a chair. I also bought another one with my own money so I’d have a good one both at home and at work. It’s that important!

For many, the Herman Miller Aeron chair is synonymous with “ergonomic chair.” But it’s very important to note that, as good as it is, it’s not necessarily the right chair for everybody. I found that for whatever reason, it just wasn’t very comfortable with my body type. I felt the seat pan was too long and pushed against the back of my knees more than I liked.

I tried a bunch of chairs and settled on the Neutral Posture series with a Tempurpedic seat cushion so my ass is cradled like a newborn. Be sure to get a chair that works for you and not simply select one because you heard about it.

Today

One thing a doctor told me when I was dealing with this was that it’s very likely that I’ll always have pain. The question is how well will I deal with it when it happens?

And it’s true. The pain has subsided for the most part, but it’s never totally gone away. The good news is that I’ve been able to have a productive career in software because I took the pain seriously and worked to address it immediately. On days when I do have pain, I deal with it with stretches, exercise, and taking breaks. I also work to reduce my stress level as I’ve found that my pain level seems to be correlated to the amount of stress I feel. I think I tend to carry my stress in my shoulders.

If you’re dealing with pain due to coding, please know that it’s not because you are deficient in some manner. Or because you’re a wimp. There’s really no value judgment to be made. You’re not alone. It’s pretty common. Don’t ignore it! You wouldn’t (or shouldn’t) ignore a searing pain in your abdomen, so why ignore this?

With the right treatment and regimen, it can get better. Good luck!

code, rx comments edit

For a long time, good folks like Matt Podwysocki have extolled the virtues of Reactive Extensions (aka Rx) to me. It piqued my interest enough for me to write a post about it, but that was the extent of it. It sounded interesting, but it didn’t have any relevance to any projects I had at the time.

Fortunately, now that I work at GitHub I have the pleasure to work with an Rx Guru, Paul Betts, on a project that actively uses Rx. And man, is my mind blown by Rx.

Hits Me Like A Hurricane

What really blew me away about Rx is how it allows you to handle complex async interactions declaratively. No need to chain callbacks together or worry about race conditions. With Rx, you can easily compose multiple async operations together. It’s powerful.

The way I describe it to folks is to think of how the IEnumerable and IEnumerator are involved when iterating over an enumerable. Now take those and reverse the polarity. That’s Rx. But with Rx, the IObservable and IObserver interfaces are involved and rather than enumerate over existing sequences, you write queries against sequences of future events.

Hear that? That’s the sound of my head asploding again.

the-future

Rx has a tendency to twist and contort the mind in strange ways. But it’s really not all that complicated. It only hurts the head at first because it’s a new way to think about async, sequences, and queryies for many folks.

Here’s a simple example that helps demonstrate the power of Rx. Say you’re writing a client app (such as a WPF application) and want to save the application to persist its window’s position and size. That way, the next time the app starts, the position is restored.

How you save the position isn’t so important, but if you’re curious, I found this post, Saving window size and location in WPF and WinForms, helpful.

I modified it in two ways for my needs. First, I replaced the Settings object with an asynchronous cache as the storage for the placement info.

I then changed it to save the placement info when the window is resized, rather than when the application exits. That way, if the app crashes, it won’t forget its last position.

Handling Resize Events

So let’s think about this a bit. When you resize a window, the resize event might be fired a large number of times. We probably don’t want to save the position on every one of those calls. It’s not just a performance problem, but it could be a data corruption problem if I’m using an async method to save the placement. It might be possible for a later call to occur before an earlier call when so many happen so close together.

What we really want to do is save the setting when there’s a pause during a resize operation. For example, a user starts to resize the window, then stops. Five seconds later, if there’s been no other resize operation, only then do we save the setting.

How would you do this with traditional code? You could probably figure it out, ut it’d be ugly. Perhaps have the resize event start a timer for five seconds, if it isn’t started already. Each subsequent event would reset the timer. When the timer finishes, it saves the setting and turns itself off. The code is going to be a bit gnarly and all over the place.

Here’s what it looks like with Rx.

Observable.FromEventPattern<SizeChangedEventHandler, SizeChangedEventArgs>
    (h => SizeChanged += h, h => SizeChanged -= h)
    .Throttle(TimeSpan.FromSeconds(5), RxApp.DeferredScheduler)
    .Subscribe(_ => this.SavePlacement());

That’s it! Nice and self contained in a single expression.

Let’s break it down a bit.

Observable.FromEventPattern<SizeChangedEventHandler, SizeChangedEventArgs>
    (h => SizeChanged += h, h => SizeChanged -= h)

This first part of the expression converts the SizeChangedEvent into an observable. The specific type of this observable is IObservable<EventPattern<SizeChangedEventArgs>>. This is analogous to an IEnumerable<EventPattern<SizeChangedEventArgs>>, but with its polarity reversed. Having an observable will allow us to subscribe to a stream of size changed events. But first:

.Throttle(TimeSpan.FromSeconds(5), RxApp.DeferredScheduler)

This next part of the expression uses the Throttle method to throttle the sequence of events coming from the observable. It will ignore events in the sequence if a newer one arrives within the specified time span. In other words, this observable won’t return any item until there’s a five second lull in events.

The RxApp.DeferredScheduler comes from the ReactiveUI framework and is equivalent to new DispatcherScheduler(Application.Current.Dispatcher). It indicates which scheduler to run the throttle timers on. In this case, we indicate the dispatcher scheduler which runs the throttle timer on the UI thread.

.Subscribe(_ => this.SavePlacement());

And we end with the Subscribe call. This method takes in an Action to run for each item in the observable sequence when it arrives. This is where we do the work to actually save the window placement.

Putting it all together, every time a resize event is succeeded by a five second lull, we save the placement of the window.

But wait, compose more

Ok, that’s pretty cool. But to write imperative code to do this would be slightly ugly and not all that hard. Ok, let’s up the stakes a bit, shall we?

We forgot something. You don’t just want to save the placement of the window when it’s resized. You also want to save it when it’s moved.

So we really need to observe two sequences of events, but still throttle both of them as if they were one sequence. In other words, when either a resize or move event occurs, the timer is restarted. And only when five seconds have passed since either event has occurred, do we save the window placement.

The traditional way to code this is going to be very ugly.

This is where Rx shines. Rx provides ways to compose observables in very interesting ways. In this case we’ll deal with two observables, the one we already created that handles SizeChanged events, and a new one that handles LocationChanged events.

Here’s the code for the LocationChanged observable. I’ll save the observable into an intermediate variable for clarity. It’s exactly what you’d expect.

var locationChanges = Observable.FromEventPattern<EventHandler, EventArgs>
  (h => LocationChanged += h, h => LocationChanged -= h);

I’ll do the same for the SizeChanged event.

var sizeChanges = Observable.FromEventPattern
    <SizeChangedEventHandler, SizeChangedEventArgs>
    (h => SizeChanged += h, h => SizeChanged -= h);

We can use the Observable.Merge method to merge these sequences into a single sequence. But going back to the IEnumerable analogy, these are both sequences of different types. If you had two enumerables of different types and wanted to combine them into a single enumerable, what would you do? You’d apply a transformation with the Select method! And that’s what we do here too.

Since I don’t care what the event arguments are, just when they arrive, I’ll transform each sequence into an IObservable<Unit.Default> by calling Select(_ => Unit.Default) on each observable. Unit is an Rx type that indicates there’s no information. It’s like returning void.

var merged = Observable.Merge(
    sizeChanges.Select(_ => Unit.Default), 
    locationChanges.Select(_ => Unit.Default)
);

I’ll then call Observable.Merge to merge the two sequences together into a single sequence of event args.

Now, with this combined sequence, I can simply apply the same throttle and subscription I did before.

merged
    .Throttle(TimeSpan.FromSeconds(5), RxApp.DeferredScheduler)
    .Subscribe(_ => this.SavePlacement());

Think about that for a second. I was able to compose various sequences of events and into a single observable and I didn’t have to change the code to throttle the events or to subscribe to them.

As you get more familiar with Rx, it starts to get easier to read the code and you tend to use less intermediate variables. Here’s the full more idiomatic expression.

Observable.Merge(
    Observable.FromEventPattern<SizeChangedEventHandler, SizeChangedEventArgs>
        (h => SizeChanged += h, h => SizeChanged -= h)
        .Select(e => Unit.Default),
    Observable.FromEventPattern<EventHandler, EventArgs>
        (h => LocationChanged += h, h => LocationChanged -= h)
        .Select(e => Unit.Default)
).Throttle(TimeSpan.FromSeconds(5), RxApp.DeferredScheduler)
.Subscribe(_ => this.SavePlacement());

That single declarative expression handles so much crazy logic. Very powerful stuff.

Even if you don’t write WPF apps, there’s still probably something useful here for you. This same powerful approach is also available for JavaScript.

See it in action

I put together a really rough sample app that demonstrates this concept. It’s not using the async cache, but it is using Rx to throttle resize and move events and then save the placement of the window after five seconds.

Just grab the WindowPlacementRxDemo project from my CodeHaacks GitHub repository.

More Info

For more info on Reactive Extensions, I recommend the following:

Tags: Rx, Reactive-Extensions, RxUI, Reactive-UI, WPF

code, open source, asp.net, asp.net mvc comments edit

Changing a big organizations is a slow endeavor. But when people are passionate and persistent, change does happen.

Three years ago, the ASP.NET MVC source code was released under an open source license. But at the time, the team could not accept any code contributions. In my blog post talking about that release, I said the following (emphasis added):

Personally (and this is totally my own opinion), I’d like to reach the point where we could accept patches. There are many hurdles in the way, but if you went back in time several years and told people that Microsoft would release several open source projects (Ajax Control Toolkit, MEF, DLR, IronPython and IronRuby, etc….) you’d have been laughed back to the present.Perhaps if we could travel to the future a few years, we’ll see a completely different landscape from today.

Well my friends, we have travelled to the future! Albeit slowly, one day at a time.

As everyone and their mother knows by now, yesterday Scott Guthrie announced that the entire ASP.NET MVC stack is being released under an open source license (Apache v2) and will be developed under an open and collaborative model:

  • ASP.NET MVC 4
  • ASP.NET Web API
  • ASP.NET Web Pages with Razor Syntax

Note that ASP.NET MVC and Web API have been open source for a long time now. The change that Scott announced is that ASP.NET Web Pages and Razor, which until now was not open source, will also be released under an open source license.

Additionally, the entire stack of products will be developed in the open in a Git repository in CodePlex and the team will accept external contributions. This is indeed exciting news!

Hard Work

It’s easy to underestimate the hard work that the ASP.NET MVC team and Web API team did to pull this off. In the middle of an aggressive schedule, they had to completely re-work their build systems, workflow, etc… to move to a new source control system and host. Not to mention integrate two different teams and products together into a single team and product. It’s a real testament to the quality people that work on this stack that this happened so quickly!

I also want to take a moment and credit the lawyers, who are often vilified, for their work in making this happen.

One of my favorite bits of wisdom Scott Guthrie taught me is that the lawyers’ job is to protect the company and reduce risk. If lawyers had their way, we wouldn’t do anything because that’s the safest choice.

But it turns out that the biggest threat to a company’s long term well-being is doing nothing. Or being paralyzed by fear. And fortunately, there are some lawyers at Microsoft who get that. And rather than looking for reasons to say NO, they looked for reasons to say YES! And looked for ways to convince their colleagues.

I spent a lot of time with these lawyers poring over tons of legal documents and such. Learning more about copyright and patent law than I ever wanted to. But united with a goal of making this happen.

These are the type of lawyers you want to work with.

Submitting Contributions

For those of you new to open source, keep in mind that this doesn’t mean open season on contributing to the project. Your chances of having a contribution accepted are only slightly better than before.

Like any good open source project, I expect submissions to be reviewed carefully. To increase the odds of your pull request being accepted, don’t submit unsolicited requests. Read the contributor guidelines (I was happy to see their similarity to theNuGet guidelines) first and start a discussion about the feature. It’s not that an unsolicited pull request won’t ever be accepted, but the more that you’re communicating with the team, the more likely it will be.

Although their guidelines don’t state this, I highly recommend you do your work in a feature branch. That way it’s very easy to pull upstream changes into your local master branch without disturbing your feature work.

Many kudos to the ASP.NET team for this great step forward, as well as to the CodePlex team for adding Git support. I think Git has a bright future for .NET and Windows developers.

code, personal, open source comments edit

Disclaimer: these opinions are my own and don’t necessarily represent the opinion of any person or institution who are not me.

The topic of sexism in the software industry has flared up recently. This post by Katie Cunningham (aka The Real Katie), entitled Lighten Up, caught my attention. As a father of a delightful little girl, I hope someday my daughter feels welcomed as a developer should she choose that profession.

In general, I try to avoid discussions of politics, religion, and racism/sexism on my blog not because I don’t have strong feelings about these things, but I doubt I will change anyone’s mind.

If you don’t think there’s an institutionalized subtle sexism problem in our industry, I probably won’t change your mind.

So I won’t try.

Instead, I want to attempt an empirical look at some problems that probably do affect you today that just happen to be related to sexism. Maybe you’ll want to do something about it.

But first, some facts.

The Facts

Whether we agree on the existence of institutional sexism in our industry, I think we can all agree that our industry is overwhelmingly male.

It wasn’t always like this. Ada Lovelace is widely credited as the world’s first programmer. So there was at least a brief time in the 1840s when 100% of developers were women. As late as the 1960s, computing was seen as women’s work, emphasis mine:

“You have to plan ahead and schedule everything so it’s ready when you need it. Programming requires patience and the ability to handle detail. Women are ‘naturals’ at computer programming.

The same site where I found that quote has a link to this great Life Magazine archive photo of IBM computer operators.

ibm-60s

But the percentage of women declined steadily from that point. According to this Girls Go Geek post, in 1987, 42% of software developers were women. But then:

From 1984 to 2006, the number of women majoring in computer science dropped from 37% to 20% — just as the percentages of women were increasing steadily in all other fields of science, technology, engineering, and math, with the possible exception of physics.

The post goes on to state that the number of CS grads at Harvard is on the increase, but overall numbers are still low.

So why is there this decline? That’s not an easy question to answer, but I think we can rule out the idea that women are somehow inherently not suited for software development. History proves that idea wrong.

Ok fine, there’s less women in software for whatever reasons. Maybe they don’t want to be developers. Hard for me to believe as I think it’s the best goddamn profession ever. But let’s humor that argument just for a moment. Suppose that was true. Why is it a problem for our industry? I’ll name two reasons.

The OSS Contributor Problem

If you’re involved in an open source project, you’ve probably noticed that it’s really hard to find good contributors. So many projects are solitary labors of love. Well it turns out according this post, Sexism: Open Source Software’s Dirty Little Secret:

Asked to guess what percentage of FOSS developers are women, mostly people guess a number between 30-45%. A few, either more observant or anticipating a trick question after hearing the proprietary figure, guess 12-16%. The exact figure, though, is even lower: 1.5%

In other words, women’s participation in FOSS development is over seventeen times lower than it is in proprietary software development.

HWHAT!? That is insane!

From a purely selfish standpoint, that’s a lot of potential developers who could be contributing to your project. Even if you don’t believe there’s rampant institutionalized sexism, why wouldn’t you want to remove barriers and create an environment that makes more contributors feel welcome to your project?

Oh, and just making your logo pink isn’t the way to go about it. Not that I have anything against pink, but simple stereotypical approaches won’t cut it. Really listen to the concerns of folks like Katie and try and address them.

I don’t mean to suggest you will get legions of female contributors overnight. This is a very complex problem and I have no clue how to fix it. I’m probably just as guilty as I can’t name a single female contributor to any of my projects, though I’ve tried my best to cajole some to contribute (you know who you are!). But a good first step is to remove ignorance and indifference to the topic.

The Employment Problem

We all know how hard it is to find good developers. In fact, while the recession saw high overall unemployment, that time was marked by a labor shortage of developers. So it comes as a surprise to me that employers tolerate a work environment that makes a large percentage of the potential workforce feel unwelcome.

According to this New York Times article written in 2010,

The share of women in the Silicon Valley-based work force was 33 percent, dropping down from 37 percent in 1999.

Note that it’s not just a gender issue.

It’s an issue I’ve covered over the years, so I was interested to see that while the collective work force of those 10 companies grew by 16 percent between 1999 and 2005, the proportion of Hispanic workers declined by 11 percent, to about 2,200; they now make up about 7 percent of the total work force. Black workers declined to 2 percent of the work force, down from 3 percent.

Again, my point here isn’t to say “You should be ashamed of yourself for being sexist and racist!” Though if you are, you should be.

No, the point here is shift your perspective and look at the reality of the current situation we’re in, despite the reasons why it is the way it is. For whatever reasons, there’s a lot of people who might be great developers, but feel that our industry doesn’t welcome them. That’s a problem! And an opportunity!

It’s an opportunity to improve our industry! If we make the software industry a place where women and minorities want to work, we’ve increased the available pool of software developers. That not only means more quality developers to hire, it also means more diverse perspectives, which is important to creative thought and benefits the bottom line:

So a sociologist called Cedric Herring has just completed a very interesting study that obtained data from 250 representative companies in the United States that looked at both their diversity levels as well as various measures of business performance there. And he finds that with every successive level of increased diversity, companies actually appear to do better on all those measures of business performance.

That’s a pretty compelling argument.

So, what are brogrammers afraid of?

For the uninitiated, the term “brogrammer” is a recent term that describes a new breed of frat boy software developers that are representative of those who don’t see the need to attract more women and minorities to our industry.

Given the benefits we enjoy when we attract a more diverse workforce into software development, why is the attitude that we shouldn’t do anything to increase the numbers of women and minorities in our industry still prevalent?

It’s not an easy question to answer, but I did have one idea that came to mind I wanted to bounce off of you. Suppose we were successful at attracting women and minorities in numbers proportional to the make-up of the country. That would increase the pool of available developers. Would that also lower overall salaries? Supply and demand, after all.

I can see how that belief that might lead to fear and the attitude that we’re fine as it is, we don’t need more of you.

But at the same time, when you consider the talent shortage, I don’t believe this for one second. At this point, I don’t have any studies to point to, but I would welcome any links to evidence you can provide. But my intuition tells me that what would happen is it would simply decrease our talent shortage, but a shortage would still remain.

What would happen is we’d see the shakeout of bad programmers from the ranks.

Let’s face it, because of the talent shortage, there’s a lot of folks who are programmers who probably shouldn’t be. But for the majority of developers, I don’t think we have anything to fear. We should welcome the influx of new ideas and the overall improvement of our industry that more developers (and thus more better developers) bring. A rising tide lifts all boats as they say.

Now, I’m not sure this is the real reason these attitudes prevail. It sure seems awful calculating. I’m inclined to think it’s simple cluelessness. But it’s possible this is a subconscious factor.

Or perhaps it’s the fear that the influx of people from diverse backgrounds will require that they grow up, leave the trappings of their college behind, and become adults who know how to relate to people different than them.

Conclusion

I know this is a touchy subject. I want to make one thing very clear. My focus in this post was on arguments that don’t require one to believe there’s rampant sexism in the software industry. The arguments were mostly self-interest arguments in favor of changing the status quo.

I don’t claim there isn’t sexism. I believe there is. You can find lots of arguments that make a compelling case that institutionalized sexism exists and that it’s wrong. The point of this post is to provide food for thought for those who don’t believe there’s sexism. If we change the status quo, I believe attitudes will follow. They tend to follow one another with each leading the other at times.

In the end, it’s a complex problem and I certainly don’t claim to have the answers on solving it. But I think a good start is leaving behind the fear, acknowledging the issue, recognizing the opportunity to improve, and embracing the concrete benefits that diversification bring.

What do you think?

git, github, code comments edit

I recently gave my first talk on Git and GitHub to the Dot Net Startup Group about Git and GitHub. I was a little nervous about how I would present Git. At its core, Git is based on a simple structure, but that simplicity is easily lost when you start digging into the myriad of confusing command switches.

I wanted a visual aid that showed off the structure of a git repository in real time while I issued commands against the repository. So I hacked one together in a couple afternoons. SeeGit is an open source instructive visual aid for teaching people about git. Point it to a directory and start issuing git commands, and it automatically updates itself with a nice graph of the git repository.

seegit

During my talk, I docked SeeGit to the right and my Console2 prompt to the left so they were side by side. As I issued git commands, the graph came alive and illustrated changes to my repository. It updates itself when new commits occur, when you switch branches, and when you merge commits.

It doesn’t handle rebases well yet due to a bug, but I’m hoping to add that as well as a lot of other useful features that make it clear what’s going on.

Part of the reason I was able to write a useful, albeit buggy, tool so quickly was due to the fantastic packages available on NuGet such as LibGit2Sharp, GraphSharp, and QuickGraph among others. Installing those got me up and running in no time.

I hope to add a nice visual illustration of a rebase soon as well as the ability to toggle the display of unreachable commits. I hope to use this in many future talks as a nice way of teaching git. Who knows, it might become useful in its own right as a tool for developers using Git on real repositories.

But it’s not quite there yet. If you would like to contribute, I would love to have some help. And let me know if you make use of this!

If you want to try it out and don’t want to deal with downloading the source and compiling it, I put together a zip package with the application. I’ve only tested it on Windows 7 so it might break if you run on XP.

asp.net mvc, asp.net, code comments edit

Conway’s Law states,

…organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.

Up until recently, there was probably no better demonstration of this law than the fact that Microsoft had two ways of shipping angle brackets (and curly braces for that matter) over HTTP – ASP.NET MVC and WCF Web API.

The reorganization of these two teams under Scott Guthrie (aka “The GU” which I’m contractually bound to tack on) led to an intense effort to consolidate these technologies in a coherent manner. It’s an effort that’s lead to what we see in the recently released ASP.NET MVC 4 Beta which includes ASP.NET Web API.

For this reason, this is an exciting release of ASP.NET MVC 4 as I can tell you, it was not a small effort to get these two teams with different philosophies and ideas to come together and start to share a single vision. And this vision may take more than one version to realize fully, but ASP.NET MVC 4 Beta is a great start!

For me personally, this is also exciting as this is the last release I had any part in and it’s great to see the effort everyone put in come to light. So many congrats to the team for this release!

Some Small Things

small-things

If you take a look at Jon Galloway’s on ASP.NET MVC 4, he points to a lot of resources and descriptions of the BIG features in this release. I highly recommend reading that post.

I wanted to take a different approach and highlight some of the small touches that might get missed in the glare of the big features.

Custom IActionInvoker Injection

I’ve written several posts that add interesting cross cutting behavior when calling actions via the IActionInvoker interface.

Ironically, the first two posts are made mostly irrelevant now that ASP.NET MVC 4 includes ASP.NET Web API.

However, the concept is still interesting. Prior to ASP.NET MVC 4, the only way to switch out the action invoker was to write a custom controller factory. In ASP.NET MVC 4, you can now simply inject an IActionInvoker using the dependency resolver.

The same thing applies to the ITempDataProvider interface. There’s almost no need to write a custom IControllerFactory any longer. It’s a minor thing, but it was a friction that’s now been buffed out for those who like to get their hands dirty and extend ASP.NET MVC in deep ways.

Two DependencyResolvers

I’ve been a big fan of using the Ninject.Mvc3 package to inject dependencies into my ASP.NET MVC controllers.

ninject.mvc3

However, your Ninject bindings do not apply to ApiController instances. For example, suppose you have the following binding in the NinjectMVC3.cs file that the Ninject.MVC3 package adds to your project’s App_Start folder.

private static void RegisterServices(IKernel kernel)
{
    kernel.Bind<ISomeService>().To<SomeService>();
}

Now create an ApiController that accepts an ISomeService in its constructor.

public class MyApiController : ApiController
{
  public MyApiController(ISomeService service)
  {
  }
  // Other code...
}

That’s not going to work out of the box. You need to configure a dependency resolver for Web API via a call to GlobalConfiguration.Configuration.ServiceResolver.SetResolver.

However, you can’t pass in the instance of the ASP.NET MVC dependency resolver, because their interfaces are different types, even though the methods on the interfaces look exactly the same.

This is why I wrote a small adapter class and convenient extension method. Took me all of five minutes.

In the case of the Ninject.MVC3 package, I added the following line to the Start method.

public static void Start()
{
  // ...Pre-existing lines of code...

  GlobalConfiguration.Configuration.ServiceResolver
  .SetResolver(DependencyResolver.Current.ToServiceResolver());
}

With that in place, the registrations work for both regular controllers and API controllers.

I’ve been pretty busy with my new job to dig into ASP.NET MVC 4, but at some point I plan to spend more time with it. I figure we may eventually upgrade NuGet.org to run on MVC 4 which will allow me to get my hands really dirty with it.

Have you tried it out yet? What hidden gems have you found?

personal comments edit

Recently I’ve been tweeting photos of my kids playing with a new toy my wife bought them that I’mthey are totally enthralled with. It’s called the Bildopolis Big Bilder Kit.

This is a creation of a family friend of ours who used to be an industrial designer at IDEO. He left a while ago to start on his own thing and came up with this. We bought a set immediately in part to to support his efforts, but also because it looked cool. We were not disappointed. This thing is fun.

mia-cody-and-bildopolis

The concept is really simple. It’s a set of cardboard squares, triangles, and rectangles that have “female” velcro semicircles attached. Those are the white parts in the photos. The kit also comes with a bag of colorful “male” velcro “dots”.

Apply the dots to attach the cardboard pieces together to make all sorts of interesting structures for your kids to play in. If your kids are older, they’ll probably want to build their own structures.

rocket-ship

I want to make it clear that I don’t get any kickbacks or anything and we paid full price for our set. I’m just pimping this out because I had so much fun building stuff with it and my kids really love it. Maybe yours will too.

tunnels

In one evening, we built about four different structures. They’re quick and flexible to build.

long-house

There are a few downsides though. The Haackerdome below is probably going to be a permanent fixture in our living room, which takes up a bit of space. Also, these structures are meant for playing inside of, not on top of. They wouldn’t be sturdy enough to support the weight of kids climbing on top of them.

haackerdome

In any case, if you want to learn more, check out the Bildopolis website. They have a video showing off the kit. There’s also a gallery where folks have sent in photos of what they’ve built. That’s where I got the idea for the dome. The kit sells for $80 not including shipping.

personal, community, github comments edit

Next week Microsoft hosts its annual MVP Summit. So what better time for me to host my first GitHub Drinkup – MVP Edition at the Tap House Grill!

Not an MVP? Nonsense! You are in my book, so show up! If you are an MVP, you’re still welcome to slum it with the rest of us schlubs.

taphousegrill

All the details are posted over at the GitHub Blog post.

What is a “Drinkup” you ask?

It’s pretty simple. It’s a meetup where we drink and share stories of valor in the face of code complexity. Or jabber on about whatever else software developers want to geek out about.

Getting together and sharing a brew or two is deeply ingrained in the GitHub culture. After all, GitHub was conceived over a beer at a bar.

If you’re not in the Seattle/Bellevue area, GitHub hosts a monthly (more or less) drinkup in San Francisco where the GitHub HQ is located. We also host the occasional drinkup all over the world when we attend conferences in various locations.

So come on out, unwind, and let me buy you a beer. Perhaps this drinkup will be the one where you’ll meet your next cofounder. At the very least, you’ll enjoy some good drink and good conversation.

.NET Startup Group March 8

And if you can’t come out on Tuesday, I’ll also be speaking at the .NET Startup group on March 8 on GitHub, Git and making it all work on Windows. I might throw in a dash of NuGet in there since I can’t help myself. There’s still plenty of seats available.

code, open source comments edit

In my previous post, I attempted to make a distinction between Open Source and Open Source Software. Some folks took issue with the post and that’s great! I love a healthy debate. It’s an opportunity to learn. One minor request though. If you disagree with me, I do humbly ask that you read the whole post first before you go and rip me a new one.

It was interesting to me that critics fell into two opposing camps. There were those who felt that it was was disingenuous for me to use the term “open source software” to describe a Microsoft project that doesn’t accept contributions and is developed under a closed model, even if it is licensed under an open source license. Many of them accepted that, yes, ASP.NET MVC is OSS, but I still shouldn’t use the term.

While others felt that the license is the sole determining factor for open source and I wasn’t helping anybody by trying to expand the definition of “open source.” To my defense, I wasn’t trying to expand it so much as describe how I think a lot of people use the term today, but they have a good point.

Going back to the first camp, a common refrain I heard was that software that meets the Open Source Definition might “meet the letter of the law, but not the spirit of the law” when it comes to open source.

Interesting. But what is the “spirit of open source” that they speak of? What is the essential ingredient?

Looking For The Spirit

I assume they mean developing in the open and accepting contributions to be necessary ingredients to qualify a project as being in the spirit of open source. I started to dig into it. I expected that we should probably see references to these things all over the place when we look up the term “open source”.

Oddly enough, Wikipedia doesn’t really talk about those things in its article on Open Source, but hey, it’s Wikipedia.

But oddly enough, there’s no mention of accepting contributions in the Open Source Definition or pretty much anywhere in http://opensource.org/ that I could find. It doesn’t really address it.

Neither does any open source license have anything to say about the way the software is developed or whether the project accepts contributions.

Let’s take a look at the Free Software Foundation which is opposed to the term “Open Source” and might have a different definition of free software. I’m going to quote a small portion of this.

Thus, “free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech,” not as in “free beer”.

A program is free software if the program’s users have the four essential freedoms:

  • The freedom to run the program, for any purpose (freedom 0).
  • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help your neighbor (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

I looked for interviews from the pioneers of open source for more information. Richard Stallman reiterates the same points in this interview. What about Eric Raymond? Well he just links to http://opensource.org/. As you can see, he’s the President Emeritus of the Open Source Initiative (OSI) which created the OSD that I’ve been using as my definition.

I then asked Miguel De Icaza for his thoughts. Miguel is a developer with a long history in open source. He started the GNOME and Mono projects and has more open source experience in his pinky than I have in my entirety. He had some interesting insights.

In general, I am not sure where the idea came from that for something to be open source, the upstream maintainer had to take patches, that has never been the case.   Some maintainers are just too protective (qmail I believe for a long time did not take patches, or even engage in public discussions).   Others are just effectively too hard for average developers to get patches in (Linux kernel, C compilers) that they are effectively closed.

What gives?

Isn’t it odd that these concepts that we hold so dearly to as being part of the spirit of open source aren’t mentioned by these stewards of open source?

Maybe it’s because the essential ingredient to open source, the spirit of open source, is not accepting contributions, it’s freedom.

The freedom to look at, remix, and redistribute source code without fear of recrimination.

Even so, I still think accepting contributions and developing in the open are hugely important to any open source project. If I didn’t believe that, I wouldn’t be working at GitHub.

I started to think that perhaps a more apt term to describe that process is crowd sourcing. Crowd sourcing can provide many benefits, according to the article I linked to:

  • Problems can be explored at comparatively little cost, and often very quickly.
  • Payment is by results or even omitted.
  • The organization can tap a wider range of talent than might be present in its own organization.
  • By listening to the crowd, organizations gain first-hand insight on their customers’ desires.
  • The community may feel a brand-building kinship with the crowdsourcing organization, which is the result of an earned sense of ownership through contribution and collaboration.

But Miguel thought a better term is “open and collaborative development.” That’s the process that is so closely associated with developing open source software that it’s become synonymous with open source in the minds of many people. But it’s not the same thing because it’s possible to conduct open and collaborative development on a non-open source project.

Splitting Hairs?

I know some folks will continue to think I’m splitting hairs, which is an impressive feat if you think about it as I definitely don’t have the hands of a surgeon and hairs are so thin.

One counter argument might be that perhaps the original framing of “open source” was focused on freedoms, but what we refer to as open source today has evolved to include crowd sourcing as an essential component.

I can see that. It seems obvious to me that collaborative development in the open is a huge part of the culture surrounding open source. But being a core part of the culture doesn’t necessarily mean it’s in the spirit. A lot of people feel that drugs are a big part of the Burning Man culture, for example, but I don’t think it’s an essential part of its spirit. It’s the creativity and expression that forms the spirit of Burning Man.

Who Cares About The License?

Another point someone made is that the community of contributors is more beneficial than having an open source license. I addressed that point in my last post, but Miguel had a great take on this, emphasis mine.

The reason why the OSI definition was important is because it provided a foundation that allowed unlimited redistribution and serviceability for the code for the future, with the knowledge that there were no legal restrictions.  This is the foundation for Debian’s policies and in general, the test that must be passed for a project to be adopted.  Serviceability does not require talking to an upstream maintainer, it means having the means and rights to do so, even if the upstream distribution goes away, vanishes, dies, or moves on to greener pastures.

You could build the most open and collaborative project ever, but if the source code isn’t under a license that meets the open source definition, it may be possible for project to close shop and withdraw all your rights to the code.

With OSS code, that’s just not possible. A copyright holder of the source can close shop and stop giving you access to new addition they write, but they can’t retroactively withdraw the license to code they’ve already released under an OSI certified license.

And this is a real benefit, even with otherwised “closed” projects that are open source. Miguel gave me one example in relation to Mono.

A perfect example is all the code that we have taken from Microsoft that they open sourced and ran with it. In some cases we modify it/tweak it (DLR, BigInteger), in others we use as-is, but without it being open source, we would be years behind and would have never been able to build a lot of extra features that we now depend on.

Prove me wrong?

By the way, I’m open to be proven wrong and changing my mind. Heck, I’ve done it twice this week as Miguel convinced me that “Open Source” only requires the license. But if you do disagree with me, I’d love to see references that back up your point as opposed to unsubstantiated name calling.

I think this topic is tricky because it’s very easy to discuss whether software is licensed under an open source license. If we agree on the open source definition, or the free software definition, you can easily evaluate that yes, the software gives you the rights and freedoms mentioned earlier.

But it’s trickier to create strict definitions of what encompasses the spirit of anything, because people have different ideas around them. I know some folks that feel that commercial software goes against the spirit of open source. I tend to disagree pointing out that it’s the license that matters, and whether or not you make money from the software is tangential. But I digress.

I know I won’t convince everyone of my points. That’s fine. I enjoy a healthy debate. The only thing I hope to convince you of is that even if you disagree with me, that you can see I’ve provided good reasons for why I believe what I do and it’s not out of being disingenuous.

Collaboration is Good

In the end, I think it’s a huge benefit for any open source project to develop in an open collaborative manner. When I think of open source software, that’s the development model that comes first in my mind. But, if you don’t follow that model (perhaps for good reasons, maybe for bad reasons) but do license the software under an open source license, I will still recognize your project as an open source project.

code, asp.net mvc, open source, community comments edit

UPDATE: I have a follow-up post that addresses a few criticisms of this post.

It all started with an innocent tweet asking whether ASP.NET MVC 3 is “open source” or not? I jumped in with my usual answer, “of course it is!” The source code is released under the Ms-PL, a license recognized that the OSI legally reviewed to ensure it meets the Open Source Definition (OSD). The Free Software Foundation (FSF) recognizes it as a “free software license”^1^making it not only OSS, but FOSS (Free and open source software) by that definition.

Afterwards, a healthy debate ensued on Twitter. No seriously! “Healthy” debates on Twitter do happen sometimes. And many times, I learn something.

Many countered with objections. How can the project be open source if development is done in private and the code is “tossed over the wall” at the end of the product cycle under an open source license? How can it be open source if it doesn’t accept contributions?

This is when it occurred to me:

  1. I’ve been using these terms interchangeably and imprecisely.
  2. Many people have different concepts of what “open source” is.

“Open source” is not the same thing as “open source software”. The first defines an approach to building software. The second is the end product. Saying they are the same thing is like saying that a car that Toyota makes and Kanban are the same thing. Obvious, isn’t it?

The Importance of Definitions

I’m a big fan of open source software and the open source model of developing software. I don’t necessarily think all software should be open source, but I do find a lot of benefits to this approach. I’m also not arguing that writing OSS in a closed off manner is a good or bad thing. What I care about is having clear definitions so we’re talking about the same things when we use these terms. Or at least making it clear what I mean when I use these terms.

Definitions are important! If you allow for any code to define itself as “open source”, you get the monstrosities like the MS-LPL. If you don’t remember this license, it looked a lot like an open source license, but it had a nasty platform restriction.

Microsoft started the term “Shared Source” to describe these licenses which allow you to look at the code, but aren’t open source by most common definitions of the term.

Open Source Software

Going back to my realization, when I talk about “open source” I often really mean “open source software”. In my mind, open source software is source code that is licensed under a license that meets the Open Source Definition.

Thus this phrase is completely defined in terms of properties of the source code. More specifically, it’s defined in terms of the source code’s license and what that license allows.

So when we discuss what it means for software to be open source software, I try to frame things in terms of the software and not who or how the software is made.

This is why I believe ASP.NET MVC is OSS despite the fact that the team doesn’t currently accept outside contributions. After all, source code can’t accept contributions. People do. In fact, open source licenses have nothing to say about contributions at all.

What defines OSS is the right to modify and freely redistribute the code. It doesn’t give anyone the right to force the authors to accept contributions.

Open Source

So that’s the definition that’s often in my mind, when I’ve use the term “open source” in the past. From now on, I’ll try to use “open source software” for that.

When most people I talk to use the term “open source”, they’re talking about something different. They are usually talking about the culture, process, and philosophy that surrounds building open source products such as open source software. Characteristics typically include:

  • Developed in the open with community involvement
  • The team accepts contributions that meet its standards
  • The end product has an open source license. This encompasses open source software, open source hardware, etc.

There’s probably others I’m forgetting, but those three stand out to me. So while I completely believe that a team can develop open source software in private without accepting contributions, I do believe that they’re not complying with the culture and spirit of open source. Or as my co-worker Paul puts it (slightly modified by me)

I would put a more positive spin, that they’re losing out on many of the benefits of open source.

And as Bertrand points out in my comments, “open source” applies to many more domains than just software, such as open source hardware, recipes, etc.

Why does it matter?

So am I simply quibbling over semantics? Why does it matter? If a project does it in private and doesn’t accept contributions, why does anyone care if the product is licensed as open source software?

I think it’s very important. In fact, I think the most important part is the freedom that an open source license allows. As important and wonderful as I think developing in the open and accepting contributions is, I think the freedom the license gives you is even more important.

Why? Maybe a contrived, but not too far fetched, example will help illustrate.

Imagine a project building a JavaScript library that develops completely in the open and accepts contributions. They’ve built up a nice healthy ecosystem and community around their project. Some of the code is really whiz bang and would make my website super great. I can submit all the contributions I want. Awesome, right?

But there’s one teensy problem. The license for the code has a platform restriction. Let’s say the code may only run on Windows. That’d be pretty horrible, right? The software is useless to me. I can’t even use a tiny part of it in my own website because I know browsers running on macs will visit my site and thus I’d be distributing it to non-Windows machines in violation of the license.

But if it were a library developed in private, and once in a while they produce an open source licensed release, so many doors are opened. I can choose to fork it and create a separate open community around the fork. I can distribute it via my website however I like. Others can redistribute it.

Again, just to clarify. In the best case, I’d have it both ways. An open source license and the ability to submit contributions and see the software developed in the open. My main point is that the license of a piece of useful software, despite its origins, is very important.

In the end?

So in the end, is ASP.NET MVC 3 “open source”? As Drew Miller aptly wrote on Twitter,

How I like to describe it: ASP.NET MVC 3 is open source software, but not an open source project.

At least not yet anyways.

UPDATE: After some discussion, I’m not so sure even this last statement is correct. Read my follow-up post, What is the spirit of open source?

^1^Although the Free Software Foundationrecognizes the MS-PL (and MS-RL) as a free software license, they don’t recommend it because of its incompatibility with GNU GPL.

open source, nuget, code comments edit

Recently, the Log4Net team released log4net 1.2.11 (congrats by the way!). The previous version of log4Net was 1.2.10.

Despite which version of version you subscribe to, we can all agree that only incrementing the third part of a version indicates that the new release is a minor update and one that hopefully has no breaking changes. Perhaps a bug fix release.

This is especially true if you subscribe to Semantic Versioning (SemVer) as NuGet does. As I wrote previously,

SemVer is a convention for versioning your public APIs that gives meaning to the version number. Each version has three parts, Major.Minor.Patch.

In brief, these correspond to:

  • Major: Breaking changes.
  • Minor: New features, but backwards compatible.
  • Patch: Backwards compatible bug fixes only.

Given that the Patch number is supposed to represent bug fixes only, NuGet chooses the minimum Major and Minor version of a package to meet the dependency contstraint, but the maximum Patch version. David Ebbo describes the algorithm and rationale in part 2 of his three part series on NuGet Versioning.

Strong Names and Versioning

The consequence of this is as follows. With the new log4Net release, if you have a package that has log4net 1.2.10 or greater as a dependency:

<dependency id="log4net" version="1.2.10" />

Installing that package would give you log4net 1.2.11. In most cases, this is what you want because the newer release might have important bug fixes such as security fixes.

However, in this case, Log4Net changed the strong name for their assembly for 1.2.11. Whatever your feelings about using strong names or not (that’s a separate discussion), the fact is that if you choose to use them, changing the strong name is changing the identity of your assembly. That’s a major breaking change.

And man, were a lot of people affected! We heard from tons of folks who were broken by this and unsure how to fix it.

NuGet does support a workaround so that you can prevent inadvertent upgrades. You can constrain the allowed versions of an installed package by manually modifying packages.config. Sadly, we don’t yet have a UI for this, so it’s a bit of a pain.

The Solution

Apart from never changing your strong name, the solution in this case is to treat this change as a major breaking change and increment the major version number of the assembly.

I don’t anticipate the Log4Net team will change the version of their assembly, but I reached out to the maintainer of the Log4Net package (no connection to the Log4Net team so please don’t give him grief about this) and he graciously incremented the major version of the Log4Net package to solve the problem.

Just to be clear, the log4net 2.0 NuGet package contains the log4net 1.2.11 assembly.

While it’s generally good form to have the package and assembly version match to avoid confusion, it’s not necessary. This is a good example of a case where they need to differ. I do suggest having the “Title” and “Description” note this fact to help avoid further confusion.

I want to thank Jiri for maintaining the Log4Net package and being responsive to the need out there! It’s much appreciated.

nuget, open source comments edit

I’ve seen a few recent tweets asking about what’s going on with NuGet since I left Microsoft. The fact is that the NuGet team has been hard at work on the release and have been discussing it in various public forums. I think the feeling of “quiet” might be due to the lack of blogging, which I can easily correct right now!

In this post, I want to highlight a few things:

  • What the NuGet team has been working on
  • How you can track what we’re doing
  • And how you can get involved in the discussion

Just to clarify, I have not left the NuGet project. Until my name is removed from this page, I will be involved. SmileHowever, as I’ve been ramping up in my new job at GitHub (loving it!), I have been less involved as I would like simply because there’s only so much of me to go around. Once we get the project I’m working on at work shipped, I hope to divide my time a little better so that I don’t neglect NuGet. But at the same time, everyone else has stepped up so much that I don’t think they’ve missed me much.

The team and I are working through how to best keep me involved and we are starting to improve lines of communication.

NuGet Status Page

The NuGet team is currently working on NuGet 1.7, but in the meantime, we’ve shipped a status page at http://status.nuget.org/.

This state is design to provide the community with information about the overall state of NuGet as well as information about the future direction and plan of the NuGet team. Notifications about planned maintenance as well as outages will be posted in the “State of NuGet address” section and can be followed using the RSS feed. Additionally during these times the team with use this section to communicate any pertinent information.

NuGet is more than an add-in to Visual Studio. It’s an important service to multiple different clients and partners and we’re working on ways to improve that communication. The status page is a start, but we’re open to other ideas for improving communication.

NuGet Issue Tracker

As always, if you’re curious about the progress being made towards NuGet 1.7, just visit the issue tracker. The link I just provided shows a filtered view of issues that are still open for NuGet 1.7. You can select the Fixed or Closed status to see what issues have already been implemented for 1.7.

So even if our blogs get a little quite, the issue tracker is the source of truth about the activity.

NuGet JabbR

The NuGet team is moving more and more of our design discussions to our JabbR room. Don’t know what JabbR is? It’s a real-time chat site built on top of ASP.NET and SignalR with a lot of nice features. It’s similar to CampFire, but has great features such as tab expansions for user names as well as emojis! JabbR itself has a very active community surrounding it and they accept pull requests!

In fact, last night starting at around 11 PM we had a big design discussion around capability filtering. For example, if you are on a client that doesn’t support a feature of the package (such as PowerShell scripts), can we filter that out for you. If you have something to say about this, don’t respond here, go to the JabbR room!

What’s Next?

A big focus for us is getting the community more and more involved in NuGet. We hope the move towards leveraging JabbR more helps in that regard. We’re still hashing out some of the details in how we do this. For example, what should be discussed in JabbR vs our discussions site? I think JabbR is a great place to hash out a design and then perhaps summarize the results in a discussion item for posterity. What do you think?

code comments edit

Back in November, someone asked a question on StackOverflow about converting arbitrary binary data (in the form of a byte array) to a string. I know this because I make it a habit to read randomly selected questions in StackOverflow written in November 2011. Questions about text encodings in particular really turn me on.

In this case, the person posing the question was encrypting data into a byte array and converting that data into a string. The conversion code he used was similar to the following:

string text = System.Text.Encoding.UTF8.GetString(data);

That isn’t exactly their code, but this is a pattern I’ve seen in the past. In fact, I have a story about this I want to tell you in a future blog post. But I digress.

The infamous Jon Skeet answers:

You should absolutely not use an Encoding to convert arbitrary binary data to text. Encoding is for when you’ve got binary data which genuinely is encoded text - this isn’t.

Instead, use Convert.ToBase64String to encode the binary data as text, then decode usingConvert.FromBase64String.

Yes! Absolutely. Totally agree. As a general rule of thumb, agreeing with Jon Skeet is a good bet.

Not to give you the impression that I’m stalking Skeet, but I did notice that this wasn’t the first time Skeet answered a question about using encodings to convert binary data to text. In response to an earlier question he states:

Basically, treating arbitrary binary data as if it were encoded text is a quick way to lose data. When you need to represent binary data in a string, you should use base64, hex or something similar.

This perked my curiosity. I’ve always known that if you need to send binary data in text format, base64 encoding is the safe way to do so. But I didn’t really understand why the other encodings were unsafe. What are the cases in which you might lose data?

Round Tripping UTF-8 Encoded Strings

Well let’s look at one example. Imagine you’re receiving a stream of bytes and you store it as a UTF-8 string and pop it in the database. Later on, you need to relay that data so you take it out, encode it back to bytes, and send it on its merry way.

The following code simulates that scenario with a byte array containing a single byte, 128.

var data = new byte[] { 128 };
string text = Encoding.UTF8.GetString(data);
var bytes = Encoding.UTF8.GetBytes(text);

Console.WriteLine("Original:\t" + String.Join(", ", data));
Console.WriteLine("Round Tripped:\t" + String.Join(", ", bytes));

The first line of code creates a byte array with a single byte. The second line converts it to a UTF-8 string. The third line takes the string and converts it back to a byte array.

If you drop that code into the Main method of a Console app, you’ll get the following output.

Original:      128
Round Tripped: 239, 191, 189

WTF?! The data was changed and the original value is lost!

If you try it with 127 or less, it round trips just fine. What’s going on here?

UTF-8 Variable Width Encoding

To understand this, it’s helpful to understand what UTF-8 is in the first place. UTF-8 is a format that encodes each character in a string with one to four bytes. It can represent every unicode character, but is also backwards compatible with ASCII.

ASCII is an encoding that represents each character with seven bits of a single byte, and thus consists of 128 possible characters. The high order bit in standard ASCII is always zero. Why only 7-bits and not the full eight?

Because seven bits ought to be enough for anybody:

When you counted all possible alphanumeric characters (A to Z, lower and upper case, numeric digits 0 to 9, special characters like “% * / ?” etc.) you ended up a value of 90-something. It was therefore decided to use 7 bits to store the new ASCII code, with the eighth bit being used as a parity bit to detect transmission errors.

UTF-8 takes advantage of this decision to create a scheme that’s both backwards compatible with the ASCII characters, but also able to represent all unicode characters by leveraging the high order bit that ASCII ignores. Going back to Wikipedia:

UTF-8 is a variable-width encoding, with each character represented by one to four bytes.If the character is encoded by just one byte, the high-order bit is 0 and the other bits give the code value (in the range 0..127).

This explains why bytes 0 through 127 all round trip correctly. Those are simply ASCII characters.

But why does 128 expand into multiple bytes when round tripped?

If the character is encoded by a sequence of more than one byte, the first byte has as many leading “1” bits as the total number of bytes in the sequence, followed by a “0” bit, and the succeeding bytes are all marked by a leading “10” bit pattern.

How do you represent 128 in binary? 10000000

Notice that it’s marked with a leading 10 bit pattern which means it’s a continuation character. Continuation of what?

the first byte never has 10 as its two most-significant bits. As a result, it is immediately obvious whether any given byte anywhere in a (valid) UTF‑8 stream represents the first byte of a byte sequence corresponding to a single character, or a continuation byte of such a byte sequence.

So in answer to the question of why does 128 expand into multiple bytes when round tripped, I don’t really know other than a single byte of 128 isn’t a valid UTF-8 character. So in all likelihood, the behavior shouldn’t be defined. it’s the Unicode Replacement Character used for invalid data (Thanks to RichB for the answer in the comments!).

I’ve noticed a lot of invalid ITF-8 values expand into these three bytes. But that’s beside the point. The point is that using UTF-8 encoding to store binary data is a recipe for data loss and heartache.

What about Windows-1252?

Going back to the original question, you’ll note that the code didn’t use UTF-8 encoding. I took some liberties in describing his approach. What he did was use  System.Text.Encoding.Default. This could be different things on different machines, but on my machine it’s the Windows-1252 character encoding also known as “Western European Latin”.

This is a single byte encoding and when I ran the same round trip code against this encoding, I could not find a data-loss scenario. Wait, could Jon be wrong?

To prove this to myself, I wrote a little program that cycles through every possible byte and round trips it.

using System;
using System.Linq;
using System.Text;

class Program
{
    static void Main(string[] args)
    {
        var encoding = Encoding.GetEncoding(1252);
        for (int b = Byte.MinValue; b <= Byte.MaxValue; b++)
        {
            var data = new[] { (byte)b };
            string text = encoding.GetString(data);
            var roundTripped = encoding.GetBytes(text);

            if (!roundTripped.SequenceEqual(data))
            {
                Console.WriteLine("Rount Trip Failed At: " + b);
                return;
            }
        }

        Console.WriteLine("Round trip successful!");
        Console.ReadKey();
    }
}

The output of this program shows that you can encode every byte, then decode it, and get the same result every time.

So in theory, it could be safe to use Windows-1252 encoding of binary data, despite what Jon said.

But I still wouldn’t do it. Not just because I believe Jon more than my own eyes and code. If it were me, I’d still use Base64 encoding because it’s known to be safe.

There are five unmapped code points in Windows-1252. You never know if those might change in the future. Also, there’s just too much risk of corruption. If you were to store this string in a file that converted its encoding to Unicode or some other encoding, you’d lose data (as we saw earlier).

Or if you were to pass this string to some unmanaged API (perhaps inadverdently) that expected a null terminated string, it’s possible this string would include an embedded null character and be truncated.

In other words, the safest bet is to listen to Jon Skeet as I’ve said all along. The next time I see Jon, I’ll have to ask him if there are other reasons not to use Windows-1252 to store binary data other than the ones I mentioned.

personal comments edit

Birthdays are a funny thing, aren’t they? Let’s look at this tweet for example,

It’s @haacked’s birthday. Give him crap about getting old.

No gifts, please. Especially not what Charlie suggests.

Of course I’m getting older. We’re all getting older. Every second of every day and twice on Monday. Every femtosecond even. Perhaps the only time we’re not getting older is the moment within a Planck time interval. But once that interval is up, yep, you’re older.

Yet people apparently live their lives completely oblivious to this fact until they’re next birthday comes along. As the chronometer slides the next number into place, the realization dawns, “Damn! I’m older!” What? You didn’t know this?!

Feeling Older

The odd thing to me is that I don’t really feel older, mentally. I mean, I consciously know I’m older, but I feel like there’s this smooth continuum from my first memory to now. While the things I spend time thinking about have changed, the way I think about others and about myself feels like it hasn’t changed. I’m the same person then as I am now, and that kind of blows my mind.

For example, I still think fart jokes are funny.

In my mind, old people tell you how they used to walk miles uphill both ways to get to school. But I realize that these days, old people tell you about how they used to have to use their phone to connect online at 1200 baud. And there was no internet!!! OMG! What the hell were we connecting to?

Rather than feeling older, I am observing the evidence that I’m older. For example, I used the word “baud” in this blog post. Another example is how injuries now take much longer to heal. I have two kids, a four year old and a two year old and I’m pretty sure that if I were to slice them clean in half, that’d only put them out of commission for a week. They’d heal up and have no scars! Meanwhile, if I get a paper cut on a finger I can pretty much kiss that finger goodbye. Write it off as a loss and start practicing typing with two bloody stumps for hands.

Getting Experienced

But it’s not just physical. I do notice that while I don’t feel older, I do have the benefit of many more years of experience to draw upon. But more importantly, I’m finally actually paying attention to that. Go figure.

Last week, we had our GitHub summit and Friday was our field trip day to a distillery then a bar. This was the night set aside to party hard. Which is amazing to me because the night before I’m pretty sure we as a company consumed enough alcohol to bring elephants to extinction.

But I drew upon my experience and took it easy because I had a flight early the next morning and I did not want to be sick on an airplane. Contrast this to a few years before at Tech-Ed Hong Kong when I was out with some local friends and at 5:00 AM I had to leave the bar early to catch a flight. For the first time in my life, I contemplated suicide.

Some might call that getting wiser. I call it pain avoidance.

Knowing Less

The other evidence of my getting older is that I know a lot less now than I did when I was younger. Certainly that can’t be true in the absolute sense since I don’t have alzheimer’s (that I’m aware of anyways). But I remember as a young programmer I knew everything!

I knew the right way to do all things in all situations with absolute conviction. But these days, I’m not so sure. About anything. All I have is the breadth of my experience and pattern matching at my disposal. Each new situation is simply a pattern matching exercise against my database of experience followed by an experiment to see if what I thought I knew produces good results.

The great thing about this approach is when you know everything, you have nothing to learn. But now, I’m constantly learning. Many of my experiments fail because many of my experiences are no longer relevant today. The world changes. Quickly. But each experiment is an opportunity to learn.

Staying Young

So yeah, I’m getting older, but I’ve found a loophole. Remember the kids I mentioned slicing in half? I’m not going to do that because I’m worried I’d end up with four of them then and two are already a handful.

These two do a great job of making me feel young because they will laugh at every fart joke I can come up with.

So thanks for all the birthday wishes on Twitter, Facebook, and elsewhere. Here’s to getting older!