code comments edit

Eric Lippert writes one of my all time favorite tech blogs. Sadly, the purple font he was famous for is no longer, but the technical depth is still there.

In a recent post, he asks the question, "What is Duck Typing?" His post provides a thoughtful critique and deconstruction of the Wikipedia entry on the subject. Seriously, go read it, but please come back here afterwards!

For those of you too lazy to read it, I'll try and summarize crudely. He starts off with his definitions of "typing":

A compile-time type system provides three things: first, rules for defining types, like "array of pointers to integer". Second, rules for logically deducing what type is associated with every expression, variable, method, property, and so on, in a program. And third, rules for determining what programs are legal or illegal based on the type of each entity. These types are used by the compiler to determine the legality of the program.

A run-time type system provides three similar things: first, again, rules for defining types. Second, every object and storage location in the system is associated with a type. And third, rules for what sorts of objects can be stored in what sorts of storage locations. These types are used by the runtime to determine how the program executes, including perhaps halting it in the event of a violation of the rules.

He continues with a description of structural typing that sounds like what he always thought "duck typing" referred to, but notes that his idea differs from the Wikipedia definition. As far as he can tell, the Wikipedia definition sounds like it's just describing Late Binding.

But this is not even typing in the first place! We already have a name for this; this is late binding. "Binding" is the association of a particular method, property, variable, and so on, with a particular name2 in a particular context; if done by the compiler then it is "early binding", and if it is done at runtime then it is "late binding".3 Why would we even need to invent this misleadingly-named idea of "duck typing" in the first place??? If you mean "late binding" then just say "late binding"!

I agree that the Wikipedia definition is a bit unclear, but I think there's more to it than simple late binding. Also, I think some of the confusion lies in the fact that duck typing isn't so much a type system as it is a fuzzy approach to treating objects as if they are certain types (or close enough) based on their behavior rather than their declared type. This is a subtle distinction to late binding.

To back this up, I looked at the original Google Group post where the Alex Martelli first described this concept.

In other words, don't check whether it IS-a duck: check whether it QUACKS-like-a duck, WALKS-like-a duck, etc, etc, depending on exactly what subset of duck-like behaviour you need to play your language-games with.

This was a response to a question that asked the question (I'm paraphrasing), how do you handle method overloading with a single parameter in a dynamic language? Specifically, the question was in reference to the Python language.

To illustrate, in a static typed language like C#, you might have the following three methods of a class (forgive me if the example seems contrived. I lack imagination.):

public class PetOwner {
  public void TakeCareOf(Duck duck) {...}
  public void TakeCareOf(Robot robot) {...}
  public void TakeCareOf(Car car) {...}
}

In C#, the method that gets called is resolved at compile time depending on the type of the argument passed to it.

var petOwner = new PetOwner();
petOwner.TakeCareOf(new Duck()); // calls first method.
petOwner.TakeCareOf(new Robot()); // calls second method.
petOwner.TakeCareOf(new Car()); // calls third method.

But in a dynamic language, such as Python, you can't have three methods with the same name each with a single argument. Without a type declared for the method argument, there is no way to distinguish between the methods. Instead, you'd need a single method and do something else.

One approach is you could switch based on the runtime type of the argument passed in, but Alex points out that would be inappropriate in Python. I assume because it conflicts with Python's dynamic nature. Keep in mind that I'm not a Python programmer so I'm basing this on my best attempt to interpret Alex's words:

In other words, don't check whether it IS-a duck: check whether it QUACKS-like-a duck, WALKS-like-a duck, etc, etc, depending on exactly what subset of duck-like behaviour you need to play your language-games with.

As I said before, I don't know a lick of Python, so I'll pseuducode what this might look like.

class PetOwner:
    def take_care_of(arg):
        if behaves_like_duck(arg):
            #Pout lips and quack quack
        elif behaves_like_robot(arg):
            #Domo arigato Mr. Roboto
        elif behaves_like_car(arg):
            #Vroom vroom vroom farfegnugen

So rather than check if the arg IS A duck, you check if it behaves like a duck. The question is, how do you do that?

Alex notes this could be tricky.

On the other hand, this can be a considerable amount of work, depending on how you go about it (actually, it need not be that bad if you "just go ahead and try", of course catching the likely exceptions if the try does not succeed; but still, greater than 0).

One proposed approach is to simply treat it like a duck, and if it fails, start treating it like a fish. If that fails, try treating it like a dog.

I'd guess that code would look something like:

class PetOwner:
  def take_care_of(arg):
    try:
      arg.walk()
      arg.quack()
    except:
      try:
        arg.sense_danger_will_robinson()
        arg.dance_in_staccato_manner()
      except:
        arg.drive()
        arg.drift()

Note that this is not exactly the same as late binding as Eric proposes. Late binding is involved, but that's not the full picture. It's late binding combined with the branching based on the set of methods and properties that make up "duck typing."

What's interesting is that this was not the only possible solution that Alex proposed. In fact, he concludes it's not the optimal approach.

Besides, "explicit is better than implicit", goes one of Python's mantras. Just let the client-code explicitly TELL you which kind of argument they are passing you (and doing so through a named argument is simple and readable), and your work drops to zero, while removing no useful functionality whatever from the client.

He goes on to state that this implicit duck typing approach to method overloading seems to have dubious benefit.

The "royal-road" alternative route to overloading would, I think, be the use of suitable named-arguments. A rockier road, perhaps preferable in some cases, but more work for dubious benefit, would be the try/except approach to see if an argument supplies the functionalities you require.

The Python approach would be to pass in a discriminator. Even so, the object passed in would have to fulfill the set of requirements for the selected branch of code indicated by the discriminator. With the discriminator, it does feel more like we're just talking about late binding, but applied to a set of methods and properties, not just each one individually as you might do with late binding.

One observation I've heard is that "duck typing" sounds kind of like "duct taping." Not sure if there's anything to that, but if you forgive a bit of a stretch, I think there it may be an apt analogy.

On the Apollo 13 mission, the crew was faced with a situation where carbon dioxide levels were rising to dangerous levels in the Lunar Module. They had plenty of filters, but their square filters would not fit in the round barrels that housed the filters. In other words, their square filters were the wrong type (whether dynamic or static). Their solution was to use duct tape to cobble something together that would work. It wasn't the solution intended by the original design, but as long as the final contraption acts like an air filter (duck typing), they would survive. And they did. Like I said, the analogy is a bit of as stretch, but I think it embodies the duck typing approach.

Perhaps a better term is typing by usage. With explicit typing, you explicitly declare an object to be one type or another (whether at compile time or run time). With typing by usage, if it just happens to meet the needs of the consumer, then hey! It's a duck!

For static typed languages, I really like the idea of structural typing. It provides a nice combination of type safety and flexibility. Mark Rendle, in the comments to Eric's blog post provides this observation:

Structural Typing may be thought of as a kind of compiler-enforced subset of Duck Typing.

In other words, it's duck typing for static typed languages.

Also in the comments to Eric's post, someone linked to my blog post about duck typing. At the time I wrote that, "structural typing" wasn't in my vocabulary. If it had been, I could have been more precise in my post. For static languages, I find structural typing to be very compelling.

What do you think? Did I nail it? Or did I drop the ball and get something wrong or misrepresent an idea? Let me know in the comments.

UPDATE: Sam Livingston-Gray, also known as @geeksam notes another key difference between late binding that I completely missed:

@haacked method_missing illustrates the disconnect between binding and typing: an obj can choose how and whether to respond to a message

Recall that Eric defines "Late Binding" as:

"Binding" is the association of a particular method, property, variable, and so on, with a particular name in a particular context; if done by the compiler then it is "early binding", and if it is done at runtime then it is "late binding".

You could argue that method_missing is another form of late binding where the name is bound to method_missing because there is no other name to bind to. But conceptually, it feels very different to me. With binding, you usually think of the caller determining which method to call by name. And whether it's bound early or late is no matter, it's still the caller's choice. With method_missing it's the object in control of whethere it's going to respond to the method call (message).

oss comments edit

Today, I read a comment about a group of people who feel betrayed by the increase in code that Microsoft is releasing under an open source license.

I and my team are really troubled by MS's apparent policy of Open-sourcing code from under our feet.

We are in an industry with rather paranoid clients that contractually bar us from using Open Source software. We have immensely enjoyed using Rx but its open-sourcing threw us into turmoil.

I feel like we have been betrayed, and will think twice before adopting some new Microsoft framework again, wary of it being Open Sourced later on without prior warning.

And I understand this feeling. I really do. Feelings of betrayal are a natural consequence of progress.

By betrayal, I mean the following definition from Webster,

to hurt (someone who trusts you, such as a friend or relative) by not giving help or by doing something morally wrong

The Catholic Church at the time must have felt betrayed by Galileo when he lent his support to the heliocentric model of the universe because it deviated from their orthodoxy.

Factory owners who profited on cheap labor from children must have felt betrayed by the passage of child labor laws.

Racists who held onto the idea that other races were inferior to their own must have felt betrayed by the passage of the civil rights act.

These are grand examples. But often we experience much smaller and simpler betrayals as a result of little tiny foot steps of progress. Such as the betrayal some might feel when changing winds in the industry makes their antiquated business practices harder to sustain.

It's important to note, that betrayal does not imply progress. It can also result from regression. But doing the right thing always leads to some group feeling betrayed.

And what do we do about those who feel betrayed? It's easy to deride them as an anachronistic holdover from a past that no longer has a place in the present. That's the easy way to deal with it.

But for those who've only known the world to be one way all their life, feelings of betrayal are understandable when the world suddenly changes. Hopefully we can reach out and help those adapt to the new way. There's room for everyone. It's a painful path. But it can happen.

For the rest who would rather stew in the feelings of betrayal and hold onto their outdated ideas with an iron fist. These are the folks we should never, under any circumstances, let impede progress.

I'm sad to hear that these folks feel betrayed because their clients won't allow them to use Rx now that it is open source. Rx is a powerful and useful library. But when you put it in perspective, clients like these are going extinct. History has never been kind to businesses who cannot adapt to change. Best to try and educate the clients about how their policy is regressive and detrimental to their future health. And if that's not successful, then look for clients who are adaptable and wisely embrace open source as one of many means of ensuring their own survival in the long run.

jekyll comments edit

In my last 2013 recap blog post I wrote about the number of steps I recorded with Fitbit last year and the year prior. In case you missed it, they were:

  • 2012 - 3,115,606 steps (Note, I started recording in March)
  • 2013 - 4,577,481 steps

Someone asked me how I got those numbers because the Fitbit dashboard is confusing. Indeed it is. Here's how.

First, when you go to the dashboard, you have to mouse over the section to see the "more info" link.

Fitbit dashboard

Then, click on the "Year" tab. But you'll notice that you still don't see summary data. You have to click on the little back link.

Fitbit activity without totals

There's a brief pause, but the summary totals should show up on bottom.

Fitbit activity with totals

Unfortunately I can't get the same report for my sleep patterns without paying for the premium account. I might do that if I had tracked my sleep better last year.

Hopefully, if you have a Fitbit, this helps you.

blogging, personal comments edit

Another year comes to an end and tradition demands that I write a recap for the year. But it doesn't require that I write a very good one.

I wish I had the time and energy to write one of those recaps that captures the essence of the year in a thoughtful insightful manner. The best I can muster is "a lot of stuff happened."

Here, look at this picture of my tiny kids playing chess.

chess

Personal

This has been a great year for me. My son started first grade, and much to our relief, he loves it. At home, he started to learn to program. I even had my first conversation with him about refactoring and the DRY principle. Parents, it's never too early to talk to your kids about clean code!

My daughter just gets more and more interesting and fun to be around. She has a big personality and just wins over any room she's in. Sometimes we take walks together and she's now able to walk with me over a mile to the local frozen yogurt place. But she usually makes me carry her part of the way back.

And I finished my second year at GitHub. After a year and a half solely focused on GitHub for Windows, I've been able to bounce around a few other cool projects which keeps me excited every day. I still love working here.

I spoke at a few conferences, but I've certainly ramped that down as travel is tough on the family and I had a tiny bit more work travel this year.

Work

Contribution graphs are not a great way to determine the impact you've had in a year. They don't capture a lot of important work that happens outside of GitHub. Yes, it's true. Productive work does happen that's not captured by a Git commit.

Even so, I find them interesting to look at for some historical perspective. The gaps in a contribution graph tell as much a story as the areas that are filled in. For example, you can see when I go on vacation based on my graphs, though I'm not very good at staying away from the computer when I do.

Here's two of my contribution graphs. The first one is what I see as it shows contributions to both public and private repositories.

Haacked Contribution Graph

The second one shows what the public sees. This is perhaps a decent, though not perfect, representation of the work I've done with open source.

Haacked public Contribution Graph

As you can see, after shipping a major release of GitHub for Windows, I shifted my focus to some open source projects like choosealicense.com and octokit.net, making my public contribution graph much greener in the latter half of the year.

What I wrote, that people seemed to like

My three most popular posts written in 2013 according to Google Analytics are:

  1. Death to the if statement - more robust code with less control structures with 25,987 page views.
  2. Argue well by losing - You only learn something when you lose an argument with 21,264 views.
  3. Test Better - How developers should become better testers with 15,618 views

By the way, does anyone know how to easily do a report in Google Analytics for content created in a year? I'd find that useful.

What I've Shipped

This past year, I've had the pleasure to be involved in shipping the following:

  1. GitHub Enterprise support in GitHub for Windows
  2. Octokit.net
  3. ChooseALicense.com
  4. RestSharp - a few releases actually.
  5. According to FitBit, I had 4,577,481 steps this year. That's approximately 2,099miles. Compare this to the 3.1 million steps I took the year before. That's a huge improvement!

You People

Yeah, let's talk about you. You people are my favorite. Well, most of you.

  • Visitors 1,462,003 unique visitors made 2,091,606 visits. Those numbers are down 24.9% and 22.27% respectively from the previous year. I'd like to blame the death of blogging, but I suspect the quality of my writing has declined as I've focused more on other areas of my life.

  • RSS Subscribers According to FeedBurner, there are still 84,377 subscribers to my RSS feed which is surprising given the demise of Google Reader. I guess everybody found replacements. Or the stats are jacked.

Next Year

I'm looking forward to 2014. I've started learning F# by reading the Real-World Functional Programming book by Tomas Petricek and Jon Skeet. I'm hoping to incorporate more functional programming into my toolset. And I'm hoping to take even more steps.

Hopefully I can speak at a few conferences again this year. I'd love to speak in some new places. I'm really hoping to get a gig in South Korea this year. It'd be a chance to see how the industry is really growing there and to visit some of my family.

jekyll comments edit

Well this is a bit embarrassing.

I recently migrated my blog to Jekyll and subsequently wrote about my painstaking work to preserve my URLs.

But after the migration, despite all my efforts, I faced an onslaught of reports of broken URLs. So what happened?

Broken glass by Tiago Pádua CC-BY-2.0

Well it's silly. The program I wrote to migrate my posts to Jekyll had a subtle flaw. In order to verify that my URL would be correct, it made a web request to my old blog (which was still up at the time) using the generated file name.

This was how I verified that the Jekyll URL would be correct. The problem is that Subtext had this stupid feature where the date part of the URL didn't matter so much. It only cared about the slug at the end of the URL.

Thus requests for the following two URLs would receive the same content:

  • http://haacked.com/archive/0001/01/01/some-post.aspx
  • http://haacked.com/archive/2013/11/21/some-post.aspx

Picard Face Palm

This "feature" masked a timezone bug in my exporter that was causing many posts to generate the wrong date. Unfortunately, my export script had no idea these were bad URLs.

Fixing it!

So how'd I fix it? First, I updated my 404 page with information about the problem and where to report the missing file. You can set a 404 page by adding a 404.html file at the root of your Jekyll repository. GitHub pages will serve this file in the case of a 404 error.

I then panicked and started fixing errors by hand until my helpful colleagues Ben Balter and Joel Glovier reminded me to try Google Analytics and Google Webmaster Tools.

If you haven't set up Google Webmaster Tools for your website, you really should. There are some great tools in there including the ability to export a CSV file containing 404 errors.

So I did that and wrote a new program, Jekyll URL Fixer, to examine the 404s and look for the corresponding Jekyll post files. I then renamed the affected files and updated the YAML front matter with the correct date.

Hopefully this fixes most of my bad URLs. Of course, if anyone linked to the broken URL in the interim, they're kind of hosed in that regard.

I apologize for the inconvenience if you couldn't find the content you were looking for and am happy to refund anyone's subscription fees to Haacked.com (up to a maximum of $0.00 per person).

jekyll comments edit

In my last post, I wrote about preserving URLs when migrating to Jekyll. In this post, show how to preserve your Disqus comments.

This ended up being a little bit tricker. By default, disqus stores comments keyed by a URL. So if you people create Disqus comments at http://example.com/foo.aspx, you need to preserve that exact URL in order for those comments to keep showing up.

In my last post, I showed how to preserve such a URL, but it's not quite exact. With Jekyll, I can get a request to http://example.com/foo.aspx to redirect to http://example.com/foo.aspx/. Note that trailing slash. To Disqus, these are two different URLs and thus my comments for that page would not load anymore.

Fortunately, Disqus allows you to set a Disqus Identifier that it uses to look up a page's comment thread. For example, if you view source on a migrated post of mine, you'll see something like this:

<script type="text/javascript">
  var disqus_shortname = 'haacked';

  var disqus_identifier = '18902';
  var disqus_url = 'http://haacked.com/archive/2013/10/28/code-review-like-you-mean-it.aspx/';

  // ...omitted
</script>

The disqus_identifier can pretty much be any string. Subtext, my old blog engine, set this to the database generated ID of the blog post. So to keep my post comments, I just needed to preserve that as I migrated over to Jekyll.

So what I did was add my own field to my migrated Jekyll posts. You can see an example by clicking edit on one of the older posts. Here's the Yaml frontmatter for that post.

---
layout: post
title: "Code Review Like You Mean It"
date: 2013-10-28 -0800
comments: true
disqus_identifier: 18902
categories: [open source,github,code]
---

This adds a new disqus_identifier field that can be accessed in the Jekyll templates. Unfortunately, the default templates you'll find in the wild (such as the Octopress ones) won't know what to do with this. So I updated the disqus.html Jekyll template include that comes with most templates. You can see the full source in this gist.

But here's the gist of that gist:

var disqus_identifier = '{% if page.disqus_identifier %}{{ page.disqus_identifier}}{% else %}{{ site.url }}{{ page.url }}{% endif %}';
var disqus_url = '{{ site.url }}{{ page.url }}';

If your current blog engine doesn't explicitly set a disqus_identifier, the identifier is the exact URL where the comments are hosted. So you could set the disqus_identifier to that for your old posts and leave it empty for your new ones.

jekyll comments edit

In my last post I wrote about migrating my blog to Jekyll and GitHub Pages. Travis Illig, a long time Subtext user asked me the following question:

The only thing I haven't really figured out is how to nicely handle the redirect from old URLs (/archive/blah/something.aspx) to the new ones without extensions (/archive/blah/something/). I've seen some meta redirect stuff combined with JavaScript but... UGH.

UGH Indeed! I decided not to bother with changing my existing URLs to be extensionless. Instead, I focused on preserving my existing permalinks by structuring my posts such that they preserved their existing URLs.

How did I do this? My old URLs have an ASP.NET .aspx extension. Surely, GitHub Pages won't serve up ASPX files. This is true. But what it will serve up is a folder that just happens to have a name that ends with ".aspx".

The trick is in how I named the markdown files for my old posts. For example, check out a recent post: 2013-11-20-declare-dont-tell.aspx.markdown

Jekyll takes the part after the date and before the .markdown extension and uses that as the post's URL slug. In this case, the "slug" is declare-dont-tell.aspx.

The way it handles extensionless URLs is to create a folder with the slug name (in this case a folder named declare-dont-tell.aspx) and creates the blog post as a file named index.html in that folder. Simple.

Thus the URL for that blog post is http://haacked.com/archive/2013/11/20/declare-dont-tell.aspx/. But here's the beautiful part. GitHub Pages doesn't require that trailing slash. So if you make a request for http://haacked.com/archive/2013/11/20/declare-dont-tell.aspx, everything still works! GitHub simply redirects you to the version with the trailing slash.

Meanwhile, all my new posts from this point on will have a nice clean extensionless slug without breaking any permalinks for my old posts.

blogging, jekyll comments edit

The older I get, the less I want to worry about hosting my own website. Perhaps this is the real reason for the rise of cloud hosting. All of us old fogeys became too lazy to manage our own infrastructure.

For example, a while back my blog went down and as I frantically tried to fix it, I received this helpful piece of advice from Zach Holman.

@haacked the ops team gets paged when http://zachholman.com is down. You still have a lot to learn, buddy.

Indeed. Always be learning.

What Zach refers to is the fact that his blog is hosted as a GitHub Pages repository. So when his blog goes down (ostensibly because GitHub Pages is down), the amazing superheroes of the GitHub operations team jumps into action to save the day. These folks are amazing. Why not benefit from their expertise?

So I did.

One of the beautiful things about GitHub Pages is that it supports Jekyll, a simple blog aware static site generator.

If you can see this blog post, then the transition of my blog over to Jekyll is complete and (mostly) successful. The GitHub repository for this blog is located at https://github.com/haacked/haacked.com. Let me know if you find any issues. Or better yet, click that edit button and send me a pull request!

Screen grab from the 1931 movie Dr. Jekyll and Mr. Hide public domain

There are two main approaches you can take with Jekyll. In one approach, you can use something like Octopress to generate your site locally and then deploy the locally generated output to a gh-pages branch. Octopress has a nice set of themes (my new design is based off of the Greyshade theme) and plugins you can take advantage of with this approach. The downside of that approach is you can't publish a blog post solely through GitHub.com the website.

Another approach is to use raw Jekyll with GitHub pages and let GitHub Pages generate your site when your content changes. The downside of this approach is that for security reasons, you have a very limited set of Jekyll plugins at your disposal. Even so, there's quite a lot you can do. My blog is using this approach.

This allows me to create and edit blog posts directly from the web interface. For example, every blog post has an "edit" link. If you click on that, it'll fork my blog and take you to an edit page for that blog post. So if you're a kind soul, you could fix a typo and send me a pull request and I can update my blog simply by clicking the Merge button.

Local Jekyll

Even with this latter approach, I found it useful to have Jekyll running locally on my Windows machine in order to test things out. I just followed the helpful instructions on this GitHub Help page. If you are on Windows, you will inevitably run into some weird UTF Encoding issue. The solution is fortunately very easy.

Migrating from Subtext

Previously, I hosted my blog using Subtext, a database driven ASP.NET application. In migrating to Jekyll, I decided to go all out and convert all of my existing blog posts into Markdown. I wrote a hackish ugly console application, Subtext Jekyll Exporter, to grab all the blog post records from my existing blog database.

The app then shells out to Pandoc to convert the HTML for each post into Markdown. This isn't super fast, but it's a one time only operation.

If you have a blog stored in a database, you can probably modify the Subtext Jekyll Exporter to create the markdown post files for your Jekyll blog. I apologize for the ugliness of the code, but I have no plans to maintain it as it's done its job for me.

The Future of Subtext

It's with heavy heart that I admit publicly what everyone has known for a while. Subtext is done. None of the main contributors, myself included, have made a commit in a long while.

I don't say dead because the source code is available on GitHub under a permissive open source license. So anyone can take the code and continue to work on it if necessary. But the truth is, there are much better blog engines out there.

I started Subtext with high hopes eight years ago. Despite a valiant effort to tame the code, what I learned in that time was that I should have started from scratch.

I was heavily influenced by this blog post from Joel Spolksy, Things You Should Never Do.

Well, yes. They did. They did it by making the single worst strategic mistake that any software company can make:

They decided to rewrite the code from scratch.

Perhaps it is a strategic mistake for a software company, but I'm not so sure the same rules apply to an open source project done in your spare time.

So much time and effort was sacrificed at the altar of backwards compatibility as we moved mountains to make the migration from previous versions to next continue to work while trying to refactor as much as possible. All that time dealing with the past was time not spent on innovative new features. I was proud of the engineering we did to make migrations work as well as they did, but I'm sad I never got to implement some of the big ideas I had.

Despite the crap ton of hours I put into it, so much so that it strained my relationship at times, I don't regret the experience at all. Working on Subtext opened so many doors for me and sparked many lifelong friendships.

So long Subtext. I'll miss that little submarine.

code, rx comments edit

Judging by the reaction to my Death to the If statement where I talked about the benefits of declarative code and reducing control statements, not everyone is on board with this concept. That’s fine, I don’t lose sleep over people being wrong.

Photo by Grégoire Lannoy CC BY 2.0

My suspicion is that the reason people don’t have the “aha! moment” is because examples of “declarative” code are too simple. This is understandable because we’re trying to get a concept across, not write the War and Peace of code. A large example becomes unwieldy to describe.

A while back, I tried to tackle this with an example using Reactive Extensions. Imagine the code you would write to handle both the resize and relocation of a window, where you want to save the position to disk, but only after a certain interval has passed since the last of either event.

So you resize the window, then before the interval has passed you move the window. And only have you stop moving it and resizing it for this interval, does it save to disk.

Set aside your typical developer bravado and think about what that code looks like in a procedural or object oriented language. You functional reactive programmers can continue to smirk smugly.

The code is going to be a bit gnarly. You will have to write bookkeeping code such as saving the time of the last event so you can check that the duration has passed. This is because you’re telling the computer how to throttle.

With declarative code, you more or less declare what you want. “Hey! Give me a throttle please!” (Just because you are declaring, it doesn’t mean you can’t be polite. I like to add a Please suffix to all my methods). And declarations are much easier to compose together.

This example is one I wrote about in my post Make Async Your Buddy with Reactive Extensions. But I made a mistake in the post. Here’s the code I showed as the end result:

Observable.Merge(
    Observable.FromEventPattern
      <SizeChangedEventHandler, SizeChangedEventArgs>
        (h => SizeChanged += h, h => SizeChanged -= h)
        .Select(e => Unit.Default),
    Observable.FromEventPattern<EventHandler, EventArgs>
        (h => LocationChanged += h, h => LocationChanged -= h)
        .Select(e => Unit.Default)
).Throttle(TimeSpan.FromSeconds(5), RxApp.DeferredScheduler)
.Subscribe(_ => this.SavePlacement());

I’ll give you a second to recover from USBS (Ugly Syntax Blindness Syndrome).

The code isn’t incorrect, but there’s a lot of noise in here due to the boilerplate expressions used to convert an event into an observable sequence of events. I think this detracted from my point.

So today, I realized I should add a couple of really simple extension methods that describe what’s going on and hides the boilerplate.

// Returns an observable sequence of a framework element's
// SizeChanged events.
public static IObservable<EventPattern<SizeChangedEventArgs>> 
    ObserveResize(this FrameworkElement frameworkElement)
{
  return Observable.FromEventPattern
    <SizeChangedEventHandler, SizeChangedEventArgs>(
        h => frameworkElement.SizeChanged += h,
        h => frameworkElement.SizeChanged -= h)
      .Select(ep => ep.EventArgs);
}

// Returns an observable sequence of a window's 
// LocationChanged events.
public static IObservable<EventPattern<EventArgs>> 
    ObserveLocationChanged(this Window window)
{
  return Observable.FromEventPattern<EventHandler, EventArgs>(
      h => window.LocationChanged += h,
      h => window.LocationChanged -= h)
    .Select(ep => ep.EventArgs);
}

This then allows me to rewrite the original code like so:

this.ObserveResize()
  .Merge(this.ObserveLocationChanged())
  .Throttle(TimeSpan.FromSeconds(5), RxApp.MainThreadScheduler)
  .Subscribe(_ => SavePlacement());

That code is much easier to read and understand what’s going on and avoids the plague of USBS (unless you’re a Ruby developer in which case you have a high sensitivity to USBS).

The important part is we don’t have to maintain tricky bookkeeping code. There’s no code here that keeps track of the last time we saw one or the other event. Here, we just declare what we want and Reactive Extensions handles the rest.

This is what I mean by declare, don’t tell. We don’t tell the code how to do its job. We just declare what we need done.

UPDATE: ReactiveUI (RxUI) 5.0 has an assembly Reactive.Events that maps every event to an observable for you! For example:

control.Events()
  .Clicked
  .Subscribe(_ => Console.WriteLine("foo"));

That makes things much easier!

code comments edit

Not long ago I wrote a blog post about how platform restrictions harm .NET. This led to a lot of discussion online and on Twitter. At some point David Kean suggested a more productive approach would be to create a UserVoice issue. So I did and it quickly gathered a lot of votes.

I’m visiting Toronto right now so I’ve been off of the Internet all day and missed all the hubbub when it happened. I found out about it when I logged into Gmail and I saw I had an email that the user voice issue I created was closed. My initial angry knee-jerk reaction was “What?! How could they close this without addressing it?!” as I furiously clicked on the subject to read the email and follow the link to this post.

Bravo!

Serious Kudos to the .NET team for this. It looks like most of the interesting PCL packages are now licensed without platform restrictions. As an example of how this small change sends out ripples of goodness, we can now make Octokit.net depend on portable HttpClient and make Octokit.net itself more cross platform and portable without a huge amount of work.

I’m also excited about the partnership between Microsoft and Xamarin this represents. I do believe C# is a great language for cross-platform development and it’s good to see Microsoft jumping back on board with this. This is a marked change from the situation I wrote about in 2012.

code comments edit

Over the past few years I’ve become more and more interested in functional programming concepts and the power, expressiveness, and elegance they hold.

But you don’t have to abandon your language of choice and wander the desert eating moths and preaching the gospel of F#,  Haskell, or Clojure to enjoy these benefits today!

In his blog post, Unconditional Programming, Michael Feathers ponders how less control structures lead to better code,

Control structures have been around nearly as long as programming but it's hard for me to see them as more than an annoyance.  Over and over again, I find that better code has fewer if-statements, fewer switches, and fewer loops.  Often this happens because developers are using languages with better abstractions.  They aren't consciously trying to avoid control structures but they do.

We don’t need to try and kill every if statement, but perhaps the more we do, the better our code becomes.

Msi\_if\_coverPhoto from wikimedia: Cover of If by the artist Mindless Self Indulgence

He then provides an example in Ruby of a padded “take” method.

…I needed to write a 'take' function to take elements from the beginning of an array.  Ruby already has a take function on Enumerable, but I needed to special behavior.  If the number of elements I needed was larger than the number of elements in the array, I needed to pad the remaining space in the resulting array with zeros.

I recommend reading his post. It’s quite interesting. At the risk of spoiling the punch line, here’s the before code which makes use of a conditional...

  def padded_take ary, n
    if n <= ary.length
      ary.take(n)
    else
      ary + [0] * (n - ary.length)
  end
  end

… and here is the after code without the conditional. In this case, he pads the source array with just enough elements as needed and then does the take.

  def pad ary, n
    pad_length = [0, n - ary.length].max
    ary + [0] * pad_length
  end

  def padded_take ary, n
    pad(ary, n).take(n)
  end

I thought it would be interesting to translate the after code to C#. One thing to note about the Ruby code is that it always allocates a new array whether it’s needed or not.

Now, I haven’t done any benchmarks on it so I have no idea if that’s bad or not compared to how often the code is called etc. But it occurred to me that we could use lazy evaluation in C# and completely circumvent the need to allocate a new array while still being expressive and elegant.

I decided to write it as an extension method (I guess that’s similar to a Mixin for you Ruby folks?).

public static IEnumerable<T> PaddedTake<T>(
  this IEnumerable<T> source, int count)
{
  return source
    .Concat(Enumerable.Repeat(default(T), count))
    .Take(count);
}

This code takes advantage of some Linq methods. The important thing to note is that Concat and Repeat are lazily evaluated. That’s why I didn’t need to do any math to figure out the difference in length between the source array and the the take count.

I just passed the total count we want to take to Repeat. Since Repeat is lazy, we could pass in int.MaxValue if we wanted to get all crazy up in here. I just passed in count as it will always be enough and I like to play it safe.

Now my Ruby friends at work might scoff at all those angle brackets and parentheses in the code, but you have to admit that it’s an elegant solution to the original problem.

Here is a test to demonstrate usage and show it works.

var items = new[] {1, 2, 3};

var result = items.PaddedTake(5).ToArray();

Assert.Equal(5, result.Length);
Assert.Equal(1, result[0]);
Assert.Equal(2, result[1]);
Assert.Equal(3, result[2]);
Assert.Equal(0, result[3]);
Assert.Equal(0, result[4]);

I also ran some quick perf tests on it comparing PaddedTake to the built in Take . PaddedTake is a tiny bit slower, but the amount is like the extra light cast by a firefly at noon of a sunny day. The performance of this method is way more affected by the number of elements in the array and the number of elements you are taking. But in my tests, the performance of PaddedTake stays pretty close to Take as we grow the array and the take.

I think it’d be interesting to have a build task that reported back the number of `if` statements and other control structures per line of code and see if you can bring that down over time. In any case, I hope this helps you improve your own code!

open source, code comments edit

Octokit.net targets multiple platforms. This involves a large risk to my sanity. You can see the general approach here in the Octokit directory of our project:

 octokit-projects

Mono gets a project! MonoAndroid gets a project file! Monotuch gets a project file! Everybody gets a project file!

Each of these projects references the same set of .cs files. When I add a file to Octokit.csproj, I have to remember to add that file to the other four project files. As you can imagine, this is easy to forget.

It’s a real pain. So I opened up a feature request on FAKE, the tool we use for our build (more on that later) and asked them for a task that would fail the build if another project file in the same directory was missing a file from the “source” project file. I figured this would be something easy for F# to handle.

The initial response from the maintainer of FAKE, Steffen Forkman, was this:

What you need is a better project system ;-)

Touché!

This problem (along with so many project file merge conflicts) would almost completely go away with file patterns in project files. I’ve been asking for this for a long time (I asked the Visual Studio team for this the day I joined Microsoft, or maybe it was the first month, I don’t recall). There’s a User Voice item requesting this,go vote it up! (Also, go vote up thisplatform restriction issuethat’s affecting Octokit.net as well)

In any case, sorry to say unlimited chocolate fountains don’t exist and I don’t have the project system I want. So let’s deal with it.

A few days later, I get this PR to octokit.net. When I ran the build, I see the following snippet in the build output.

Running build failed.
Error:
System.Exception: Missing files in  D:\Octokit\Octokit-MonoAndroid.csproj:
Clients\OrganizationMembersClient.cs

That’s telling me that somebody forgot to add the class OrganizationMembersClient.cs to the Octokit-MonoAndroid.csproj. Wow! Isn’t open source grand?

A big thanks to Steffen and other members of the FAKE community who pitched in to build a small but very useful feature. In a follow-up post, I’ll write a little bit about why we moved to using FAKE to build Octokit.net.

Update

I opened an issue to take this to the next step. Rather than just verify the project files, I want some way to automatically modify or generate them.

Update 2

FAKE just got even better with the new FixProjects task! For now, we’ve added this as an explicit command.

.\build FixProjects

Over time, we may just integrate this into the Octokit.net build directly.

open source, code, github comments edit

Most developers are aware of the potential pitfalls of premature optimization and premature generalization. At least I hope they are. But what about premature standardization, a close cousin to premature generalization?

It’s human nature. When patterns emerge, they tempt people to drop everything and put together a standard to codify the pattern. After all, everyone benefits from a standard, right? Isn’t it a great way to ensure interoperability?

Yes, standards can be helpful. But to shoehorn a pattern into a standard prematurely can stifle innovation. New advances are often evolutionary. Multiple ideas compete and the best ones (hopefully) gain acceptance over time while the other ideas die out from lack of interest.

Once standardization is in place, people spend so much energy on abiding by the standard rather than experiment with alternative ideas. Those who come up with alternative ideas become mocked for not following “the standard.” This is detrimental.

In his Rules of Standardization, Yaron Goland suggests that before we adopt a standard,

The technology must be very old and very well understood

He proposes twenty years as a good rule of thumb. He also suggests that,

Standardizing the technology must provide greater advantage to the software writing community then keeping the technology incompatible

This is a good rule of thumb to contemplate before one proposes a standard.

Social Standards

So far, I’ve focused on software interoperability standards. Software has a tendency to be a real stickler when it comes to data exchange. If even one bit is out of place, software loses its shit.

For example, if my code sends your code a date formatted as ISO 8601, but your code expects a date in Unix Time, stuffs gonna be broke™.

But social standards are different. By a “social standard” I mean a convention of behavior among people. And the thing about people is we’re pretty damn flexible, Hacker News crowd notwithstanding.

Rather than being enforced by software or specifications, social standards tend to be enforced through the use of encouragement, coercion, and shaming.

Good social standards are not declared so much as they emerge based on what people do already. If people converge on a standard, then it becomes the standard. And it’s only the standard so long as people adopt it.

This reminds me of a quote by W.L. Gore & Associates’ CEO, Terri Kelly on leadership at a non-hierarchical company,

If you call a meeting, and no one shows up, you’re probably not a leader, because no one is willing to follow you.

Standard GitHub issue labels?

I wrote a recent tweet to announce a label that the Octokit team uses to denote low hanging fruit for new contributors,

For those looking to get started with .NET OSS and http://Octokit.net, we tag low hanging fruit as "easy-fix". https://github.com/octokit/octokit.net/issues?labels=easy-fix

It was not my intention to create a new social standard.

Someone questioned me why we didn’t use the “jump in” label proposed by Nik Molnar,

The idea for a standardized issue label for open source projects came from the two pieces of feedback I consistently hear from would-be contributors:

  1. “I’m not sure where to start with contributing to project X.”
  2. “I’ll try to pick off a bug on the backlog as soon as I’ve acquainted myself enough with the codebase to provide value.” “”

In the comments to that blog post, Glenn Block notes that the ScriptCS project is using the “YOU TAKE IT” label to accomplish the same thing.

About two and a half years earlier, I blogged about the “UpForGrabs” label the NuGet team was using for the same reason.

As you can see, multiple people over time have had the same idea. So the question was raised to me, would I agree that “standardizing” a label to invite contributors might be a good thing?

To rephrase one of Goland’s rule of standardization,

A social standard must provide greater advantage to the software community than just doing your own thing.

This is a prime example of a social standard and in this case, I don’t think it provides a greater advantage than each project doing it’s own thing. At least not yet. If one arises naturally because everyone thinks it’s a great idea, then I’m sold! But I don’t think this is something that can just be declared to be a standard. It requires more experimentation.

I think the real problem is that these labels are just not descriptive enough. One issue I have with Up For Grabs, You Take It,and Jump Inis they seem too focused on giving commands to the potential contributor, “HEY! YOU TAKE IT! WE DON’T WANT IT!”. They’re focused on the relationship of the core team to the issue. I think the labels should describe the issue and not how the core team wants new contributors to interact with the issue.

What makes an issue appeal to a new contributor is different from contributor to contributor. So rather than a generic “UpForGrabs” label, I think a set of labels that are descriptive of the issue make sense. People can then self-select the issues that appeal to them.

For many new contributors, an issue labeled as “easy-fix” is going to appeal to their need to dip their toe into OSS. For others, issues labeled as “docs-and-samples” will fit their abilities better.

So far, I’ve been delighted that several brand new OSS contributors sent us pull requests. It far surpassed my expectations. Of course, I don’t have a control Octokit.net project with the different labels, so I can’t rightly attribute it to the labels. Science doesn’t work that way. Even if we did, I doubt it’s the labels that made much of any difference here.

Again, this is not an attempt to propose a new standard. This is just an approach we’re experimenting with in Octokit.net. If you like this idea, please steal it. If you have a better idea, I’d love to hear it!

github comments edit

Today on the GitHub blog, we announced the first release of Octokit.net.

Octokit is a family of client libraries for the GitHub API. Back in May, we released Octokit libraries for Ruby and Objective-C.

Today we're releasing the third member of the Octokit family, Octokit.net, the GitHub API toolkit for .NET developers.

octokit-dotnet

GitHub provides a powerful set of tools for developers who build amazing software together. But these tools extend way beyond the website and Git clients.

The GitHub API provides a rich web based way to leverage GitHub.com within your own applications. The Octokit family of libraries makes it easy to call into the API. I can’t wait to see what you build with it.

The Project

Octokit.net is an open source project on GitHub so feel free to contribute with pull requests, issues, etc. You’ll notice that we call it a 0.1.0 release. As of today, it doesn’t implement every API endpoint that GitHub.com supports.

We wanted to make sure that it was in use by a real application so we focused on the endpoints that GitHub for Windows needs. If there’s an endpoint that is not implemented, please do log an issue. Or even better, send a pull request!

Our approach in implementing this library was to avoid being overly speculative. We tried to implement features as we needed them based on developing a real production application.

But now that it’s in the wild, we’re curious to see what other types of applications will need from the library.

Platform and Licensing Details

Octokit.net is licensed under the MIT license.

As of today, Octokit.net requires .NET 4.5. We also have a WinRT library for .NET 4.5 Core. This is because we build on top of HttpClient is not available in .NET 4.0.

There is a Portable HttpClient package that does work for .NET 4.0, but we won’t distribute it because it has platform limitations that are incompatible with our license.

I had hoped that its platform limitations would have been removed by now, but that sadly is not the case. If you’re wondering why that matters, read my post here.

However, if you check the repository out, you’ll notice that there’s a branch named haacked/portable-httpclient. If you only plan to deploy on Windows, you can build that branch yourself and make use of it.

Go Forth And Build!

I’ve had great fun working with my team at GitHub on Octokit.net the past few weeks. I hope you have fun building amazing software that extends GitHub in ways we never imagined. Enjoy!