comments edit

Soccer Ball Today the soccer team I started playing with had their last game of the season. This was only my second game with them and we were playing the first place team in the league. This team was much slower than our last opponent and not as skillful, but were known for playing very dirty.

Fortunately on this day, the ref ran a tight ship and a nice game of soccer ensued. At least nice for the other team who proceeded to pound us for five goals to our two. We started off strong, but with no subs to speak of, the second half found us weary and unable to keep up.

The highlight for me was putting the ball in the back of the net in my second game with this team. The play involved flicking the ball over the defender and taking a shot off the bounce. They invited me to join them when the season starts in January. Hopefully by then I’ll have some fitness to contribute.

comments edit

Ali Gif you’ve watched da show yous probably ave wondered, ow can i attract da wicked bitches dig dat awesome omeboy? da secret is to learn to bang dig im? well in da house’s your chance to learns da ons and out of ali g-speak. respek!

In English, that translates to…

If you’ve watched the show you probably have wondered, how can I attract the lovely ladies like that awesome homeboy? The secret is to learn to speak like him? Well here’s your chance to learns the ins and out of Ali G-speak. Respek!

Check it out, the Ali G translator.

comments edit

I answered a question about ASP.NET deployment in a newsgroup recently where the person asked which files should he deploy when moving his site to a production server.

As a followup to my answer, Jon Galloway pointed the person to a neat deployment utility called UnleashIt.

UNLEASHit \ Ready to deploy, Sir!

UnleashIt provides integration with VS.NET 2003 as an add-in. You can create deployment profiles and share them with other team members. I plan to use this for any customization of my .TEXT blog I plan to do.

So why not just use Visual Studio’s copy project option? I’ve never used it but Jon had this to say:

Visual Studio has a copy project option for web projects, but it depends on your setup and you may miss files (javascript, css, images).

As usual, I have a few minor complaints as I’m just a nitpicky person. The first is that the application is not resizeable. The fonts on the main screen seem smaller than in other applications.

More problematic is that the application doesn’t seem to support adding file masks. Currently the application is missing *.asmx and *.ashx, but more importantly it would be nice to create a deployment profile using this tool that could handle Word docs (for example) if they were a part of the site.

comments edit

If you haven’t heard, RSS Bandit can synchronize its state (feedlist, read/unread, etc…) across multiple machines. I wrote about it in the RSS Bandit docs.

So far, there are four means for synchronizing feeds: Ftp, dasBlog, local or network file, and webDav. For the average user, these options might not be always be available.

However, using GMail Drive Shell Extension, you can create a local drive letter that maps to your GMail account. Then in RSS Bandit, open up the properties dialog, click on the Remote Storage Tab, choose the File Share protocol and enter the GMail drive in the UNC directory path (it doesn’t have to be UNC). In the screenshot below, I have the e: drive mapped to my GMail account.

Remote Storage Tab

Now you can use your GMail account for synchronizing your RSS Bandit state between multiple machines. Note that this usage of GMail is not supported by Google nor the developers of RSS Bandit. So if Google suddenly decides to disrupt this usage of GMail, you’ve been warned.

As you can see in the RSS Bandit Roadmap, there will be support for more synchronization sources in the next major release.

code comments edit

There’s a lot of focus these days on SOAP vs REST and the proliferation of WS-* specifications. Sometimes you wonder if WS-* solves problems that aren’t all that common or have already been solved.

For example, some in the REST camp will say, HTTP has security built in. It#8217;s called SSL. Why not use it instead of building WS-Security.

Another example is WS-Addressing. This places addressing information within the SOAP envelope so that the message can be delivered via transports other than HTTP. At first glance, I wonder how often this will be useful for web services when HTTP is the predominant mode of transport.

However, Pat Caldwell illustrates a real world scenario in which WS-Addressing solved a real need that REST couldn#8217;t and doesn#8217;t address.

REST has its place, but for some of those nitty gritty situations, SOAP keeps everything clean.

comments edit

Spam Adam tells me he doesn’t support comments nor the CommentAPI because he doesn’t want to deal with comment spam. So the day after admonishing him for being anti-social, I get hit by a slew of comment spam pointing to porn sites and selling Ginzu knifes. Did you know those things can cut right through a can?

I removed the offensive comments. Don’t worry, as a duty to my readers I checked out the porn sites and you’re not missing anything.

This torrent of comment spam means only one thing. I have arrived!

comments edit

Pat Gannon (no blog) makes a great point in the comments on my post about using regular expressions to parse HTML. He says:

Just to play devil’s advocate for a minute, it seems like HTML is just too darned close to XML to have to parse this way. Isn’t there a library out there for converting HTML into XHTML? If you can do that, you can just read the file in using XmlDocument::LoadXml(). Once you’ve done that, you can find your tags using an XPath query. Sorry, I just couldn’t let a parsing post go by without tossing in my two cents ;)

In fact, there are two approaches to this. The first recognizes that HTML is really just a subset of SGML. Thus if you have a SGML parser, you’re done. So one option is to try Chris Lovett’s SgmlReader.

In fact, this is what the current version of RSS Bandit uses for auto-discovery of RSS feeds within HTML content. However, I recently replaced it with regular expressions because of some memory use and performance problems we were having with it. In our case, finding these tags is a lot faster and uses less memory by just using a regular expression. (Now you see the motivation for the post).

Another option is to use Simon Mourier’s HTML Agility Pack. He takes an interesting approach in that he provides an HtmlDocument class that implements System.Xml.XPath.IXPathNavigable. Thus his approach provides the same interface as an XmlDocument for querying nodes, but doesn’t change the underlying HTML content as many other approaches would by converting them to XML.

And just to toot Pat’s horn a bit, I used to be his manager at Solien when he was just starting out in his career. Now he works at Univision and has inherited reams of code that parse through Fortran code as well as proprietary database files. He’s also written his own grammar engine and xml syntax for describing computer languages such as C#. So he knows a thing or two about parsing text. He’s become quite a top notch developer. I’m just waiting for him to get off his arse and start a blog.

code, regex comments edit

I just love regular expressions. I mean look at the sample below.

</?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[\^'">\s]+))?)+\s*|\s*)/?>

What’s not to like?

Ok I admit, I was a bit intimidated by regular expressions when I first started off as a developer. All I needed was a Substring method and an IndexOf method and I was set. But after a few projects that required some intense text processing, I realized the power and utility of regular expressions. They should be on the tool belt of every developer. To that end, I recommend Mastering Regular Expressions by Jeffrey Friedl. This is really THE book on Regular Expressions. Reading it will make your Regex-Fu powerful.

So let’s look at a common task of matching HTML tags within the body of some text. When you initially think to parse an HTML tag, it seems quite easy. You might consider the following expression:

</?\w+\s+[\^>]*>

Roughly Translated, this expression looks for the beginning tag and tag name, followed by some white-space and then anything that doesn’t end the tag.

Now this will probably work 99 times out of 100, but there’s a flaw in this expression. Do you see it? What if I asked you to match the following tag?

<img title="displays >" src="big.gif">

Hopefully you see the issue here. The expression will match

<img title="displays >

Unfortunately, this implementation is too naive. We have to consider the fact that the greater-than symbol does not end a tag if it’s within a quoted attribute value. Thus we must correctly match attributes.

Now there are four possible formats for an Html attribute

name="double quoted value" name='single quoted value' name=notquotedvaluewithnowhitespace name

Each of these cases are quite simple. In the first case, you could do the following:

\w+\s*=\s*"[\^"]*"

The portion "[\^"]*" matches a double quote, followed by any non double quote characters, followed by a double quote. Another way to express this is to use lazy evaluation as such:

\w+\s*=\s*".*?"

The portion ".*?" uses lazy evaluation (the “lazy star”) to match as few characters as possible. For example, if we had a string like so

<a name=test value="test2">

evaluating ".*" (aka greedy) would match

"test" value="test2"

However using the lazy evaluation consumes the fewest characters that match the expression, thus the first match using ".*?" would be "test" and the second match is "test2".

The full expression for matching an HTML tag is that lovely mash of characters presented at the very beginning of this post. It’s a modified version of the one presented in Friedl’s book

However I wouldn’t recommend you just plunk that down in your code. Rather, you should consider adding it to a regular expression library assembly.

Don’t know how? Well I’ll show you a code listing for an exe that when run, builds a fully compiled version of this regular expression into an assembly that you can then reference in any project. In a later installment, I’ll explain in more detail just what the code is doing and how to use the compiled assembly. How irresponsible of me not to do that now. ;)

Source Listing

comments edit

Weird. I did a google search for an entry in my blog and one of the results was a bloglines account that had my blog subscribed. I was basically seeing all the blogs that some bloglines user was subscribed to. Is that a feature of bloglines to expose your subscriptions like that? Or is that a privacy flaw?

UPDATE: Nevermind. I’m just being paranoid. Bloglines supports public profiles.

comments edit

Dare Obasanjo, the project lead on the RSS Bandit project (of which I contribute) is leaving his post as a Program Manager on the XML team at Microsoft to work as a Program Manager on the MSN Communication Services Platform team.

When Microsoft revealed a blogging service similar to Blogger, I had a feeling it was only a matter of time before Dare would somehow be involved with that seeing his interest in Social Software.

It will be interesting to see the direction Microsoft takes with social software. Although Microsoft perhaps doesn’t see entering the aggregator market as a profit center, I wouldn’t be surprised if that changes in the next year or so.

As aggregation continues to take off, it seems natural to incorporate it into Office. Remember that “Information At Your Fingertips” mantra Mr. Gates touted a while ago? Well I get most of my online information through two sources, Google and RSS Bandit.

In any case, I wish Dare well. Hopefully this is the platform for him to have some of his ideas implemented. I have to admit, I’d love to work on social software such as RSS Bandit and .TEXT full time. But I have a mortgage to pay.

comments edit

Since I like to stoke the fire of partisanship… This joke was sent to me by my friend Walter.

George Bush meets with the Queen of England. He asks her, “Your Majesty, how do you run such an efficient government? Are there any tips you can give to me?”

“Well,” says the Queen, “the most important thing is to surround yourself with intelligent people.”

“Bush frowns. “But how do I know the people around me are really intelligent?”

The Queen takes a sip of tea. “Oh, that’s easy. You just ask them to answer an intelligent riddle. “ The Queen pushes a button on her intercom. “Please send Tony Blair in here, would you?”

Tony Blair walks into the room. “Yes, my Queen?”

The Queen smiles. “Answer me this, please, Tony. Your mother and father have a child. It is not your brother and it is not your sister. Who is it?”

Without pausing for a moment, Tony Blair answers, “That would be me.”

“Yes! Very good,” says the Queen.

Bush goes back home to ask Dick Cheney, his vice president, the same question.

“Dick, answer this for me. Your mother and your father have a child. It’s not your brother and it’s not your sister. Who is it?”

“I’m not sure,” says Cheney, “ let me get back to you on that one.”

Cheney goes to his advisors and asks every one, but none can give him an answer. Finally, he ends up in the men’s room and recognizes Colin Powell’s shoes in the next stall. Cheney shouts, “Colin! Can you answer this for me? Your mother and father have a child and it’s not your brother or your sister. Who is it?”

Colin Powell yells back, “That’s easy. It’s me!”

Cheney smiles, and says, “Thanks!” Then, Cheney goes back to speak with Bush. “Say, I did some research and I have the answer to that riddle. It’s Colin Powell.”

Bush gets up, stomps over to Cheney and angrily yells into his face, “No, you idiot! It’s Tony Blair!”

comments edit

Scott Guthrie has returned to blogging with a tremendous piece on his team’s effort towards reaching “ZBB” or Zero Bug Bounce.

I’ve personally never worked on software project as large as the ASP.NET 2.0 project, so it’s fascinating for me to read Scott’s description of the testing and check-in process. Typically, my check-in process is to get latest on any files I didn’t change, build, and run my unit tests. Assuming everything passes, I check in my files, get latest again build, and run the the unit tests again. If everything still passes, I’m done with the check-in. If all went smoothly, it’s all done under half an hour.

For the ASP.NET team, every check-in undergoes peer review and is run through a few hours of checkin test suites. They then run more exhaustive nightly tests over the product to catch issues in the latest builds. That’s pretty impressive.

code, tdd comments edit

Jonathan de Halleux, aka Peli, never ceases to impress me with his innovations within MbUnit. In case you’re not familiar with MbUnit, it’s a unit testing framework similar to NUnit.

The difference is that while NUnit seems to have stagnated, Jonathan is constantly innovating new features, test fixtures, etc… for a complete unit testing solution. In fact, he’s even made it so that you can run your NUnit tests within MbUnit without a recompile.

His latest feature is not necessarily a mind blower, but it’s definitely will save me a lot of time writing the same type of code over and over for testing a range of values. I’ll just show you a code snippet and you can figure out what it’s doing for you.

 

[TestFixture]public class DivisionFixture{    [RowTest]    [Row(1000,10,100.0000)]    [Row(-1000,10,-100.0000)]    [Row(1000,7,142.85715)]    [Row(1000,0.00001,100000000)]    [Row(4195835,3145729,1.3338196)]    public void DivTest(double num, double den, double res)    {        Assert.AreEqual(res, num / den, 0.00001 );    }}

 

And if you’re anal like me and wondering why I chose “num” instead of “numerator” etc… Purely for blog formatting reasons. ;)

UPDATE: Jonathan points out that negative assertions are also supported. Here’s an illustrative code snippet. I can’t wait to try this out.

 

[RowTest] [Row(1000,10,100.0000)] ... [Row(1,0,0, ExpectedException =              typeof(ArithmeticException))] public void DivTest(double num, double den, double res) {...} 

comments edit

Xclef Saw this on Gizmodo. It’s bigger and not as nice looking as an iPod, but it is 100 GB.

The DMC Xclef 500 also supports Ogg Vorbis and even WAV—with a 100GB drive, you could start ripping your CDs with no compression at all. The 100GB version is $450 from DMC’s online store.

humor comments edit

My friend Michael who lives in London for now sent me this.

Once again, The Washington Post published its yearly contest in which readers are asked to supply alternate meanings for various words (& leave it to the Post to search for new meanings).

And the winners are …

​1. Coffee (n.), a person who is coughed upon.

​2. Flabbergasted (adj.), appalled over how much weight you have gained.

​3. Abdicate (v.), to give up all hope of ever having a flat stomach.

​4. Esplanade (v.), to attempt an explanation while drunk.

​5. Willy-nilly (adj.), impotent.

​6. Negligent (adj.), describes a condition in which you absentmindedly\       answer the door in your nightgown.

​7. Lymph (v.), to walk with a lisp.

​8. Gargoyle (n.), an olive-flavored mouthwash.

​9. Flatulence (n.) the emergency vehicle that picks you up after\      you are run over by a steamroller

​10. Balderdash (n.), a rapidly receding hairline.

​11. Testicle (n.), a humorous question on an exam.

​12. Rectitude (n.), the formal, dignified demeanor assumed by a\       proctologist immediately before he examines you.

​13. Oyster (n.), a person who sprinkles his conversation with\       Yiddish expressions.

​14. Pokemon (n), A Jamaican proctologist.

​15. Frisbeetarianism (n.), The belief that, when you die your Soul\       goes up on the roof and gets stuck there.

​16. Circumvent (n.), the opening in the front of boxer shorts

comments edit

I know a lot of people like to post picturesof their workspace online. Not sure why (vanity!), but they just do. So I thought I’d jump on that bandwagon and do the same.

This first picture shows my work office with it’s nice 17th floor view.

Work Office \ Strange green bands take over the screens.

This next one is our home office.

\ If you look carefully, you’ll notice the hastily minimized porn application.

As you can see, the home setup is much nicer than the work setup with dual 17” flat panels monitors, and a slick looking aluminum Shuttle case. I wish my company would invest in nice monitors. My work monitors flicker, make me cross-eyed and spit in my food. If you look closely at the top picture, the computer case is literally held together with scotch tape on top. The IT department wouldn’t budget duct tape.

The little figurine on top is the “Buddy Christ” from Dogma. You can purchase that on Kevin Smith’s website. My wife painted the red shoe on the left.

comments edit

Allow users to configure Google Desktop to search their GMail accounts. Most of my personal email isn’t going to be in Outlook. It’ll be in my web-based accounts.

comments edit

Copernic LogoAfter reading the reaction around the net about Google Desktop (GD for short), one common complaint I noticed is the use of a web browser for local searching. Why use a web browser to search locally and forego all the utility and benefits a rich client can provide?

So I thought I’d start trying out some of the free alternatives to GD. One that is mentioned quite often is Copernic. I decided to uninstall GD and give it a whirl.

So far, I’m not quite satisfied. I left it running overnight and it’s still not done indexing my hard drive. Not only that, while it’s indexing, my computer runs at a snail’s pace at times. I often have to restart it to reclaim my computer’s resources. In comparison, GD finished indexing within a much shorter time span with a nearly imperceptible impact on my computer’s performance.

One area where Copernic shines above GD is the UI. Copernic provides options for refining the search parameters just below the search input. When you search for emails, the search result window breaks down the results by date. You can see them grouped into emails received Today, Yesterday, Last Week, Last Month, This Year, and so on…

One shortcoming that both engines share is the inability to specify file types to search other than the preconfigured ones. For example, I would like to search my C# files that have the .cs extension. No can do.

So the search continues. I could shell out for X1, but I’d like to find a free product I can use at both work and home. I read about another product to try at home, but forgot its name. In any case, I’ll keep you posted.

UPDATE: Whoops! Apparently you can configure Copernic to index arbitrary file types through the advanced options dialog. Thanks to Eric for the tip.