Searching Open Source Code With Open Source
First week on the job and I’ve already got the keys to the company blog. I just posted my first post at koders.com announcing the latest set of site updates.
One thing that I was surprised to learn this week, though it really shouldn’t surprise me, is that Koders uses an open source search engine to create the full-text index. More specifically, it uses Lucene.NET, a port of the Java Lucene project.
I’m familiar with Lucene.NET because the Subtext and RSS Bandit projects both use it for searching (though I was not the one to implement it in either case). As far as I know, it pretty much is the de-facto standard for open source search software on the .NET platform.
Of course Lucene.NET is only part of the Koders code search picture. It provides the full-text indexing, but if you use the Koders search engine, you’ll notice that there is some level of semantic analysis on top of the text index. Otherwise, you wouldn’t be able to search for method names and class names and such, not to mention the syntax highlighting when viewing code.
I’m still learning about all the layers and extensions Koders built on top of Lucene.NET. As I said in the post, those are probably topics I can’t get write about in too much detail.
One thing I should point out is that code search is only part of the picture. Koders also has a pretty sweet code repository browser (click on thumbnail for larger view).
My favorite open source project is now in the index and can be viewed here.
On a side note, I recently talked about Search Driven Development in the theoretical sense, but have been able to put it to good use already at Koders.com. In the great developer tradition of dogfooding, I needed to look at some code from home before I had my VPN setup. It was nice to be able to login to Koders internal Enterprise Edition and find the snippet of code I needed.