Drinking From an RSS Fire Hose

Firehose So now that you’ve subscribed to your 4000 feeds, how do you keep on top of the flood of incoming items? Dare talks about this, “the attention problem”, that faces power users of RSS (and ATOM) aggregators such as RSS Bandit.

Ideally a user should be able to tell a client, “Here are the sites I’m interested in, here are the topics I’m interested in, and now only show me stuff I’d find interesting or important”. This is the next frontier of features for RSS/ATOM aggregators and an area I plan to invest a significant amount of time in for the next version of RSS Bandit.

One way to think of it is that there’s a cacophony of content out there. You want an automated system to filter out the noise and allow through the music.

There are several difficulties inherent in any automated system designed to filter content based on your tastes and preferences. Often times, you don’t really know what you like till you see it. So how does the automated filter know if you’re going to like something or not? Well, you can train it by rating items. You can perhaps incorporate ratings of others. You can build rules about which content you like and dislike.

All of these methods run into the problem that your likes and dislikes tend to evolve and change over time as a product of your life’s experience and automated filters tend to narrow the items they allow through. If you set up hard rules for filtering data, you need to make sure to change them over time. A constant task of tweaking. If you’re using a system that requires you to train it based on an initial set of sample data, you have to make sure the training set is not to small or narrow. Otherwise the filter will only bring in items that meet some narrow facet of your personality. Collaborative filters are particularly prone to this problem. Think of how drab a lot of music on your mega-radio stations are today, a result of a gigantic collaborative filter. I doubt I could train a human to filter items for myself, much less an automated system.

This is not to say that filters can’t do a half-way decent job of being a personal editor. They can. The point here is that a really good filter has to allow a bit of noise through. My favorite radio station right now is KCRW, especially the show Metropolis. I don’t necessarily like everything Jason Bently spins, but I’m constantly being introduced to music that I’ve never heard of that I end up really enjoying. I believe that’s a result of a filtering system (human DJ) that allows some bit of noise through in order to expose listeners to new items.

Some noise is essential to a good content filtration system.

In any case, the effort to add filtration to RSS Bandit is of particular interest to me. Having studied Bayes theorem and read up a small amount on autonomous agent systems, I’m really excited about the potential for intelligent filtering in RSS Bandit. In particular, one method of filtering I like is creating de-facto editors via assignment. For example, like Dare, I read everything by Don Box. So I might assign a rule that always includes items by Don. But I might also go a step further and think that anything Don links to will probably be of interest to me, so I might state that everything he links to should be subscribed to automatically (perhaps based on my filtering rules based on how much I trust Don). Effectively, Don becomes an editor for my aggregator (without even knowing so). He becomes a content DJ.

What others have said

Requesting Gravatar... Paul Scott Mar 24, 2005 3:38 PM
# re: Drinking From an RSS Fire Hose
This would be a great improvement. I'd like to see something similar to Tivo's thumbs up and thumbs down feature.
Requesting Gravatar... Steve Maine Mar 24, 2005 10:31 PM
# re: Drinking From an RSS Fire Hose
I think you might be on to something with your "content DJ" approach. It sounds very similar to how I've come across some of the most interesting blogs I currently read. Of course, I'm still doing it manually...


Requesting Gravatar... Dimitri Glazkov Mar 25, 2005 7:18 AM
# Social Content Management
Requesting Gravatar... Dimitri Glazkov Mar 25, 2005 9:04 AM
# Social Content Management
Requesting Gravatar... Sharp as a Marble Mar 30, 2005 12:29 PM
# re: Drinking From an RSS Fire Hose
Hell, start simple. Allow blocking of keywords. Right now, if I see the word Schiavo in the title, I automatically mark it read and go to the next story. Automation on this front would be helpful.

The other starting point would be weighting rather than straight filtering. Show me eveything, but put the most important on top / mark in different colors / etc.

When I worked on F/A-18's, one of the concerns for the pilots was information overload. The more advanced the aircraft, the more the pilot had to worry about. Seems to be the same issue here.
Requesting Gravatar... you've been HAACKED Mar 30, 2005 5:06 PM
# Putting a Crimp in the RSS Fire Hose
Requesting Gravatar... you've been HAACKED Mar 30, 2005 5:07 PM
# Putting a Crimp in the RSS Fire Hose
Requesting Gravatar... Greg Linden Apr 05, 2005 12:43 PM
# re: Drinking From an RSS Fire Hose
Hi, Phil. Have you tried Findory (http://findory.com)? It's a personalized news aggregator designed to filter content based on your tastes and preferences. Findory learns from the articles you read and builds a personalized front page.

Great point on allowing a little bit of noise through. I'd put it slightly differently though. I think the goal is to make the recommendations a little surprising. They shouldn't be obvious. They need to help you discover things you probably wouldn't have found on your own.

What do you have to say?

(will show your gravatar)
Please add 4 and 4 and type the answer here: