Making Heads or Tails of Microformats

0 comments suggest edit

I had an email discussion with Dimitri about Microformats a little while back, trying to understand the purpose of Microformats and what they intend to solve.

At the time, the potential benefit I saw was that it might allow CSS writers to share stylesheets for marking up certain types of content. For example, suppose we standardize the markup for a calendar event (say, using the hCalendar format). Now if I write some seriously sweet CSS that makes calendar events explode off the page, I could send that CSS to you and it would be immediately useful. No need to reformat it to reflect the structure of the HTML used to render your calendar event, assuming you followed the standard.

At the time, I was focused on the fact that according to the microformats about page, microformats are designed for humans first and machines second.

However, the fact that microformats are machine readable lends itself to other potential uses. For example, the Microformats blog recently highlighted a Greasemonkey script that parses out hCalendar events and provides links to import them into a calendar application.

Now while I try to keep an open mind, I find it odd that Microformat proponents are attacking the use of XML on the web.

This is where I find the goals of Microformats to be a bit far reaching. As far as my understanding goes, they present Microformats as a means to have your website be the API, attempting to make technologies such as RSS obsolete. The problem I have with this idea is that data exchange and presentation are often at odds.

For example, suppose I want my presentation to only display calendar events for the current week, but I want users to be able to import calendar events for the month. However, I never want to display a month calendar, for aesthetic reasons. It seems the Microformats method would be for me to have a month’s worth of calendar events on the page, but use CSS to hide those I don’t want displayed. Or, I could allow a query string parameter to specify how many entries to display, but how would I make that parameter discoverable without messing with my presentation (i.e. without placing a link to it)?

Instead, I might choose a standard XML format for calendar entries and provide a auto-discoverable reference to the calendar entries much in the same way that HTML pages add auto-discoverable references to RSS feeds. What’s so wrong with that?

It seems the Microformats user might say that the separate XML feed is not necessary. Why duplicate content? This is a fairly good point worth considering. The goal of Microformats is to provide a information in a machine readable format as well as human readable. Part of fulfilling that goal is to ensure that the presentation degrades well in a normal browser.

For example, a competing approach to avoid duplication of content might be to simply specify a calendar event namespace in an XHTML file and embed that within the markup. The problem with this approach is that many browsers and web authors do not truly support XHTML properly. Thus, tags for alternate namespaces do not show up, violating the Microformats goal of degrading gracefully. Not only that, but most XHTML pages end up as being served as tag soup because they are sent using the mime type text/html. [See Sending XHTML as text/html Considered Harmful].

However, therein lies the problem with Microformats when compared to a non-presentational XML format like RSS. If you recall, RSS stands for Really Simple Syndiication. It’s not just that it is simple to syndicate content, but that (in theory) it is simple to parse such a feed since it relies on strict XML. Parsing HTML is much more difficult to do because of the inconsistencies and all the effort that goes into understanding malformed HTML. Unfortunately, that is exactly what a consumer of Microformats is essentially forced to do, since Microformats are intended to degrade gracefully. Microformats aren’t limited to XHTML and can be placed in valid HTML documents, making it much more difficult to validate a Microformat snippet.

In any case, it’ll be interesting to see how the use of Microformats unfold. As Greasemonkey becomes more prevalent, I imagine the popularity of Microformats might also take off. If I misunderstood microformats, be sure to let me know.

Found a typo or error? Suggest an edit! If accepted, your contribution is listed automatically here.

Comments

avatar

5 responses

  1. Avatar for Dimitri Glazkov
    Dimitri Glazkov July 28th, 2005

    Good post! I really ought to restart my blogging, you know.

  2. Avatar for Kevin Marks
    Kevin Marks July 29th, 2005

    One way microformats could help with your (slightly contrived) example would be to use a Javascript post-processer to compile the presentation of 'current' events from the cleaner xhtml microformat representation, thus you only have one file to maintain.



    See http://web.mit.edu/glasser/www/JSCalendar/ for an example of this approach.

  3. Avatar for Richard Hsu
    Richard Hsu July 29th, 2005

    sorry for writing it here, i wasn't able to use your Contact form to send a message to you.



    can you provide an email where i can mail you, regarding SubText or if Contact form is working, then i will try that out again.

  4. Avatar for Haacked
    Haacked July 30th, 2005

    Richard, try sending an email to Subtext-devs@lists.sourceforge.net

  5. Avatar for Bernard Farrell
    Bernard Farrell May 17th, 2007

    I've started looking at Microformats to represent diabetes-related data.
    I plan to work on this using a shared Wiki and I'm looking for others to get involved.
    You can also see the abstract for a paper I wrote about this recently at the Journal for Diabetes Science and Technology site.
    If you'd like to get involved, drop by my blog and leave a note.