Html Encoding Code Blocks With ASP.NET 4

Sep 25, 2009 aspnet code aspnetmvc suggest edit

This is the first in a three part series related to HTML encoding blocks, aka the <%: ... %> syntax.

Html Encoding Code Blocks With ASP.NET 4
Html Encoding Nuggets With ASP.NET MVC 2
Using AntiXss as the default encoder for ASP.NET

One great new feature being introduced in ASP.NET 4 is a new code block (often called a Code Nugget by members of the Visual Web Developer team) syntax which provides a convenient means to HTML encode output in an ASPX page or view.

<%: CodeExpression %>

I often tell people it’s <%= but with the = seen from the front.

Let’s look at an example of how this might be used in an ASP.NET MVC view. Suppose you have a form which allows the user to submit their first and last name. After submitting the form, the same view is used to display the submitted values.

First Name: <%: Model.FirstName %>
Last Name: <%: Model.FirstName %>

<form method="post">
  <%: Html.TextBox("FirstName") %>
  <%: Html.TextBox("LastName") %>
</form>

By using the the new syntax, Model.FirstName and Model.LastName are properly HTML encoded which helps in mitigating Cross Site Scripting (XSS) attacks.

Expressing Intent with the new `IHtmlString` interface

If you’re paying close attention, you might be asking yourself “Html.TextBox is supposed to return HTML that is already sanitized. Wouldn’t using this syntax with Html.TextBox cause double encoding?”

ASP.NET 4 also introduces a new interface, IHtmlString along with a default implementation, HtmlString. Any method that returns a value that implements the IHtmlString interface will not get encoded by this new syntax.

In ASP.NET MVC 2, all helpers which return HTML now take advantage of this new interface which means that when you’re writing a view, you can simply use this new syntax all the time and it will just work.By adopting this habit, you’ve effectively changed the act of HTML encoding from an opt-in model to an opt-out model.

The Goals

There were four primary goals we wanted to satisfy with the new syntax.

Obvious at a glance. When you look at a page or a view, it should be immediately obvious which code blocks are HTML encoded and which are not. You shouldn’t have to refer back to flags in web.config or the page directive (which could turn encoding on or off) to figure out whether the code is actually being encoded. Also, it’s not uncommon to review code changes via check-in emails which only show a DIFF. This is one reason we didn’t reuse existing syntax.

Not only that, code review becomes a bit easier with this new syntax. For example, it would be easy to do a global search for <%= in a code base and review those lines with more scrutiny (though we hope there won’t be any to review). Also, when you receive a check-in email which shows a DIFF, you have most of the context you need to review that code.
Evokes a similar meaning to <%=. We could have used something entirely new, but we didn’t have the time to drastically change the syntax. We also wanted something that had a similar feel to <%= which evokes the sense that it’s related to output. Yeah, it’s a bit touchy feely and arbitrary, but I think it helps people feel immediately familiar with the syntax.
Replaces the old syntax and allows developers to show their intent. One issue with the current implementation of output code blocks is there’s no way for developers to indicate that a method is returning already sanitized HTML. Having this in place helps enable our goal of completely replacing the old syntax with this new syntax in practice.

This also means we need to work hard to make sure all new samples, books, blog posts, etc. eventually use the new syntax when targeting ASP.NET 4.

Hopefully, the next generation of ASP.NET developers will experience this as being the default output code block syntax and <%= will just be a bad memory for us old-timers like punch cards, manual memory allocations, and Do While Not rs.EOF.
Make it easy to migrate from ASP.NET 3.5. We strongly considered just changing the existing <%= syntax to encode by default. We eventually decided against this for several reasons, some of which are listed in the above goals. Doing so would make it tricky and painful to upgrade an existing application from earlier versions of ASP.NET.

Also, we didn’t want to impose an additional burden for those who already do practice good encoding. For those who don’t already practice good encoding, this additional burden might prevent them from porting their app and thus they wouldn’t get the benefit anyways.

When Can I Use This?

This is a new feature of ASP.NET 4. If you’re developing on ASP.NET 3.5, you will have to continue to use the existing <%= syntax and remember to encode the output yourself.

In ASP.NET 4 Beta 2, you will have the ability to try this out yourself with ASP.NET MVC 2 Preview 2. If you’re running on ASP.NET 3.5, you’ll have to use the old syntax.

What about ASP.NET MVC 2?

As mentioned, ASP.NET MVC 2 supports this new syntax in its helper when running on ASP.NET 4.

In order to make this possible, we are making a breaking change such that the relevant helper methods (ones that return HTML as a string) will return a type that implements IHtmlString.

In a follow-up blog post, I’ll write about the specifics of that change. It was an interesting challenge given that IHtmlString is new to ASP.NET 4, but ASP.NET MVC 2 is actually compiled against ASP.NET 3.5 SP1. :)

Found a typo or mistake in the post? suggest edit

Comments

23 responses

Robbie Paplin • September 25th, 2009
Happy to see this come to the Webforms view engine. This is one of the reasons, why I've liked Spark so much. Secure views by default.
Now if I could only get whitespace removal options from my view engines... ;)
Eber Irigoyen • September 25th, 2009
nice little addition
Joe Chung • September 25th, 2009
How about one for HTML attribute encoding? I would recommend <%@ (for the XPath @ attribute syntax), but it's already reserved for directives.
Example:
<%: Model.LinkText %>
If we keep it up, we'll become as "awesome" as Perl and Ruby when it comes to making symbol soup!
Chris • September 25th, 2009
Nice see see this addition, although I'd almost prefer to see something like the view templates in Django: everything is html encoded by default, and you have to manually mark the ones you DON'T want encoded.
Vijay Santhanam • September 25th, 2009
I agree with Joe Chung, what about attribute encoding?
barryd • September 25th, 2009
So a few questions on the implementation ....
1) Is this using the old, slightly buggery, slow to be updated Server.*Encode under the hood, or can we easily slot in AntiXSS, which has a more frequent release schedule?
2) What about attributes? Will the helpers be not just encoding text correctly, but attributes correctly? And URIs?
My concern is that IHtmlString indicates that all folks will worry about is straight forward HTML, when there's risks around all types of output, and the various different ways of encoding they require.
Andrey Shchekin • September 25th, 2009
This is nice, but I think this is the only explicitly HTML-targeted feature in the forms engine (which can actually be used to output XML, JSON, text, etc).
Maybe something like IEscapedString with IsEscapedFor(useCase) would be better? With HtmlString being one implementation and HtmlAttributeString, XmlString, etc -- other variants.
Igor • September 25th, 2009
Does this mean we can safely use <%= to output intentionally not-sanitized HTML? Guess so.
If we implement IHtmlString to our method, it will auto-sanitize and we will not be able to choose weather to sanitize or not, and <%: won't be able to sanitize output of a method that implements IHtmlString?
Will this work in VS2008 MVC2 Preview 2 or we have to wait VS10 Beta 2?
Dummy Customer • September 25th, 2009
We strongly considered just changing the existing <%= syntax to encode by default.

That would have been a great solution, but I totally understand that you need to support old code from breaking.
Jonty • September 25th, 2009
Please, please, please, please, please, please, please can you make it easier to capture aspx output in-memory!
Mike • September 26th, 2009
"I often tell people it’s <%= but with the = seen from the front."
lol! Doable with the new WPF based editor in VS2010.
Satheesh • September 26th, 2009
Good thing!!
Jim Manico • September 26th, 2009
What about Javascript, CSS, URL and HTML attribute encoding? Check out : www.owasp.org/.../XSS_%28Cross_Site_Scripting%2... to see other contexts of data display that must be accounted for.
Borek • September 29th, 2009
It would be good to know what's ASP.NET's long-term strategy for context-sensitive escaping. Are we going to have different opening tags for different contexts? Will ASP.NET vNext break backwards compatibility and make the implementation of <%: context-aware?
Many people have raised this issue already, would it be possible to quickly comment on that please? It would be much appreciated.
Thanks,
Borek
paul • September 29th, 2009
Looks very nice, Phil. I can see a path for implementing IHtmlString to pull 'safe tags' & attributes from configuration and use that to emit things like blog content...
Haacked • September 30th, 2009
@Jim we've improved the HTML Encoding such that <%: %> will be save to use within quoted (single or double) attributes. We don't recommend unquoted attributes.
As for JavaScript, this does not the correct encoding for JavaScript. However, AFAIK, it does encode double quotes which should provide some mitigation if you accidentally use this encoding and meant to use JavaScript encoding.
I believe ASP.NET 4 does include a JavaScript encoding method (I'd have to dig around). We did not create a special syntax for it.
aspUser • October 1st, 2009
Any chance CodeExpressionBuilder:
weblogs.asp.net/.../The-CodeExpressionBuilder.aspx
will be implemented in ASP.NET 4?
Megan • October 11th, 2009
[b] [c=13]You [c=9]Talk[/c=9] it we [c=9]Live[/c=9] it ... Your jelous admit it.. My [c=9]girls[/c=9] I (L) em.[/b] [/c=13] :]
make a website • April 13th, 2010
It really helped me a lot.I would like to thanks that master brain who make all this for the readers like me.I hadn't used Twitter for my professional work, mostly because so few forensic clinicians are using it.
Kirby L. Wallace • May 3rd, 2010
I often tell people it’s <%= but with the = seen from the front.
I looked at that and thought, "what the hell has he been smokin'?", and was just moving on when it hit me. And I couldn't supress the laugh. hah hah...
But, wouldn't it really look like this from the front:
||:
Or maybe even just
|
... cause the others are behind the "<"?
Adam Tuliper • October 5th, 2010
What about any support for implementing this in <%# %> data binding scenarios so Encode() doesn't have to be called every time?
hkj • November 19th, 2013
thank u
Cornan • December 30th, 2014
Apparently Microsoft never got around to documenting these on MSDN?

If anybody knows of such a link in the MSDN Library or TechNet, I'd like to see it. All I can find are blogs and forums and the like.

Expressing Intent with the new IHtmlString interface