pragmatist
Patrick Joyce

July 11, 2007

Broken URL Encoding in .Net and the AntiXssLibrary

Properly encoding user input is one of the most important security precautions that you can take when writing a web application. Failure to do so can open up your site to Cross Site Scripting, SQL Injection, and directory traversal attacks. Recently I was working on an ASP.Net application and noticed some very strange behavior from the .Net encoding methods.

URL Encoding is defined in RFC 3986 as replacing all characters not in ALPHA / DIGIT / “-” / “.” / “\_” / “\~” with a percent sign followed by their hexadecimal ASCII representation. Therefore an ampersand is encoded as %26.

The encoded representation of the string “patrick’s trials and tribulations.doc” is “patrick%27s%20trials%20%26%20tribulations.doc”

Fortunately, .Net provides methods to do encoding for you. Therefore, you would think that you would URL encode a string by calling code>HttpUtility.UrlEncode(String)</code>

Unfortunately, the .Net methods don’t work the way you would expect them to. HttpUtility.UrlEncode(“patrick’s trials & tribulations.doc”) will return “patrick’s+trials%26+tribulations.doc” Note that spaces are encoded as signs. More importantly, note that the single quote isn’t encoded. This means that even if you use HttpUtility.UrlEncode your application will remain open to JavaScript injection.

The problem is that the default UrlEncode method uses a “black list” to only encode certain characters that are deemed “dangerous”. The proper way to do encoding is to use a “white list” of characters that are allowed, and encode everything else.

I have no idea why Microsoft took this approach, and if anyone has any ideas I would greatly appreciate hearing them in the comments.

Thankfully Microsoft eventually realized how dangerous their approach was and released the AntiXssLibrary that includes a UrlEncode method that works as expected. The problem and solution are discussed in this article.

I’m also not sure why Microsoft just didn’t fix the HttpUtility.UrlEncode method instead of creating a separate library. I imagine it is for backwards compatibility but, again, I would love to hear any ideas in the comments.

Bottom line: whenever you are URL Encoding something in .Net use AntiXss.UrlEncode instead of HttpUtility.UrlEncode

More Articles on Software & Product Development

Agile With a Lowercase “a”
”Agile“ is an adjective. It is not a noun. It isn’t something you do, it is something you are.
How Do You End Up With A Great Product A Year From Now?
Nail the next two weeks. 26 times in a row.
Build it Twice
Resist the urge to abstract until you've learned what is general to a class of problems and what is specific to each problem.