The mysteriously escaped request path in ASP.NET

Posted on Tue 19 November 2013 in Coding

This blog post answers the question “Why does Uri.AbsolutePath sometimes return an escaped path and sometimes an unescaped path for the request URL in an ASP.NET application?”

I spent some time googling for a possible reason, but search terms like URL/URI, request, path, encoding, and escaping are so overloaded that the signal-to-noise ratio is extremely low. My searches turned up nothing.

Some background first! For a web application at work, we use a REST server built with Nancy. After a recent upgrade, we noticed that images (non-static, loaded via the REST API) with pluses in their names stopped working. Instead of getting images back, we got 404 responses.

Checking the logs, we found that when Nancy saw the image path parameter of the request, pluses had been converted into spaces. At this point, an alarm bell rang, of course! RFC 3986 (Uniform Resource Identifier (URI): Generic Syntax) defines plus (+) as a reserved character, but does not give it any special meaning in the path component of an URI. Furthermore, RFC 2616 (Hypertext Transfer Protocol - HTTP/1.1) refers to a predecessor of RFC 3986 as defining the meaning of a path component, among other things. Thus, a plus character in a path component should not be escaped in any way. Replacing pluses with spaces comes from the form content type application/x-www-form-urlencoded used in HTML. So, this appeared to be a Nancy bug, and we found the issue to have been reported already, albeit in a slightly different context.

Now to the strange part. On our build server, running Windows Server 2012 and IIS 8.0, we observed the bug in action. However, on my development machine running Windows 8.1 and IIS 8.5, the bug did not show. Why was that? To understand what was happening, I decided to exclude Nancy and write a simple HTTP handler of my own:

public class Handler : IHttpHandler
{
    public void ProcessRequest(HttpContext context)
    {
        context.Response.StatusCode = 200;
        var url = context.Request.Url;
        using (var writer =
            new StreamWriter(context.Response.OutputStream))
        {
            writer.WriteLine("Request.Path = " + context.Request.Path);
            writer.WriteLine("Request.Url.AbsolutePath = " +
                url.AbsolutePath);
            writer.WriteLine("Request.Url.OriginalString = " +
                url.OriginalString);
        }
    }

Deploying my little HTTP handler to our build server and locally, I saw:

$ curl http://remote.server.com/a+b
Request.Path = /a+b
Request.Url.AbsolutePath = /a+b
Request.Url.OriginalString = http://remote.server.com/a+b

$ curl http://localhost:15080/a+b
Request.Path = /a+b
Request.Url.AbsolutePath = /a%2Bb
Request.Url.OriginalString = http://localhost:15080/a%2Bb

Now we’re getting somewhere! Locally, the AbsolutePath property of Request.Url (which is of type System.Uri) contained an escaped path. Remotely, the path was not escaped. As can be seen above, the difference in escaping was also reflected in the value of the OriginalString property, which is the string with which the Uri object was created in the first place! This indicated that either there was a difference in how the .NET framework produced the Uri objects between the machines, or there was a difference in how the IIS servers where configured, possibly because of the version difference.

Since I couldn’t find any differences in the IIS configurations, and no relevant information regarding changes between IIS 8.0 and 8.5 (again, low signal-to-noise ratio in search results), I started digging into the .NET framework code!

After spending more time reading .NET code than I would like to admit, I finally found the culprit: an application setting by the name “aspnet:UseLegacyRequestUrlGeneration.” The documentation for the setting is essentially worthless. It simply states that with the setting set to false, the default value, “the ASP.NET runtime generates a Uri that has better standards compliance than previous versions of ASP.NET.” Some reflections on this:

  • The setting cannot be found on the .NET 4.5 version of the page, indicating that it was introduced with .NET 4.5.1.
  • The default value changes the behavior of how request URIs are passed to HTTP handler. That’s right, the behavior changes in a minor update of the framework. Way to go, Microsoft!
  • Which standards are better complied to? Given that pluses in URI paths should not be escaped (at least as far as I can tell), I would say that that there is at least one standard being complied to in a worse way.
  • The method responsible for escaping is Uri.EscapeDataString, the documentation of which indicates that reserved characters like plus are escaped.

Now, our build server had .NET 4 and my local machine .NET 4.5.1, so this was clearly the reason for the difference in escaping. Also, setting the application setting to true in Web.config changed the behavior as expected:

<configuration>
    <appSettings>
        <add key="aspnet:UseLegacyRequestUrlGeneration" value="true" />
    </appSettings>


$ curl http://localhost:15080/a+b
Request.Path = /a+b
Request.Url.AbsolutePath = /a+b
Request.Url.OriginalString = http://localhost:15080/a+b

How do you check the value of the setting, given that it doesn’t exist prior to .NET 4.5.1? Unfortunately, I didn’t find a decent way to do it, but some reflection hacking does the trick:

public static bool UseLegacyRequestUrlGeneration()
{
    var appSettingsType = typeof (System.Web.Util.HttpEncoder)
        .Assembly
        .GetTypes()
        .FirstOrDefault(t => t.Name == "AppSettings");
    var prop = appSettingsType != null
        ? appSettingsType.GetProperty("UseLegacyRequestUrlGeneration", 
            BindingFlags.Static | BindingFlags.NonPublic) 
        : null;
    return prop != null ? (bool) prop.GetValue(null, null) : true;
}

The method reads the setting from the internal System.Web.Util.AppSettings class, and if the corresponding property does not exist, it returns true since pre-4.5.1 behavior is considered legacy.

Back to Nancy then. Why didn’t I observe the bug locally? Because instead of decoding plus into a space, it now decoded “%2B” into a plus! The issue at hand must clearly be fixed, but for the Nancy.Hosting.Aspnet project, it is important to also consider the value of the “aspnet:UseLegacyRequestUrlGeneration” setting, because the HTTP handler uses Uri.AbsolutePath to construct the request URL passed into the Nancy core.


Side notes:

  • For IIS to accept a request URL containing a plus in a path segment, you must turn on allowDoubleEscaping in Web.config:

    <configuration>
        <system.webServer>
            <security>
                <requestFiltering allowDoubleEscaping="true" />
            </security>
    
  • To use a custom HTTP handler, add it to Web.config like so:

    <configuration>
        <system.webServer>
            <handlers>
                <add name="MyHandler" verb="*"
                     type="SomeAssembly.SomeHandler"
                     path="*" />
            </handlers>