Ok, I feel like an idiot, but I’ve been experimenting with this for an hour now and I cannot for the life of me figure how to get the encoding correct to run output from an ASP.NET page into a file. Well, no I can get it to work with explicitly setting the encoding to the Windows 1252, but this is not really what I want…

 

Here’s the setup. I’m using the ASP.NET runtime inside of a desktop app. The ProcessRequest method needs to pass a TextWriter to the ASP.NET runtime into which it will then render the output. This is the same TextWriter that Page.Render writes into for example.

 

All works well, except I can’t get the output written to file to look correct. What I want is to get the output written in UTF-8. So I thought I can use:

 

TextWriter Output;

 

try

{

      // *** Note you have to write the right 'codepage'. If you use the default UTF-8

      // *** everything will be double encoded.

      Output = new StreamWriter(this.OutputFile,false, Encoding.UTF8);

 

}

catch (Exception ex)

{

      this.Error = true;

      this.ErrorMessage = ex.Message;

      return false;

}

 

// *** Reset the Response settings

this.ResponseHeaders = null;

this.Cookies = null;

this.ResponseStatusCode = 200;

 

wwWorkerRequest Request = new wwWorkerRequest(Page, QueryString, Output);

if (this.Context != null)

      Request.Context = this.Context;

 

Request.PostData = this.PostData;

Request.PostContentType = this.PostContentType;

Request.RequestHeaders = this.RequestHeaders;

Request.PhysicalPath= this.PhysicalDirectory;

 

try

{

      HttpRuntime.ProcessRequest(Request);

}

catch(Exception ex)

{

      Output.Close();

      this.ResponseStatusCode = 500;

      this.ErrorMessage = ex.Message;

      this.Error = true;

      return false;

}

 

Output.Close();

 

this.ResponseHeaders = Request.ResponseHeaders;

this.ResponseStatusCode = Request.ResponseStatusCode;

 

 

// *** Capture the Cookies that were set by the server

this.Cookies = Request.Cookies;

 

if (Request.Context != null)

      this.Context = Request.Context;

 

return true;

 

The ASP.NET application is setup to encode to UTF-8 in Web.config:

 

 <globalization requestEncoding="utf-8" responseEncoding="utf-8" />

 

So, what happens? Output gets generated but the output actually gets double encoded. I have a string like this embedded in the HTML of the ASPX rendered:

 

¢ª

 

After running the code above I get (raw output):

 

¢ª

 

Which is some funky double encoded wanna-be UTF-8 output of the above characters.

 

Next, I thought Ok, so we’re double encoding – let’s try Encoding.Ascii on the stream, but that gives me invalid characters (??????), so that’s no good either. Using Encoding.Default produces different results yet:

 

¢ª

 

which is just plain garbage.

 

I did manage to get this to work by using Encoding.Default (Windows 1252 basically) and then also setting the web.config to use Windows-1252 for its encoding, but this is not really what I want. Using a specific Encoding works to get me through, but it's not a good generic solution. Certainly UTF8 would be a better choice.

 

I don’t really understand what I should be passing in for a TextWriter here when I need to dump to file. Why is this double encoding occurring when I use Encoding.UTF8 on the stream? It seems what I need is raw binary stream into which the encoding TextWriter is writing. But then I’m stilling missing the byte order mark too…

 

What am I missing here? How do I set up my stream and TextWriter to get ASP.NET to write my output to file as properly encoded UTF-8 including the UTF-8 PreAmble and properly encoded upper characters?