encoding and dealing with data returned over Web connection is arguably one
of the most confusing subjects I've run into with working in .NET. All
strings in .NET are Unicode (double byte) and require specific encoding to
display properly. When retrieving data over the Web the data is retrieved in
a binary stream and in order to use it as a string it must be encoded.
Different content might require different encodings and you have to control
how to encode the string. This basically involves telling the stream reader
which CodePage to convert to.
provides a number of tools to facilitate the encoding process including the Encoding
class, which allows you to easily switch encoding formats for specific
operations. Many classes and conversion tools then use this Encoding class as
a parameter or property to provide their encoding and decoding.
with finding out what encoding is used the HttpWebResponse
object returns a ContentEncodingproperty,
but unfortunately very few Web servers return this information in their
headers, so it's difficult to dynamically discover what format to encode to. CodePage 1252 is the best all around choice for Western
content and I tend to use that as the default if no ContentEncoding
can be determined. The following code is useful when creating an Encoding
are returning binary data store this data in a byte array (byte)
or stream the data directly whatever output source you need to deal with. For
example, if you download a file, don't store it to string first but stream it
straight into a file on disk.
If you've worked at
all with .NET you've probably found out about streams by now. Streams are
very flexible abstractions that are used to deal with blocks of data that are
well, streaming – built from data that is not necessary complete by the time
you start reading it. Streams are efficient because they read and write data
sequentially for the most part (you can also access some streams like files
with random access). In most cases streams are mapped to things like files or
Network I/O inputs and outputs. Streams can also be applied to strings and
memory mapped files and any number of other things that require reading and
writing from large blocks of data. Streams manage the underlying access to
insure integrity of the data so you can read the data before all the data is
available. .NET uses streams for most of the network I/O environment, so
access HTTP, FTP, and even sockets provides a fairly consistent interface
across protocols. In these situations you usually end up with an input stream
and an output stream. Both the WebRequest and WebResponse (which are the base classes of the HttpWebRequest/HttpWebResponse objects) have methods to
return the respective streams which you read from and write to.
.NET strings are objects and
as such require some overhead when they are created. Common operations such
as lcHtml = lcHtml
+ "One more";
very expensive when performed in tight loops. A new object is created and the
old one discarded for each iteration of the loop.
Creating strings for anything more than few kilobytes in this manner gets
slow in a hurry! Realizing that string building is a very common task, the
.NET Framework includes a StringBuilder class that
is optimized for manipulating strings as presized
character arrays that data is inserted to rather than creating new objects everytime. StringBuilder is
hundreds of times faster than plain string concatenation and reduces memory
usage considerably. When running in tight loops you should avoid using the +
operator with strings or any objects getting converted to strings. Instead
you can use the AppendFormat method which appends data
into strings using a string template without the overhead of separate string
Delegates are an important
concept in .NET. They are used frequently in code that implements event
handling or any sort of dynamic code transfer where a calling routine
provides a callback function for a handler process. You can think of a
delegate as a type safe function pointer. Delegates are actually objects that
encapsulate the function pointer and provide the compiler with a function
signature that must be used when calling a delegate pointer function. If
you're familiar with C++ it's like a pointer to a function plus a typedef wrapped into a single object.
The most common use of
delegates is an eventhandler, which uses the
delegate to fire events. When the event publisher fires the event method, the
delegate that is assigned to handle the event is called and you're event
subscriber object then can simply handle the event by implementing a method
in your class.
In multi-thread scenarios,
delegates are used to call the user's thread entry point code. All of this
greatly simplifies handling function pointers.
information on object interface development check out the article .NET
Interface-Based Programming in this issue.