I've been thinking and messing with HTML Editing a lot lately. And it's giving me a headache ...

Basically I've gotten to the point were just about all of my document based UIs deal with HTML for displaying content. HTML display is relatively easy to create and display and it works well. For example, the Message Board here, the Web Log, Help Builder, the Wiki, the Message Board Reader are typical examples of applications that use a rich display surface with HTML to display data.

I would also like to edit text in this environment in a rich way. Most Web Applications these days provide input in plain text boxes and this is problematic in a number of ways. First obviously you can't have rich display attributes. But it's also a problem if you need to post something that has intrinisic formatting such as HTML or XML or even a block of code. Posting those sorts of things in a text box will provide undesirable results to say the least.

I have in the past resorted to a sort of hack to deal with this by allowing a special edit syntax (double chevrons instead of single chevrons) for HTML markup to make it possible to let people post HTML and XML that will NOT be interpreted but rather be automatically rendered. The idea behind this seemingly backwards approach is that the majority of text we type is well, text, and markup makes up the minimal bit of it. I then use this 'marked' up plain text and fix it up with basic formatting features like inserting
and

tags automatically. This works reasonably well if you work with 'programmer' types (well, on reflection even then it gets iffy at times ) but for end users that's not a great solution.

The next option is to actually do HTML editing on your forms. I've used the IE Web Browser control both in Web apps and in Windows applications for providing HTML editing. In fact, this is what I've been struggling with just today for countless hours yet once again trying to build a decent two way editor that works for a fairly dynamic application (Help Builder in this case) that needs to provide the ability to frequently post code, HTML and XML as well as code expressions that evaluate.

The Web Browser's Edit features are cool on first test, but incredibly incomplete, poorly documented and buggy once you go beyond the basics. If all you're after is have editing for relatively simple text that does things like bold, italic inserts hyperlinks and a few other basic tasks, the control works just fine and is very easy to use.

In fact, I'm writing this Web Log entry right now using FreeTextBox which is an ASP.Net control that's part of .Text and available as a third party utility. The control is quite nice and very easyto simply drop into a project and off you go.

The problems are if you need to do things that are a little more complicated like editing tables or adding other structures to a document. Code in particular seems to be very difficult to deal with. Most of the HTML Editing controls I've seen don't include the ability to markup code. If you bring in code it will be turned into a plain HTML string which looses all leading spaces. It's relatively easy to fix this but using tags in the editing environment brings a few surprises (which is probably why most such tools don't include it on their toolbars).

For me this means I usually write entries to my Web Log by writing the text in Word and then pasting it into the HTML control. It's very nice that this does work so easily, but hte code Word pastes is NASTY and bloated. If you have to ever manually edit it - good luck.

Along the same lines Visual Studio also has some major funkiness when you copy code from it. As you probably know the syntax coloring is preseved if you paste into Word, or into an HTML edit control. But under certain circumstances it's interesting how easily the HTML gets deformed (for example, pasting directly into the HTML Edit control will yield mangled code).

So, today I spent a good part of my day working on the Edit control in Help Builder that should allow to switch between edit and design modes easily. I got 95% of this working smoothly, but there are a number of little bugs that are driving me insane.

The code formatting is among the most problematic. As soon as IE gets a hold of any HTML used to render the HTML gets scrunched together and becomes unreadable. I managed to build a code matcher that will flip between design and HTML modes and remember the location, but this is spotty as it relies on the Parent node of the current selection. Apparently not every element in a document has a Parent node.

My worst problem comes from the fact that the HTML I'm posting migth contain script blocks that hte user inserts via some toolbar options. For example a help topic crosslink gets inserted as a dynamic link that looks something like this:

<%= TopicLink([Test],[Test Topic]) %>

Unfortunately the edit control does not allow for Pasting this script block Html into the document. So now I have to translate this stuff into something else first and manage the string during the HTML -> Text -> Html conversions. Fun stuff... Other fun stuff includes the fact that Selection Pastes seem to eat spaces in a number of occasions.

I'm starting to appreciate the reasons that the VS.Net Visual Editor sucks so bad.

Oh yes, there was a point to this post . It seems to me HTML Editing - or a unified rich text editing environment or control - is something that is sorely missing in Windows today. Given the fact that we are moving more and more into a document centric environment the lack of a decent control to do this is a major hindrance to UI development. Many people go down this path over and over again, and to date I have not seen a control that addresses these issues even close.

Given that this is such a common thing it seems like a major omission on microsoft's part. It's interesting to me that both the Web Browser Control and the HTML Editing components are both so buggy and badly documented/implemented. .Net in particular has very limited support for this stuff and COM interop with it is crappy at best. Whidbey promises a new managed WebBrowser control, but from what is in the PDC build it doesn't provide nearly enough functionality to be useful (no editing support at all, limited DOM support - no selections for example).