After experimenting with various modes of Word integration mainly to provide spell checking in Help Builder I decided to provide  a simpler route to this process in addition to the full fledged Word insertion I talked about a couple of days ago.

 

Doing a Spell Check in Word is pretty simple if you have plain text to deal with – it only takes a few lines of code to automate Word to pop up the Spell Checking Dialog interactively. If you keep Word invisible you can get the Word dialog to pop up right on top of your application like this:

 

 

In the image above Help Builder is performing a spell check on the body content which is in HTML edit mode, meaning the content is not actually text, but HTML. Spell checking an HTML document is not a trivial task and the function below includes some logic to handle HTML spell checking as well.

 

************************************************************************

* wwUtils :: SpellCheck

****************************************

***  Function: Uses Word's Spell Check functionality to spell check

***            a string of text interactively

***    Assume: Requires Word 2000 and IE 4 or later COM objects

***      Pass:  lcText    - the text to spell check

***             llIsHtml  - if .t. the text is HTML

***    Return: Spellchecked text

************************************************************************

FUNCTION SpellCheck(lcText,llIsHtml)

LOCAL loWord, loDoc, x, y

 

IF EMPTY(lcText)

   RETURN ""

ENDIF

 

IF !ISCOMOBJECT("Word.Application")

   RETURN lcText

ENDIF  

 

IF !llIsHtml

   *** Plain Text - simple assign and retrieve

   loWord = CREATEOBJECT("Word.Application")

   loDoc = loWord.Documents.Add(,,1,.T.)

 

   loDoc.Content.Text = lcText

   loDoc.CheckSpelling()

   lcText = loDoc.Content.Text

 

   loWord.Visible = .f.

   loDoc.Close(.f.)

   loDoc = .null.

 

   loWord.Quit(.f.)

   loWord = .null.

 

   RETURN lcText

ENDIF

 

*** HTML - load into IE, retrieve text, replace

***        changed text

LOCAL loIe as InternetExplorer.Application

loIE = CREATEOBJECT("InternetExplorer.Application")

loIE.Navigate("about:blank")

 

DO WHILE loIE.Busy

   DOEVENTS

ENDDO

 

loIEDoc = loIE.Document

loIEDoc.Body.innerHtml = lcText

 

lcTText = loIEDoc.Body.innerText

loIE = .f.

 

loWord = CREATEOBJECT("Word.Application")

*loWord.WindowState= 2  && wdWindowStateMinimize

 

loDoc = loWord.Documents.Add(,,1,.T.)

 

loDoc.Content.Text = lcTText

 

*** Pick up all the error text

x = 0

FOR EACH loError IN loDoc.SpellingErrors

     x = x + 1

     DIMENSION laErrors[X,2]

     laErrors[x,1] = loError

     laErrors[x,2] = loError.Text && Old text

ENDFOR

 

IF x > 0

   loDoc.SpellingErrors.Item(1).CheckSpelling()

ENDIF

 

FOR y = 1 TO  x

    IF laErrors[y,1].Text # laErrors[y,2]

       lcText = STRTRAN(lcText,laErrors[y,2],laErrors[y,1].Text)

    ENDIF

    laErrors[y,1] = .null.

ENDFOR

 

loWord.Visible = .f.

loDoc.Close(.f.)

loDoc= .null.

loWord.Quit(.f.)

loWord = .null.

 

RETURN lcText

*  wwUtils :: SpellCheck

 

 

The basic spell check is pretty simple: Create Word, and open a document, assign the text to the Content object, Spellcheck, then read the text back out and shut down Word. It’s very important to use the proper code to shut Word down or you’ll get hung references and Word will not shut down and keep running invisibly. Note also that I don’t make Word visible. The Spell Checking dialog is independent of the Word document container and pops up on top of the current application. Note that it’s not modal, nor tied to your application so clicking anywhere else cause the window to disappear (or rather go to the bottom of the window stack).

 

For importing HTML the code uses the InternetExplorer.Application object to retrieve the innerText of the HTML document. This is the easiest way to retrieve just the Text from an HTML document although there are other ways including manual parsing and stripping of tags. The plain text is then passed to Word for spellchecking. In this code the errors are trapped before and stored in an array, which gives us the ability to keep track of the original and changed values so we can replace them in our HTML document. It keeps track of each of the spelling errors and after the spell check is complete runs through each of the original spelling errors and updates them in the original HTML document by replacing the text.

 

This works fairly well, but with the HTML portion there are a couple of issues to watch out for:

 

Replacements replace all instances of the changed text, so it’s possible that you might replace some text that shouldn’t be replaced. Replacing something like IN will cause problems.

 

innerHTML text is not always well formed text. For example, text in two adjacent cells runs together without spaces. This might give you a few false positives for spell checking that might not require any action.

 

It’s not perfect for HTML but workable – certainly better than no spell checking at all.