Rick Strahl's Weblog
Rick Strahl's FoxPro and Web Connection Weblog
White Papers | Products | Message Board | News |

VbScript.RegExp and Dynamic Replacements


March 04, 2007 •

I was just lamenting the limited VBScript.RegEx object that we’re stuck with in VFP and COM compared the much richer .NET RegEx object. Not so much the functionality of the parser itself which is actually Ok in performance and compliance, but the fact that you can’t easily do dynamic Replace operations. In .NET you get a function delegate on match that allows you to modify the matched value that’s very powerful. No such thing in the base VBScript object.

 

However, after giving this some thought it’s actually pretty easy to accomplish this in FoxPro code, although it’s likely not quite as efficient doing it with FoxCode vs. letting VBScript do the replacement of values internally. VBScript's expression parser has a mechanism to retrieve the matched values ("$1" for example) but it's still too limited if you need to massage the retrieved string.

 

The trick is to forget about Replace in the RegEx parser and use standard matches instead. You can then run through the list of matches and individually replace them optionally allowing a FoxPro Expression to be used for the replacement operation. This effectively gives you Delegate functionality in a way similar that .NET handles with MatchCallbacks.

 

What I came up with this is pretty useful (to me at least <s>) as I can throw out a bunch of manual STRTRAN() code I’ve been using for a single RegEx expression in many cases. While it’s probably not necessarily faster to do so it certainly can reduce code significantly.

 

Anyway here’s the function that’s part of a small RegEx wrapper class:

 

*************************************************************

DEFINE CLASS wwRegEx AS Custom

*************************************************************

*: Author: Rick Strahl

*:         (c) West Wind Technologies, 2007

*:Contact: http://www.west-wind.com

*:Created: 03/04/07

*************************************************************

#IF .F.

*:Help Documentation

*:Topic:

Class wwRegEx

 

*:Description:

RegEx class based on the VBScript RegEx class. Instance

loads up the VBScript RegEx object and keeps it alive in

this instance.

 

This class matches the RegEx class signature and then adds

a number of useful high level methods.

 

*:Example:

 

*:Remarks:

 

*:SeeAlso:

 

 

*:ENDHELP

#ENDIF

 

RegEx = null

 

*** Custom Properties

IgnoreCase = .T.

Global = .T.

MultiLine = .F.

 

Matches = null

 

FUNCTION IgnoreCase_Assign(value)

this.RegEx.IgnoreCase = value

ENDFUNC

 

FUNCTION Global_Assign(value)

this.RegEx.Global = value

ENDFUNC

 

FUNCTION MultiLine_Assign(value)

this.RegEx.MultiLine = value

ENDFUNC

 

*** Stock Properties

 

************************************************************************

* wwRegEx ::  Init

****************************************

***  Function:

***    Assume:

***      Pass:

***    Return:

************************************************************************

FUNCTION Init()

 

this.RegEx = CREATEOBJECT("VBScript.RegExp")

this.RegEx.Global = .T.

this.RegEx.IgnoreCase = .T.

 

ENDFUNC

*  wwRegEx ::  Init

 

 

  RegEx base methods omitted

 

************************************************************************

* wwUtils ::  Replace

****************************************

***  Function: Replaces the replace string or expression for

***            any RegEx matches found in a source string

***    Assume: NOTE: very different from native REplace method

***      Pass: lcSource

***            lcRegEx

***            lcReplace   -   String or Expression to replace with

***            llIsExpression - if .T. lcReplace is EVAL()'d

***

***            Expression can use a value of lcMatch to get the

***            current matched string value.

***    Return: updated string

************************************************************************

FUNCTION Replace(lcSource,lcRegEx,lcReplace,llIsExpression)

LOCAL loMatches, lnX, loMatch, lcRepl

 

this.RegEx.Pattern = lcRegEx

loMatches = this.RegEx.Execute(lcSource)

 

lnCount = loMatches.Count

 

IF lnCount = 0

   RETURN lcSource

ENDIF

 

lcRepl = lcReplace

 

*** Note we have to go last to first to not hose

*** relative string indexes of the match

FOR lnX = lnCount -1 TO 0 STEP -1

      loMatch = loMatches.Item(lnX)

      lcMatch = loMatch.Value

      IF llIsExpression

            *** Evaluate dynamic expression each time      

            lcRepl = EVAL( lcReplace )

      ENDIF

      lcSource = STUFF(lcSource,loMatch.FirstIndex+1,loMatch.Length,lcRepl)

ENDFOR

 

RETURN lcSource

* wwRegEx : Replace

 

 

With this replace function you can do a global replace like this:

 

lcText = "Here we go<h3>Header</H3><br>More Text <h2>Header again</H2><br>asasa"

? lcText

 

loRegEx = CREATEOBJECT("wwRegEx")

loRegEx.IgnoreCase=.T.

 

? loRegEx.Replace(lcText,[</h\d><br>],"STRTRAN(lcMatch,[<br>],[])",.T.)

 

Notice that the ReplaceExpression is passed as an Expression, which can be any FoxPro code that can be evaluated using the matched string value. In this case the result matches any strings that </hX><br> and I want to basically strip out the <br> from the matched value to replace. So I can use STRTRAN() to strip of the <br> from the match. The “lcMatch” variable is available in any expression passed and holds the matched value which is transformed with the expression.

 

Now here the example is very simple, but of course you can route the ‘delegate’ to a function of your own code and so you could run some very sophisticated on the match and the transformation for the replacement value. This opens up RegEx to some powerful dynamic replacement options that makes RegEx a hell of a lot more useful in VFP/COM.

 

Well, I’m off retrofitting some existing code now <s>…

Posted in:

Feedback for this Weblog Entry