Note to self: Remember that the COM RegEx parser doesn't deal with the . operator the same way in multi-line content as .NET or most other RegEx parsers do. I've just spent 20 minutes troubleshooting a RegEx expression that works just fine in RegEx Buddy and .NET code, but failed in one of my FoxPro apps here using the COM VBScript.RegEx parser.

The code I was working on required stripping out @Register tags from an ASP.NET style markup document:

TEXT TO lcText  NoShow
<!-- 
     * Set the name of your class in the ID property
     * Set the GeneratedSourceFile at a PRG file in your FoxPro project directory
     * NOTE: the path is relative to your executing directory (CURDIR())
     * Remove this block of comment text
-->
<%@ Page Language="C#"          
         GeneratedSourceFile="devDemo/BasePage.prg"
         ID="BasePage_Page"
         AuthenticationMode="Basic"         
%>
<%@ Register Assembly="WebConnectionWebControls" 
    Namespace="Westwind.WebConnection.WebControls"
    TagPrefix="ww" %>

<%@ Register Assembly="WebConnectionWebControls" 
    Namespace="Westwind.WebConnection.WebControls.Customization"
    TagPrefix="ww" %>
... more HTML here
<form id="form1" runat="server">           
     
ENDTEXT

LOCAL loRegEx as VBScript.RegExp 
loRegEx = CREATEOBJECT("VBScript.RegExp")
loRegEx.IgnoreCase = .T. loRegEx.Global = .T. loRegEx.MultiLine = .T. loRegEx.Pattern = '<%@\s{0,}Register.*?%>\s{0,}' ? loRegEx.Replace(lcText,"") RETURN

So I started out with the above expression to match and then remove the entire @Register tags:

loRegEx.Pattern = '<%@\s{0,}Register.*?%>\s{0,}'

using the . to specify any character in a multi-line expression to parse. This doesn't work because apparently the . operator in the VBScript RegEx parser doesn't match the newline and so only effective matches on the first line. This is despite the multi-line option, which only affects how the ^ and $ (beginning and end of line) characters are parsed by the RegEx parser.

There are a couple of ways around this. What I used here since I just replace the . with [\s,\S] which is essential every character:

loRegEx.Pattern = '<%@\s{0,}Register[\s,\S]*?%>\s{0,}'

Or to be more explict [.|\n] also works to provide the same results.

My short term memory is going bad. Just as I got this working  I ran into some older code (in the same program file even!) where I had apparently done exactly the same thing previously using [\s,\S] instead of the .. Nothing like solving the same problem twice, eh? Hopefully this time after writing it up I'll remember. <g>

In general I wish I could remember more of the little bit of RegEx work I do. Even better some of that what other people do, he he. I appreciate the power of RegEx, but it seems whenever I do anything with RegEx it takes forever to do even simple things and once it's done I immediately and completely forget the syntax and process that went into figuring it out. No retention there for me. Case in point here. Next time maybe I'll remember.