Working with Unicode data

Visual FoxPro 9.0

Visual FoxPro 9.0 has a new feature in SYS(987,.T.) which allows automatic mapping of Sql Server Unicode types to the current FoxPro ANSI CodePage. This means starting with VFP 9 it's much easier to access and write fields of nText, nChar or nVarChar.

In the wwSQL Class simply set the EnableUnicodeToAnsiMapping property after instantiating the wwSql class to enable this functionality:

THIS.owwStoreSQL = CREATEOBJECT("wwSQL") this.owwStoreSql.EnableUnicodeToAnsiMapping() IF !THIS.owwStoreSQL.Connect(THIS.oConfig.owwStore.cSQLConnection) WAIT WINDOW "Failed to connect to SQL Server" TIMEOUT 5 RETURN ENDIF ...

From then on in VFP 9.0 will see Unicode text automatically translated back and forth between ANSI and Unicode.

Visual FoxPro 8.0 and Earlier

Visual FoxPro 8.0 and earlier doesn't support Unicode data natively, but SQL Server allows storage of nText and nVarChar fields that can contain Unicode data. Visual FoxPro has no native data types that can map to these fields, so if you want to use this data returned from SQL Server you can only return it as strings (or binary).

The problem is that VFP doesn't support Unicode character sets and must convert strings into an ANSI string with a matching codepage. VFP can manage only one ANSI codepage at a time (the one configured for the system) which is the only codepage it can effectively convert into VFP strings on a given system.

This is a big problem if you need to work with data from multiple codepages (such as Chinese, Korean, Russian and Polish for example).

The partial solution to this problem is to pull data from SQL Server in UTF-8 format. UTF-8 is an Ansi character set representation of Unicode which encodes extended characters as multi-byte combinations. Strings in this format are not intelligable by human readers if extended characters are included (such as Chinese text will consist mostly of upper ASCII character), but the string actually maintains its Unicode signature through this encoding.

While UTF-8 cannot be displayed easily in the VFP IDE, it can be easily displayed in HTML and Web Pages which typically encode pages in UTF-8 anyway. Incoming data from Web pages often is also UTF-8 encoded so for Web application passing UTF-8 through a system is actual efficient.

One important aspect to Unicode compatibility is the fact that COM always uses Unicode. Whenever you call a COM object you actually are passing a VFP ANSI string that gets converted to a Unicode string. By default COM and VFP use the current machine's encoding to convert the string to Unicode. However you can override this behavior by using the SYS(3101) function which allows you to specify a Codepage. You can use 65001 which is the Codepage for UTF-8, which means that you can pass and retrieve strings from a COM object via UTF-8 format.

What does this have to with SQL? Well, ADO is a COM based data interface and thus allows us to pass UTF-8/Unicode data to it via COM.

The problem with ODBC

So what does this mean to your VFP applications when dealing with SQL Server? If you use ODBC to retrieve Unicode data from the Web server, VFP will try to convert the data into the current codepage. And there's really no way to override how this conversion occurs. So when you try to retrieve Chinese Unicode data from an English machine, you will get nothing but ???? ??? data returned to you from ODBC. VFP completely looses the original data and simply tells you it doesn't knwo what to do with it.

With ODBC you can hack around this to some degree by casting the data to Binary:

IF loSQL.Execute([select ID,] +; [CAST( CAST( Descript as nVarChar(4000) ) as VarBinary(8000) ) as Note, ] +; [CAST( CAST( lDescript as nVarChar(4000) ) as VarBinary(8000) ) as Description, ] +; [lDescript2 ] +; [from ForeignData],"TFData") = -1

which returns the data to you in binary format. From there you can get the data into UTF-8 format from within VFP with:

SELECT Id,; PADR(STRCONV(Note,10) as Note , ; PADR(STRCONV(Description,10) as Description ; FROM TFData ; INTO CURSOR TFData2

To get data back into server you can then do the reverse. The following example retrieves some data captured from a UTF-8 encoded Web page (or via COM with SYS(3101,65001) in ASP):

*** Read the raw input data as UTF-8 from Web Form
pcSavedDescription = Request.Form("txtDescription")

*** Convert UTF-8 to Unicode and store as BINARY string!!!
pcSavedDescription = STRCONV(pcSavedDescription,12)

pcSavedTitle = Request.Form("txtTitle")
pcSavedTitle = STRCONV(pcSavedTitle,12)

pcId = SYS(2015)
loSql.Execute(;
[insert into ForeignData (ID,Descript,lDescript) values ] +;
[( ?pcID, CAST(?pcSavedTitle as nVarChar(4000)),CAST(?pcSavedDescription as nVarChar(4000) ) ) ] )

This works, but it's pretty messy code that has many conversions that occur on the server which is both slow and hard to write into SQL Statements. It gets much worse if you need to pass parameters to the Sql Server with Unicode data. You have to CAST() the parameter which means you can't use VFP's SP parameter syntax.

In addition, there's a major limitation here in that you cannot retrieve nText data that exceeds 4000 characters. There is no way to convert nText to binary without the intermediate conversion to nVarChar which is limited to 4000 characters.

Workable in some scenarios where you already have code in place but you need to make a couple of queries/updates to retrieve data, but this is definitely not a soltution for end to end Unicode integration.

ADO to the rescue

ADO works via COM and thus suffers none of the problems that VFP suffers in retrieving Unicode data via ODBC. ADO returns data in COM objects - a RecordSet that has fields - and manages that data internally. Internally the data remains in Unicode until you retrieve the data from these objects.

If you DON'T use SYS(3101) you'll have the same issues mentioned earlier in that VFP converts tot he default codepage. However, by setting SYS(3101,65001) you can tell ADO to return all of your data in UTF-8 format, which means you can run normal SQL statements against ADO that return Unicode data and it gets returned to you in UTF-8 format.

The one drawback here is that if you set the codepage in this fashion you need to make sure you also pass all parameters into ADO with UTF-8.

CursorAdapter to Help

The final piece that makes life a bit easier for VFP developers who want work with Cursor data rather than the ADO object is the CursorAdapter. The CursorAdapter allows connecting to an ADO data source and retrieve data into a cursor. The data can even be optionally updateable.

The CursorAdapter supports a ADOCodePage property which can also be set to 65001 to tell it to retrieve data from ADO into the cursor in UTF-8 format.

The following code retrieves a SQL statement with Unicode data into VFP Cursor containing all string data in UTF-8 format:

pcLang="Korean" && Our query parameter SYS(3101,65001) LOCAL oConn as ADODB.Connection oConn = CREATEOBJECT("ADODB.Connection") oConn.ConnectionString = "Provider=sqloledb;Data Source=(local);Initial Catalog=WestWindAdmin;Integrated Security=SSPI" oConn.Open() LOCAL oCommand as ADODB.Command oCommand = CREATEOBJECT("ADODB.Command") oCommand.ActiveConnection = oConn LOCAL oRS as ADODB.RecordSet oRS = CREATEOBJECT("ADODB.RecordSet") oRS.ActiveConnection = oConn LOCAL oCA as CursorAdapter oCA = CREATEOBJECT("CursorAdapter") oCA.ADOCodePage = 65001 oCA.Alias = "TSqlQUery" oCA.DataSourceType = "ADO" oCA.DataSource = oRS pcLang="Korean" oCA.SelectCmd = "Select * from foreigndata where descript=?pcLang" IF !oCA.CursorFill(,,,oCommand) AERROR(laErrors) ? lAERRORS[1] ? lAERRORs[2] RETURN ENDIF BROWSE *** Make free standing cursor oCA.CursorDetach() RETURN

The nice thing about the CursorAdapter is that it allows you to keep SQL Passthrough style syntax for your SQL statements so you can simply pass in parameters using the ?pcLang syntax shown above that is compatible with existing VFP SP queries.

If you want to update you can also make the CursorAdapter updateable. However, I personally prefer sending direct SQL commands to the backend because in most of my applications I have business objects that perform update logic themselves.

Updates and Non-Queries with plain ADO

* wwSQL::ExecuteNonQuery
LPARAMETER lcSQL, llStoredProc

IF !this.Connect()
RETURN .f.
ENDIF

lnOldCp = SYS(3101)
IF this.ncodepage # 0
SYS(3101,this.nCodePage)
ENDIF

this.cSql = lcSql

LOCAL oCommand as ADODB.Command
oCommand=CREATEOBJECT("ADODB.Command")
oCommand.ActiveConnection=this.oadoconn

*** If 'manual' stored proc
IF LOWER(lcSql) = "exec" OR LOWER(lcSQL) = "call"
llStoredProc = .t.
lcSql = GETWORDNUM(lcSQL,2," (" )
ENDIF

IF llStoredProc
this.nCommandMode = 4 && AdcmdStoredProc
oCommand.CommandType = 4
ELSE
this.nCommandMode = 1 && adCmdText
ENDIF

*** Assign the SQL command Text so parms ALWAYS get parsed
oCommand.CommandText = lcSql

*** Parse parameters from oParameters collection
*** OR using VFP syntax
THIS.AssignCommandParameters(oCommand)

*? oCommand.CommandText

oCommand.Execute(,,128)

*** Loop through and update parameters
IF llStoredProc OR LOWER(lcSQL) = "exec"
FOR EACH Parm IN oCommand.Parameters
IF Parm.Direction = 2 OR Parm.Direction = 3
this.oParameters[Parm.Name].Value = Parm.Value
IF TYPE( Parm.Name ) # "U"
lcName = Parm.Name
&lcName = Parm.Value
ENDIF
ENDIF
ENDFOR
ENDIF

IF this.nCodePage # 0
SYS(3101,lnOldCP)
ENDIF

RETURN !this.lError


  Last Updated: 4/29/2008 | © West Wind Technologies, 2008