Unicode and VBA’s ChrW() and AscW() functions

Spreadsheets have their CHAR() function, and VBA has its Chr() function. Both return the text character for the specified numerical input, 0 to 255. And spreadsheets have their CODE() function, and VBA has its Asc() function. Both of those return the ASCII code for the leading character of a text string. All well-worn stuff.

But what if you want or need to work with Unicode values? All four functions fail you. As an example, assume you want the true prime character (‘, Unicode 2032) in a string. The prime character, technically, is not an italicized apostrophe (), a right single curly quote (‘), or an acute accent (‘).

VBA provides the ChrW() function that does that. ChrW() expects a long as input, but also accepts hexadecimal. Unicode is in hex numbering, so there are two choices: Change U2032 to decimal, or tell ChrW() that the input is in Hex. Since HEX2DEC(2032) is 8242, these two are equivalent:

  • ChrW(8242)
  • ChrW(&H2032)

Both will put ‘ into a string. If ChrW() repeated the same functionality of Chr() below 256, things would be simple. However, the Windows character set deviates from the Unicode character set for ASCII(128) to ASCII(159). In that range, Chr(CharCode) and ChrW(CharCode) produce different results. As WikiPedia says, Windows “coincides with ISO-8859-1 for all codes except the range 128 to 159 (hex 80 to 9F), where the little-used C1 controls are replaced with additional characters.” Not sure what C1 controls (probably a printer), but if we want to get Unicode to the spreadsheet, do we want it to give the functionality of CHAR()/Chr(), or that of ChrW() which is ISO-8859-1 compliant? Or, optionally both. The function CHARW() takes the optional route. If you set Exact_functionality to TRUE, you can put those C1 controls in your spreadsheet. The default is to do otherwise.

Function CHARW(CharCode As Variant, Optional Exact_functionality As Boolean = False) As String
‘Use a Leading “U” or “u” to indicate Unicode values
‘Exact_functionality returns the Unicode characters for Ascii(128) to Ascii(159) rather than
‘the Windows characters

   If UCase(Left$(CharCode, 1)) = “U” Then CharCode = Replace(CharCode, “U”, “&H”, 1, 1, vbTextCompare)
   CharCode = CLng(CharCode)

   If CharCode < 256 Then
      If Exact_functionality Then
         CHARW = ChrW(CharCode)
      Else
         CHARW = Chr(CharCode)
      End If
   Else
      CHARW = ChrW(CharCode)
   End If
End Function

One very nice thing is that you can feed Clng() a hex value, and it will do the HEX2DEC conversion for you.

The VBA function AscW() goes the other way, and has the same ISO problems. It will tell you the decimal code of the first character in a Unicode string, with no regard to the Windows character set. We can make another UDF CODEW() that can optionally specify either the decimal or hex value for the first character is returned, and whether or not to be ISO compliant. The default is to return the HEX unicode (as Uxxxx) and not to comply.

Function CODEW(Character As String, Optional Unicode_value As Boolean = True, _
               Optional Exact_functionality As Boolean = False) As Variant
‘ Exact Functionality returns exact Unicode for characters as AscW() does
‘ rather than Windows characters as Asc() does
  Dim Characters As String
   Dim i       As Long

   If Exact_functionality Then
      CODEW = AscW(Character)
      If Unicode_value Then CODEW = “U” & Hex(CODEW)
      Exit Function
   End If

   For i = 128 To 159 ‘where non-compliant
     Characters = Characters & Chr(i)
   Next i

   If InStr(1, Characters, Left$(Character, 1), vbBinaryCompare) Then
      CODEW = Asc(Character)
   Else
      CODEW = AscW(Character)
   End If
   If Unicode_value Then CODEW = “U” & Hex(CODEW)
End Function

The default will return U2032 when the first character is ‘, and 8242 when Unicode_value is set FALSE. For another example, € is CHAR(128), Chr(128), ChrW(8354), CHARW(128), CHARW(“U80″), CHARW(“U20AC”,TRUE) and CHARW(8364,TRUE).

CODEW(“€”) is “U80″, CODEW(“€”,FALSE) is 128, CODEW(“€”,,TRUE) is “U20AC”, and CODEW(“€”,FALSE,TRUE) is 8354.

To see Unicode characters, the cell’s font has to be set to a Unicode font.

…mrt

5 Comments

  1. AscW returns negative numbers sometimes. This is documented in the MS Excel help files. Does the macro above also work in those cases?

  2. Michael says:

    Stefano -

    There is nothing in the UDF to protect for bad AscW behavior.

    …mrt

  3. Michael says:

    Stefano -

    This would be the modified code:

    Function CODEW(Character As String, Optional Unicode_value As Boolean = True, _
                   Optional Exact_functionality As Boolean = False) As Variant
    ‘ Exact Functionality returns exact Unicode for characters as AscW() does
    ‘ correcting for http://support.microsoft.com/kb/272138 behavior
    ‘ rather than Windows characters as Asc() does
     Dim Characters As String
       Dim i       As Long
     
       If Exact_functionality Then
          CODEW = AscW(Character)
          If CODEW < 0 the CODEW = 65536 + CODEW
          If Unicode_value Then CODEW = “U” & Hex(CODEW)
          Exit Function
       End If

       For i = 128 To 159 ‘where non-compliant
        Characters = Characters & Chr(i)
       Next i

       If InStr(1, Characters, Left$(Character, 1), vbBinaryCompare) Then
          CODEW = Asc(Character)
       Else
          CODEW = AscW(Character)
          If CODEW < 0 the CODEW = 65536 + CODEW
       End If
       If Unicode_value Then CODEW = “U” & Hex(CODEW)
    End Function

    Can’t really call it “Exact Functionality” any more :roll:
    …mrt

Posting code or formulas in your comment? Use <code> tags!

  • <code lang="vb">Block of code goes here</code>
  • <code lang="vb" inline="true">Inline code goes here</code>
  • <code>Formula goes here</code>

Leave a Reply

You must be logged in to post a comment.

Here's how to update your reports of company and nearly any web data: