微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

重拾VB632:Issues Specific to the Double-Byte Character Set (DBCS)

来自MSDN-2001-OCT: Visual Tools and Languages/Visual Studio 6.0 Documentation/Visual Basic Documentation/Using Visual Basic/Programmer’s Guide/Part 2: What Can You Do With Visual Basic/International Issues/

1. The concept of DBCS

(1) The double-byte character set (DBCS) was created to handle East Asian languages that use ideographic characters,which require more than the 256 characters supported by ANSI. Characters in DBCS are addressed using a 16-bit notation,using 2 bytes. With 16-bit notation you can represent 65,536 characters,although far fewer characters are defined for the East Asian languages.

(2) In locales where DBCS is used — including China,Japan,and Korea — both single-byte and double-byte characters are included in the character set. The single-byte characters used in these locales conform to the 8-bit national standards for each country and correspond closely to the ASCII character set. Certain ranges of codes in these single-byte character sets (SBCS) are designated as lead bytes for DBCS characters. A consecutive pair made of a lead byte(前导字节) and a trail byte(后继字节) represents one double-byte character. The code range used for the lead byte depends on the locale.

(3) DBCS is a different character set from Unicode. Because Visual Basic represents all strings internally in Unicode format,both ANSI characters and DBCS characters are converted to Unicode and Unicode characters are converted to ANSI characters or DBCS characters automatically whenever the conversion is needed. You can also convert between Unicode and ANSI/DBCS characters manually.

2. ANSI,DBCS,and Unicode: DeFinitions

(1) Because the ANSI standard uses only a single byte to represent each character,it is limited to a maximum of 256 character and punctuation codes. Although this is adequate for English,it doesn't fully support many other languages.

(2) DBCS is used in Microsoft Windows systems that are distributed in most parts of Asia. It provides support for many different East Asian language alphabets,such as Chinese,Japanese,and Korean. DBCS uses the numbers 0 – 128 to represent the ASCII character set. Some numbers greater than 128 function as lead-byte characters,which are not really characters but simply indicators that the next value is a character from a non-Latin character set. In DBCS,ASCII characters are only 1 byte in length,whereas Japanese,Korean,and other East Asian characters are 2 bytes in length.

(3) Unicode is a character-encoding scheme that uses 2 bytes for every character.

The International Standards Organization (ISO) defines a number in the range of 0 to 65,535 (216 – 1) for just about every character and symbol in every language (plus some empty spaces for future growth).

On all 32-bit versions of Windows,Unicode is used by the Component Object Model (COM), the basis for OLE and ActiveX technologies. Unicode is fully supported by Windows NT. Although both Unicode and DBCS have double-byte characters,the encoding schemes are completely different.

(4) DBCS Sort Order and String Comparison: 如果选Option Compare Text statement,comparisons are made according to the case-insensitive textual sort order determined by the user's system locale. 那在中文里可能have two representations for the same character: a narrow-width letter and a wide-width letter,而它们会被视为相同的.

3. DBCS String Manipulation Functions

(1) Although a double-byte character consists of a lead byte and a trail byte and requires two consecutive storage bytes,it must be treated as a single unit in any operation involving characters and strings.

(2) The "B" versions of the functions in the following table are intended especially for use with strings of binary data. The "W" versions are intended for use with Unicode strings.

(3) The functions without a "B" or "W" in this table correctly handle DBCS and ANSI characters. In addition to the functions above,the String function handles DBCS characters. This means that all these functions consider a DBCS character as one character even if that character consists of 2 bytes.

In locales using DBCS,the number of characters and the number of bytes are not necessarily the same. Mid would only return the number of characters,not bytes.

(4) In most cases,use the character-based functions when you handle string data because these functions can properly handle ANSI strings,DBCS strings,and Unicode strings.

When you store the characters to a String variable or get the characters from a String variable,Visual Basic automatically converts between Unicode and ANSI characters. When you handle the binary data,use the Byte array instead of the String variable and the byte-based string manipulation functions.

(5) Visual Basic provides several string conversion functions that are useful for DBCS characters: StrConv,UCase,and LCase.

For example,you can convert narrow letters to wide letters by specifying vbWide in the second argument of StrConv.

You can also use the StrConv function to convert Unicode characters to ANSI/DBCS characters,and vice versa.

4. Font,display,and Print Considerations in a DBCS Environment

(1) When you use a font designed only for SBCS characters,DBCS characters may not be displayed correctly in the DBCS version of Windows.

字体和字体大小都需要调整。Usually,the text in your application will be displayed best in a 9-point font on most East Asian platforms,whereas an 8-point font is typical on European platforms.

These considerations apply to printing DBCS characters with your application as well.

(2) How to Avoid Changing Font Settings:一个是用Font Association,which automatically maps any English fonts in your application to a Korean font(中文支持,日文不支持).

Another option is to use the System or FixedSys font.

(3) 也可以编程根据用户机上的locale自适应地选择字体。

5. Processing Files That Use Double-Byte Characters

(1) In locales where DBCS is used,a file may include both double-byte and single-byte characters. Because a DBCS character is represented by two bytes,your Visual Basic code must avoid splitting it.

(2) When you read a fixed length of bytes from a binary file,use a Byte array instead of a String variable to prevent the ANSI-to-Unicode conversion in Visual Basic.

(3) When you use a String variable with Input or InputB to read bytes from a binary file,Unicode conversion occurs and the result is incorrect.

(4) Keep in mind that the names of files and directories may also include DBCS characters.

6. others

(1) DBCS characters are not supported in any of the following identifiers: Public procedure names,Public variables,Public constants,Project name,Class names

(2) The KeyPress event can process a double-byte character code as one event. The higher byte of the keyascii argument represents the lead byte of a double-byte character,and the lower byte represents the trail byte.

(3) Many Windows API and DLL functions return size in bytes. This return value represents the size of the returned string. Visual Basic converts the returned string into Unicode even though the return value still represents the size of the ANSI or DBCS string. Therefore,you may not be able to use this returned size as the string's size.

7. Visual Basic Bidirectional Features

(1) bidirectional refers to the product ability to manipulate and display text for both left-to-right and right-to-left languages.

(2) Although RightToLeft is a part of every Microsoft Visual Basic installation,it is operational only when Microsoft Visual Basic is installed in a bidirectional 32-bit Microsoft Windows environment. 比如arabic系统。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Format[$] ( expr [ , fmt ] ) format 返回变体型 format$ 强制返回为文本 -------------------------------- 数字类型的格式化 --------------------------------     固定格式参数:     General Number 普通数字,如可以用来去掉千位分隔号     format$("100,1
VB6或者ASP 格式化时间为 MM/dd/yyyy 格式,竟然没有好的办法, Format 或者FormatDateTime 竟然结果和系统设置的区域语言的日期和时间格式相关。意思是尽管你用诸如 Format(Now, "MM/dd/yyyy"),如果系统的设置格式区域语言的日期和时间格式分隔符是"-",那他还会显示为 MM-dd-yyyy     只有拼凑: <%response.write
在项目中添加如下代码:新建窗口来显示异常信息。 Namespace My ‘全局错误处理,新的解决方案直接添加本ApplicationEvents.vb 到工程即可 ‘添加后还需要一个From用来显示错误。如果到这步还不会则需要先打好基础啦 ‘======================================================== ‘以下事件
转了这一篇文章,原来一直想用C#做k3的插件开发,vb没有C#用的爽呀,这篇文章写与2011年,看来我以前没有认真去找这个方法呀。 https://blog.csdn.net/chzjxgd/article/details/6176325 金蝶K3 BOS的插件官方是用VB6编写的,如果  能用.Net下的语言工具开发BOS插件是一件很愉快的事情,其中缘由不言而喻,而本文则是个人首创,实现在了用V
Sub 分列() ‘以空格为分隔符,连续空格只算1个。对所选中的单元格进行处理 Dim m As Range, tmpStr As String, s As String Dim x As Integer, y As Integer, subStr As String If MsgBox("确定要分列处理吗?请确定分列的数据会覆盖它后面的单元格!", _
  窗体代码 1 Private Sub Text1_OLEDragDrop(Data As DataObject, Effect As Long, Button As Integer, Shift As Integer, X As Single, Y As Single) 2 Dim path As String, hash As String 3 For Each fil
  Imports MySql.Data.MySqlClient Public Class Form1 ‘ GLOBAL DECLARATIONS Dim conString As String = "Server=localhost;Database=net2;Uid=root;Pwd=123456;" Dim con As New MySqlConnection
‘導入命名空間 Imports ADODB Imports Microsoft.Office.Interop   Private Sub A1() Dim Sql As String Dim Cnn As New ADODB.Connection Dim Rs As New ADODB.Recordset Dim S As String   S = "Provider=OraOLEDB.Oracl
Imports System.IO Imports System.Threading Imports System.Diagnostics Public Class Form1 Dim A(254) As String    Function ping(ByVal IP As Integer) As String Dim IPAddress As String IPAddress = "10.0.
VB运行EXE程序,并等待其运行结束 参考:https://blog.csdn.net/useway/article/details/5494084 Private Declare Function WaitForSingleObject Lib "kernel32" (ByVal hHandle As Long, ByVal dwMilliseconds As Long) As Long Pr