File tree 3 files changed +29
-11
lines changed
3 files changed +29
-11
lines changed Original file line number Diff line number Diff line change @@ -72,7 +72,7 @@ SHOW CHARACTER SET;
72
72
+ -- -------+-------------------------------------+-------------------+--------+
73
73
| ascii | US ASCII | ascii_bin | 1 |
74
74
| binary | binary | binary | 1 |
75
- | gbk | Chinese Internal Code Specification | gbk_bin | 2 |
75
+ | gbk | Chinese Internal Code Specification | gbk_chinese_ci | 2 |
76
76
| latin1 | Latin1 | latin1_bin | 1 |
77
77
| utf8 | UTF- 8 Unicode | utf8_bin | 3 |
78
78
| utf8mb4 | UTF- 8 Unicode | utf8mb4_bin | 4 |
Original file line number Diff line number Diff line change @@ -7,6 +7,7 @@ summary: 本文介绍 TiDB 对 GBK 字符集的支持情况。
7
7
8
8
TiDB 从 v5.4.0 开始支持 GBK 字符集。本文档介绍 TiDB 对 GBK 字符集的支持和兼容情况。
9
9
10
+ <<<<<<< HEAD
10
11
``` sql
11
12
SHOW CHARACTER SET WHERE CHARSET = ' gbk' ;
12
13
+ -- -------+-------------------------------------+-------------------+--------+
@@ -36,6 +37,9 @@ MySQL 的字符集默认排序规则是 `gbk_chinese_ci`。与 MySQL 不同,Ti
36
37
如果要使 TiDB 兼容 MySQL 的 GBK 字符集排序规则,你需要在初次初始化 TiDB 集群时设置 TiDB 配置项[ ` new_collations_enabled_on_first_bootstrap ` ] ( /tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap ) 为 ` true ` 来开启[ 新的排序规则框架] ( /character-set-and-collation.md#新框架下的排序规则支持 ) 。
37
38
38
39
开启新的排序规则框架后,如果查看 GBK 字符集对应的排序规则,你可以看到 TiDB GBK 默认排序规则已经切换为 ` gbk_chinese_ci ` 。
40
+ =======
41
+ 从 TiDB v6.0.0 开始,[ 新的排序规则框架] ( /character-set-and-collation.md#新框架下的排序规则支持 ) 默认启用,即 TiDB GBK 字符集的默认排序规则为 ` gbk_chinese_ci ` ,与 MySQL 保持一致。
42
+ >>>>>>> 112a825ed1 (update the default collation of GBK from gbk_bin to gbk_chinese_ci (#20234 ))
39
43
40
44
``` sql
41
45
SHOW CHARACTER SET WHERE CHARSET = ' gbk' ;
@@ -56,6 +60,19 @@ SHOW COLLATION WHERE CHARSET = 'gbk';
56
60
2 rows in set (0 .00 sec)
57
61
```
58
62
63
+ ## 与 MySQL 的兼容性
64
+
65
+ 本节介绍 TiDB 中 GBK 字符集与 MySQL 的兼容情况。
66
+
67
+ ### 排序规则兼容性
68
+
69
+ MySQL 的 GBK 字符集默认排序规则是 ` gbk_chinese_ci ` 。TiDB 的 GBK 字符集的默认排序规则取决于 TiDB 配置项 [ ` new_collations_enabled_on_first_bootstrap ` ] ( /tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap ) 的值:
70
+
71
+ - 默认情况下,TiDB 配置项 [ ` new_collations_enabled_on_first_bootstrap ` ] ( /tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap ) 为 ` true ` ,表示开启[ 新的排序规则框架] ( /character-set-and-collation.md#新框架下的排序规则支持 ) 。GBK 字符集的默认排序规则是 ` gbk_chinese_ci ` 。
72
+ - 当 TiDB 配置项 [ ` new_collations_enabled_on_first_bootstrap ` ] ( /tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap ) 为 ` false ` 时,表示关闭新的排序规则框架,GBK 字符集的默认排序规则是 ` gbk_bin ` 。
73
+
74
+ 另外,TiDB 支持的 ` gbk_bin ` 与 MySQL 支持的 ` gbk_bin ` 排序规则不一致,TiDB 是将 GBK 转换成 ` utf8mb4 ` ,然后再进行二进制排序。
75
+
59
76
### 非法字符兼容性
60
77
61
78
* 在系统变量 [ ` character_set_client ` ] ( /system-variables.md#character_set_client ) 和 [ ` character_set_connection ` ] ( /system-variables.md#character_set_connection ) 不同时设置为 ` gbk ` 的情况下,TiDB 处理非法字符的方式与 MySQL 一致。
Original file line number Diff line number Diff line change @@ -25,16 +25,17 @@ SHOW CHARACTER SET;
25
25
```
26
26
27
27
```
28
- +---------+---------------+-------------------+--------+
29
- | Charset | Description | Default collation | Maxlen |
30
- +---------+---------------+-------------------+--------+
31
- | utf8 | UTF-8 Unicode | utf8_bin | 3 |
32
- | utf8mb4 | UTF-8 Unicode | utf8mb4_bin | 4 |
33
- | ascii | US ASCII | ascii_bin | 1 |
34
- | latin1 | Latin1 | latin1_bin | 1 |
35
- | binary | binary | binary | 1 |
36
- +---------+---------------+-------------------+--------+
37
- 5 rows in set (0.00 sec)
28
+ +---------+-------------------------------------+-------------------+--------+
29
+ | Charset | Description | Default collation | Maxlen |
30
+ +---------+-------------------------------------+-------------------+--------+
31
+ | ascii | US ASCII | ascii_bin | 1 |
32
+ | binary | binary | binary | 1 |
33
+ | gbk | Chinese Internal Code Specification | gbk_chinese_ci | 2 |
34
+ | latin1 | Latin1 | latin1_bin | 1 |
35
+ | utf8 | UTF-8 Unicode | utf8_bin | 3 |
36
+ | utf8mb4 | UTF-8 Unicode | utf8mb4_bin | 4 |
37
+ +---------+-------------------------------------+-------------------+--------+
38
+ 6 rows in set (0.00 sec)
38
39
```
39
40
40
41
``` sql
You can’t perform that action at this time.
0 commit comments