中国DOS联盟

-- 联合DOS 推动DOS 发展DOS --

联盟域名：www.cn-dos.net 论坛域名：www.cn-dos.net/forum
DOS，代表着自由开放与发展，我们努力起来，学习FreeDOS和Linux的自由开放与GNU精神，共同创造和发展美好的自由与GNU GPL世界吧！

游客: 注册 | 登录 | 命令行 | 搜索 | 上传 | 帮助 »

中国DOS联盟论坛 » 网络日志（Blog） » 国家标准GB18030-2005《信息技术中文编码字符集》

English/Chinese Fix Translation

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 46 楼』: 18030模仿试验组装页使用 LLM 解释/回答一下

0
1
2
3
4
5
6
7
8
9

00
01
02
03
04
05
06
07
08
09

10
11
12
13
14
15
17
18
19

20
21
22
23
24
25
26
27
28
29

30
31
32
33
34
35
36
37
38
39

Last edited by zzz19760225 on 2016-11-26 at 21:15 ]

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:42

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 47 楼』: 一个字化使用 LLM 解释/回答一下

硬件设备名称
在引导启动或直接进入内存的时候，搜索存储设备空间的名字，名字就是代表，代表那些组成的范围。
统一方便汉族识别的设备名字，分区列表，文件夹和文件，就类似军官群体的权力结构。
尽量使用最少的字节去表达相对丰富的名字空间，并且具有信息井的深入延伸的可能，信息量的等级必然会随着人类成长扩展而历史周期的向外扩大。

外设添加
USB TAT SSD A：
优盘硬盘机械盘硬盘固体盘优盘：固盘：机盘：或者后面的冒号不需要了，用盘就代表了，尽量全部用汉字表达，除了数组数群数字公式的符号，专门集中书写运算，其他尽量使用汉字。这样处理汉字的编辑器主要处理的是汉字和标点符号，如何标点符号也汉字化，就基本在汉字环境中工作，相对的工作内容反而简化。在全汉字方向的路上，行列和标点符号都用汉字代替，需要专用于行列和标点符号的汉字！

存储类别
FAT EXT
一种主存储模式，后期新时代新历史下的新制作和新改善存储的未知空间。存储主要是对数字数组数群进行编制存放，这部分需要找点中文资料好去揣测。
储，仓，
输入二进制的数组，就是这个汉字，编辑器也是处理这个对应的汉字定义内容。
-----------------------------------------------------
| 仓 | B2D6 | 1011001011010110 | 1 | cāng | 4 | wbb |
|------|-----------|---------------------------------|
| 储 | B4A2 | 1011010010100010 | 2 | chǔ | 12 | wyfj |
|------|-----------|---------------------------------|
| 取 | C8A1 | 1100100010100001 | 3 | qǔ | 8 | bcy |
|------|-----------|---------------------------------|
| 算 | CBE3 | 1100101111100011 | 4 | suàn | 14 | thaj |
|------|-----------|---------------------------------|
| 得 | B5C3 | 1011010111000011 | 5 | dé | 11 | tjgf |
|------|-----------|---------------------------------|
| 加 | BCD3 | 1011110011010011 | 6 | jiā | 5 | lkg |
|-----------------------------------------------------
| 减 | BCF5 | 1011110011110101 | 7 | jiǎn | 11 | udgt |
|-----------------------------------------------------
| 乘 | B3CB | 1011001111001011 | 8 | chéng | 10 | tuxv |
|-----------------------------------------------------
| 除 | B3FD | 1011001111111101 | 9 | chú | 9 | bwty |
|-----------------------------------------------------
| 建 | BDA8 | 1011110110101000 | 10 | jiàn | 8 | vfhp |
|-----------------------------------------------------
| 改 | B8C4 | 1011100011000100 | 11 | gǎi | 7 | nty |
|-----------------------------------------------------
| 删 | C9BE | 1100100110111110 | 12 | shān | 7 | mmgj |
|-----------------------------------------------------
| 存 | B4E6 | 1011010011100110 | 13 | cún | 6 | dhbd |
|-----------------------------------------------------
| 上 | C9CF | 1100100111001111 | 14 | shàng | 3 | hhgg |
|-----------------------------------------------------
| 下 | CFC2 | 1100111111000010 | 15 | xià | 3 | ghi |
|-----------------------------------------------------
| 左 | D7F3 | 1101011111110011 | 16 | zuǒ | 5 | daf |
|-----------------------------------------------------
| 右 | D3D2 | 1101001111010010 | 17 | yòu | 5 | dkf |
|-----------------------------------------------------
| 没 | C3BB | 1100001110111011 | 18 | méi | 7 | imcy |
|-----------------------------------------------------
| 空 | BFD5 | 1011111111010101 | 19 | kōng | 8 | pwaf |
|-----------------------------------------------------
| 环 | BBB7 | 1011101110110111 | 20 | huán | 8 | ggiy |
|-----------------------------------------------------
| 隔 | B8F4 | 1011100011110100 | 21 | gé | 12 | bgkh |
|-----------------------------------------------------
| 正 | D5FD | 1101010111111101 | 22 | zhèng | 5 | ghd |
|-----------------------------------------------------
| 负 | B8BA | 1011100010111010 | 23 | fù | 6 | qmu |
|-----------------------------------------------------
| . | 2E | 101110 | 24 |
|-----------------------------------------------------
| - | 2D | 101101 | 25 |
|-----------------------------------------------------
| / | 2F | 101111 | 26 |
|-----------------------------------------------------
| \ | 5C | 1011100 | 27 |
|-----------------------------------------------------
| 运 | D4CB | 1101010011001011 | 28 | yùn | 7 | fcpi |
|-----------------------------------------------------
| 同 | CDAC | 1100110110101100 | 29 | tóng | 6 | mgkd |
|-----------------------------------------------------
| 反 | B7B4 | 1011011110110100 | 30 | fǎn | 4 | rci |
|-----------------------------------------------------
| 异 | D2EC | 1101001011101100 | 31 | yì | 6 | naj |
|-----------------------------------------------------
运mov（运输单，指定信息货物运送到哪个仓位），这里的运单仓位与快递到街道地址，还有仓库接受的存取，有重复过程。
下隔，上隔，左隔，右隔。
在一个书本模式的18030上用色彩涂抹，逐渐减少。
隔对应换行
环对应while循环
数据存与分区文件夹文件存的字冲突？
没对应空格键，空字保留对应佛空，避免重复使用，无对应道无，空无非没有，用空是否好，空穴来风的空使用，或者一个无就可以了，用空也可以。
对应存和取两个汉字，存取于仓，或者屋，房，城，信息山，信息林，矢量字符，矢量字图，矢量仓。是否要加上偏旁拼音的组合进行筛选字。

存储设置分区
C： D： hd0 sda sdb

文件夹

文件

Last edited by zzz19760225 on 2016-12-2 at 11:56 ]

Hardware Device Name
When booting or directly entering memory, search for the name of the storage device space. The name represents the range it constitutes.
A unified device name for easy identification by the Han people, partition list, folder and file, is similar to the power structure of the officer group.
Try to use the fewest bytes to express a relatively rich name space, and have the possibility of in-depth extension of information wells. The level of information will necessarily expand outward with the historical cycle as human growth expands.

Peripheral Addition
USB TAT SSD A：
USB flash drive Mechanical hard disk Solid state hard disk USB flash drive: Solid state disk: Mechanical disk: Or the colon at the end is not needed, just use disk to represent. Try to use all Chinese characters to express. Except for symbols in arrays, number groups, and formulas, which are specially concentrated for writing operations, try to use Chinese characters as much as possible. The editor that handles Chinese characters mainly deals with Chinese characters and punctuation marks. If punctuation marks are also Chinese-characterized, it will basically work in a Chinese character environment, and the corresponding work content is relatively simplified. On the road of all-Chinese characters, rows and punctuation marks are replaced by Chinese characters, and Chinese characters specially used for rows and punctuation marks are needed!

Storage Category
FAT EXT
A main storage mode, the unknown space of new production and new improved storage in the later new era and new history. Storage is mainly to compile and store digital arrays and number groups. This part needs to find some Chinese materials to guess.
Store, warehouse,
Entering binary arrays is this Chinese character, and the editor also processes the corresponding Chinese character definition content.
-----------------------------------------------------
| 仓 | B2D6 | 1011001011010110 | 1 | cāng | 4 | wbb |
|------|-----------|---------------------------------|
| 储 | B4A2 | 1011010010100010 | 2 | chǔ | 12 | wyfj |
|------|-----------|---------------------------------|
| 取 | C8A1 | 1100100010100001 | 3 | qǔ | 8 | bcy |
|------|-----------|---------------------------------|
| 算 | CBE3 | 1100101111100011 | 4 | suàn | 14 | thaj |
|------|-----------|---------------------------------|
| 得 | B5C3 | 1011010111000011 | 5 | dé | 11 | tjgf |
|------|-----------|---------------------------------|
| 加 | BCD3 | 1011110011010011 | 6 | jiā | 5 | lkg |
|-----------------------------------------------------
| 减 | BCF5 | 1011110011110101 | 7 | jiǎn | 11 | udgt |
|-----------------------------------------------------
| 乘 | B3CB | 1011001111001011 | 8 | chéng | 10 | tuxv |
|-----------------------------------------------------
| 除 | B3FD | 1011001111111101 | 9 | chú | 9 | bwty |
|-----------------------------------------------------
| 建 | BDA8 | 1011110110101000 | 10 | jiàn | 8 | vfhp |
|-----------------------------------------------------
| 改 | B8C4 | 1011100011000100 | 11 | gǎi | 7 | nty |
|-----------------------------------------------------
| 删 | C9BE | 1100100110111110 | 12 | shān | 7 | mmgj |
|-----------------------------------------------------
| 存 | B4E6 | 1011010011100110 | 13 | cún | 6 | dhbd |
|-----------------------------------------------------
| 上 | C9CF | 1100100111001111 | 14 | shàng | 3 | hhgg |
|-----------------------------------------------------
| 下 | CFC2 | 1100111111000010 | 15 | xià | 3 | ghi |
|-----------------------------------------------------
| 左 | D7F3 | 1101011111110011 | 16 | zuǒ | 5 | daf |
|-----------------------------------------------------
| 右 | D3D2 | 1101001111010010 | 17 | yòu | 5 | dkf |
|-----------------------------------------------------
| 没 | C3BB | 1100001110111011 | 18 | méi | 7 | imcy |
|-----------------------------------------------------
| 空 | BFD5 | 1011111111010101 | 19 | kōng | 8 | pwaf |
|-----------------------------------------------------
| 环 | BBB7 | 1011101110110111 | 20 | huán | 8 | ggiy |
|-----------------------------------------------------
| 隔 | B8F4 | 1011100011110100 | 21 | gé | 12 | bgkh |
|-----------------------------------------------------
| 正 | D5FD | 1101010111111101 | 22 | zhèng | 5 | ghd |
|-----------------------------------------------------
| 负 | B8BA | 1011100010111010 | 23 | fù | 6 | qmu |
|-----------------------------------------------------
| . | 2E | 101110 | 24 |
|-----------------------------------------------------
| - | 2D | 101101 | 25 |
|-----------------------------------------------------
| / | 2F | 101111 | 26 |
|-----------------------------------------------------
| \ | 5C | 1011100 | 27 |
|-----------------------------------------------------
| 运 | D4CB | 1101010011001011 | 28 | yùn | 7 | fcpi |
|-----------------------------------------------------
| 同 | CDAC | 1100110110101100 | 29 | tóng | 6 | mgkd |
|-----------------------------------------------------
| 反 | B7B4 | 1011011110110100 | 30 | fǎn | 4 | rci |
|-----------------------------------------------------
| 异 | D2EC | 1101001011101100 | 31 | yì | 6 | naj |
|-----------------------------------------------------
Transport mov (transport order, specify the warehouse where the information goods are transported). The warehouse position of the transport order has a repeated process with the express delivery to the street address and the storage and retrieval accepted by the warehouse.
Lower partition, upper partition, left partition, right partition.
Paint with color on a book-like 18030, gradually reducing.
Partition corresponds to line break
Ring corresponds to while loop
Is there a conflict between the characters for data storage and partition folder file storage?
"None" corresponds to the space bar, and the character "empty" is reserved corresponding to Buddha's emptiness to avoid repeated use. "None" corresponds to Dao's nothingness. "Empty" is not the same as "none". Is it good to use "empty"? The use of "empty" in "empty hole comes from the wind", or one "none" is enough, and "empty" can also be used.
Corresponding to the two Chinese characters "store" and "take", store and take in the warehouse, or house, room, city, information mountain, information forest, vector character, vector character map, vector warehouse. Should we add the combination of radical pinyin to screen words.

Storage Setup Partition
C： D： hd0 sda sdb

Folder

File

Last edited by zzz19760225 on 2016-12-2 at 11:56 ]

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:43

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 48 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:43

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 49 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:44

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 50 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:45

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 51 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:46

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 52 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:47

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 53 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:49

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 54 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:50

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 55 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:50

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 56 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:51

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 57 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:52

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 58 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:53

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 59 楼』: 使用 LLM 解释/回答一下

1

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:54

查看资料发短消息网志

编辑帖子回复引用回复

zzz19760225
超级版主

积分 3673
发帖 2020
注册 2016-2-1
状态离线

『第 60 楼』: 使用 LLM 解释/回答一下

歡迎光臨中文編碼網頁
GB 18030 編碼表

因 GB 18030 的設計是要把整個 Unicode 的字碼作對應，所以 GB 18030 的碼表，也與 Unicode 一樣龐大。

請按下拉選框，選擇你想查詢的字碼。字碼第一欄以 Unicode 排序，第二欄為 GB 18030 內碼。有需要時輔以第三欄說明。

GBK與GB 2312的分別

GB 2312 2字節碼位，第一個字節的值從 0xA1–FE（0xAA–AF 及 0xF8–FE 實際未使用），第二個字節的值從 0xA1–FE。
GBK 2字節碼位，第一個字節的值從 0x81–FE，第二個字節的值從 0x40–7E 及 0x80–FE。
GB 2312 只有 6,763 個漢字。GBK 收錄所有中日韓統一表意文字基本區漢字。
0x8140–A0FE，加入 6,080 個漢字；0xAA40–FD9B（不包括原有 GB 2312 範圍），加入 8,059 個漢字；0xFD9C–FE4F，加入 21 個兼容漢字。
GB 2312 只有 682 個符號。在後來的字型標準如 GB 5007.1 和 GB 6345.1 等，在 0xA8BB–A8C0 補上六個拼音符號：ɑ ḿ ń ň ǹ ɡ。GBK 承繼了這些符號。
GBK 加入 10 個小寫羅馬數字 ⅰ–ⅹ (0xA2A1–A2AA)。
GBK 加入 29 個豎排標點符號 (0xA6D9–A6F5)。來源自 GB 12345 標準。
GBK 加入台灣電腦系統用的符號 (0xA840–A895, 0xA940–A988，不包含 A958, A95B, A95D–A95F)。
但實際上，台灣電腦系統並沒有 0xA844(―), 0xA891(☉), 0xA95C(‐) 。
Big5 碼的 0xA145(‧), 0xA15A(╴), 0xA1C2(¯ 或 ‾), 0xA1C5(ˍ) 亦沒有在 GBK 出現。
加入表意文字描述符 (0xA989–A995) 及漢字數字零〇 (0xA996)
加入當時 Unicode 尚未收錄的 52 個《簡化字總表》漢字、28 個《康熙字典》及《辭海》漢字部件 (0xFE50–FEA0)。
註：GB 5007.1 和 GB 6345.1 等標準，在第 10 區（內碼 0xAAA1–AAFE）補充 94 個半字圖形字符（即是 ASCII 符號）、
在第 11 區（內碼 0xABA1–ABC0）補充漢語拼音 a, e, i, o, u, ü 的四聲半字字符和 ê, ɑ, ḿ, ń, ň, ǹ, ɡ 的半字字符共 32 個。
GBK 和 GB 18030 標準均沒有遵從。

GBK與微軟CP936的分別

微軟 CP936 在 0x80 加入歐元符號 €（1995年 GBK 推出時，歐元尚未誕生）
微軟 CP936 沒有 0xA6D9–A6DF, A6EC–A6ED, A6F3, A8BC, A8BF, A989–A995, FE50–FEA0（GB 13000.1 / Unicode 1.0 沒有那些字符）。

GB 18030-2000與GBK的分別

GB 18030-2000 增加了4字節的碼位，第一個字節的值從 0x81–FE，第二個字節的值從 0x30–39，第三個字節從 0x81–FE，第四個字節從 0x30–39。並把 Unicode 的所有可能編碼，都對應到其中一個 GB 18030 碼位。
GB 18030-2000 收錄所有中日韓統一表意文字擴展A區漢字。
GB 18030-2000 把歐元符號收錄在 0xA2E3。
很不幸，在微軟簡體中文系統，0x80 依舊是歐元符號；0xA2E3 則另有一個歐元符號，對應至私人造字碼 U+E76C。
因為 Unicode ≥3.0 已收錄以下字符，在 GB 18030-2000 的官方文件附錄E 及 GB 18030-2005 的官方文件附錄E－表E.1，列出了以下字符在下一版 GB 13000（註：相當於 ISO/IEC 10646:2003）的位置。事實上，GB 18030-2000 和 -2005 已修改了它們所對應的 Unicode 對應。

GB碼位 ↓ 字符 ↓ GBK對應的造字區 ↓ GB 18030對應的Unicode ↓
A8BF ǹ U+E7C8 U+01F9
A989 〾 U+E7E7 U+303E
A98A ⿰ U+E7E8 U+2FF0
A98B ⿱ U+E7E9 U+2FF1
A98C ⿲ U+E7EA U+2FF2
A98D ⿳ U+E7EB U+2FF3
A98E ⿴ U+E7EC U+2FF4
A98F ⿵ U+E7ED U+2FF5
A990 ⿶ U+E7EE U+2FF6
A991 ⿷ U+E7EF U+2FF7
A992 ⿸ U+E7F0 U+2FF8
A993 ⿹ U+E7F1 U+2FF9
A994 ⿺ U+E7F2 U+2FFA
A995 ⿻ U+E7F3 U+2FFB
FE50 ⺁ U+E815 U+2E81
FE54 ⺄ U+E819 U+2E84
FE55 㑳 U+E81A U+3473
FE56 㑇 U+E81B U+3447
FE57 ⺈ U+E81C U+2E88
FE58 ⺋ U+E81D U+2E8B
FE5A 㖞 U+E81F U+359E
FE5B 㘚 U+E820 U+361A
FE5C 㘎 U+E821 U+360E
FE5D ⺌ U+E822 U+2E8C
FE5E ⺗ U+E823 U+2E97
FE5F 㥮 U+E824 U+396E
FE60 㤘 U+E825 U+3918
FE62 㧏 U+E827 U+39CF
FE63 㧟 U+E828 U+39DF
FE64 㩳 U+E829 U+3A73
FE65 㧐 U+E82A U+39D0
FE68 㭎 U+E82D U+3B4E
FE69 㱮 U+E82E U+3C6E
FE6A 㳠 U+E82F U+3CE0
FE6B ⺧ U+E830 U+2EA7
FE6E ⺪ U+E833 U+2EAA
FE6F 䁖 U+E834 U+4056
FE70 䅟 U+E835 U+415F
FE71 ⺮ U+E836 U+2EAE
FE72 䌷 U+E837 U+4337
FE73 ⺳ U+E838 U+2EB3
FE74 ⺶ U+E839 U+2EB6
FE75 ⺷ U+E83A U+2EB7
FE77 䎱 U+E83C U+43B1
FE78 䎬 U+E83D U+43AC
FE79 ⺻ U+E83E U+2EBB
FE7A 䏝 U+E83F U+43DD
FE7B 䓖 U+E840 U+44D6
FE7C 䙡 U+E841 U+4661
FE7D 䙌 U+E842 U+464C
FE80 䜣 U+E844 U+4723
FE81 䜩 U+E845 U+4729
FE82 䝼 U+E846 U+477C
FE83 䞍 U+E847 U+478D
FE84 ⻊ U+E848 U+2ECA
FE85 䥇 U+E849 U+4947
FE86 䥺 U+E84A U+497A
FE87 䥽 U+E84B U+497D
FE88 䦂 U+E84C U+4982
FE89 䦃 U+E84D U+4983
FE8A 䦅 U+E84E U+4985
FE8B 䦆 U+E84F U+4986
FE8C 䦟 U+E850 U+499F
FE8D 䦛 U+E851 U+499B
FE8E 䦷 U+E852 U+49B7
FE8F 䦶 U+E853 U+49B6
FE92 䲣 U+E856 U+4CA3
FE93 䲟 U+E857 U+4C9F
FE94 䲠 U+E858 U+4CA0
FE95 䲡 U+E859 U+4CA1
FE96 䱷 U+E85A U+4C77
FE97 䲢 U+E85B U+4CA2
FE98 䴓 U+E85C U+4D13
FE99 䴔 U+E85D U+4D14
FE9A 䴕 U+E85E U+4D15
FE9B 䴖 U+E85F U+4D16
FE9C 䴗 U+E860 U+4D17
FE9D 䴘 U+E861 U+4D18
FE9E 䴙 U+E862 U+4D19
FE9F 䶮 U+E863 U+4DAE

GB 18030-2005與GB 18030-2000的分別

夾附中日韓統一表意文字擴展B區漢字、朝鮮文、蒙古文（包括滿文、托忒文、錫伯文、阿禮嘎禮文）、德宏傣文、藏文、維吾爾文／哈薩克文／柯爾克茲文，和彝文的字形表。韓文包含 3,376 個韓字加 69 個字母加 51 個兼容字母、蒙古文包含 149 字、傣文包含 35 字、藏文包含 193 字、維吾爾文包含 49 字加 153 個字母表達形式、彝文包含 1,215 字（不包含 U+A4A2, U+A4A3, U+A4B4, U+A4C1, U+A4C5）。
GB 18030-2000 沒有把 ḿ 對應至 Unicode。在 GB 18030-2005 終於獲訂正。見官方文件附錄E－表E.2。

GB碼位 ↓ 字符 ↓ GB 18030-2000對應的造字區 ↓ GB 18030-2005對應的Unicode ↓
A8BC ḿ U+E7C7 U+1E3F

GB 18030 仍未訂正對應的字符

在 GB 18030-2000 推出時，因未有中日韓統一表意文字擴展B區，以下字符被對應到造字區。而在 GB 18030-2005 推出時，儘管 Unicode 已收錄了擴展B區，但在 GB 18030-2005 標準中，以下字符仍然對應到造字區，未有作出修改。見 WG2 N2773 文件。結果，GB 18030-2005 重複收錄了以下 6 字兩次。

GB碼位 ↓ 字符 ↓ GB 18030對應的造字區 ↓ Unicode ≥3.1 ↓ 因此而重複的GB碼位 ↓
FE51 ? U+E816 U+20087 95329031
FE52 ? U+E817 U+20089 95329033
FE53 ? U+E818 U+200CC 95329730
FE6C ? U+E831 U+215D7 9536B937
FE76 ? U+E83B U+2298F 9630BA35
FE91 ? U+E855 U+241FE 9635B630

以下字符在 GB 18030-2000 時已有，而當時 Unicode 仍未有以下字符。儘管 Unicode 在 4.1 版本，已經把以下字符悉數加入，但在 GB 18030-2005 標準中，以下字符仍然對應到造字區。見 WG2 N2773 文件。

GB碼位 ↓ 字符 ↓ GB 18030對應的造字區 ↓ Unicode ≥4.1 ↓
A6D9 ︐ U+E78D U+FE10
A6DA ︒ U+E78E U+FE12
A6DB ︑ U+E78F U+FE11
A6DC ︓ U+E790 U+FE13
A6DD ︔ U+E791 U+FE14
A6DE ︕ U+E792 U+FE15
A6DF ︖ U+E793 U+FE16
A6EC ︗ U+E794 U+FE17
A6ED ︘ U+E795 U+FE18
A6F3 ︙ U+E796 U+FE19
FE59 龴 U+E81E U+9FB4
FE61 龵 U+E826 U+9FB5
FE66 龶 U+E82B U+9FB6
FE67 龷 U+E82C U+9FB7
FE6D 龸 U+E832 U+9FB8
FE7E 龹 U+E843 U+9FB9
FE90 龺 U+E854 U+9FBA
FEA0 龻 U+E864 U+9FBB

返回主網頁

16年6月25日星期六 9:15:04 pm

http://code.web.idv.hk/gb18030/gb18030.php

Last edited by zzz19760225 on 2017-11-28 at 11:24 ]

Welcome to the Chinese Encoding Webpage

GB 18030 Encoding Table

Since GB 18030 is designed to map the entire Unicode code points, the code table of GB 18030 is as large as Unicode.

Please press the drop-down box to select the code point you want to query. The first column of code points is sorted by Unicode, and the second column is the GB 18030 internal code. Supplementary with the third column description when necessary.

Differences between GBK and GB 2312

GB 2312 has 2-byte code positions. The value of the first byte is from 0xA1–FE (0xAA–AF and 0xF8–FE are not actually used), and the value of the second byte is from 0xA1–FE.
GBK has 2-byte code positions. The value of the first byte is from 0x81–FE, and the value of the second byte is from 0x40–7E and 0x80–FE.
GB 2312 has only 6,763 Chinese characters. GBK includes all CJK unified ideographs in the basic plane.
0x8140–A0FE, adds 6,080 Chinese characters; 0xAA40–FD9B (excluding the original GB 2312 range), adds 8,059 Chinese characters; 0xFD9C–FE4F, adds 21 compatible Chinese characters.
GB 2312 has only 682 symbols. In later font standards such as GB 5007.1 and GB 6345.1, six pinyin symbols: ɑ ḿ ń ň ǹ ɡ are added at 0xA8BB–A8C0. GBK inherits these symbols.
GBK adds 10 lowercase Roman numerals ⅰ–ⅹ (0xA2A1–A2AA).
GBK adds 29 vertical punctuation marks (0xA6D9–A6F5). Derived from GB 12345 standard.
GBK adds symbols used in Taiwan computer systems (0xA840–A895, 0xA940–A988, excluding A958, A95B, A95D–A95F).
However, in reality, Taiwan computer systems do not have 0xA844(―), 0xA891(☉), 0xA95C(‐).
Big5 code's 0xA145(‧), 0xA15A(╴), 0xA1C2(¯ or ‾), 0xA1C5(ˍ) also do not appear in GBK.
Adds ideographic description characters (0xA989–A995) and Chinese character zero 〇 (0xA996)
Adds 52 simplified Chinese characters in the "General Table of Simplified Characters" and 28 components of "Kangxi Dictionary" and "Cihai" that were not yet included in Unicode at that time (0xFE50–FEA0).
Note: Standards such as GB 5007.1 and GB 6345.1 supplement 94 half-width graphic characters (i.e., ASCII symbols) in the 10th area (internal code 0xAAA1–AAFE), and 32 half-width characters for Chinese pinyin a, e, i, o, u, ü with four tones and ê, ɑ, ḿ, ń, ň, ǹ, ɡ in the 11th area (internal code 0xABA1–ABC0). GBK and GB 18030 standards do not comply.

Differences between GBK and Microsoft CP936

Microsoft CP936 adds the euro symbol € at 0x80 (when GBK was launched in 1995, the euro had not been born)
Microsoft CP936 does not have 0xA6D9–A6DF, A6EC–A6ED, A6F3, A8BC, A8BF, A989–A995, FE50–FEA0 (GB 13000.1 / Unicode 1.0 did not have those characters).

Differences between GB 18030-2000 and GBK

GB 18030-2000 adds 4-byte code positions. The value of the first byte is from 0x81–FE, the second byte is from 0x30–39, the third byte is from 0x81–FE, and the fourth byte is from 0x30–39. And maps all possible encodings of Unicode to one of the GB 18030 code positions.
GB 18030-2000 includes all CJK unified ideographs in Extension A.
GB 18030-2000 includes the euro symbol at 0xA2E3.
Unfortunately, in the Microsoft Simplified Chinese system, 0x80 is still the euro symbol; 0xA2E3 has another euro symbol corresponding to the private use code point U+E76C.
Since Unicode ≥3.0 has included the following characters, the official document Appendix E of GB 18030-2000 and Appendix E - Table E.1 of GB 18030-2005 list the positions of the following characters in the next version of GB 13000 (note: equivalent to ISO/IEC 10646:2003). In fact, GB 18030-2000 and -2005 have modified their corresponding Unicode mappings.

GB code position ↓ Character ↓ GBK corresponding private use area ↓ GB 18030 corresponding Unicode ↓
A8BF ǹ U+E7C8 U+01F9
A989 〾 U+E7E7 U+303E
A98A ⿰ U+E7E8 U+2FF0
A98B ⿱ U+E7E9 U+2FF1
A98C ⿲ U+E7EA U+2FF2
A98D ⿳ U+E7EB U+2FF3
A98E ⿴ U+E7EC U+2FF4
A98F ⿵ U+E7ED U+2FF5
A990 ⿶ U+E7EE U+2FF6
A991 ⿷ U+E7EF U+2FF7
A992 ⿸ U+E7F0 U+2FF8
A993 ⿹ U+E7F1 U+2FF9
A994 ⿺ U+E7F2 U+2FFA
A995 ⿻ U+E7F3 U+2FFB
FE50 ⺁ U+E815 U+2E81
FE54 ⺄ U+E819 U+2E84
FE55 㑳 U+E81A U+3473
FE56 㑇 U+E81B U+3447
FE57 ⺈ U+E81C U+2E88
FE58 ⺋ U+E81D U+2E8B
FE5A 㖞 U+E81F U+359E
FE5B 㘚 U+E820 U+361A
FE5C 㘎 U+E821 U+360E
FE5D ⺌ U+E822 U+2E8C
FE5E ⺗ U+E823 U+2E97
FE5F 㥮 U+E824 U+396E
FE60 㤘 U+E825 U+3918
FE62 㧏 U+E827 U+39CF
FE63 㧟 U+E828 U+39DF
FE64 㩳 U+E829 U+3A73
FE65 㧐 U+E82A U+39D0
FE68 㭎 U+E82D U+3B4E
FE69 㱮 U+E82E U+3C6E
FE6A 㳠 U+E82F U+3CE0
FE6B ⺧ U+E830 U+2EA7
FE6E ⺪ U+E833 U+2EAA
FE6F 䁖 U+E834 U+4056
FE70 䅟 U+E835 U+415F
FE71 ⺮ U+E836 U+2EAE
FE72 䌷 U+E837 U+4337
FE73 ⺳ U+E838 U+2EB3
FE74 ⺶ U+E839 U+2EB6
FE75 ⺷ U+E83A U+2EB7
FE77 䎱 U+E83C U+43B1
FE78 䎬 U+E83D U+43AC
FE79 ⺻ U+E83E U+2EBB
FE7A 䏝 U+E83F U+43DD
FE7B 䓖 U+E840 U+44D6
FE7C 䙡 U+E841 U+4661
FE7D 䙌 U+E842 U+464C
FE80 䜣 U+E844 U+4723
FE81 䜩 U+E845 U+4729
FE82 䝼 U+E846 U+477C
FE83 䞍 U+E847 U+478D
FE84 ⻊ U+E848 U+2ECA
FE85 䥇 U+E849 U+4947
FE86 䥺 U+E84A U+497A
FE87 䥽 U+E84B U+497D
FE88 䦂 U+E84C U+4982
FE89 䦃 U+E84D U+4983
FE8A 䦅 U+E84E U+4985
FE8B 䦆 U+E84F U+4986
FE8C 䦟 U+E850 U+499F
FE8D 䦛 U+E851 U+499B
FE8E 䦷 U+E852 U+49B7
FE8F 䦶 U+E853 U+49B6
FE92 䲣 U+E856 U+4CA3
FE93 䲟 U+E857 U+4C9F
FE94 䲠 U+E858 U+4CA0
FE95 䲡 U+E859 U+4CA1
FE96 䱷 U+E85A U+4C77
FE97 䲢 U+E85B U+4CA2
FE98 䴓 U+E85C U+4D13
FE99 䴔 U+E85D U+4D14
FE9A 䴕 U+E85E U+4D15
FE9B 䴖 U+E85F U+4D16
FE9C 䴗 U+E860 U+4D17
FE9D 䴘 U+E861 U+4D18
FE9E 䴙 U+E862 U+4D19
FE9F 䶮 U+E863 U+4DAE

Differences between GB 18030-2005 and GB 18030-2000

Includes the glyph tables for CJK unified ideographs in Extension B, Korean characters, Mongolian (including Manchu, Torgut, Xibe, and Alagkari scripts), Dehong Dai, Tibetan, Uyghur/Kazakh/Kyrgyz, and Yi scripts. Korean includes 3,376 Korean characters plus 69 letters plus 51 compatible letters, Mongolian includes 149 characters, Dai includes 35 characters, Tibetan includes 193 characters, Uyghur includes 49 characters plus 153 letter forms, Yi includes 1,215 characters (excluding U+A4A2, U+A4A3, U+A4B4, U+A4C1, U+A4C5).
GB 18030-2000 did not map ḿ to Unicode. It was finally corrected in GB 18030-2005. See Appendix E - Table E.2 of the official document.

GB code position ↓ Character ↓ GB 18030-2000 corresponding private use area ↓ GB 18030-2005 corresponding Unicode ↓
A8BC ḿ U+E7C7 U+1E3F

Characters not yet corrected in GB 18030

When GB 18030-2000 was launched, because there was no CJK unified ideographs in Extension B, the following characters were mapped to the private use area. And when GB 18030-2005 was launched, although Unicode had included Extension B, in the GB 18030-2005 standard, the following characters still mapped to the private use area and were not modified. See WG2 N2773 document. As a result, GB 18030-2005 repeatedly included the following 6 characters twice.

GB code position ↓ Character ↓ GB 18030 corresponding private use area ↓ Unicode ≥3.1 ↓ Repeated GB code positions due to this ↓
FE51 ? U+E816 U+20087 95329031
FE52 ? U+E817 U+20089 95329033
FE53 ? U+E818 U+200CC 95329730
FE6C ? U+E831 U+215D7 9536B937
FE76 ? U+E83B U+2298F 9630BA35
FE91 ? U+E855 U+241FE 9635B630

The following characters were already in GB 18030-2000, and at that time Unicode did not have the following characters. Although Unicode included all the following characters in version 4.1, in the GB 18030-2005 standard, the following characters still mapped to the private use area. See WG2 N2773 document.

GB code position ↓ Character ↓ GB 18030 corresponding private use area ↓ Unicode ≥4.1 ↓
A6D9 ︐ U+E78D U+FE10
A6DA ︒ U+E78E U+FE12
A6DB ︑ U+E78F U+FE11
A6DC ︓ U+E790 U+FE13
A6DD ︔ U+E791 U+FE14
A6DE ︕ U+E792 U+FE15
A6DF ︖ U+E793 U+FE16
A6EC ︗ U+E794 U+FE17
A6ED ︘ U+E795 U+FE18
A6F3 ︙ U+E796 U+FE19
FE59 龴 U+E81E U+9FB4
FE61 龵 U+E826 U+9FB5
FE66 龶 U+E82B U+9FB6
FE67 龷 U+E82C U+9FB7
FE6D 龸 U+E832 U+9FB8
FE7E 龹 U+E843 U+9FB9
FE90 龺 U+E854 U+9FBA
FEA0 龻 U+E864 U+9FBB

Return to the main page

Saturday, June 25, 2016 9:15:04 pm

http://code.web.idv.hk/gb18030/gb18030.php

Last edited by zzz19760225 on 2017-11-28 at 11:24 ]

1<词>，2，3/段\，4{节}，5(章)。

2016-6-26 19:54

查看资料发短消息网志

编辑帖子回复引用回复

请注意：您目前尚未注册或登录，请您注册或登录以使用论坛的各项功能，例如发表和回复帖子等。

可打印版本 | 推荐给朋友 | 订阅主题 | 收藏主题

论坛跳转: