歡迎光臨中文編碼網頁
GB 18030 編碼表
因 GB 18030 的設計是要把整個 Unicode 的字碼作對應,所以 GB 18030 的碼表,也與 Unicode 一樣龐大。
請按下拉選框,選擇你想查詢的字碼。字碼第一欄以 Unicode 排序,第二欄為 GB 18030 內碼。有需要時輔以第三欄說明。
GBK與GB 2312的分別
GB 2312 2字節碼位,第一個字節的值從 0xA1–FE(0xAA–AF 及 0xF8–FE 實際未使用),第二個字節的值從 0xA1–FE。
GBK 2字節碼位,第一個字節的值從 0x81–FE,第二個字節的值從 0x40–7E 及 0x80–FE。
GB 2312 只有 6,763 個漢字。GBK 收錄所有中日韓統一表意文字基本區漢字。
0x8140–A0FE,加入 6,080 個漢字;0xAA40–FD9B(不包括原有 GB 2312 範圍),加入 8,059 個漢字;0xFD9C–FE4F,加入 21 個兼容漢字。
GB 2312 只有 682 個符號。在後來的字型標準如 GB 5007.1 和 GB 6345.1 等,在 0xA8BB–A8C0 補上六個拼音符號:ɑ ḿ ń ň ǹ ɡ。GBK 承繼了這些符號。
GBK 加入 10 個小寫羅馬數字 ⅰ–ⅹ (0xA2A1–A2AA)。
GBK 加入 29 個豎排標點符號 (0xA6D9–A6F5)。來源自 GB 12345 標準。
GBK 加入台灣電腦系統用的符號 (0xA840–A895, 0xA940–A988,不包含 A958, A95B, A95D–A95F)。
但實際上,台灣電腦系統並沒有 0xA844(―), 0xA891(☉), 0xA95C(‐) 。
Big5 碼的 0xA145(‧), 0xA15A(╴), 0xA1C2(¯ 或 ‾), 0xA1C5(ˍ) 亦沒有在 GBK 出現。
加入表意文字描述符 (0xA989–A995) 及漢字數字零 〇 (0xA996)
加入當時 Unicode 尚未收錄的 52 個《簡化字總表》漢字、28 個《康熙字典》及《辭海》漢字部件 (0xFE50–FEA0)。
註:GB 5007.1 和 GB 6345.1 等標準,在第 10 區(內碼 0xAAA1–AAFE)補充 94 個半字圖形字符(即是 ASCII 符號)、
在第 11 區(內碼 0xABA1–ABC0)補充漢語拼音 a, e, i, o, u, ü 的四聲半字字符和 ê, ɑ, ḿ, ń, ň, ǹ, ɡ 的半字字符共 32 個。
GBK 和 GB 18030 標準均沒有遵從。
GBK與微軟CP936的分別
微軟 CP936 在 0x80 加入歐元符號 €(1995年 GBK 推出時,歐元尚未誕生)
微軟 CP936 沒有 0xA6D9–A6DF, A6EC–A6ED, A6F3, A8BC, A8BF, A989–A995, FE50–FEA0(GB 13000.1 / Unicode 1.0 沒有那些字符)。
GB 18030-2000與GBK的分別
GB 18030-2000 增加了4字節的碼位,第一個字節的值從 0x81–FE,第二個字節的值從 0x30–39,第三個字節從 0x81–FE,第四個字節從 0x30–39。並把 Unicode 的所有可能編碼,都對應到其中一個 GB 18030 碼位。
GB 18030-2000 收錄所有中日韓統一表意文字擴展A區漢字。
GB 18030-2000 把歐元符號收錄在 0xA2E3。
很不幸,在微軟簡體中文系統,0x80 依舊是歐元符號;0xA2E3 則另有一個歐元符號,對應至私人造字碼 U+E76C。
因為 Unicode ≥3.0 已收錄以下字符,在 GB 18030-2000 的官方文件附錄E 及 GB 18030-2005 的官方文件附錄E-表E.1,列出了以下字符在下一版 GB 13000(註:相當於 ISO/IEC 10646:2003)的位置。事實上,GB 18030-2000 和 -2005 已修改了它們所對應的 Unicode 對應。
GB碼位 ↓ 字符 ↓ GBK對應的造字區 ↓ GB 18030對應的Unicode ↓
A8BF ǹ U+E7C8 U+01F9
A989 〾 U+E7E7 U+303E
A98A ⿰ U+E7E8 U+2FF0
A98B ⿱ U+E7E9 U+2FF1
A98C ⿲ U+E7EA U+2FF2
A98D ⿳ U+E7EB U+2FF3
A98E ⿴ U+E7EC U+2FF4
A98F ⿵ U+E7ED U+2FF5
A990 ⿶ U+E7EE U+2FF6
A991 ⿷ U+E7EF U+2FF7
A992 ⿸ U+E7F0 U+2FF8
A993 ⿹ U+E7F1 U+2FF9
A994 ⿺ U+E7F2 U+2FFA
A995 ⿻ U+E7F3 U+2FFB
FE50 ⺁ U+E815 U+2E81
FE54 ⺄ U+E819 U+2E84
FE55 㑳 U+E81A U+3473
FE56 㑇 U+E81B U+3447
FE57 ⺈ U+E81C U+2E88
FE58 ⺋ U+E81D U+2E8B
FE5A 㖞 U+E81F U+359E
FE5B 㘚 U+E820 U+361A
FE5C 㘎 U+E821 U+360E
FE5D ⺌ U+E822 U+2E8C
FE5E ⺗ U+E823 U+2E97
FE5F 㥮 U+E824 U+396E
FE60 㤘 U+E825 U+3918
FE62 㧏 U+E827 U+39CF
FE63 㧟 U+E828 U+39DF
FE64 㩳 U+E829 U+3A73
FE65 㧐 U+E82A U+39D0
FE68 㭎 U+E82D U+3B4E
FE69 㱮 U+E82E U+3C6E
FE6A 㳠 U+E82F U+3CE0
FE6B ⺧ U+E830 U+2EA7
FE6E ⺪ U+E833 U+2EAA
FE6F 䁖 U+E834 U+4056
FE70 䅟 U+E835 U+415F
FE71 ⺮ U+E836 U+2EAE
FE72 䌷 U+E837 U+4337
FE73 ⺳ U+E838 U+2EB3
FE74 ⺶ U+E839 U+2EB6
FE75 ⺷ U+E83A U+2EB7
FE77 䎱 U+E83C U+43B1
FE78 䎬 U+E83D U+43AC
FE79 ⺻ U+E83E U+2EBB
FE7A 䏝 U+E83F U+43DD
FE7B 䓖 U+E840 U+44D6
FE7C 䙡 U+E841 U+4661
FE7D 䙌 U+E842 U+464C
FE80 䜣 U+E844 U+4723
FE81 䜩 U+E845 U+4729
FE82 䝼 U+E846 U+477C
FE83 䞍 U+E847 U+478D
FE84 ⻊ U+E848 U+2ECA
FE85 䥇 U+E849 U+4947
FE86 䥺 U+E84A U+497A
FE87 䥽 U+E84B U+497D
FE88 䦂 U+E84C U+4982
FE89 䦃 U+E84D U+4983
FE8A 䦅 U+E84E U+4985
FE8B 䦆 U+E84F U+4986
FE8C 䦟 U+E850 U+499F
FE8D 䦛 U+E851 U+499B
FE8E 䦷 U+E852 U+49B7
FE8F 䦶 U+E853 U+49B6
FE92 䲣 U+E856 U+4CA3
FE93 䲟 U+E857 U+4C9F
FE94 䲠 U+E858 U+4CA0
FE95 䲡 U+E859 U+4CA1
FE96 䱷 U+E85A U+4C77
FE97 䲢 U+E85B U+4CA2
FE98 䴓 U+E85C U+4D13
FE99 䴔 U+E85D U+4D14
FE9A 䴕 U+E85E U+4D15
FE9B 䴖 U+E85F U+4D16
FE9C 䴗 U+E860 U+4D17
FE9D 䴘 U+E861 U+4D18
FE9E 䴙 U+E862 U+4D19
FE9F 䶮 U+E863 U+4DAE
GB 18030-2005與GB 18030-2000的分別
夾附中日韓統一表意文字擴展B區漢字、朝鮮文、蒙古文(包括滿文、托忒文、錫伯文、阿禮嘎禮文)、德宏傣文、藏文、維吾爾文/哈薩克文/柯爾克茲文,和彝文的字形表。 韓文包含 3,376 個韓字加 69 個字母加 51 個兼容字母、 蒙古文包含 149 字、傣文包含 35 字、藏文包含 193 字、 維吾爾文包含 49 字加 153 個字母表達形式、 彝文包含 1,215 字(不包含 U+A4A2, U+A4A3, U+A4B4, U+A4C1, U+A4C5)。
GB 18030-2000 沒有把 ḿ 對應至 Unicode。在 GB 18030-2005 終於獲訂正。見官方文件附錄E-表E.2。
GB碼位 ↓ 字符 ↓ GB 18030-2000對應的造字區 ↓ GB 18030-2005對應的Unicode ↓
A8BC ḿ U+E7C7 U+1E3F
GB 18030 仍未訂正對應的字符
在 GB 18030-2000 推出時,因未有中日韓統一表意文字擴展B區,以下字符被對應到造字區。 而在 GB 18030-2005 推出時,儘管 Unicode 已收錄了擴展B區, 但在 GB 18030-2005 標準中,以下字符仍然對應到造字區,未有作出修改。 見 WG2 N2773 文件。 結果,GB 18030-2005 重複收錄了以下 6 字兩次。
GB碼位 ↓ 字符 ↓ GB 18030對應的造字區 ↓ Unicode ≥3.1 ↓ 因此而重複的GB碼位 ↓
FE51 ? U+E816 U+20087 95329031
FE52 ? U+E817 U+20089 95329033
FE53 ? U+E818 U+200CC 95329730
FE6C ? U+E831 U+215D7 9536B937
FE76 ? U+E83B U+2298F 9630BA35
FE91 ? U+E855 U+241FE 9635B630
以下字符在 GB 18030-2000 時已有,而當時 Unicode 仍未有以下字符。 儘管 Unicode 在 4.1 版本,已經把以下字符悉數加入,但在 GB 18030-2005 標準中,以下字符仍然對應到造字區。 見 WG2 N2773 文件。
GB碼位 ↓ 字符 ↓ GB 18030對應的造字區 ↓ Unicode ≥4.1 ↓
A6D9 ︐ U+E78D U+FE10
A6DA ︒ U+E78E U+FE12
A6DB ︑ U+E78F U+FE11
A6DC ︓ U+E790 U+FE13
A6DD ︔ U+E791 U+FE14
A6DE ︕ U+E792 U+FE15
A6DF ︖ U+E793 U+FE16
A6EC ︗ U+E794 U+FE17
A6ED ︘ U+E795 U+FE18
A6F3 ︙ U+E796 U+FE19
FE59 龴 U+E81E U+9FB4
FE61 龵 U+E826 U+9FB5
FE66 龶 U+E82B U+9FB6
FE67 龷 U+E82C U+9FB7
FE6D 龸 U+E832 U+9FB8
FE7E 龹 U+E843 U+9FB9
FE90 龺 U+E854 U+9FBA
FEA0 龻 U+E864 U+9FBB
返回主網頁
16年6月25日 星期六 9:15:04 pm
http://code.web.idv.hk/gb18030/gb18030.php
Last edited by zzz19760225 on 2017-11-28 at 11:24 ]
Welcome to the Chinese Encoding Webpage
GB 18030 Encoding Table
Since GB 18030 is designed to map the entire Unicode code points, the code table of GB 18030 is as large as Unicode.
Please press the drop-down box to select the code point you want to query. The first column of code points is sorted by Unicode, and the second column is the GB 18030 internal code. Supplementary with the third column description when necessary.
Differences between GBK and GB 2312
GB 2312 has 2-byte code positions. The value of the first byte is from 0xA1–FE (0xAA–AF and 0xF8–FE are not actually used), and the value of the second byte is from 0xA1–FE.
GBK has 2-byte code positions. The value of the first byte is from 0x81–FE, and the value of the second byte is from 0x40–7E and 0x80–FE.
GB 2312 has only 6,763 Chinese characters. GBK includes all CJK unified ideographs in the basic plane.
0x8140–A0FE, adds 6,080 Chinese characters; 0xAA40–FD9B (excluding the original GB 2312 range), adds 8,059 Chinese characters; 0xFD9C–FE4F, adds 21 compatible Chinese characters.
GB 2312 has only 682 symbols. In later font standards such as GB 5007.1 and GB 6345.1, six pinyin symbols: ɑ ḿ ń ň ǹ ɡ are added at 0xA8BB–A8C0. GBK inherits these symbols.
GBK adds 10 lowercase Roman numerals ⅰ–ⅹ (0xA2A1–A2AA).
GBK adds 29 vertical punctuation marks (0xA6D9–A6F5). Derived from GB 12345 standard.
GBK adds symbols used in Taiwan computer systems (0xA840–A895, 0xA940–A988, excluding A958, A95B, A95D–A95F).
However, in reality, Taiwan computer systems do not have 0xA844(―), 0xA891(☉), 0xA95C(‐).
Big5 code's 0xA145(‧), 0xA15A(╴), 0xA1C2(¯ or ‾), 0xA1C5(ˍ) also do not appear in GBK.
Adds ideographic description characters (0xA989–A995) and Chinese character zero 〇 (0xA996)
Adds 52 simplified Chinese characters in the "General Table of Simplified Characters" and 28 components of "Kangxi Dictionary" and "Cihai" that were not yet included in Unicode at that time (0xFE50–FEA0).
Note: Standards such as GB 5007.1 and GB 6345.1 supplement 94 half-width graphic characters (i.e., ASCII symbols) in the 10th area (internal code 0xAAA1–AAFE), and 32 half-width characters for Chinese pinyin a, e, i, o, u, ü with four tones and ê, ɑ, ḿ, ń, ň, ǹ, ɡ in the 11th area (internal code 0xABA1–ABC0). GBK and GB 18030 standards do not comply.
Differences between GBK and Microsoft CP936
Microsoft CP936 adds the euro symbol € at 0x80 (when GBK was launched in 1995, the euro had not been born)
Microsoft CP936 does not have 0xA6D9–A6DF, A6EC–A6ED, A6F3, A8BC, A8BF, A989–A995, FE50–FEA0 (GB 13000.1 / Unicode 1.0 did not have those characters).
Differences between GB 18030-2000 and GBK
GB 18030-2000 adds 4-byte code positions. The value of the first byte is from 0x81–FE, the second byte is from 0x30–39, the third byte is from 0x81–FE, and the fourth byte is from 0x30–39. And maps all possible encodings of Unicode to one of the GB 18030 code positions.
GB 18030-2000 includes all CJK unified ideographs in Extension A.
GB 18030-2000 includes the euro symbol at 0xA2E3.
Unfortunately, in the Microsoft Simplified Chinese system, 0x80 is still the euro symbol; 0xA2E3 has another euro symbol corresponding to the private use code point U+E76C.
Since Unicode ≥3.0 has included the following characters, the official document Appendix E of GB 18030-2000 and Appendix E - Table E.1 of GB 18030-2005 list the positions of the following characters in the next version of GB 13000 (note: equivalent to ISO/IEC 10646:2003). In fact, GB 18030-2000 and -2005 have modified their corresponding Unicode mappings.
GB code position ↓ Character ↓ GBK corresponding private use area ↓ GB 18030 corresponding Unicode ↓
A8BF ǹ U+E7C8 U+01F9
A989 〾 U+E7E7 U+303E
A98A ⿰ U+E7E8 U+2FF0
A98B ⿱ U+E7E9 U+2FF1
A98C ⿲ U+E7EA U+2FF2
A98D ⿳ U+E7EB U+2FF3
A98E ⿴ U+E7EC U+2FF4
A98F ⿵ U+E7ED U+2FF5
A990 ⿶ U+E7EE U+2FF6
A991 ⿷ U+E7EF U+2FF7
A992 ⿸ U+E7F0 U+2FF8
A993 ⿹ U+E7F1 U+2FF9
A994 ⿺ U+E7F2 U+2FFA
A995 ⿻ U+E7F3 U+2FFB
FE50 ⺁ U+E815 U+2E81
FE54 ⺄ U+E819 U+2E84
FE55 㑳 U+E81A U+3473
FE56 㑇 U+E81B U+3447
FE57 ⺈ U+E81C U+2E88
FE58 ⺋ U+E81D U+2E8B
FE5A 㖞 U+E81F U+359E
FE5B 㘚 U+E820 U+361A
FE5C 㘎 U+E821 U+360E
FE5D ⺌ U+E822 U+2E8C
FE5E ⺗ U+E823 U+2E97
FE5F 㥮 U+E824 U+396E
FE60 㤘 U+E825 U+3918
FE62 㧏 U+E827 U+39CF
FE63 㧟 U+E828 U+39DF
FE64 㩳 U+E829 U+3A73
FE65 㧐 U+E82A U+39D0
FE68 㭎 U+E82D U+3B4E
FE69 㱮 U+E82E U+3C6E
FE6A 㳠 U+E82F U+3CE0
FE6B ⺧ U+E830 U+2EA7
FE6E ⺪ U+E833 U+2EAA
FE6F 䁖 U+E834 U+4056
FE70 䅟 U+E835 U+415F
FE71 ⺮ U+E836 U+2EAE
FE72 䌷 U+E837 U+4337
FE73 ⺳ U+E838 U+2EB3
FE74 ⺶ U+E839 U+2EB6
FE75 ⺷ U+E83A U+2EB7
FE77 䎱 U+E83C U+43B1
FE78 䎬 U+E83D U+43AC
FE79 ⺻ U+E83E U+2EBB
FE7A 䏝 U+E83F U+43DD
FE7B 䓖 U+E840 U+44D6
FE7C 䙡 U+E841 U+4661
FE7D 䙌 U+E842 U+464C
FE80 䜣 U+E844 U+4723
FE81 䜩 U+E845 U+4729
FE82 䝼 U+E846 U+477C
FE83 䞍 U+E847 U+478D
FE84 ⻊ U+E848 U+2ECA
FE85 䥇 U+E849 U+4947
FE86 䥺 U+E84A U+497A
FE87 䥽 U+E84B U+497D
FE88 䦂 U+E84C U+4982
FE89 䦃 U+E84D U+4983
FE8A 䦅 U+E84E U+4985
FE8B 䦆 U+E84F U+4986
FE8C 䦟 U+E850 U+499F
FE8D 䦛 U+E851 U+499B
FE8E 䦷 U+E852 U+49B7
FE8F 䦶 U+E853 U+49B6
FE92 䲣 U+E856 U+4CA3
FE93 䲟 U+E857 U+4C9F
FE94 䲠 U+E858 U+4CA0
FE95 䲡 U+E859 U+4CA1
FE96 䱷 U+E85A U+4C77
FE97 䲢 U+E85B U+4CA2
FE98 䴓 U+E85C U+4D13
FE99 䴔 U+E85D U+4D14
FE9A 䴕 U+E85E U+4D15
FE9B 䴖 U+E85F U+4D16
FE9C 䴗 U+E860 U+4D17
FE9D 䴘 U+E861 U+4D18
FE9E 䴙 U+E862 U+4D19
FE9F 䶮 U+E863 U+4DAE
Differences between GB 18030-2005 and GB 18030-2000
Includes the glyph tables for CJK unified ideographs in Extension B, Korean characters, Mongolian (including Manchu, Torgut, Xibe, and Alagkari scripts), Dehong Dai, Tibetan, Uyghur/Kazakh/Kyrgyz, and Yi scripts. Korean includes 3,376 Korean characters plus 69 letters plus 51 compatible letters, Mongolian includes 149 characters, Dai includes 35 characters, Tibetan includes 193 characters, Uyghur includes 49 characters plus 153 letter forms, Yi includes 1,215 characters (excluding U+A4A2, U+A4A3, U+A4B4, U+A4C1, U+A4C5).
GB 18030-2000 did not map ḿ to Unicode. It was finally corrected in GB 18030-2005. See Appendix E - Table E.2 of the official document.
GB code position ↓ Character ↓ GB 18030-2000 corresponding private use area ↓ GB 18030-2005 corresponding Unicode ↓
A8BC ḿ U+E7C7 U+1E3F
Characters not yet corrected in GB 18030
When GB 18030-2000 was launched, because there was no CJK unified ideographs in Extension B, the following characters were mapped to the private use area. And when GB 18030-2005 was launched, although Unicode had included Extension B, in the GB 18030-2005 standard, the following characters still mapped to the private use area and were not modified. See WG2 N2773 document. As a result, GB 18030-2005 repeatedly included the following 6 characters twice.
GB code position ↓ Character ↓ GB 18030 corresponding private use area ↓ Unicode ≥3.1 ↓ Repeated GB code positions due to this ↓
FE51 ? U+E816 U+20087 95329031
FE52 ? U+E817 U+20089 95329033
FE53 ? U+E818 U+200CC 95329730
FE6C ? U+E831 U+215D7 9536B937
FE76 ? U+E83B U+2298F 9630BA35
FE91 ? U+E855 U+241FE 9635B630
The following characters were already in GB 18030-2000, and at that time Unicode did not have the following characters. Although Unicode included all the following characters in version 4.1, in the GB 18030-2005 standard, the following characters still mapped to the private use area. See WG2 N2773 document.
GB code position ↓ Character ↓ GB 18030 corresponding private use area ↓ Unicode ≥4.1 ↓
A6D9 ︐ U+E78D U+FE10
A6DA ︒ U+E78E U+FE12
A6DB ︑ U+E78F U+FE11
A6DC ︓ U+E790 U+FE13
A6DD ︔ U+E791 U+FE14
A6DE ︕ U+E792 U+FE15
A6DF ︖ U+E793 U+FE16
A6EC ︗ U+E794 U+FE17
A6ED ︘ U+E795 U+FE18
A6F3 ︙ U+E796 U+FE19
FE59 龴 U+E81E U+9FB4
FE61 龵 U+E826 U+9FB5
FE66 龶 U+E82B U+9FB6
FE67 龷 U+E82C U+9FB7
FE6D 龸 U+E832 U+9FB8
FE7E 龹 U+E843 U+9FB9
FE90 龺 U+E854 U+9FBA
FEA0 龻 U+E864 U+9FBB
Return to the main page
Saturday, June 25, 2016 9:15:04 pm
http://code.web.idv.hk/gb18030/gb18030.php
Last edited by zzz19760225 on 2017-11-28 at 11:24 ]