ASCII
Edit Entry
This entry lacks entry classification. Supplement relevant content to help the entry be more complete! Edit now >>
ASCII (pronunciation: /ˈæski/ ASS-kee, American Standard Code for Information Interchange, American Standard Code for Information Interchange) is a computer coding system based on the Latin alphabet. It is mainly used to display modern English, and its extended version EASCII can partially support other Western European languages and is equivalent to the international standard ISO/IEC 646. Due to the widespread use of ASCII on the World Wide Web, it was gradually replaced by Unicode until December 2007.
Chinese Name American Standard Code for Information Interchange
Abbreviation ASCII
Category Coding Standard
English Name American Standard Code for Information Interchange
Other Name ASCII Code
Function Display modern English and other Western European languages
Table of Contents
1 Brief Introduction
2 Historical Evolution
3 Code Generation
4 Standard Code Table
5 Size Rules
6 International Issues
7 Commonly Used on Keyboard
8 Code Algorithm
9 Chinese Character Coding
1 Brief Introduction Edit
ASCII
ASCII code uses specified 7-bit or 8-bit binary number combinations to represent 128 or 256 possible characters. Standard ASCII code, also known as basic ASCII code, uses 7-bit binary numbers to represent all uppercase and lowercase letters, numbers 0 to 9, punctuation marks, and special control characters used in American English. Among them:
0-31 and 127 (a total of 33) are control characters or communication-specific characters (the rest are displayable characters), such as control characters: LF (line feed), CR (carriage return), FF (form feed), DEL (delete), BS (backspace), BEL (bell), etc.; communication-specific characters: SOH (start of heading), EOT (end of transmission), ACK (acknowledge), etc.; ASCII values of 8, 9, 10, and 13 are respectively converted to backspace, tab, line feed, and carriage return characters. They do not have specific graphic displays, but will have different effects on text display depending on different application programs.
32-126 (a total of 95) are characters (32sp is space), among which 48-57 are ten Arabic numerals from 0 to 9
65-90 are 26 uppercase English letters, 97-122 are 26 lowercase English letters, and the rest are some punctuation marks, operation symbols, etc.
Also, it should be noted that in standard ASCII, its highest bit (b7) is used as a parity bit. The so-called parity check is a method used to check for errors during code transmission. Generally, there are two types: odd parity and even parity. Odd parity regulation: The number of 1s in a correct code byte must be odd. If it is not odd, add 1 to the highest bit b7; even parity regulation: The number of 1s in a correct code byte must be even. If it is not even, add 1 to the highest bit b7.
The latter 128 are called extended ASCII codes. Currently, many x86-based systems support the use of extended (or "high") ASCII. Extended ASCII codes allow the 8th bit of each character to be used to determine an additional 128 special symbol characters, foreign language letters, and graphic symbols.
2 Historical Evolution Edit
6000 years ago Hieroglyphics
3000 years ago Alphabet
1838 to 1854 Samuel F. B. Morse invented the telegraph, and each character in the alphabet corresponds to a series of short and long pulses
1821 to 1824 Louis Braille invented Braille, a 6-bit code that encodes characters, common letter combinations, common words, and punctuation.
A special escape code indicates that subsequent character codes should be interpreted as uppercase. A special shift code allows subsequent codes to be interpreted as numbers.
1931 CCITT standardized Telex codes, including Baudot #2 codes, which are 5-bit codes including characters and numbers.
1890 Early computer character codes were from Hollerith cards, a 6-bit character code system BCDIC (Binary-Coded Decimal Interchange Code: Binary Coded Decimal Interchange Code)
1960s Expanded to 8-bit EBCDIC, the standard for IBM mainframes
1967 American Standard Code for Information Interchange (ASCII: American Standard Code for Information Interchange)
There was a big dispute over whether the character length was 6 bits, 7 bits, or 8 bits. From a reliability perspective, replacement characters should not be used,
Therefore, ASCII cannot be a 6-bit encoding, but due to cost reasons, the 8-bit version plan was also excluded (at that time, the storage space cost per bit was still very expensive).
In this way, the final character codes are 26 lowercase letters, 26 uppercase letters, 10 numbers, 32 symbols, 33 handles, and a space, totaling 128 character codes.
ASCII is now recorded in ANSI X3.4-1986 Character Set - 7-bit American National Standard Code for Information Interchange (7-Bit ASCII: 7-Bit American National
Standard Code for Information Interchange), released by the American National Standards Institute (American National Standards Institute).
3 Code Generation Edit
In a computer, all data must be represented in binary numbers during storage and operation (because a computer uses high and low levels to respectively represent 1 and 0). For example, letters like a, b, c, d (including uppercase), numbers like 0, 1, and some common symbols (such as *, #, @, etc.) must also be represented in binary numbers when stored in a computer. Of course, everyone can agree on their own set of (this is called encoding) for which binary numbers represent which symbols. However, if everyone wants to communicate with each other without confusion, then everyone must use the same encoding rules. So the relevant standardized organization in the United States introduced the so-called ASCII encoding, which uniformly stipulates which binary numbers are used to represent the above common symbols.
The American Standard Code for Information Interchange is formulated by the American National Standard Institute (American National Standard Institute, ANSI), a standard single-byte character encoding scheme for text-based data. It started in the late 1950s and was finalized in 1967. It was originally an American national standard for different computers to use as a common Western character encoding standard when communicating with each other. It has been established as an international standard by the International Organization for Standardization (International Organization for Standardization, ISO), called the ISO 646 standard. Applicable to all Latin alphabet letters.
4 Standard Code Table Edit
Bin
Dec
Hex
Abbreviation/Character
Explanation
0000,0000 0 00
NUL(null)
Null character
0000,0001 1 01 SOH(start,of,headline) Start of headline
0000,0010 2 02 STX,(start,of,text) Start of text
0000,0011 3 03 ETX,(end,of,text) End of text
0000,0100 4 04 EOT,(end,of,transmission) End of transmission
0000,0101 5 05 ENQ,(enquiry) Enquiry
0000,0110 6 06 ACK,(acknowledge) Acknowledge
0000,0111 7 07 BEL,(bell) Bell
0000,1000 8 08 BS,(backspace) Backspace
0000,1001 9 09 HT,(horizontal,tab) Horizontal tab
0000,1010 10 0A LF,(NL,line,feed,new,line) Line feed key
0000,1011 11 0B VT,(vertical,tab) Vertical tab
0000,1100 12 0C FF,(NP,form,feed,new,page) Form feed key
0000,1101 13 0D CR,(carriage,return) Carriage return key
0000,1110 14 0E SO,(shift,out) Shift out
0000,1111 15 0F SI,(shift,in) Shift in
0001,0000 16 10 DLE,(data,link,escape) Data link escape
0001,0001 17 11 DC1,(device,control,1) Device control 1
0001,0010 18 12 DC2,(device,control,2) Device control 2
0001,0011 19 13 DC3,(device,control,3) Device control 3
0001,0100 20 14 DC4,(device,control,4) Device control 4
0001,0101 21 15 NAK,(negative,acknowledge) Negative acknowledge
0001,0110 22 16 SYN,(synchronous,idle) Synchronous idle
0001,0111 23 17 ETB,(end,of,trans,block) End of transmission block
0001,1000 24 18 CAN,(cancel) Cancel
0001,1001 25 19 EM,(end,of,medium) End of medium
0001,1010 26 1A SUB,(substitute) Substitute
0001,1011 27 1B ESC,(escape) Escape
0001,1100 28 1C FS,(file,separator) File separator
0001,1101 29 1D GS,(group,separator) Group separator
0001,1110 30 1E RS,(record,separator) Record separator
0001,1111 31 1F US,(unit,separator) Unit separator
0010,0000 32 20 (space) Space
0010,0001 33 21 !
0010,0010 34 22 "
0010,0011 35 23 #
0010,0100 36 24 $
0010,0101 37 25 %
0010,0110 38 26 &
0010,0111 39 27 '
0010,1000 40 28 (
0010,1001 41 29 )
0010,1010 42 2A *
0010,1011 43 2B +
0010,1100 44 2C
0010,1101 45 2D -
0010,1110 46 2E
00101111 47 2F /
00110000 48 30 0
00110001 49 31 1
00110010 50 32 2
00110011 51 33 3
00110100 52 34 4
00110101 53 35 5
00110110 54 36 6
00110111 55 37 7
00111000 56 38 8
00111001 57 39 9
00111010 58 3A
00111011 59 3B
00111100 60 3C <
00111101 61 3D =
00111110 62 3E >
00111111 63 3F ?
01000000 64 40 @
01000001 65 41 A
01000010 66 42 B
01000011 67 43 C
01000100 68 44 D
01000101 69 45 E
01000110 70 46 F
01000111 71 47 G
01001000 72 48 H
01001001 73 49 I
01001010 74 4A J
01001011 75 4B K
01001100 76 4C L
01001101 77 4D M
01001110 78 4E N
01001111 79 4F O
01010000 80 50 P
01010001 81 51 Q
01010010 82 52 R
01010011 83 53 S
01010100 84 54 T
01010101 85 55 U
01010110 86 56 V
01010111 87 57 W
01011000 88 58 X
01011001 89 59 Y
01011010 90 5A Z
01011011 91 5B
01011110 94 5E ^
01011111 95 5F _
01100000 96 60 `
01100001 97 61 a
01100010 98 62 b
01100011 99 63 c
01100100 100 64 d
01100101 101 65 e
01100110 102 66 f
01100111 103 67 g
01101000 104 68 h
01101001 105 69 i
01101010 106 6A j
01101011 107 6B k
01101100 108 6C l
01101101 109 6D m
01101110 110 6E n
01101111 111 6F o
01110000 112 70 p
01110001 113 71 q
01110010 114 72 r
01110011 115 73 s
01110100 116 74 t
01110101 117 75 u
01110110 118 76 v
01110111 119 77 w
01111000 120 78 x
01111001 121 79 y
01111010 122 7A z
01111011 123 7B {
01111100 124 7C |
01111101 125 7D }
01111110 126 7E ~
01111111 127 7F DEL,(delete) Delete
Octal
Hexadecimal
Decimal
Character
Octal
Hexadecimal
Decimal
Character
0 0 0 nul 100 40 64 @
1 1 1 soh 101 41 65 A
2 2 2 stx 102 42 66 B
3 3 3 etx 103 43 67 C
4 4 4 eot 104 44 68 D
5 5 5 enq 105 45 69 E
6 6 6 ack 106 46 70 F
7 7 7 bel 107 47 71 G
10 8 8 bs 110 48 72 H
11 9 9 ht 111 49 73 I
12 0a 10 nl 112 4a 74 J
13 0b 11 vt 113 4b 75 K
14 0c 12 ff 114 4c 76 L
15 0d 13 er 115 4d 77 M
16 0e 14 so 116 4e 78 N
17 0f 15 si 117 4f 79 O
20 10 16 dle 120 50 80 P
21 11 17 dc1 121 51 81 Q
22 12 18 dc2 122 52 82 R
23 13 19 dc3 123 53 83 S
24 14 20 dc4 124 54 84 T
25 15 21 nak 125 55 85 U
26 16 22 syn 126 56 86 V
27 17 23 etb 127 57 87 W
30 18 24 can 130 58 88 X
31 19 25 em 131 59 89 Y
32 1a 26 sub 132 5a 90 Z
33 1b 27 esc 133 5b 91
36 1e 30 re 136 5e 94 ^
37 1f 31 us 137 5f 95 _
40 20 32 sp 140 60 96 '
41 21 33 ! 141 61 97 a
42 22 34 " 142 62 98 b
43 23 35 # 143 63 99 c
44 24 36 $ 144 64 100 d
45 25 37 % 145 65 101 e
46 26 38 & 146 66 102 f
47 27 39 ` 147 67 103 g
50 28 40 ( 150 68 104 h
51 29 41 ) 151 69 105 i
52 2a 42 * 152 6a 106 j
53 2b 43 + 153 6b 107 k
54 2c 44 154 6c 108 l
55 2d 45 - 155 6d 109 m
56 2e 46 156 6e 110 n
57 2f 47 / 157 6f 111 o
60 30 48 0 160 70 112 p
61 31 49 1 161 71 113 q
62 32 50 2 162 72 114 r
63 33 51 3 163 73 115 s
64 34 52 4 164 74 116 t
65 35 53 5 165 75 117 u
66 36 54 6 166 76 118 v
67 37 55 7 167 77 119 w
70 38 56 8 170 78 120 x
71 39 57 9 171 79 121 y
72 3a 58 172 7a 122 z
73 3b 59 173 7b 123 {
74 3c 60 < 174 7c 124 |
75 3d 61 = 175 7d 125 }
76 3e 62 > 176 7e 126 ~
77 3f 63 ? 177 7f 127 del
5 Size Rules Edit
1) Numbers 0-9 are smaller than letters. For example, "7" < "F";
2) Number 0 is smaller than number 9, and increases in order from 0 to 9. For example, "3" < "8"
3) Letter A is smaller than letter Z, and increases in order from A to Z. For example, "A" < "Z"
4) The uppercase letter of the same letter is smaller than the lowercase letter. For example, "A" < "a".
Remember the ASCII code sizes of several common letters:
"Line feed LF" is 0x0A; "Carriage return CR" is 0x0D; Space is 0x20; "0" is 0x30; "A" is 0x41; "a" is 0x61.
In addition, there is also an ASCII character query for 128-255. ASCII skills are convenient for querying the character corresponding to the ASCII code: create a new text document, hold down ALT + the code value to be queried (note that this is decimal)
Release it to display the corresponding character. For example: hold down ALT + 97, then 'a' will be displayed.
6 International Issues Edit
ASCII is an American standard, so it cannot well meet the needs of other English-speaking countries. For example, where is the British pound symbol (£)?
Accent marks in the Latin alphabet
Greek, Hebrew, Arabic, and Russian using the Cyrillic alphabet.
Chinese pictographic characters in Chinese character systems, Japan and Korea.
In 1967, the International Organization for Standardization (ISO: International Standards Organization) recommended a variant of ASCII,
Codes 0x40, 0x5B, 0x5C, 0x5D, 0x7B, 0x7C, and 0x7D "are reserved for national use", and codes 0x5E, 0x60, and 0x7E are marked as
"When special characters required by the country need 8, 9, or 10 space positions, they can be used for other graphic symbols". This is obviously not an optimal international solution,
Because this does not guarantee consistency. But this shows how people try to encode for different languages.
Extended ASCII
1981 IBM PC ROM 256-character character set, that is, the IBM extended character set
ASCII 1985 11 Windows character set is called "ANSI character set", following the ANSI draft and ISO standard (ANSI/ISO8859-1-1987, simply "Latin 1".
Initial version of the ANSI character set:
April 1987 Code page 437, character mapping code, appears in MS-DOS 3.3
Extended ASCII characters are characters from 128 to 255 (0x80-0xff).
Double-byte character set
Double-byte character set (DBCS: double-byte character set), solving the compatibility between pictographic characters in China, Japan, and Korea and ASCII.
DBCS starts from 256 codes, just like ASCII. Like any well-behaved code page, the first 128 codes are ASCII.
However, some of the higher 128 codes always follow a second byte.
These two bytes together (called the lead byte and the following byte) define a character, usually a complex pictographic character.
7 Commonly Used on Keyboard Edit
ESC key VK_ESCAPE (27)
Enter key: VK_RETURN (13)
TAB key: VK_TAB (9)
Caps Lock key: VK_CAPITAL (20)
Shift key: VK_SHIFT (16)
Ctrl key: VK_CONTROL (17)
Alt key: VK_MENU (18)
Space bar: VK_SPACE (32)
Backspace key: VK_BACK (8)
Left logo key: VK_LWIN (91)
Right logo key: VK_LWIN (92)
Mouse right-click shortcut: VK_APPS (93)
Insert key: VK_INSERT (45)
Home key: VK_HOME (36)
Page Up: VK_PRIOR (33)
PageDown: VK_NEXT (34)
End key: VK_END (35)
Delete key: VK_DELETE (46)
Direction key (←): VK_LEFT (37)
Direction key (↑): VK_UP (38)
Direction key (→): VK_RIGHT (39)
Direction key (↓): VK_DOWN (40)
F1 key: VK_F1 (112)
F2 key: VK_F2 (113)
F3 key: VK_F3 (114)
F4 key: VK_F4 (115)
F5 key: VK_F5 (116)
F6 key: VK_F6 (117)
F7 key: VK_F7 (118)
F8 key: VK_F8 (119)
F9 key: VK_F9 (120)
F10 key: VK_F10 (121)
F11 key: VK_F11 (122)
F12 key: VK_F12 (123)
Num Lock key: VK_NUMLOCK (144)
Numeric keypad 0: VK_NUMPAD0 (96)
Numeric keypad 1: VK_NUMPAD0 (97)
Numeric keypad 2: VK_NUMPAD0 (98)
Numeric keypad 3: VK_NUMPAD0 (99)
Numeric keypad 4: VK_NUMPAD0 (100)
Numeric keypad 5: VK_NUMPAD0 (101)
Numeric keypad 6: VK_NUMPAD0 (102)
Numeric keypad 7: VK_NUMPAD0 (103)
Numeric keypad 8: VK_NUMPAD0 (104)
Numeric keypad 9: VK_NUMPAD0 (105)
Numeric keypad. : VK_DECIMAL (110)
Numeric keypad *: VK_MULTIPLY (106)
Numeric keypad +: VK_MULTIPLY (107)
Numeric keypad -: VK_SUBTRACT (109)
Numeric keypad /: VK_DIVIDE (111)
Pause Break key: VK_PAUSE (19)
Scroll Lock key: VK_SCROLL (145)
8 Code Algorithm Edit
In ASCII, it is defined as 01000001, that is, decimal 65. With this standard, when we input A, the computer can know that the binary code of the input character is 01000001 through ASCII code. Without such a standard, we must figure out how to tell the computer that we input an A; without such a standard, we need to re-encode on other machines to tell the computer that we want to input A. ASCII code does not refer to decimal, but binary. It's just that using decimal is more convenient. For example, in ASCII code, the binary code of A is 01000001. If expressed in decimal, it is 65, and if expressed in hexadecimal, it is 41H.
ASCII Non-printing Control Character Table
In the ASCII code table, only information representations of some characters, numbers, and punctuation marks are included. This is mainly because the computer was invented in the United States, and under English, we use ASCII representation enough! But under Chinese character input, ASCII code cannot be used to represent, and Chinese characters are only a common representation in China. So if we want to input Chinese characters in a computer, we must have a standard like ASCII code to represent each Chinese character. This is China's national standard code for Chinese characters, which defines a representation standard for Chinese characters in the computer. Through this standard, when we input Chinese characters, our input code is converted to the区位 code, and the font code of this Chinese character is obtained through the unique 区位 code and displayed. Of course, the 区位 code of Chinese characters is also represented in binary in the computer!
1. Conversion of binary number to decimal number
The weight of the 0th bit of a binary number is 2 to the power of 0, the weight of the 1st bit is 2 to the power of 1...
So, suppose there is a binary number: 0110 0100, converted to decimal is:
The following is a vertical form:
0110 0100 converted to decimal
Bit 0 0 * 20 = 0
Bit 1 0 * 21 = 0
Bit 2 1 * 22 = 4
Bit 3 0 * 23 = 0
Bit 4 0 * 24 = 0
Bit 5 1 * 25 = 32
Bit 6 1 * 26 = 64
Bit 7 0 * 27 = 0 +
---------------------------
100
Calculated in horizontal form:
0 * 20 + 0 * 21 + 1 * 22 + 1 * 23 + 0 * 24 + 1 * 25 + 1 * 26 + 0 * 27 = 100
0 multiplied by anything is 0, so we can also skip the bits with value 0 directly:
1 * 22 + 1 * 23 + 1 * 25 + 1 * 26 = 100
2. Conversion of octal number to decimal number
Octal is base 8.
Octal numbers use these eight numbers from 0 to 7 to express a number.
The weight of the 0th bit of an octal number is 8 to the power of 0, the weight of the 1st bit is 8 to the power of 1, the weight of the 2nd bit is 8 to the power of 2...
So, suppose there is an octal number: 1507, converted to decimal is:
Expressed in vertical form:
1507 converted to decimal.
Bit 0 7 * 80 = 7
Bit 1 0 * 81 = 0
Bit 2 5 * 82 = 320
Bit 3 1 * 83 = 512 +
--------------------------
839
Similarly, we can also calculate directly in horizontal form:
7 * 80 + 0 * 81 + 5 * 82 + 1 * 83 = 839
The result is that the octal number 1507 is converted to the decimal number 839.
Expression method of octal number
In C and C++, how to express an octal number? If this number is 876, we can conclude that it is not an octal number because no digit greater than 7 in the octal number can appear. But if this number is 123, 567, or 12345670, then it may be an octal number or a decimal number.
Therefore, C and C++ stipulate that if a number is to indicate that it uses octal, a 0 must be added in front of it, such as: 123 is decimal, but 0123 indicates that octal is used. This is the expression method of octal numbers in C and C++.
Since C and C++ do not provide a way to express binary numbers, the octal we have learned is the second way of numerical expression in C and C++ languages that we have learned.
Now, for the same number, for example, 100, we can express it in the normal decimal form in the code, for example, when initializing a variable:
int a = 100;
We can also write it like this:
int a = 0144; //0144 is octal 100; how to convert a decimal number to octal, we will learn later.
Be sure to remember that when expressing in octal, you cannot miss the leading 0. Otherwise, the computer will take it all as decimal. However, there is a place where 0 cannot be added when using octal numbers, that is, the "escape character" expression method we learned earlier.
Use of octal numbers in escape characters
We have learned the method of using an escape character '\' plus a special letter to represent a certain character, such as: '\n' means newline (line), and '\t' means Tab character, and '\'' means single quote. Today we have learned another way of using an escape character: the escape character '\' followed by an octal number is used to represent the character whose ASCII value is equal to this value.
For example, check the ASCII code table in Chapter 5, and we find that the ASCII value of the question mark character (?) is 63. Then we can convert it to octal value: 77, and use '\77' to represent '?'. Since it is octal, it should be written as '\077', but because C and C++ stipulate that it is not allowed to use a slash plus a decimal number to represent a character, so the 0 here can be omitted.
In fact, we rarely use the escape character plus octal number to represent a character in actual programming, so the content of section 6.2.4 is only for you to understand.
Conversion of hexadecimal number to decimal number
Binary, using two Arabic numerals: 0, 1;
Octal, using eight Arabic numerals: 0, 1, 2, 3, 4, 5, 6, 7;
Decimal, using ten Arabic numerals: 0 to 9;
Hexadecimal, using sixteen Arabic numerals... and so on. Did the Arabs or Indians only invent 10 numbers?
Hexadecimal is base 16, but we only have these ten numbers from 0 to 9, so we use the five letters A, B, C, D, E, F to respectively represent 10, 11, 12, 13, 14, 15. The letters are not case-sensitive.
The weight of the 0th bit of a hexadecimal number is 16 to the power of 0, the weight of the 1st bit is 16 to the power of 1, the weight of the 2nd bit is 16 to the power of 2...
So, on the Nth bit (N starts from 0), if it is a number X (X is greater than or equal to 0, and X is less than or equal to 15, that is, F), the size represented is X * 16 to the power of N.
Suppose there is a hexadecimal number 2AF5, then how to convert it to decimal?
Expressed in vertical form:
2AF5 converted to decimal:
Bit 0: 5 * 160 = 5
Bit 1: F * 161 = 240
Bit 2: A * 162 = 2560
Bit 3: 2 * 163 = 8192 +
ASCII -------------------------------------
10997
Direct calculation is:
5 * 160 + F * 161 + A * 162 + 2 * 163 = 10997
(Don't forget, in the above calculation, A represents 10, and F represents 15)
Now it can be seen that the key to converting all bases to decimal is different weights.
Suppose someone asks you why the decimal number 1234 is one thousand two hundred and thirty-four? You can give him such a formula:
1234 = 1 * 103 + 2 * 102 + 3 * 101 + 4 * 100
Expression method of hexadecimal number
If the special writing form is not used, the hexadecimal number will also be confused with the decimal number. Any number: 9876, it is impossible to see whether it is a hexadecimal number or a decimal number.
C and C++ stipulate that a hexadecimal number must start with 0x. For example, 0x1 represents a hexadecimal number. And 1 represents a decimal number. In addition, such as: 0xff, 0xFF, 0X102A, etc. The x in them is also case-insensitive. (Note: The 0 in 0x is the number 0, not the letter O)
The following are some usage examples:
int a = 0x100F;
int b = 0x70 + a;
So far, we have learned all bases: decimal, octal, hexadecimal number expressions. The last point is very important. In C/C++, decimal numbers have positive and negative points. For example, 12 means positive 12, and -12 means negative 12; but octal and hexadecimal can only express unsigned positive integers. If you write in the code: -078, or write: -0xF2, C and C++ do not treat it as a negative number.
Use of hexadecimal numbers in escape characters
The escape character can also be followed by a hexadecimal number to represent a character. For example, in section 6.2.4, the '?' character can have the following expressions:
'?' //Directly input the character
'\77' //Use octal, and the 0 at the beginning can be omitted here
'\0x3F' //Use hexadecimal
Similarly, this section is only for understanding. Except for the null character represented by the octal number '\0', we rarely use the latter two methods to represent a character.
3. Conversion from decimal number to binary, octal, hexadecimal number
Conversion of base 2 number to base 2 number
Give you a decimal number, for example: 6, how to convert it to binary?
Converting a decimal number to a binary number is a continuous process of dividing by 2:
Divide the number to be converted by 2 to get the quotient and remainder,
Divide the quotient by 2 until the quotient is 0. Finally, reverse the order of all remainders, and the resulting number is the conversion result.
It sounds a bit confusing? Let's explain with an example. For example, to convert 6 to binary.
"Divide the number to be converted by 2 to get the quotient and remainder".
Then:
The number to be converted is 6, 6 ÷ 2, get quotient 3, remainder 0. (Don't tell me you can't calculate 6 ÷ 3!)
"Divide the quotient by 2 until the quotient is 0..."
Now the quotient is 3, which is not 0, so continue dividing by 2.
Then: 3 ÷ 2, get quotient 1, remainder 1.
"Divide the quotient by 2 until the quotient is 0..."
Now the quotient is 1, which is not 0, so continue dividing by 2.
Then: 1 ÷ 2, get quotient 0, remainder 1 (take a pen and paper to calculate, 1 ÷ 2 is quotient 0 and remainder 1!)
"Divide the quotient by 2 until the quotient is 0... Finally, reverse the order of all remainders"
Great! Now the quotient is 0.
We get remainders 0, 1, 1 in three calculations in sequence, and reverse the order of all remainders, which is: 110!
6 converted to binary is 110.
Change the above paragraph to a table, it is as follows:
Dividend Calculation process Quotient Remainder
6 6/2 3 0
3 3/2 1 1
1 1/2 0 1
(In the computer, ÷ is represented by /)
If it is an exam, it will take a bit of time to draw such a table, so the more common conversion process is to use the following figure for consecutive division:
(Figure: 1)
Please compare the figure, table, and text description, and calculate by yourself how to convert 6 to binary.
After talking for a long time, is our conversion result correct? Is binary number 110 equal to 6? You have learned how to convert binary number to decimal number, so please calculate now whether 110 converted to decimal is 6.
6.3.2 Conversion of decimal number to octal and hexadecimal number
Very happy, the method of converting a decimal number to octal is similar to the method of converting to binary, the only change: the divisor is changed from 2 to 8.
Take an example, how to convert the decimal number 120 to octal.
Expressed in a table:
Dividend Calculation process Quotient Remainder
120 120/8 15 0
15 15/8 1 7
1 1/8 0 1
120 converted to octal is: 170.
Very, very happy, the method of converting a decimal number to hexadecimal is similar to the method of converting to binary, the only change: the divisor is changed from 2 to 16.
Also 120, converted to hexadecimal is:
Dividend Calculation process Quotient Remainder
120 120/16 7 8
7 7/16 0 7
120 converted to hexadecimal is: 78.
Please take a pen and paper, in the form of (Figure: 1), calculate the process of the above two tables.
4. Conversion between hexadecimal numbers
The conversion between binary and hexadecimal is relatively important. However, the conversion between the two does not need to be calculated. Every C and C++ programmer can directly convert a binary number to a hexadecimal number, and vice versa.
We are the same, as long as we learn this section, we can do it.
First, let's look at a binary number: 1111, what is it?
You may still calculate like this: 1 * 20 + 1 * 21 + 1 * 22 + 1 * 23 = 1 * 1 + 1 * 2 + 1 * 4 + 1 * 8 = 15.
However, since 1111 is only 4 bits, we must directly remember the weight of each bit, and it is from the high bit to the low bit, :8, 4, 2, 1. That is, the weight of the highest bit is 23 = 8, and then 22 = 4, 21 = 2, 20 = 1.
Remember 8421, for any 4-bit binary number, we can quickly calculate its corresponding decimal value.
The following lists all possible values of a 4-bit binary number xxxx (some are omitted in the middle)
Only 4-bit binary number Quick calculation method Decimal value Hexadecimal value
1111 = 8 + 4 + 2 + 1 = 15 F
1110 = 8 + 4 + 2 + 0 = 14 E
1101 = 8 + 4 + 0 + 1 = 13 D
1100 = 8 + 4 + 0 + 0 = 12 C
1011 = 8 + 4 + 0 + 1 = 11 B
1010 = 8 + 0 + 2 + 0 = 10 A
1001 = 8 + 0 + 0 + 1 = 10 9
。。。。
0001 = 0 + 0 + 0 + 1 = 1 1
0000 = 0 + 0 + 0 + 0 = 0 0
To convert a binary number to hexadecimal, it is to divide into 4-bit segments and convert each segment to hexadecimal.
For example (the upper line is the binary number, the following is the corresponding hexadecimal):
1111 1101, 1010 0101, 1001 1011
F D, A 5, 9 B
Conversely, when we see FD, how to quickly convert it to binary?
First convert F:
See F, we need to know it is 15 (maybe you are not familiar with the five numbers A~F yet), then how to use 8421 to make up 15? It should be 8 + 4 + 2 + 1, so all four bits are 1: 1111.
Then convert D:
See D, know it is 13, how to use 8421 to make up 13? It should be: 8 + 4 + 1, that is: 1101.
So, FD converted to binary is: 1111 1101
Since the conversion from hexadecimal to binary is quite direct, so when we need to convert a decimal number to a binary number, we can also convert it to hexadecimal first, and then convert it to binary.
For example, the decimal number 1234 is converted to binary. If we want to divide by 2 all the time to directly get the binary number, it needs to be calculated many times. So we can first divide by 16 to get the hexadecimal number:
ASCII
Dividend Calculation process Quotient Remainder
1234 1234/16 77 2
77 77/16 4 13 (D)
4 4/16 0 4
The result hexadecimal is: 0x4D2
Then we can directly write the binary form of 0x4D2: 0100 1011 0010.
The corresponding relationship is:
0100 -- 4
1011 -- D
0010 -- 2
Similarly, if a binary number is very long, and we need to convert it to a decimal number, in addition to the method we learned earlier, we can also first convert this binary number to hexadecimal, and then convert it to decimal.
The following is an example of an int type binary number:
01101101 11100101 10101111 00011011
We convert it to hexadecimal in groups of four bits: 6D E5 AF 1B
9 Chinese Character Coding Edit
0-127 is the range of 7-bit ASCII code, which is an international standard.
As for Chinese characters, the range of ASCII code used by different character sets is also different. Commonly used Chinese character sets include GB2312-80, GBK,
Big5, unicode, etc. Here I will focus on the most commonly used GB_2312 character set.
The GB_2312 character set is currently the most commonly used Chinese character encoding standard. The GBK character set used in windows 95/98/2000 contains GB2312, or is compatible with GB2312. The GB_2312 character set contains 6763 simplified Chinese characters and 682 standard Chinese symbols. In this standard, each Chinese character is represented by 2 bytes, and the ASCII code of each byte is 161-254 (16 hexadecimal A1 - FE). The first byte corresponds to the 1-94th area of the area code, and the second byte corresponds to the 1-94th bit of the bit code.
161-254 is actually easy to remember. Everyone knows that in English characters, the range of printable characters is 33-126. Add this pair of numbers plus
128 (or set the highest bit to 1) to get the range of characters used by Chinese characters.
//The specification of GB18030 is that the first byte of a Chinese character is between 0x81-0xFE, and the second byte is in the interval 0x40-0x7E and 0x80-0xFE. Each byte is converted to an integer greater than 128.
if ((char_temp>=0x81)&&(char_temp<=0xFE))
{
if(*len
{
*len+=1;
*p_temp++=char_temp;
_putch(char_temp);
x++;
}
}
Reference materials:
1.
ASCII code table 0-127
http://www.asciima.com/
2.
Extended ASCII printable characters
http://www.asciima.com/ascii/4.html
3.
ASCII non-printing control character table
http://www.asciima.com/ascii/2.html
Extended Reading:
1.
American Standard Code for Information Interchange
http://www.360doc.com/content/10/1207/11/3945310_75772859.shtml
2.
ASCII code table ASCII control characters
http://ascii.911cha.com/
Edit Entry
This entry lacks entry classification. Supplement relevant content to help the entry be more complete! Edit now >>
ASCII (pronunciation: /ˈæski/ ASS-kee, American Standard Code for Information Interchange, American Standard Code for Information Interchange) is a computer coding system based on the Latin alphabet. It is mainly used to display modern English, and its extended version EASCII can partially support other Western European languages and is equivalent to the international standard ISO/IEC 646. Due to the widespread use of ASCII on the World Wide Web, it was gradually replaced by Unicode until December 2007.
Chinese Name American Standard Code for Information Interchange
Abbreviation ASCII
Category Coding Standard
English Name American Standard Code for Information Interchange
Other Name ASCII Code
Function Display modern English and other Western European languages
Table of Contents
1 Brief Introduction
2 Historical Evolution
3 Code Generation
4 Standard Code Table
5 Size Rules
6 International Issues
7 Commonly Used on Keyboard
8 Code Algorithm
9 Chinese Character Coding
1 Brief Introduction Edit
ASCII
ASCII code uses specified 7-bit or 8-bit binary number combinations to represent 128 or 256 possible characters. Standard ASCII code, also known as basic ASCII code, uses 7-bit binary numbers to represent all uppercase and lowercase letters, numbers 0 to 9, punctuation marks, and special control characters used in American English. Among them:
0-31 and 127 (a total of 33) are control characters or communication-specific characters (the rest are displayable characters), such as control characters: LF (line feed), CR (carriage return), FF (form feed), DEL (delete), BS (backspace), BEL (bell), etc.; communication-specific characters: SOH (start of heading), EOT (end of transmission), ACK (acknowledge), etc.; ASCII values of 8, 9, 10, and 13 are respectively converted to backspace, tab, line feed, and carriage return characters. They do not have specific graphic displays, but will have different effects on text display depending on different application programs.
32-126 (a total of 95) are characters (32sp is space), among which 48-57 are ten Arabic numerals from 0 to 9
65-90 are 26 uppercase English letters, 97-122 are 26 lowercase English letters, and the rest are some punctuation marks, operation symbols, etc.
Also, it should be noted that in standard ASCII, its highest bit (b7) is used as a parity bit. The so-called parity check is a method used to check for errors during code transmission. Generally, there are two types: odd parity and even parity. Odd parity regulation: The number of 1s in a correct code byte must be odd. If it is not odd, add 1 to the highest bit b7; even parity regulation: The number of 1s in a correct code byte must be even. If it is not even, add 1 to the highest bit b7.
The latter 128 are called extended ASCII codes. Currently, many x86-based systems support the use of extended (or "high") ASCII. Extended ASCII codes allow the 8th bit of each character to be used to determine an additional 128 special symbol characters, foreign language letters, and graphic symbols.
2 Historical Evolution Edit
6000 years ago Hieroglyphics
3000 years ago Alphabet
1838 to 1854 Samuel F. B. Morse invented the telegraph, and each character in the alphabet corresponds to a series of short and long pulses
1821 to 1824 Louis Braille invented Braille, a 6-bit code that encodes characters, common letter combinations, common words, and punctuation.
A special escape code indicates that subsequent character codes should be interpreted as uppercase. A special shift code allows subsequent codes to be interpreted as numbers.
1931 CCITT standardized Telex codes, including Baudot #2 codes, which are 5-bit codes including characters and numbers.
1890 Early computer character codes were from Hollerith cards, a 6-bit character code system BCDIC (Binary-Coded Decimal Interchange Code: Binary Coded Decimal Interchange Code)
1960s Expanded to 8-bit EBCDIC, the standard for IBM mainframes
1967 American Standard Code for Information Interchange (ASCII: American Standard Code for Information Interchange)
There was a big dispute over whether the character length was 6 bits, 7 bits, or 8 bits. From a reliability perspective, replacement characters should not be used,
Therefore, ASCII cannot be a 6-bit encoding, but due to cost reasons, the 8-bit version plan was also excluded (at that time, the storage space cost per bit was still very expensive).
In this way, the final character codes are 26 lowercase letters, 26 uppercase letters, 10 numbers, 32 symbols, 33 handles, and a space, totaling 128 character codes.
ASCII is now recorded in ANSI X3.4-1986 Character Set - 7-bit American National Standard Code for Information Interchange (7-Bit ASCII: 7-Bit American National
Standard Code for Information Interchange), released by the American National Standards Institute (American National Standards Institute).
3 Code Generation Edit
In a computer, all data must be represented in binary numbers during storage and operation (because a computer uses high and low levels to respectively represent 1 and 0). For example, letters like a, b, c, d (including uppercase), numbers like 0, 1, and some common symbols (such as *, #, @, etc.) must also be represented in binary numbers when stored in a computer. Of course, everyone can agree on their own set of (this is called encoding) for which binary numbers represent which symbols. However, if everyone wants to communicate with each other without confusion, then everyone must use the same encoding rules. So the relevant standardized organization in the United States introduced the so-called ASCII encoding, which uniformly stipulates which binary numbers are used to represent the above common symbols.
The American Standard Code for Information Interchange is formulated by the American National Standard Institute (American National Standard Institute, ANSI), a standard single-byte character encoding scheme for text-based data. It started in the late 1950s and was finalized in 1967. It was originally an American national standard for different computers to use as a common Western character encoding standard when communicating with each other. It has been established as an international standard by the International Organization for Standardization (International Organization for Standardization, ISO), called the ISO 646 standard. Applicable to all Latin alphabet letters.
4 Standard Code Table Edit
Bin
Dec
Hex
Abbreviation/Character
Explanation
0000,0000 0 00
NUL(null)
Null character
0000,0001 1 01 SOH(start,of,headline) Start of headline
0000,0010 2 02 STX,(start,of,text) Start of text
0000,0011 3 03 ETX,(end,of,text) End of text
0000,0100 4 04 EOT,(end,of,transmission) End of transmission
0000,0101 5 05 ENQ,(enquiry) Enquiry
0000,0110 6 06 ACK,(acknowledge) Acknowledge
0000,0111 7 07 BEL,(bell) Bell
0000,1000 8 08 BS,(backspace) Backspace
0000,1001 9 09 HT,(horizontal,tab) Horizontal tab
0000,1010 10 0A LF,(NL,line,feed,new,line) Line feed key
0000,1011 11 0B VT,(vertical,tab) Vertical tab
0000,1100 12 0C FF,(NP,form,feed,new,page) Form feed key
0000,1101 13 0D CR,(carriage,return) Carriage return key
0000,1110 14 0E SO,(shift,out) Shift out
0000,1111 15 0F SI,(shift,in) Shift in
0001,0000 16 10 DLE,(data,link,escape) Data link escape
0001,0001 17 11 DC1,(device,control,1) Device control 1
0001,0010 18 12 DC2,(device,control,2) Device control 2
0001,0011 19 13 DC3,(device,control,3) Device control 3
0001,0100 20 14 DC4,(device,control,4) Device control 4
0001,0101 21 15 NAK,(negative,acknowledge) Negative acknowledge
0001,0110 22 16 SYN,(synchronous,idle) Synchronous idle
0001,0111 23 17 ETB,(end,of,trans,block) End of transmission block
0001,1000 24 18 CAN,(cancel) Cancel
0001,1001 25 19 EM,(end,of,medium) End of medium
0001,1010 26 1A SUB,(substitute) Substitute
0001,1011 27 1B ESC,(escape) Escape
0001,1100 28 1C FS,(file,separator) File separator
0001,1101 29 1D GS,(group,separator) Group separator
0001,1110 30 1E RS,(record,separator) Record separator
0001,1111 31 1F US,(unit,separator) Unit separator
0010,0000 32 20 (space) Space
0010,0001 33 21 !
0010,0010 34 22 "
0010,0011 35 23 #
0010,0100 36 24 $
0010,0101 37 25 %
0010,0110 38 26 &
0010,0111 39 27 '
0010,1000 40 28 (
0010,1001 41 29 )
0010,1010 42 2A *
0010,1011 43 2B +
0010,1100 44 2C
0010,1101 45 2D -
0010,1110 46 2E
00101111 47 2F /
00110000 48 30 0
00110001 49 31 1
00110010 50 32 2
00110011 51 33 3
00110100 52 34 4
00110101 53 35 5
00110110 54 36 6
00110111 55 37 7
00111000 56 38 8
00111001 57 39 9
00111010 58 3A
00111011 59 3B
00111100 60 3C <
00111101 61 3D =
00111110 62 3E >
00111111 63 3F ?
01000000 64 40 @
01000001 65 41 A
01000010 66 42 B
01000011 67 43 C
01000100 68 44 D
01000101 69 45 E
01000110 70 46 F
01000111 71 47 G
01001000 72 48 H
01001001 73 49 I
01001010 74 4A J
01001011 75 4B K
01001100 76 4C L
01001101 77 4D M
01001110 78 4E N
01001111 79 4F O
01010000 80 50 P
01010001 81 51 Q
01010010 82 52 R
01010011 83 53 S
01010100 84 54 T
01010101 85 55 U
01010110 86 56 V
01010111 87 57 W
01011000 88 58 X
01011001 89 59 Y
01011010 90 5A Z
01011011 91 5B
01011110 94 5E ^
01011111 95 5F _
01100000 96 60 `
01100001 97 61 a
01100010 98 62 b
01100011 99 63 c
01100100 100 64 d
01100101 101 65 e
01100110 102 66 f
01100111 103 67 g
01101000 104 68 h
01101001 105 69 i
01101010 106 6A j
01101011 107 6B k
01101100 108 6C l
01101101 109 6D m
01101110 110 6E n
01101111 111 6F o
01110000 112 70 p
01110001 113 71 q
01110010 114 72 r
01110011 115 73 s
01110100 116 74 t
01110101 117 75 u
01110110 118 76 v
01110111 119 77 w
01111000 120 78 x
01111001 121 79 y
01111010 122 7A z
01111011 123 7B {
01111100 124 7C |
01111101 125 7D }
01111110 126 7E ~
01111111 127 7F DEL,(delete) Delete
Octal
Hexadecimal
Decimal
Character
Octal
Hexadecimal
Decimal
Character
0 0 0 nul 100 40 64 @
1 1 1 soh 101 41 65 A
2 2 2 stx 102 42 66 B
3 3 3 etx 103 43 67 C
4 4 4 eot 104 44 68 D
5 5 5 enq 105 45 69 E
6 6 6 ack 106 46 70 F
7 7 7 bel 107 47 71 G
10 8 8 bs 110 48 72 H
11 9 9 ht 111 49 73 I
12 0a 10 nl 112 4a 74 J
13 0b 11 vt 113 4b 75 K
14 0c 12 ff 114 4c 76 L
15 0d 13 er 115 4d 77 M
16 0e 14 so 116 4e 78 N
17 0f 15 si 117 4f 79 O
20 10 16 dle 120 50 80 P
21 11 17 dc1 121 51 81 Q
22 12 18 dc2 122 52 82 R
23 13 19 dc3 123 53 83 S
24 14 20 dc4 124 54 84 T
25 15 21 nak 125 55 85 U
26 16 22 syn 126 56 86 V
27 17 23 etb 127 57 87 W
30 18 24 can 130 58 88 X
31 19 25 em 131 59 89 Y
32 1a 26 sub 132 5a 90 Z
33 1b 27 esc 133 5b 91
36 1e 30 re 136 5e 94 ^
37 1f 31 us 137 5f 95 _
40 20 32 sp 140 60 96 '
41 21 33 ! 141 61 97 a
42 22 34 " 142 62 98 b
43 23 35 # 143 63 99 c
44 24 36 $ 144 64 100 d
45 25 37 % 145 65 101 e
46 26 38 & 146 66 102 f
47 27 39 ` 147 67 103 g
50 28 40 ( 150 68 104 h
51 29 41 ) 151 69 105 i
52 2a 42 * 152 6a 106 j
53 2b 43 + 153 6b 107 k
54 2c 44 154 6c 108 l
55 2d 45 - 155 6d 109 m
56 2e 46 156 6e 110 n
57 2f 47 / 157 6f 111 o
60 30 48 0 160 70 112 p
61 31 49 1 161 71 113 q
62 32 50 2 162 72 114 r
63 33 51 3 163 73 115 s
64 34 52 4 164 74 116 t
65 35 53 5 165 75 117 u
66 36 54 6 166 76 118 v
67 37 55 7 167 77 119 w
70 38 56 8 170 78 120 x
71 39 57 9 171 79 121 y
72 3a 58 172 7a 122 z
73 3b 59 173 7b 123 {
74 3c 60 < 174 7c 124 |
75 3d 61 = 175 7d 125 }
76 3e 62 > 176 7e 126 ~
77 3f 63 ? 177 7f 127 del
5 Size Rules Edit
1) Numbers 0-9 are smaller than letters. For example, "7" < "F";
2) Number 0 is smaller than number 9, and increases in order from 0 to 9. For example, "3" < "8"
3) Letter A is smaller than letter Z, and increases in order from A to Z. For example, "A" < "Z"
4) The uppercase letter of the same letter is smaller than the lowercase letter. For example, "A" < "a".
Remember the ASCII code sizes of several common letters:
"Line feed LF" is 0x0A; "Carriage return CR" is 0x0D; Space is 0x20; "0" is 0x30; "A" is 0x41; "a" is 0x61.
In addition, there is also an ASCII character query for 128-255. ASCII skills are convenient for querying the character corresponding to the ASCII code: create a new text document, hold down ALT + the code value to be queried (note that this is decimal)
Release it to display the corresponding character. For example: hold down ALT + 97, then 'a' will be displayed.
6 International Issues Edit
ASCII is an American standard, so it cannot well meet the needs of other English-speaking countries. For example, where is the British pound symbol (£)?
Accent marks in the Latin alphabet
Greek, Hebrew, Arabic, and Russian using the Cyrillic alphabet.
Chinese pictographic characters in Chinese character systems, Japan and Korea.
In 1967, the International Organization for Standardization (ISO: International Standards Organization) recommended a variant of ASCII,
Codes 0x40, 0x5B, 0x5C, 0x5D, 0x7B, 0x7C, and 0x7D "are reserved for national use", and codes 0x5E, 0x60, and 0x7E are marked as
"When special characters required by the country need 8, 9, or 10 space positions, they can be used for other graphic symbols". This is obviously not an optimal international solution,
Because this does not guarantee consistency. But this shows how people try to encode for different languages.
Extended ASCII
1981 IBM PC ROM 256-character character set, that is, the IBM extended character set
ASCII 1985 11 Windows character set is called "ANSI character set", following the ANSI draft and ISO standard (ANSI/ISO8859-1-1987, simply "Latin 1".
Initial version of the ANSI character set:
April 1987 Code page 437, character mapping code, appears in MS-DOS 3.3
Extended ASCII characters are characters from 128 to 255 (0x80-0xff).
Double-byte character set
Double-byte character set (DBCS: double-byte character set), solving the compatibility between pictographic characters in China, Japan, and Korea and ASCII.
DBCS starts from 256 codes, just like ASCII. Like any well-behaved code page, the first 128 codes are ASCII.
However, some of the higher 128 codes always follow a second byte.
These two bytes together (called the lead byte and the following byte) define a character, usually a complex pictographic character.
7 Commonly Used on Keyboard Edit
ESC key VK_ESCAPE (27)
Enter key: VK_RETURN (13)
TAB key: VK_TAB (9)
Caps Lock key: VK_CAPITAL (20)
Shift key: VK_SHIFT (16)
Ctrl key: VK_CONTROL (17)
Alt key: VK_MENU (18)
Space bar: VK_SPACE (32)
Backspace key: VK_BACK (8)
Left logo key: VK_LWIN (91)
Right logo key: VK_LWIN (92)
Mouse right-click shortcut: VK_APPS (93)
Insert key: VK_INSERT (45)
Home key: VK_HOME (36)
Page Up: VK_PRIOR (33)
PageDown: VK_NEXT (34)
End key: VK_END (35)
Delete key: VK_DELETE (46)
Direction key (←): VK_LEFT (37)
Direction key (↑): VK_UP (38)
Direction key (→): VK_RIGHT (39)
Direction key (↓): VK_DOWN (40)
F1 key: VK_F1 (112)
F2 key: VK_F2 (113)
F3 key: VK_F3 (114)
F4 key: VK_F4 (115)
F5 key: VK_F5 (116)
F6 key: VK_F6 (117)
F7 key: VK_F7 (118)
F8 key: VK_F8 (119)
F9 key: VK_F9 (120)
F10 key: VK_F10 (121)
F11 key: VK_F11 (122)
F12 key: VK_F12 (123)
Num Lock key: VK_NUMLOCK (144)
Numeric keypad 0: VK_NUMPAD0 (96)
Numeric keypad 1: VK_NUMPAD0 (97)
Numeric keypad 2: VK_NUMPAD0 (98)
Numeric keypad 3: VK_NUMPAD0 (99)
Numeric keypad 4: VK_NUMPAD0 (100)
Numeric keypad 5: VK_NUMPAD0 (101)
Numeric keypad 6: VK_NUMPAD0 (102)
Numeric keypad 7: VK_NUMPAD0 (103)
Numeric keypad 8: VK_NUMPAD0 (104)
Numeric keypad 9: VK_NUMPAD0 (105)
Numeric keypad. : VK_DECIMAL (110)
Numeric keypad *: VK_MULTIPLY (106)
Numeric keypad +: VK_MULTIPLY (107)
Numeric keypad -: VK_SUBTRACT (109)
Numeric keypad /: VK_DIVIDE (111)
Pause Break key: VK_PAUSE (19)
Scroll Lock key: VK_SCROLL (145)
8 Code Algorithm Edit
In ASCII, it is defined as 01000001, that is, decimal 65. With this standard, when we input A, the computer can know that the binary code of the input character is 01000001 through ASCII code. Without such a standard, we must figure out how to tell the computer that we input an A; without such a standard, we need to re-encode on other machines to tell the computer that we want to input A. ASCII code does not refer to decimal, but binary. It's just that using decimal is more convenient. For example, in ASCII code, the binary code of A is 01000001. If expressed in decimal, it is 65, and if expressed in hexadecimal, it is 41H.
ASCII Non-printing Control Character Table
In the ASCII code table, only information representations of some characters, numbers, and punctuation marks are included. This is mainly because the computer was invented in the United States, and under English, we use ASCII representation enough! But under Chinese character input, ASCII code cannot be used to represent, and Chinese characters are only a common representation in China. So if we want to input Chinese characters in a computer, we must have a standard like ASCII code to represent each Chinese character. This is China's national standard code for Chinese characters, which defines a representation standard for Chinese characters in the computer. Through this standard, when we input Chinese characters, our input code is converted to the区位 code, and the font code of this Chinese character is obtained through the unique 区位 code and displayed. Of course, the 区位 code of Chinese characters is also represented in binary in the computer!
1. Conversion of binary number to decimal number
The weight of the 0th bit of a binary number is 2 to the power of 0, the weight of the 1st bit is 2 to the power of 1...
So, suppose there is a binary number: 0110 0100, converted to decimal is:
The following is a vertical form:
0110 0100 converted to decimal
Bit 0 0 * 20 = 0
Bit 1 0 * 21 = 0
Bit 2 1 * 22 = 4
Bit 3 0 * 23 = 0
Bit 4 0 * 24 = 0
Bit 5 1 * 25 = 32
Bit 6 1 * 26 = 64
Bit 7 0 * 27 = 0 +
---------------------------
100
Calculated in horizontal form:
0 * 20 + 0 * 21 + 1 * 22 + 1 * 23 + 0 * 24 + 1 * 25 + 1 * 26 + 0 * 27 = 100
0 multiplied by anything is 0, so we can also skip the bits with value 0 directly:
1 * 22 + 1 * 23 + 1 * 25 + 1 * 26 = 100
2. Conversion of octal number to decimal number
Octal is base 8.
Octal numbers use these eight numbers from 0 to 7 to express a number.
The weight of the 0th bit of an octal number is 8 to the power of 0, the weight of the 1st bit is 8 to the power of 1, the weight of the 2nd bit is 8 to the power of 2...
So, suppose there is an octal number: 1507, converted to decimal is:
Expressed in vertical form:
1507 converted to decimal.
Bit 0 7 * 80 = 7
Bit 1 0 * 81 = 0
Bit 2 5 * 82 = 320
Bit 3 1 * 83 = 512 +
--------------------------
839
Similarly, we can also calculate directly in horizontal form:
7 * 80 + 0 * 81 + 5 * 82 + 1 * 83 = 839
The result is that the octal number 1507 is converted to the decimal number 839.
Expression method of octal number
In C and C++, how to express an octal number? If this number is 876, we can conclude that it is not an octal number because no digit greater than 7 in the octal number can appear. But if this number is 123, 567, or 12345670, then it may be an octal number or a decimal number.
Therefore, C and C++ stipulate that if a number is to indicate that it uses octal, a 0 must be added in front of it, such as: 123 is decimal, but 0123 indicates that octal is used. This is the expression method of octal numbers in C and C++.
Since C and C++ do not provide a way to express binary numbers, the octal we have learned is the second way of numerical expression in C and C++ languages that we have learned.
Now, for the same number, for example, 100, we can express it in the normal decimal form in the code, for example, when initializing a variable:
int a = 100;
We can also write it like this:
int a = 0144; //0144 is octal 100; how to convert a decimal number to octal, we will learn later.
Be sure to remember that when expressing in octal, you cannot miss the leading 0. Otherwise, the computer will take it all as decimal. However, there is a place where 0 cannot be added when using octal numbers, that is, the "escape character" expression method we learned earlier.
Use of octal numbers in escape characters
We have learned the method of using an escape character '\' plus a special letter to represent a certain character, such as: '\n' means newline (line), and '\t' means Tab character, and '\'' means single quote. Today we have learned another way of using an escape character: the escape character '\' followed by an octal number is used to represent the character whose ASCII value is equal to this value.
For example, check the ASCII code table in Chapter 5, and we find that the ASCII value of the question mark character (?) is 63. Then we can convert it to octal value: 77, and use '\77' to represent '?'. Since it is octal, it should be written as '\077', but because C and C++ stipulate that it is not allowed to use a slash plus a decimal number to represent a character, so the 0 here can be omitted.
In fact, we rarely use the escape character plus octal number to represent a character in actual programming, so the content of section 6.2.4 is only for you to understand.
Conversion of hexadecimal number to decimal number
Binary, using two Arabic numerals: 0, 1;
Octal, using eight Arabic numerals: 0, 1, 2, 3, 4, 5, 6, 7;
Decimal, using ten Arabic numerals: 0 to 9;
Hexadecimal, using sixteen Arabic numerals... and so on. Did the Arabs or Indians only invent 10 numbers?
Hexadecimal is base 16, but we only have these ten numbers from 0 to 9, so we use the five letters A, B, C, D, E, F to respectively represent 10, 11, 12, 13, 14, 15. The letters are not case-sensitive.
The weight of the 0th bit of a hexadecimal number is 16 to the power of 0, the weight of the 1st bit is 16 to the power of 1, the weight of the 2nd bit is 16 to the power of 2...
So, on the Nth bit (N starts from 0), if it is a number X (X is greater than or equal to 0, and X is less than or equal to 15, that is, F), the size represented is X * 16 to the power of N.
Suppose there is a hexadecimal number 2AF5, then how to convert it to decimal?
Expressed in vertical form:
2AF5 converted to decimal:
Bit 0: 5 * 160 = 5
Bit 1: F * 161 = 240
Bit 2: A * 162 = 2560
Bit 3: 2 * 163 = 8192 +
ASCII -------------------------------------
10997
Direct calculation is:
5 * 160 + F * 161 + A * 162 + 2 * 163 = 10997
(Don't forget, in the above calculation, A represents 10, and F represents 15)
Now it can be seen that the key to converting all bases to decimal is different weights.
Suppose someone asks you why the decimal number 1234 is one thousand two hundred and thirty-four? You can give him such a formula:
1234 = 1 * 103 + 2 * 102 + 3 * 101 + 4 * 100
Expression method of hexadecimal number
If the special writing form is not used, the hexadecimal number will also be confused with the decimal number. Any number: 9876, it is impossible to see whether it is a hexadecimal number or a decimal number.
C and C++ stipulate that a hexadecimal number must start with 0x. For example, 0x1 represents a hexadecimal number. And 1 represents a decimal number. In addition, such as: 0xff, 0xFF, 0X102A, etc. The x in them is also case-insensitive. (Note: The 0 in 0x is the number 0, not the letter O)
The following are some usage examples:
int a = 0x100F;
int b = 0x70 + a;
So far, we have learned all bases: decimal, octal, hexadecimal number expressions. The last point is very important. In C/C++, decimal numbers have positive and negative points. For example, 12 means positive 12, and -12 means negative 12; but octal and hexadecimal can only express unsigned positive integers. If you write in the code: -078, or write: -0xF2, C and C++ do not treat it as a negative number.
Use of hexadecimal numbers in escape characters
The escape character can also be followed by a hexadecimal number to represent a character. For example, in section 6.2.4, the '?' character can have the following expressions:
'?' //Directly input the character
'\77' //Use octal, and the 0 at the beginning can be omitted here
'\0x3F' //Use hexadecimal
Similarly, this section is only for understanding. Except for the null character represented by the octal number '\0', we rarely use the latter two methods to represent a character.
3. Conversion from decimal number to binary, octal, hexadecimal number
Conversion of base 2 number to base 2 number
Give you a decimal number, for example: 6, how to convert it to binary?
Converting a decimal number to a binary number is a continuous process of dividing by 2:
Divide the number to be converted by 2 to get the quotient and remainder,
Divide the quotient by 2 until the quotient is 0. Finally, reverse the order of all remainders, and the resulting number is the conversion result.
It sounds a bit confusing? Let's explain with an example. For example, to convert 6 to binary.
"Divide the number to be converted by 2 to get the quotient and remainder".
Then:
The number to be converted is 6, 6 ÷ 2, get quotient 3, remainder 0. (Don't tell me you can't calculate 6 ÷ 3!)
"Divide the quotient by 2 until the quotient is 0..."
Now the quotient is 3, which is not 0, so continue dividing by 2.
Then: 3 ÷ 2, get quotient 1, remainder 1.
"Divide the quotient by 2 until the quotient is 0..."
Now the quotient is 1, which is not 0, so continue dividing by 2.
Then: 1 ÷ 2, get quotient 0, remainder 1 (take a pen and paper to calculate, 1 ÷ 2 is quotient 0 and remainder 1!)
"Divide the quotient by 2 until the quotient is 0... Finally, reverse the order of all remainders"
Great! Now the quotient is 0.
We get remainders 0, 1, 1 in three calculations in sequence, and reverse the order of all remainders, which is: 110!
6 converted to binary is 110.
Change the above paragraph to a table, it is as follows:
Dividend Calculation process Quotient Remainder
6 6/2 3 0
3 3/2 1 1
1 1/2 0 1
(In the computer, ÷ is represented by /)
If it is an exam, it will take a bit of time to draw such a table, so the more common conversion process is to use the following figure for consecutive division:
(Figure: 1)
Please compare the figure, table, and text description, and calculate by yourself how to convert 6 to binary.
After talking for a long time, is our conversion result correct? Is binary number 110 equal to 6? You have learned how to convert binary number to decimal number, so please calculate now whether 110 converted to decimal is 6.
6.3.2 Conversion of decimal number to octal and hexadecimal number
Very happy, the method of converting a decimal number to octal is similar to the method of converting to binary, the only change: the divisor is changed from 2 to 8.
Take an example, how to convert the decimal number 120 to octal.
Expressed in a table:
Dividend Calculation process Quotient Remainder
120 120/8 15 0
15 15/8 1 7
1 1/8 0 1
120 converted to octal is: 170.
Very, very happy, the method of converting a decimal number to hexadecimal is similar to the method of converting to binary, the only change: the divisor is changed from 2 to 16.
Also 120, converted to hexadecimal is:
Dividend Calculation process Quotient Remainder
120 120/16 7 8
7 7/16 0 7
120 converted to hexadecimal is: 78.
Please take a pen and paper, in the form of (Figure: 1), calculate the process of the above two tables.
4. Conversion between hexadecimal numbers
The conversion between binary and hexadecimal is relatively important. However, the conversion between the two does not need to be calculated. Every C and C++ programmer can directly convert a binary number to a hexadecimal number, and vice versa.
We are the same, as long as we learn this section, we can do it.
First, let's look at a binary number: 1111, what is it?
You may still calculate like this: 1 * 20 + 1 * 21 + 1 * 22 + 1 * 23 = 1 * 1 + 1 * 2 + 1 * 4 + 1 * 8 = 15.
However, since 1111 is only 4 bits, we must directly remember the weight of each bit, and it is from the high bit to the low bit, :8, 4, 2, 1. That is, the weight of the highest bit is 23 = 8, and then 22 = 4, 21 = 2, 20 = 1.
Remember 8421, for any 4-bit binary number, we can quickly calculate its corresponding decimal value.
The following lists all possible values of a 4-bit binary number xxxx (some are omitted in the middle)
Only 4-bit binary number Quick calculation method Decimal value Hexadecimal value
1111 = 8 + 4 + 2 + 1 = 15 F
1110 = 8 + 4 + 2 + 0 = 14 E
1101 = 8 + 4 + 0 + 1 = 13 D
1100 = 8 + 4 + 0 + 0 = 12 C
1011 = 8 + 4 + 0 + 1 = 11 B
1010 = 8 + 0 + 2 + 0 = 10 A
1001 = 8 + 0 + 0 + 1 = 10 9
。。。。
0001 = 0 + 0 + 0 + 1 = 1 1
0000 = 0 + 0 + 0 + 0 = 0 0
To convert a binary number to hexadecimal, it is to divide into 4-bit segments and convert each segment to hexadecimal.
For example (the upper line is the binary number, the following is the corresponding hexadecimal):
1111 1101, 1010 0101, 1001 1011
F D, A 5, 9 B
Conversely, when we see FD, how to quickly convert it to binary?
First convert F:
See F, we need to know it is 15 (maybe you are not familiar with the five numbers A~F yet), then how to use 8421 to make up 15? It should be 8 + 4 + 2 + 1, so all four bits are 1: 1111.
Then convert D:
See D, know it is 13, how to use 8421 to make up 13? It should be: 8 + 4 + 1, that is: 1101.
So, FD converted to binary is: 1111 1101
Since the conversion from hexadecimal to binary is quite direct, so when we need to convert a decimal number to a binary number, we can also convert it to hexadecimal first, and then convert it to binary.
For example, the decimal number 1234 is converted to binary. If we want to divide by 2 all the time to directly get the binary number, it needs to be calculated many times. So we can first divide by 16 to get the hexadecimal number:
ASCII
Dividend Calculation process Quotient Remainder
1234 1234/16 77 2
77 77/16 4 13 (D)
4 4/16 0 4
The result hexadecimal is: 0x4D2
Then we can directly write the binary form of 0x4D2: 0100 1011 0010.
The corresponding relationship is:
0100 -- 4
1011 -- D
0010 -- 2
Similarly, if a binary number is very long, and we need to convert it to a decimal number, in addition to the method we learned earlier, we can also first convert this binary number to hexadecimal, and then convert it to decimal.
The following is an example of an int type binary number:
01101101 11100101 10101111 00011011
We convert it to hexadecimal in groups of four bits: 6D E5 AF 1B
9 Chinese Character Coding Edit
0-127 is the range of 7-bit ASCII code, which is an international standard.
As for Chinese characters, the range of ASCII code used by different character sets is also different. Commonly used Chinese character sets include GB2312-80, GBK,
Big5, unicode, etc. Here I will focus on the most commonly used GB_2312 character set.
The GB_2312 character set is currently the most commonly used Chinese character encoding standard. The GBK character set used in windows 95/98/2000 contains GB2312, or is compatible with GB2312. The GB_2312 character set contains 6763 simplified Chinese characters and 682 standard Chinese symbols. In this standard, each Chinese character is represented by 2 bytes, and the ASCII code of each byte is 161-254 (16 hexadecimal A1 - FE). The first byte corresponds to the 1-94th area of the area code, and the second byte corresponds to the 1-94th bit of the bit code.
161-254 is actually easy to remember. Everyone knows that in English characters, the range of printable characters is 33-126. Add this pair of numbers plus
128 (or set the highest bit to 1) to get the range of characters used by Chinese characters.
//The specification of GB18030 is that the first byte of a Chinese character is between 0x81-0xFE, and the second byte is in the interval 0x40-0x7E and 0x80-0xFE. Each byte is converted to an integer greater than 128.
if ((char_temp>=0x81)&&(char_temp<=0xFE))
{
if(*len
{
*len+=1;
*p_temp++=char_temp;
_putch(char_temp);
x++;
}
}
Reference materials:
1.
ASCII code table 0-127
http://www.asciima.com/
2.
Extended ASCII printable characters
http://www.asciima.com/ascii/4.html
3.
ASCII non-printing control character table
http://www.asciima.com/ascii/2.html
Extended Reading:
1.
American Standard Code for Information Interchange
http://www.360doc.com/content/10/1207/11/3945310_75772859.shtml
2.
ASCII code table ASCII control characters
http://ascii.911cha.com/
1<词>,2,3/段\,4{节},5(章)。
