Assembly Language
(Programming Language Oriented to Machines)
This entry is reviewed by the "Science Popularization China" Encyclopedia Science Entry Compilation and Application Project.
Assembly language (assembly language) is a low-level language used for electronic computers, microprocessors, microcontrollers, or other programmable devices, also known as symbolic language. In assembly language, mnemonics are used to replace the operation codes of machine instructions, and address symbols (Symbol) or labels (Label) are used to replace the addresses of instructions or operands. In different devices, assembly language corresponds to different machine language instruction sets, and is converted into machine instructions through the assembly process. Generally speaking, a specific assembly language corresponds one - to - one with a specific machine language instruction set, and cannot be directly ported between different platforms.
Many assemblers provide additional support mechanisms for program development, assembly control, and auxiliary debugging. Some assembly language programming tools often provide macros, which are also called macro assemblers.
Assembly language is not widely used in programming like most other programming languages. In today's practical applications, it is usually applied in the occasions of underlying, hardware operation, and high - requirement program optimization. Device drivers, embedded operating systems, and real - time running programs all need assembly language.
Chinese Name
Assembly Language
English Name
Assembly Language
Discipline
Software Engineering
Era of Origin
1950s
Compilation Method
Assembly
Table of Contents
1. Development Process
2. Language Characteristics
▪ Overall Characteristics
▪ Advantages
▪ Disadvantages
3. Language Components
▪ Data Transfer Instructions
▪ Integer and Logical Operation Instructions
▪ Shift Instructions
▪ Bit - Operation Instructions
▪ Conditional Setting Instructions
▪ Control Transfer Instructions
▪ String Operation Instructions
▪ Input and Output Instructions
4. Related Technologies
▪ Assembler
▪ Compilation Environment
5. Development Prospect
6. Practical Applications
7. Classic Textbooks
▪ x86 Processors
▪ ARM and Microcontrollers
Development Process
When it comes to the emergence of assembly language, we should first talk about machine language. Machine language is a set of machine instructions. To expand, a machine instruction is a command that a machine can execute correctly. The machine instructions of an electronic computer are a series of binary digits. The computer converts them into a series of high - and low - level electrical levels to drive the electronic components of the computer to perform operations.
The above - mentioned computer refers to a machine that can execute machine instructions and perform operations. This is the concept of early computers. In the PC we commonly use, there is a chip to complete the functions of the above - mentioned computer. This chip is the CPU (Central Processing Unit) we often talk about. Each microprocessor needs different level pulses to control its work due to different hardware designs and internal structures. So each microprocessor has its own machine instruction set, that is, machine language.
Early program design all used machine language. Programmers typed the program code composed of 0 and 1 on paper tape or cards, with 1 being punched and 0 not being punched, and then input the program into the computer through a paper tape machine or card machine for operation. Such machine language is composed of pure 0s and 1s, which is very complex, not convenient for reading and modification, and also prone to errors. Programmers soon discovered the troubles brought by using machine language. They were difficult to distinguish and remember, which brought obstacles to the development of the entire industry. Then assembly language emerged.
The main body of assembly language is assembly instructions. The difference between assembly instructions and machine instructions lies in the way of instruction representation. Assembly instructions are an easy - to - remember writing format of machine instructions.
Operation: The content of register BX is sent to AX
1000100111011000 Machine instruction
mov ax,bx Assembly instruction
After that, programmers use assembly instructions to write source programs. However, what the computer can understand is only machine instructions. Then how to make the computer execute the programs written by programmers with assembly instructions? At this time, there needs to be a translation program that can convert assembly instructions into machine instructions. Such a program is called a compiler. Programmers write source programs in assembly language, and then use an assembly compiler to compile them into machine code, which is finally executed by the computer.
Language Characteristics
Assembly language is a program design language directly facing the processor (Processor). The processor works under the control of instructions, and each instruction that the processor can recognize is called a machine instruction. Each processor has its own set of instructions that it can recognize, called an instruction set. When the processor executes an instruction, it takes different actions according to different instructions to complete different functions. It can not only change its own internal working state but also control the working state of other peripheral circuits.
Another feature of assembly language is that the objects it operates on are not specific data, but registers or memory. That is to say, it directly deals with registers and memory. This is also why the execution speed of assembly language is faster than other languages, but at the same time, this also makes programming more complex. Since data is stored in registers or memory, there must be addressing modes, that is, the method used to find the required data. For example, in the above example, we cannot use data directly like in high - level languages, but first need to take out the data from the corresponding registers AX and BX. This also increases the complexity of programming, because in high - level languages, this part of addressing work is completed by the compilation system, while in assembly language, it is completed by the programmer himself, which virtually increases the complexity of programming and the readability of the program.
Moreover, assembly language instructions are a symbolic representation of machine instructions, and different types of CPUs have different machine instruction systems, so there are also different assembly languages. Therefore, assembly language programs are closely related to machines. So, except for a certain degree of portability of assembly language programs between CPUs of the same series and different models, assembly language programs between other different types of CPUs (such as minicomputers and microcomputers) cannot be ported. That is to say, the universality and portability of assembly language programs are lower than those of high - level language programs.
Because assembly language has the characteristic of "machine - relatedness", when programmers write programs in assembly language, they can fully arrange various resources inside the machine reasonably and keep them in the best use state all the time. The program code written in this way has short execution code and fast execution speed. Assembly language is the one that is most closely and directly related to hardware among various programming languages, and also the one with the highest efficiency in terms of time and space. It is one of the mandatory professional courses for computer application technology in colleges and universities, and is of great significance for training students to master program design technology, be familiar with the operation and program debugging technology on the computer.
Overall Characteristics
1. Machine - relatedness
This is a low - level language oriented to machines. Usually, it is specially designed for a specific computer or series of computers. Because it is a symbolic representation of machine instructions, different machines have different assembly languages. Using assembly language can face the machine and give full play to the characteristics of the machine, and obtain high - quality programs.
2. High speed and high efficiency
Assembly language retains the advantages of machine language, has the characteristics of directness and simplicity, can effectively access and control various hardware devices of the computer, such as disks, memory, CPU, I/O ports, etc., and occupies less memory and has fast execution speed. It is an efficient program design language.
3. Complexity of writing and debugging
Because it directly controls hardware, and even simple tasks need many assembly language statements, so when carrying out program design, one must consider everything possible problems and reasonably allocate and use various software and hardware resources. In this way, the burden on programmers is inevitably increased. Similarly, when debugging the program, once the operation of the program goes wrong, it is very difficult to find it.
Advantages
1. Because the program designed in assembly language is finally converted into machine instructions, it can maintain the consistency of machine language, be direct and simple, and can access and control various hardware devices of the computer, such as disks, memory, CPU, I/O ports, etc., just like machine instructions. Using assembly language, all accessible software and hardware resources can be accessed.
2. The object code is short, occupies less memory, and has fast execution speed. It is an efficient program design language. It is often used in combination with high - level languages to improve the execution speed and efficiency of the program, make up for the lack of high - level languages in hardware control, and is widely applied.
Disadvantages
1. Assembly language is oriented to machines and is at the bottom of the entire computer language hierarchy, so it is regarded as a low - level language. Usually, it is specially designed for a specific computer or series of computers. Different processors have different assembly language grammars and compilers, and the compiled programs cannot be executed on different processors, lacking portability.
2. It is difficult to understand the program design intention from the assembly language code, and the maintainability is poor. Even to complete a simple task, a large number of assembly language codes are needed, which is very easy to produce bugs and difficult to debug.
3. Using assembly language must have a very good understanding of a certain processor, and can only be optimized for a specific architecture and processor, with very low development efficiency, long cycle and tedious work.
Language Components
Data Transfer Instructions
This part of instructions includes general data transfer instructions MOV, conditional transfer instructions CMOVcc, stack operation instructions PUSH/PUSHA/PUSHAD/POP/POPA/POPAD, exchange instructions XCHG/XLAT/BSWAP, address or segment descriptor selector transfer instructions LEA/LDS/LES/LFS/LGS/LSS, etc. It should be noted that CMOVcc is not a specific instruction but an instruction cluster, including a large number of instructions, which are used to decide whether to perform the specified transfer operation according to certain bit states of the EFLAGS register.
Integer and Logical Operation Instructions
This part of instructions is used to perform arithmetic and logical operations, including addition instructions ADD/ADC, subtraction instructions SUB/SBB, increment instructions INC, decrement instructions DEC, comparison operation instructions CMP, multiplication instructions MUL/IMUL, division instructions DIV/IDIV, sign extension instructions CBW/CWDE/CDQE, decimal adjustment instructions DAA/DAS/AAA/AAS, logical operation instructions NOT/AND/OR/XOR/TEST, etc.
Shift Instructions
This part of instructions is used to move the register or memory operand for the specified number of times. Including logical left shift instructions SHL, logical right shift instructions SHR, arithmetic left shift instructions SAL, arithmetic right shift instructions SAR, circular left shift instructions ROL, circular right shift instructions ROR, etc.
Bit - Operation Instructions
This part of instructions includes bit test instructions BT, bit test and set instructions BTS, bit test and reset instructions BTR, bit test and invert instructions BTC, bit forward scan instructions BSF, bit backward scan instructions BSR, etc.
Conditional Setting Instructions
This is not a specific instruction but an instruction cluster, including about 30 instructions, which are used to set an 8 - bit register or memory operand according to certain bit states of the EFLAGS register. For example, SETE/SETNE/SETGE, etc.
Control Transfer Instructions
This part includes unconditional transfer instructions JMP, conditional transfer instructions Jcc/JCXZ, loop instructions LOOP/LOOPE/LOOPNE, procedure call instructions CALL, sub - procedure return instructions RET, interrupt instructions INTn, INT3, INTO, IRET, etc. It should be noted that Jcc is an instruction cluster, including many instructions, which are used to decide whether to transfer according to certain bit states of the EFLAGS register; INT n is a software interrupt instruction, where n can be a number between 0 and 255, which is used to indicate the interrupt vector number.
String Operation Instructions
This part of instructions is used to operate on data strings, including string transfer instructions MOVS, string comparison instructions CMPS, string scan instructions SCANS, string load instructions LODS, string save instructions STOS. These instructions can optionally use the prefixes REP/REPE/REPZ/REPNE and REPNZ to operate continuously.
Input and Output Instructions
This part of instructions is used to exchange data with peripheral devices, including port input instructions IN/INS, port output instructions OUT/OUTS.
High - Level Language Auxiliary Instructions
This part of instructions provides convenience for compilers of high - level languages, including instructions ENTER for creating stack frames and instructions LEAVE for releasing stack frames.
Control and Privilege Instructions
This part includes no - operation instructions NOP, halt instructions HLT, wait instructions WAIT/MWAIT, escape instructions ESC, bus lock instructions LOCK, memory range check instructions BOUND, global descriptor table operation instructions LGDT/SGDT, interrupt descriptor table operation instructions LIDT/SIDT, local descriptor table operation instructions LLDT/SLDT, descriptor segment limit value loading instructions LSR, descriptor access right reading instructions LAR, task register operation instructions LTR/STR, request privilege level adjustment instructions ARPL, task switch flag clearing instructions CLTS, control register and debug register data transfer instructions MOV, cache control instructions INVD/WBINVD/INVLPG, model - related register reading and writing instructions RDMSR/WRMSR, processor information acquisition instructions CPUID, timestamp reading instructions RDTSC, etc.
Floating - Point and Multimedia Instructions
This part of instructions is used to accelerate the operation of floating - point data and the single - instruction multiple - data (SIMD and its extension SSEx) instructions used to accelerate multimedia data processing. This part of instructions is very large and cannot be listed one by one. Please refer to the INTEL manual by yourself.
Virtual Machine Extension Instructions
This part of instructions includes INVEPT/INVVPID/VMCALL/VMCLEAR/VMLAUNCH/VMRESUME/VMPTRLD/VMPTRST/VMREAD/VMWRITE/VMXOFF/VMON, etc.
Related Technologies
Assembler
A typical modern assembler (assembler) constructs object code, interprets the mnemonics of the assembly language instruction set into operation codes (OpCode), and parses symbolic names (symbolic names) into memory addresses and other entities. Using symbolic references is an important feature of the assembler, which can save the tedious and time - consuming calculation of manual address transfer after modifying the program. Basically, it is to turn machine code into some letters, and when compiling, replace the input instruction letters with obscure machine code.
Compilation Environment
The symbolic program written in non - machine language such as assembly language is called a source program. The function of the assembly language compiler is to translate the source program into an object program. The object program is a machine language program. When it is placed in the pre - set position in the memory, it can be processed and executed by the CPU of the computer.
Generally speaking, there are relatively few debugging environments for assembly, and there are also very few very good compilers. The choice of compiler depends on the type of the target processor and the specific system platform. Generally speaking, a functional compiler should be very easy to use. For example, it should be able to automatically organize the format, have syntax highlighting, integrate compilation, linking, and debugging, and be convenient and practical.
For the widely used personal computer, the free - choice assembly language compilers include MASM, NASM, TASM, GAS, FASM, RADASM, etc., but most of them do not have debugging functions. If it is for learning assembly language, Easy Assembler is a very suitable compiler for beginners because it has a perfect integrated environment.
Development Prospect
Assembly language is the mnemonic of machine language. Compared with the boring machine code, it is easier to read, write, debug and modify. At the same time, excellent assembly language designers have made clever designs, making the code after assembly of assembly language have the advantages of faster execution speed and less memory space occupied than high - level languages. However, the running speed and space occupation of assembly language are relative to high - level languages and need to be cleverly designed. Moreover, the code execution efficiency of some high - level languages is also very high after compilation, so this advantage is gradually weakened. And it has obvious limitations when writing complex programs. Assembly language depends on the specific model and cannot be universal or ported between different models. It is often said that assembly language is a low - level language, which does not mean that assembly language should be discarded. On the contrary, assembly language is still a language that bottom - layer design programmers of computers (or microcomputers) must understand. In some industries and fields, assembly is indispensable and inapplicable without it. However, the largest field of computers today is IT software, which is also the computer application software programming we often talk about. In the hands of skilled programmers, the programs written in assembly language have relatively improved running efficiency and performance compared with programs written in other languages, but the price is that it takes a longer time to optimize. If the computer principle and programming foundation are not solid, it will instead increase the development difficulty, which is really not worth the loss. Compared with software development around 2010, it has been a market - oriented software industry. Plus the excellence and cross - platform of high - level languages, a company cannot let a team use assembly language to write everything, spending several times or even dozens of times the time. It is better to use other languages to complete it. As long as the final result is not much worse than that written in assembly language, it can be completed first. This is an inevitable result under the market economy.
However, so far, no programmer dares to assert that assembly language is unnecessary to learn. At the same time, assembly language (Assembly Language) is a program design language oriented to machines. Exquisitely designed assembly language programmers have partially deviated from software development and entered the industrial electronic programming. For the industries with relatively small functions but strict hardware requirements for language design, such as 4 - bit microcontrollers, due to their capacity and operation, the electronic engineers in this industry are generally responsible for the development and design of circuits and software control. The main development language is assembly, and the use of C language accounts for a very small part. And electronic development engineers are very hard to find. In some industrial companies, a core electronic engineer has a higher treatment than any other staff. In contrast, the treatment of general electronic engineers is more than ten times that of programmers. This situation is because since the 21st century, although there are also many people learning assembly, few can really learn it thoroughly. It is more difficult to learn and use than high - level languages, and the scope of application is small. Although it is simple, it is too flexible. It is much more difficult for people who have learned high - level languages to learn assembly than those who start learning assembly first. But people who have learned assembly can learn high - level languages very easily. Simple from complex is easy, complex from simple is difficult. For a programmer who has a comprehensive understanding of the microcomputer principle, assembly language is a required language.
Practical Applications
With the increasing complexity of modern software systems, a large number of newly emerged high - level languages such as C/C++, Pascal/Object Pascal have also emerged. These new languages enable programmers to be simpler and more efficient in the development process, and enable software developers to meet the requirements of rapid software development. And the complexity of assembly language makes its applicable fields gradually decrease. But this does not mean that assembly has no place. Because assembly is closer to machine language and can directly operate on hardware, the generated programs have higher running speed and occupy smaller memory than other languages. Therefore, it is widely used in some programs with high time - sensitivity requirements, the core modules of many large programs, and industrial control.
In addition, although there are many programming languages to choose from, assembly is still a required course for students majoring in computer science in various universities to enable students to deeply understand the operation principle of computers.
Historically, assembly language was once one of the very popular program design languages. With the growth of software scale and the subsequent requirements for software development progress and efficiency, high - level languages gradually replaced assembly language. But even so, high - level languages cannot completely replace the role of assembly language. Take the Linux kernel as an example. Although most of the code is written in C language, it is still inevitable to use assembly code in some key places. Because this part of the code is very closely related to hardware, even C language will be powerless, and assembly language can give full play to its advantages and maximize the performance of hardware.
First of all, most statements in assembly language directly correspond to machine instructions, have fast execution speed, high efficiency, and small code volume. They are relatively useful in occasions where memory capacity is limited but fast and real - time response is required, such as in instrument meters and industrial control equipment.
Secondly, in the core part of the system program and the part that frequently interacts with the system hardware, assembly language can be used. For example, the core program segment of the operating system, the initialization program of the I/O interface circuit, the low - level driver program of external equipment, and frequently called sub - programs, dynamic link libraries, some advanced drawing programs, video game programs, etc.
Thirdly, assembly language can be used in various aspects such as software encryption and decryption, analysis and prevention of computer viruses, and program debugging and error analysis.
Finally, by learning assembly language, one can deepen the understanding of courses such as computer principle and operating system. By learning and using assembly language, one can perceive, experience, and understand the logical function of the machine, lay a technical theoretical foundation for understanding the principles of various software systems upward, and lay a practical application foundation for mastering the principles of the hardware system downward.
Classic Textbooks
There are many textbooks on assembly language, involving various processors. There are roughly no less than 100 kinds. Among so many textbooks, the more commonly used ones can be listed in categories as follows:
x86 Processors
1. "x86 Assembly Language: From Real - Mode to Protected Mode", written by Li Zhong, published by Publishing House of Electronics Industry, 2013 - 1.
Based on the INTEL x86 processor, NASM compiler, and BOCHS virtual machine. Assembly language is the language of the processor. In this sense, since learning assembly language, one must directly program for hardware, not use inexplicable DOS interrupts and API calls. This is an interesting book. It does not spend space on calculating some boring math problems. On the contrary, it teaches you how to directly control hardware, such as displaying characters, reading hard disk data, and controlling other hardware without the support of BIOS, DOS, Windows, Linux, or any other software.
We know that 32 - bit and 64 - bit are the mainstream, and real - mode and DOS operating systems have become history. Linux and Windows both work in protected mode. This book talks about real - mode to 32 - bit protected mode, especially focusing on 32 - bit protected mode. Reading this book is of great help for understanding the working principles of modern computers and modern operating systems.
2. "Assembly Language" (2nd Edition), written by Wang Shuang, published by Tsinghua University Press, 2013 - 4 - 1
Based on the INTEL 8086 processor, MASM compiler, and DOS platform assembly textbook, completely focusing on the real - mode of the 8086 processor, not involving common 32 - bit and 64 - bit modes. But because it is easy to understand, the readers' feedback is very good.
3. "80X86 Assembly Language Programming Tutorial", compiled by Yang Jiwen et al., published by Tsinghua University Press, 1999 - 3 - 1
Based on the INTEL x86 processor, MASM and TASM compilers, including the content of 16 - bit real - mode and 32 - bit protected mode, and the latter is described in more detail.
4. "32 - Bit Assembly Language Programming", compiled by Qian Xiaojie, published by China Machine Press, 2011 - 8 - 1
Based on the INTEL x86 processor, MASM compiler, and WINDOWS platform assembly textbook.
5. "16/32 - Bit Microcomputer Principle, Assembly Language and Interface Technology", compiled by Qian Xiaojie and Chen Tao, published by China Machine Press, 2005 - 2 - 1
Based on the INTEL x86 processor, it discusses the basic principles, assembly language, and interface technology of 16 - bit microcomputers, and introduces the relevant technologies of 32 - bit microcomputer systems.
6. "Intel Assembly Language Programming" (5th Edition), written by (US) Irving, published by Publishing House of Electronics Industry, 2012 - 7 - 1
Based on the INTEL x86 processor, MASM compiler, and DOS/WINDOWS platform assembly textbook, it has the content of 16 - bit real - mode and 32 - bit protected mode.
7. "The Art of Assembly Language Programming" (2nd Edition), written by (US) Hyde, published by Tsinghua University Press, 2011 - 12 - 1
Based on the INTEL x86 processor, the author's self - made advanced language assembler (High Level Assembler, HLA) is used as a teaching tool to partially obtain the advantages and functions of high - level languages.
8. "x86 PC Assembly Language, Design and Interface" (5th Edition), written by (US) Mazidi and Kaush, published by Publishing House of Electronics Industry, 2011 - 1 - 1
Based on the INTEL x86 processor, it not only talks about the content of 16 - bit real - mode but also talks about the content of 32 - bit protected mode, and also has some introduction to 64 - bit.
ARM and Microcontrollers
1. "Assembly Language Programming -- Based on ARM Architecture" (2nd Edition), edited by Wen Quangang et al., published by Beijing University of Aeronautics and Astronautics Press, 2010 - 8 - 1
Based on the processor of the ARM architecture, it is an introductory textbook for learning embedded technology.
2. "Learn AVR Microcontroller from Scratch", compiled by Xu Yimin et al., published by China Machine Press, 2011 - 1 - 1
It includes microcontroller overview, development tools of avr microcontroller, avr microcontroller C language, basic structure of atmega16 microcontroller, instruction system and assembly system of avr, etc.
3. "Practical Tutorial on 51 Microcontroller Simulation Based on Multisim10", edited by Nie Dian and Ding Wei, published by Publishing House of Electronics Industry, 2010 - 2 - 1
It expounds various main functions of NI Multisim 10 in microcontroller simulation.
4. "PIC18 Microcontroller: Architecture, Programming and Interface Design", written by (US) Berry, published by Tsinghua University Press, 2009 - 4 - 1
Microcontrollers are widely used in many fields such as automobiles, home appliances, industrial control, and medical equipment. This book takes the PIC18 series microcontrollers of Microchip Company as an example to comprehensively explain how to program microcontrollers using C language and assembly language.
5. "CASL Assembly Language Programming", compiled by Zhao Lihui, published by China Electric Power Press, 2002 - 10 - 1
CASL assembly language is a necessary content for the senior programmer level of the computer software professional technology qualification and level examination in China. This book is a monograph on CASL assembly language programming.
-----------------------------------------------------------------------
http://vdisk.weibo.com/s/BJ8eVgUUujf2u
[
Last edited by zzz19760225 on 2017 - 11 - 11 at 19:03 ]