汇编语言
(面向机器的程序设计语言)
本词条由“科普中国”百科科学词条编写与应用工作项目 审核 。
汇编语言(assembly language)是一种用于电子计算机、微处理器、微控制器或其他可编程器件的低级语言,亦称为符号语言。在汇编语言中,用助记符(Mnemonics)代替机器指令的操作码,用地址符号(Symbol)或标号(Label)代替指令或操作数的地址。在不同的设备中,汇编语言对应着不同的机器语言指令集,通过汇编过程转换成机器指令。普遍地说,特定的汇编语言和特定的机器语言指令集是一一对应的,不同平台之间不可直接移植。
许多汇编程序为程序开发、汇编控制、辅助调试提供了额外的支持机制。有的汇编语言编程工具经常会提供宏,它们也被称为宏汇编器。
汇编语言不像其他大多数的程序设计语言一样被广泛用于程序设计。在今天的实际应用中,它通常被应用在底层,硬件操作和高要求的程序优化的场合。驱动程序、嵌入式操作系统和实时运行程序都需要汇编语言。
中文名
汇编语言
外文名
Assembly Language
学 科
软件工程
产生年代
20世纪50年代
编译方式
汇编
目录
1 发展历程
2 语言特点
▪ 总体特点
▪ 优点
▪ 缺点
3 语言组成
▪ 数据传送指令
▪ 整数和逻辑运算指令
▪ 移位指令
▪ 位操作指令
▪ 条件设置指令
▪ 控制转移指令
▪ 串操作指令
▪ 输入输出指令
4 相关技术
▪ 汇编器
▪ 编译环境
5 发展前景
6 实际应用
7 经典教材
▪ x86处理器
▪ ARM及单片机
发展历程
说到汇编语言的产生,首先要讲一下机器语言。机器语言是机器指令的集合。机器指令展开来讲就是一台机器可以正确执行的命令。电子计算机的机器指令是一列二进制数字。计算机将之转变为一列高低电平,以使计算机的电子器件受到驱动,进行运算。
上面所说的计算机指的是可以执行机器指令,进行运算的机器。这是早期计算机的概念。在我们常用的PC机中,有一个芯片来完成上面所说的计算机的功能。这个芯片就是我们常说的CPU(Central Processing Unit,中央处理单元)。每一种微处理器,由于硬件设计和内部结构的不同,就需要用不同的电平脉冲来控制,使它工作。所以每一种微处理器都有自己的机器指令集,也就是机器语言。
早期的程序设计均使用机器语言。程序员们将用0, 1数字编成的程序代码打在纸带或卡片上,1打孔,0不打孔,再将程序通过纸带机或卡片机输入计算机,进行运算。这样的机器语言由纯粹的0和1构成,十分复杂,不方便阅读和修改,也容易产生错误。程序员们很快就发现了使用机器语言带来的麻烦,它们难于辨别和记忆,给整个产业的发展带来了障碍,于是汇编语言产生了。
汇编语言的主体是汇编指令。汇编指令和机器指令的差别在于指令的表示方法上。汇编指令是机器指令便于记忆的书写格式。
操作:寄存器BX的内容送到AX中
1000100111011000 机器指令
mov ax,bx 汇编指令
此后,程序员们就用汇编指令编写源程序。可是,计算机能读懂的只有
工作过程 工作过程
机器指令,那么如何让计算机执行程序员用汇编指令编写的程序呢?这时,就需要有一个能够将汇编指令转换成机器指令的翻译程序,这样的程序我们称其为编译器。程序员用汇编语言写出源程序,再用汇编编译器将其编译为机器码,由计算机最终执行。
语言特点
汇编语言是直接面向处理器(Processor)的程序设计语言。处理器是在指令的控制下工作的,处理器可以识别的每一条指令称为机器指令。每一种处理器都有自己可以识别的一整套指令,称为指令集。处理器执行指令时,根据不同的指令采取不同的动作,完成不同的功能,既可以改变自己内部的工作状态,也能控制其它外围电路的工作状态。
汇编语言的另一个特点就是它所操作的对象不是具体的数据,而是寄存器或者存储器,也就是说它是直接和寄存器和存储器打交道,这也是为什么汇编语言的执行速度要比其它语言快,但同时这也使编程更加复杂,因为既然数据是存放在寄存器或存储器中,那么必然就存在着寻址方式,也就是用什么方法找到所需要的数据。例如上面的例子,我们就不能像高级语言一样直接使用数据,而是先要从相应的寄存器AX、BX 中把数据取出。这也就增加了编程的复杂性,因为在高级语言中寻址这部分工作是由编译系统来完成的,而在汇编语言中是由程序员自己来完成的,这无异增加了编程的复杂程度和程序的可读性。
再者,汇编语言指令是机器指令的一种符号表示,而不同类型的CPU 有不同的机器指令系统,也就有不同的汇编语言,所以,汇编语言程序与机器有着密切的关系。所以,除了同系列、不同型号CPU 之间的汇编语言程序有一定程度的可移植性之外,其它不同类型(如:小型机和微机等)CPU 之间的汇编语言程序是无法移植的,也就是说,汇编语言程序的通用性和可移植性要比高级语言程序低。
正因为汇编语言有“与机器相关性”的特性,程序员用汇编语言编写程序时,可充分对机器内部的各种资源进行合理的安排,让它们始终处于最佳的使用状态。这样编写出来的程序执行代码短、执行速度快。汇编语言是各种编程语言中与硬件关系最密切、最直接的一种,在时间和空间的效率上也最高的一种,它是高等院校计算机应用技术必修的专业课程之一,对于训练学生掌握程序设计技术,熟悉上机操作和程序调试技术有重要作用
总体特点
1.机器相关性
这是一种面向机器的低级语言,通常是为特定的计算机或系列计算机专门设计的。因为是机器指令的符号化表示,故不同的机器就有不同的汇编语言。使用汇编语言能面向机器并较好地发挥机器的特性,得到质量较高的程序。
2.高速度和高效率
汇编语言保持了机器语言的优点,具有直接和简捷的特点,可有效地访问、控制计算机的各种硬件设备,如磁盘、存储器、CPU、I/O端口等,且占用内存少,执行速度快,是高效的程序设计语言。
3.编写和调试的复杂性
由于是直接控制硬件,且简单的任务也需要很多汇编语言语句,因此在进行程序设计时必须面面俱到,需要考虑到一切可能的问题,合理调配和使用各种软、硬件资源。这样,就不可避免地加重了程序员的负担。与此相同,在程序调试时,一旦程序的运行出了问题,就很难发现。
优点
1、因为用汇编语言设计的程序最终被转换成机器指令,故能够保持机器语言的一致性,直接、简捷,并能像机器指令一样访问、控制计算机的各种硬件设备,如磁盘、存储器、CPU、I/O端口等。使用汇编语言,可以访问所有能够被访问的软、硬件资源。
2、目标代码简短,占用内存少,执行速度快,是高效的程序设计语言,经常与高级语言配合使用,以改善程序的执行速度和效率,弥补高级语言在硬件控制方面的不足,应用十分广泛。
缺点
1、汇编语言是面向机器的,处于整个计算机语言层次结构的底层,故被视为一种低级语言,通常是为特定的计算机或系列计算机专门设计的。不同的处理器有不同的汇编语言语法和编译器,编译的程序无法在不同的处理器上执行,缺乏可移植性;
2、难于从汇编语言代码上理解程序设计意图,可维护性差,即使是完成简单的工作也需要大量的汇编语言代码,很容易产生bug,难于调试;
3、使用汇编语言必须对某种处理器非常了解,而且只能针对特定的体系结构和处理器进行优化,开发效率很低,周期长且单调。
语言组成
数据传送指令
这部分指令包括通用数据传送指令MOV、条件传送指令CMOVcc、堆栈操作指令PUSH/PUSHA/PUSHAD/POP/POPA/POPAD、交换指令XCHG/XLAT/BSWAP、地址或段描述符选择子传送指令LEA/LDS/LES/LFS/LGS/LSS等。注意,CMOVcc不是一条具体的指令,而是一个指令簇,包括大量的指令,用于根据EFLAGS寄存器的某些位状态来决定是否执行指定的传送操作。
整数和逻辑运算指令
这部分指令用于执行算术和逻辑运算,包括加法指令ADD/ADC、减法指令SUB/SBB、加一指令INC、减一指令DEC、比较操作指令CMP、乘法指令MUL/IMUL、除法指令DIV/IDIV、符号扩展指令CBW/CWDE/CDQE、十进制调整指令DAA/DAS/AAA/AAS、逻辑运算指令NOT/AND/OR/XOR/TEST等。
移位指令
这部分指令用于将寄存器或内存操作数移动指定的次数。包括逻辑左移指令SHL、逻辑右移指令SHR、算术左移指令SAL、算术右移指令SAR、循环左移指令ROL、循环右移指令ROR等。
位操作指令
这部分指令包括位测试指令BT、位测试并置位指令BTS、位测试并复位指令BTR、位测试并取反指令BTC、位向前扫描指令BSF、位向后扫描指令BSR等。
条件设置指令
这不是一条具体的指令,而是一个指令簇,包括大约30条指令,用于根据EFLAGS寄存器的某些位状态来设置一个8位的寄存器或者内存操作数。比如SETE/SETNE/SETGE等等。
控制转移指令
这部分包括无条件转移指令JMP、条件转移指令Jcc/JCXZ、循环指令LOOP/LOOPE/LOOPNE、过程调用指令CALL、子过程返回指令RET、中断指令INTn、INT3、INTO、IRET等。注意,Jcc是一个指令簇,包含了很多指令,用于根据EFLAGS寄存器的某些位状态来决定是否转移;INT n是软中断指令,n可以是0到255之间的数,用于指示中断向量号。
串操作指令
这部分指令用于对数据串进行操作,包括串传送指令MOVS、串比较指令CMPS、串扫描指令SCANS、串加载指令LODS、串保存指令STOS,这些指令可以有选择地使用REP/REPE/REPZ/REPNE和REPNZ的前缀以连续操作。
输入输出指令
这部分指令用于同外围设备交换数据,包括端口输入指令IN/INS、端口输出指令OUT/OUTS。
高级语言辅助指令
这部分指令为高级语言的编译器提供方便,包括创建栈帧的指令ENTER和释放栈帧的指令LEAVE。
控制和特权指令
这部分包括无操作指令NOP、停机指令HLT、等待指令WAIT/MWAIT、换码指令ESC、总线封锁指令LOCK、内存范围检查指令BOUND、全局描述符表操作指令LGDT/SGDT、中断描述符表操作指令LIDT/SIDT、局部描述符表操作指令LLDT/SLDT、描述符段界限值加载指令LSR、描述符访问权读取指令LAR、任务寄存器操作指令LTR/STR、请求特权级调整指令ARPL、任务切换标志清零指令CLTS、控制寄存器和调试寄存器数据传送指令MOV、高速缓存控制指令INVD/WBINVD/INVLPG、型号相关寄存器读取和写入指令RDMSR/WRMSR、处理器信息获取指令CPUID、时间戳读取指令RDTSC等。
浮点和多媒体指令
这部分指令用于加速浮点数据的运算,以及用于加速多媒体数据处理的单指令多数据(SIMD及其扩展SSEx)指令。这部分指令数据非常庞大,无法一一列举,请自行参考INTEL手册。
虚拟机扩展指令
这部分指令包括INVEPT/INVVPID/VMCALL/VMCLEAR/VMLAUNCH/VMRESUME/VMPTRLD/VMPTRST/VMREAD/VMWRITE/VMXOFF/VMON等。
相关技术
汇编器
典型的现代汇编器(assembler)建造目标代码,由解译组语指令集的易记码(mnemonics)到操作码(OpCode),并解析符号名称(symbolic names)成为存储器地址以及其它的实体。使用符号参考是汇编器的一个重要特征,它可以节省修改程序后人工转址的乏味耗时计算。基本就是把机器码变成一些字母而已,编译的时候再把输入的指令字母替换成为晦涩难懂机器码。
编译环境
用汇编语言等非机器语言书写好的符号程序称为源程序,汇编语言编译器的作用是将源程序翻译成目标程序。目标程序是机器语言程序,当它被安置在内存的预定位
置上后,就能被计算机的CPU处理和执行。
汇编的调试环境总的来说比较少,也很少有非常好的编译器。编译器的选择依赖于目标处理器的类型和具体的系统平台。一般来说,功能良好的编译器用起来应当非常方便,比如,应当可以自动整理格式、语法高亮显示,集编译、链接和调试为一体,方便实用。
对于广泛使用的个人计算机来说,可以自由选择的汇编语言编译器有MASM、NASM、TASM、GAS、FASM、RADASM等,但大都不具备调试功能。如果是为了学习汇编语言,轻松汇编因为拥有一个完善的集成环境,是一款非常适合初学者的汇编编译器。
发展前景
汇编语言是机器语言的助记符,相对于比枯燥的机器代码易于读写、易于调试和修改,同时优秀的汇编语言设计者经过巧妙的设计,使得汇编语言汇编后的代码比高级语言执行速度更快,占内存空间少等优点,但汇编语言的运行速度和空间占用是针对高级语言并且需要巧妙设计,而且部分高级语言在编译后代码执行效率同样很高,所以此优点慢慢弱化。而且在编写复杂程序时具有明显的局限性,汇编语言依赖于具体的机型,不能通用,也不能在不同机型之间移植。常说汇编语言是低级语言,并不是说汇编语言要被弃之,相反,汇编语言仍然是计算机(或微机)底层设计程序员必须了解的语言,在某些行业与领域,汇编是必不可少的,非它不可适用。只是,现在计算机最大的领域为IT软件,也是我们常说的计算机应用软件编程,在熟练的程序员手里,使用汇编语言编写的程序,运行效率与性能比其它语言写的程序相对提高,但是代价是需要更长的时间来优化,如果对计算机原理及编程基础不扎实,反而增加其开发难度,实在是得不偿失,对比2010年前后的软件开发,已经是市场化的软件行业,加上高级语言的优秀与跨平台,一个公司不可以让一个团队使用汇编语言来编写所有的东西,花上几倍甚至几十倍的时间,不如使用其它语言来完成,只要最终结果不比汇编语言编写的差太多,就能抢先一步完成,这是市场经济下的必然结果。
但是,迄今为止,还没有程序员敢断定汇编语言是不需要学的,同时,汇编语言(Assembly Language)是面向机器的程序设计语言,设计精湛的汇编程序员,部分已经脱离软件开发,挤身于工业电子编程中。对于功能相对小巧但硬件对语言设计要求苛刻的行业,如4位单片机,由于其容量及运算,此行业的电子工程师一般负责从开发设计电路及软件控制,主要开发语言就是汇编,c语言使用只占极少部分,而电子开发工程师是千金难求,在一些工业公司,一个核心的电子工程师比其它任何职员待遇都高,对比起来,一般电子工程师待遇是程序员的十倍以上。这种情况是因为21世纪以来,学习汇编的人虽然也不少,但是真正能学到精通的却不多,它相对于高级语言难学,难用,适用范围小,虽然简单,但是过于灵活,学习过高级语言的人去学习汇编比一开始学汇编的人难得多,但是学过汇编的人学习高级语言却很容易,简从繁易,繁从简难。对于一个全面了解微机原理的程序员,汇编语言是必修语言。
实际应用
随着现代软件系统越来越庞大复杂,大量经过了封装的高级语言如C/C++,Pascal/Object Pascal也应运而生。这些新的语言使得程序员在开发过程中能够更简单,更有效率,使软件开发人员得以应付快速的软件开发的要求。而汇编语言由于其复杂性使得其适用领域逐步减小。但这并不意味着汇编已无用武之地。由于汇编更接近机器语言,能够直接对硬件进行操作,生成的程序与其他的语言相比具有更高的运行速度,占用更小的内存,因此在一些对于时效性要求很高的程序、许多大型程序的核心模块以及工业控制方面大量应用。
此外,虽然有众多编程语言可供选择,但汇编依然是各大学计算机科学类专业学生的必修课,以让学生深入了解计算机的运行原理。
历史上,汇编语言曾经是非常流行的程序设计语言之一。随着软件规模的增长,以及随之而来的对软件开发进度和效率的要求,高级语言逐渐取代了汇编语言。但即便如此,高级语言也不可能完全替代汇编语言的作用。就拿Linux内核来讲,虽然绝大部分代码是用C语言编写的,但仍然不可避免地在某些关键地方使用了汇编代码。由于这部分代码与硬件的关系非常密切,即使是C语言也会显得力不从心,而汇编语言则能够很好扬长避短,最大限度地发挥硬件的性能。
首先,汇编语言的大部分语句直接对应着机器指令,执行速度快,效率高,代码体积小,在那些存储器容量有限,但需要快速和实时响应的场合比较有用,比如仪器仪表和工业控制设备中。
其次,在系统程序的核心部分,以及与系统硬件频繁打交道的部分,可以使用汇编语言。比如操作系统的核心程序段、I/O接口电路的初始化程序、外部设备的低层驱动程序,以及频繁调用的子程序、动态连接库、某些高级绘图程序、视频游戏程序等等。
再次,汇编语言可以用于软件的加密和解密、计算机病毒的分析和防治,以及程序的调试和错误分析等各个方面。
最后,通过学习汇编语言,能够加深对计算机原理和操作系统等课程的理解。通过学习和使用汇编语言,能够感知、体会和理解机器的逻辑功能,向上为理解各种软件系统的原理,打下技术理论基础;向下为掌握硬件系统的原理,打下实践应用基础。
经典教材
汇编语言教材很多,各种处理器都有涉及,粗略统计不下百种。在这么多的教材里,用得较多的可以分类列举如下:
x86处理器
1.《x86汇编语言:从实模式到保护模式》,李忠著,电子工业出版社,2013-1 。
基于INTEL x86处理器、NASM编译器和BOCHS虚拟机。汇编语言就是处理器的语言,从这个意义上来说,既然学习汇编语言,就必须直接面向硬件编程,而不是使用莫名其妙的DOS中断和API调用。这是一本有趣的书,它没有把篇幅花在计算一些枯燥的数学题上。相反,它教你如何直接控制硬件,在不借助于BIOS、DOS、Windows、Linux或者任何其他软件支持的情况下来显示字符、读取硬盘数据、控制其他硬件等。
我们知道,32位和64位是主流,实模式和DOS操作系统已经成为历史,Linux和Windows都工作在保护模式下。这本书从实模式讲到32位保护模式,尤其以32位保护模式为重点,阅读本书,对理解现代计算机和现代操作系统的工作原理有非常大的帮助作用。
2.《汇编语言》(第2版),王爽 著,清华大学出版社,2013-4-1
基于INTEL 8086处理器、MASM编译器,以及DOS平台的汇编教材,完全以8086处理器的实模式为主,不涉及常用的32位和64位模式,但因为通俗易懂,读者反映很好。
3.《80X86汇编语言程序设计教程》,杨季文等 编著,清华大学出版社,1999-3-1
基于INTEL x86处理器、MASM和TASM编译器,包含16位实模式和32位保护模式的内容,而且对后者讲述较为详细。
4.《32位汇编语言程序设计》,钱晓捷 编著,机械工业出版社,2011-8-1
基于INTEL x86处理器、MASM编译器,以及WINDOWS平台的汇编教材。
5.《16/32位微机原理汇编语言及接口技术》,钱晓捷,陈涛编著,机械工业出版社,2005-2-1
基于INTEL x86处理器,论述16位微型计算机的基本原理、汇编语言和接口技术,并引出32位微机系统相关技术。
6.《Intel汇编语言程序设计》(第五版),(美)欧文 著,电子工业出版社,2012-7-1
基于INTEL x86处理器、MASM编译器,以及DOS/WINDOWS平台的汇编教材,既有16位实模式的内容,也有32位保护模式的内容。
7.《汇编语言的编程艺术》(第2版),(美)海德 著,清华大学出版社,2011-12-1
基于INTEL x86处理器,使用了作者自制的高级语言汇编器(High Level Assembler,HLA)作为教学工具,以部分地获得高级语言的优势和功能。
8.《x86 PC汇编语言、设计与接口》(第五版),(美)马兹迪,考西著,电子工业出版社,2011-1-1
基于INTEL x86处理器,既讲了16位实模式的内容,也讲了32位保护模式的内容,对64位也有所介绍。
ARM及单片机
1.《汇编语言程序设计--基于ARM体系结构》(第2版),文全刚等主编,北京航空航天大学出版社,2010-8-1
基于ARM体系结构的处理器,是学习嵌入式技术的入门教材。
2.《零基础学AVR单片机》,徐益民等编著,机械工业出版社,2011-1-1
单片机概述、avr单片机的开发工具、avr单片机c语言、atmega16单片机基本结构、avr的指令系统与汇编系统等。
3.《基于Multisim10的51单片机仿真实战教程》,聂典,丁伟主编,电子工业出版社,2010-2-1
阐述了NI Multisim 10在单片机仿真中的各项主要功能。
4.《PIC18微控制器:体系结构、编程与接口设计》,(美)贝里著,清华大学出版社,2009-4-1
微控制器广泛应用于汽车、家电、工业控制、医疗设备等众多领域。本书以Microchip公司的PIC18系列微控制器为例,全面讲解如何使用C语言和汇编语言对微控制器进行编程。
5.《CASL汇编语言程序设计》,赵立辉编著,中国电力出版社,2002-10-1
CASL汇编语言是中国计算机软件专业技术资格和水平考试高级程序员级的必考内容。本书是讲述CASL汇编语言程序设计的专著。
-----------------------------------------------------------------------
http://vdisk.weibo.com/s/BJ8eVgUUujf2u
Last edited by zzz19760225 on 2017-11-11 at 19:03 ]
Assembly Language
(Programming Language Oriented to Machines)
This entry is reviewed by the "Science Popularization China" Encyclopedia Science Entry Compilation and Application Project.
Assembly language (assembly language) is a low-level language used for electronic computers, microprocessors, microcontrollers, or other programmable devices, also known as symbolic language. In assembly language, mnemonics are used to replace the operation codes of machine instructions, and address symbols (Symbol) or labels (Label) are used to replace the addresses of instructions or operands. In different devices, assembly language corresponds to different machine language instruction sets, and is converted into machine instructions through the assembly process. Generally speaking, a specific assembly language corresponds one - to - one with a specific machine language instruction set, and cannot be directly ported between different platforms.
Many assemblers provide additional support mechanisms for program development, assembly control, and auxiliary debugging. Some assembly language programming tools often provide macros, which are also called macro assemblers.
Assembly language is not widely used in programming like most other programming languages. In today's practical applications, it is usually applied in the occasions of underlying, hardware operation, and high - requirement program optimization. Device drivers, embedded operating systems, and real - time running programs all need assembly language.
Chinese Name
Assembly Language
English Name
Assembly Language
Discipline
Software Engineering
Era of Origin
1950s
Compilation Method
Assembly
Table of Contents
1. Development Process
2. Language Characteristics
▪ Overall Characteristics
▪ Advantages
▪ Disadvantages
3. Language Components
▪ Data Transfer Instructions
▪ Integer and Logical Operation Instructions
▪ Shift Instructions
▪ Bit - Operation Instructions
▪ Conditional Setting Instructions
▪ Control Transfer Instructions
▪ String Operation Instructions
▪ Input and Output Instructions
4. Related Technologies
▪ Assembler
▪ Compilation Environment
5. Development Prospect
6. Practical Applications
7. Classic Textbooks
▪ x86 Processors
▪ ARM and Microcontrollers
Development Process
When it comes to the emergence of assembly language, we should first talk about machine language. Machine language is a set of machine instructions. To expand, a machine instruction is a command that a machine can execute correctly. The machine instructions of an electronic computer are a series of binary digits. The computer converts them into a series of high - and low - level electrical levels to drive the electronic components of the computer to perform operations.
The above - mentioned computer refers to a machine that can execute machine instructions and perform operations. This is the concept of early computers. In the PC we commonly use, there is a chip to complete the functions of the above - mentioned computer. This chip is the CPU (Central Processing Unit) we often talk about. Each microprocessor needs different level pulses to control its work due to different hardware designs and internal structures. So each microprocessor has its own machine instruction set, that is, machine language.
Early program design all used machine language. Programmers typed the program code composed of 0 and 1 on paper tape or cards, with 1 being punched and 0 not being punched, and then input the program into the computer through a paper tape machine or card machine for operation. Such machine language is composed of pure 0s and 1s, which is very complex, not convenient for reading and modification, and also prone to errors. Programmers soon discovered the troubles brought by using machine language. They were difficult to distinguish and remember, which brought obstacles to the development of the entire industry. Then assembly language emerged.
The main body of assembly language is assembly instructions. The difference between assembly instructions and machine instructions lies in the way of instruction representation. Assembly instructions are an easy - to - remember writing format of machine instructions.
Operation: The content of register BX is sent to AX
1000100111011000 Machine instruction
mov ax,bx Assembly instruction
After that, programmers use assembly instructions to write source programs. However, what the computer can understand is only machine instructions. Then how to make the computer execute the programs written by programmers with assembly instructions? At this time, there needs to be a translation program that can convert assembly instructions into machine instructions. Such a program is called a compiler. Programmers write source programs in assembly language, and then use an assembly compiler to compile them into machine code, which is finally executed by the computer.
Language Characteristics
Assembly language is a program design language directly facing the processor (Processor). The processor works under the control of instructions, and each instruction that the processor can recognize is called a machine instruction. Each processor has its own set of instructions that it can recognize, called an instruction set. When the processor executes an instruction, it takes different actions according to different instructions to complete different functions. It can not only change its own internal working state but also control the working state of other peripheral circuits.
Another feature of assembly language is that the objects it operates on are not specific data, but registers or memory. That is to say, it directly deals with registers and memory. This is also why the execution speed of assembly language is faster than other languages, but at the same time, this also makes programming more complex. Since data is stored in registers or memory, there must be addressing modes, that is, the method used to find the required data. For example, in the above example, we cannot use data directly like in high - level languages, but first need to take out the data from the corresponding registers AX and BX. This also increases the complexity of programming, because in high - level languages, this part of addressing work is completed by the compilation system, while in assembly language, it is completed by the programmer himself, which virtually increases the complexity of programming and the readability of the program.
Moreover, assembly language instructions are a symbolic representation of machine instructions, and different types of CPUs have different machine instruction systems, so there are also different assembly languages. Therefore, assembly language programs are closely related to machines. So, except for a certain degree of portability of assembly language programs between CPUs of the same series and different models, assembly language programs between other different types of CPUs (such as minicomputers and microcomputers) cannot be ported. That is to say, the universality and portability of assembly language programs are lower than those of high - level language programs.
Because assembly language has the characteristic of "machine - relatedness", when programmers write programs in assembly language, they can fully arrange various resources inside the machine reasonably and keep them in the best use state all the time. The program code written in this way has short execution code and fast execution speed. Assembly language is the one that is most closely and directly related to hardware among various programming languages, and also the one with the highest efficiency in terms of time and space. It is one of the mandatory professional courses for computer application technology in colleges and universities, and is of great significance for training students to master program design technology, be familiar with the operation and program debugging technology on the computer.
Overall Characteristics
1. Machine - relatedness
This is a low - level language oriented to machines. Usually, it is specially designed for a specific computer or series of computers. Because it is a symbolic representation of machine instructions, different machines have different assembly languages. Using assembly language can face the machine and give full play to the characteristics of the machine, and obtain high - quality programs.
2. High speed and high efficiency
Assembly language retains the advantages of machine language, has the characteristics of directness and simplicity, can effectively access and control various hardware devices of the computer, such as disks, memory, CPU, I/O ports, etc., and occupies less memory and has fast execution speed. It is an efficient program design language.
3. Complexity of writing and debugging
Because it directly controls hardware, and even simple tasks need many assembly language statements, so when carrying out program design, one must consider everything possible problems and reasonably allocate and use various software and hardware resources. In this way, the burden on programmers is inevitably increased. Similarly, when debugging the program, once the operation of the program goes wrong, it is very difficult to find it.
Advantages
1. Because the program designed in assembly language is finally converted into machine instructions, it can maintain the consistency of machine language, be direct and simple, and can access and control various hardware devices of the computer, such as disks, memory, CPU, I/O ports, etc., just like machine instructions. Using assembly language, all accessible software and hardware resources can be accessed.
2. The object code is short, occupies less memory, and has fast execution speed. It is an efficient program design language. It is often used in combination with high - level languages to improve the execution speed and efficiency of the program, make up for the lack of high - level languages in hardware control, and is widely applied.
Disadvantages
1. Assembly language is oriented to machines and is at the bottom of the entire computer language hierarchy, so it is regarded as a low - level language. Usually, it is specially designed for a specific computer or series of computers. Different processors have different assembly language grammars and compilers, and the compiled programs cannot be executed on different processors, lacking portability.
2. It is difficult to understand the program design intention from the assembly language code, and the maintainability is poor. Even to complete a simple task, a large number of assembly language codes are needed, which is very easy to produce bugs and difficult to debug.
3. Using assembly language must have a very good understanding of a certain processor, and can only be optimized for a specific architecture and processor, with very low development efficiency, long cycle and tedious work.
Language Components
Data Transfer Instructions
This part of instructions includes general data transfer instructions MOV, conditional transfer instructions CMOVcc, stack operation instructions PUSH/PUSHA/PUSHAD/POP/POPA/POPAD, exchange instructions XCHG/XLAT/BSWAP, address or segment descriptor selector transfer instructions LEA/LDS/LES/LFS/LGS/LSS, etc. It should be noted that CMOVcc is not a specific instruction but an instruction cluster, including a large number of instructions, which are used to decide whether to perform the specified transfer operation according to certain bit states of the EFLAGS register.
Integer and Logical Operation Instructions
This part of instructions is used to perform arithmetic and logical operations, including addition instructions ADD/ADC, subtraction instructions SUB/SBB, increment instructions INC, decrement instructions DEC, comparison operation instructions CMP, multiplication instructions MUL/IMUL, division instructions DIV/IDIV, sign extension instructions CBW/CWDE/CDQE, decimal adjustment instructions DAA/DAS/AAA/AAS, logical operation instructions NOT/AND/OR/XOR/TEST, etc.
Shift Instructions
This part of instructions is used to move the register or memory operand for the specified number of times. Including logical left shift instructions SHL, logical right shift instructions SHR, arithmetic left shift instructions SAL, arithmetic right shift instructions SAR, circular left shift instructions ROL, circular right shift instructions ROR, etc.
Bit - Operation Instructions
This part of instructions includes bit test instructions BT, bit test and set instructions BTS, bit test and reset instructions BTR, bit test and invert instructions BTC, bit forward scan instructions BSF, bit backward scan instructions BSR, etc.
Conditional Setting Instructions
This is not a specific instruction but an instruction cluster, including about 30 instructions, which are used to set an 8 - bit register or memory operand according to certain bit states of the EFLAGS register. For example, SETE/SETNE/SETGE, etc.
Control Transfer Instructions
This part includes unconditional transfer instructions JMP, conditional transfer instructions Jcc/JCXZ, loop instructions LOOP/LOOPE/LOOPNE, procedure call instructions CALL, sub - procedure return instructions RET, interrupt instructions INTn, INT3, INTO, IRET, etc. It should be noted that Jcc is an instruction cluster, including many instructions, which are used to decide whether to transfer according to certain bit states of the EFLAGS register; INT n is a software interrupt instruction, where n can be a number between 0 and 255, which is used to indicate the interrupt vector number.
String Operation Instructions
This part of instructions is used to operate on data strings, including string transfer instructions MOVS, string comparison instructions CMPS, string scan instructions SCANS, string load instructions LODS, string save instructions STOS. These instructions can optionally use the prefixes REP/REPE/REPZ/REPNE and REPNZ to operate continuously.
Input and Output Instructions
This part of instructions is used to exchange data with peripheral devices, including port input instructions IN/INS, port output instructions OUT/OUTS.
High - Level Language Auxiliary Instructions
This part of instructions provides convenience for compilers of high - level languages, including instructions ENTER for creating stack frames and instructions LEAVE for releasing stack frames.
Control and Privilege Instructions
This part includes no - operation instructions NOP, halt instructions HLT, wait instructions WAIT/MWAIT, escape instructions ESC, bus lock instructions LOCK, memory range check instructions BOUND, global descriptor table operation instructions LGDT/SGDT, interrupt descriptor table operation instructions LIDT/SIDT, local descriptor table operation instructions LLDT/SLDT, descriptor segment limit value loading instructions LSR, descriptor access right reading instructions LAR, task register operation instructions LTR/STR, request privilege level adjustment instructions ARPL, task switch flag clearing instructions CLTS, control register and debug register data transfer instructions MOV, cache control instructions INVD/WBINVD/INVLPG, model - related register reading and writing instructions RDMSR/WRMSR, processor information acquisition instructions CPUID, timestamp reading instructions RDTSC, etc.
Floating - Point and Multimedia Instructions
This part of instructions is used to accelerate the operation of floating - point data and the single - instruction multiple - data (SIMD and its extension SSEx) instructions used to accelerate multimedia data processing. This part of instructions is very large and cannot be listed one by one. Please refer to the INTEL manual by yourself.
Virtual Machine Extension Instructions
This part of instructions includes INVEPT/INVVPID/VMCALL/VMCLEAR/VMLAUNCH/VMRESUME/VMPTRLD/VMPTRST/VMREAD/VMWRITE/VMXOFF/VMON, etc.
Related Technologies
Assembler
A typical modern assembler (assembler) constructs object code, interprets the mnemonics of the assembly language instruction set into operation codes (OpCode), and parses symbolic names (symbolic names) into memory addresses and other entities. Using symbolic references is an important feature of the assembler, which can save the tedious and time - consuming calculation of manual address transfer after modifying the program. Basically, it is to turn machine code into some letters, and when compiling, replace the input instruction letters with obscure machine code.
Compilation Environment
The symbolic program written in non - machine language such as assembly language is called a source program. The function of the assembly language compiler is to translate the source program into an object program. The object program is a machine language program. When it is placed in the pre - set position in the memory, it can be processed and executed by the CPU of the computer.
Generally speaking, there are relatively few debugging environments for assembly, and there are also very few very good compilers. The choice of compiler depends on the type of the target processor and the specific system platform. Generally speaking, a functional compiler should be very easy to use. For example, it should be able to automatically organize the format, have syntax highlighting, integrate compilation, linking, and debugging, and be convenient and practical.
For the widely used personal computer, the free - choice assembly language compilers include MASM, NASM, TASM, GAS, FASM, RADASM, etc., but most of them do not have debugging functions. If it is for learning assembly language, Easy Assembler is a very suitable compiler for beginners because it has a perfect integrated environment.
Development Prospect
Assembly language is the mnemonic of machine language. Compared with the boring machine code, it is easier to read, write, debug and modify. At the same time, excellent assembly language designers have made clever designs, making the code after assembly of assembly language have the advantages of faster execution speed and less memory space occupied than high - level languages. However, the running speed and space occupation of assembly language are relative to high - level languages and need to be cleverly designed. Moreover, the code execution efficiency of some high - level languages is also very high after compilation, so this advantage is gradually weakened. And it has obvious limitations when writing complex programs. Assembly language depends on the specific model and cannot be universal or ported between different models. It is often said that assembly language is a low - level language, which does not mean that assembly language should be discarded. On the contrary, assembly language is still a language that bottom - layer design programmers of computers (or microcomputers) must understand. In some industries and fields, assembly is indispensable and inapplicable without it. However, the largest field of computers today is IT software, which is also the computer application software programming we often talk about. In the hands of skilled programmers, the programs written in assembly language have relatively improved running efficiency and performance compared with programs written in other languages, but the price is that it takes a longer time to optimize. If the computer principle and programming foundation are not solid, it will instead increase the development difficulty, which is really not worth the loss. Compared with software development around 2010, it has been a market - oriented software industry. Plus the excellence and cross - platform of high - level languages, a company cannot let a team use assembly language to write everything, spending several times or even dozens of times the time. It is better to use other languages to complete it. As long as the final result is not much worse than that written in assembly language, it can be completed first. This is an inevitable result under the market economy.
However, so far, no programmer dares to assert that assembly language is unnecessary to learn. At the same time, assembly language (Assembly Language) is a program design language oriented to machines. Exquisitely designed assembly language programmers have partially deviated from software development and entered the industrial electronic programming. For the industries with relatively small functions but strict hardware requirements for language design, such as 4 - bit microcontrollers, due to their capacity and operation, the electronic engineers in this industry are generally responsible for the development and design of circuits and software control. The main development language is assembly, and the use of C language accounts for a very small part. And electronic development engineers are very hard to find. In some industrial companies, a core electronic engineer has a higher treatment than any other staff. In contrast, the treatment of general electronic engineers is more than ten times that of programmers. This situation is because since the 21st century, although there are also many people learning assembly, few can really learn it thoroughly. It is more difficult to learn and use than high - level languages, and the scope of application is small. Although it is simple, it is too flexible. It is much more difficult for people who have learned high - level languages to learn assembly than those who start learning assembly first. But people who have learned assembly can learn high - level languages very easily. Simple from complex is easy, complex from simple is difficult. For a programmer who has a comprehensive understanding of the microcomputer principle, assembly language is a required language.
Practical Applications
With the increasing complexity of modern software systems, a large number of newly emerged high - level languages such as C/C++, Pascal/Object Pascal have also emerged. These new languages enable programmers to be simpler and more efficient in the development process, and enable software developers to meet the requirements of rapid software development. And the complexity of assembly language makes its applicable fields gradually decrease. But this does not mean that assembly has no place. Because assembly is closer to machine language and can directly operate on hardware, the generated programs have higher running speed and occupy smaller memory than other languages. Therefore, it is widely used in some programs with high time - sensitivity requirements, the core modules of many large programs, and industrial control.
In addition, although there are many programming languages to choose from, assembly is still a required course for students majoring in computer science in various universities to enable students to deeply understand the operation principle of computers.
Historically, assembly language was once one of the very popular program design languages. With the growth of software scale and the subsequent requirements for software development progress and efficiency, high - level languages gradually replaced assembly language. But even so, high - level languages cannot completely replace the role of assembly language. Take the Linux kernel as an example. Although most of the code is written in C language, it is still inevitable to use assembly code in some key places. Because this part of the code is very closely related to hardware, even C language will be powerless, and assembly language can give full play to its advantages and maximize the performance of hardware.
First of all, most statements in assembly language directly correspond to machine instructions, have fast execution speed, high efficiency, and small code volume. They are relatively useful in occasions where memory capacity is limited but fast and real - time response is required, such as in instrument meters and industrial control equipment.
Secondly, in the core part of the system program and the part that frequently interacts with the system hardware, assembly language can be used. For example, the core program segment of the operating system, the initialization program of the I/O interface circuit, the low - level driver program of external equipment, and frequently called sub - programs, dynamic link libraries, some advanced drawing programs, video game programs, etc.
Thirdly, assembly language can be used in various aspects such as software encryption and decryption, analysis and prevention of computer viruses, and program debugging and error analysis.
Finally, by learning assembly language, one can deepen the understanding of courses such as computer principle and operating system. By learning and using assembly language, one can perceive, experience, and understand the logical function of the machine, lay a technical theoretical foundation for understanding the principles of various software systems upward, and lay a practical application foundation for mastering the principles of the hardware system downward.
Classic Textbooks
There are many textbooks on assembly language, involving various processors. There are roughly no less than 100 kinds. Among so many textbooks, the more commonly used ones can be listed in categories as follows:
x86 Processors
1. "x86 Assembly Language: From Real - Mode to Protected Mode", written by Li Zhong, published by Publishing House of Electronics Industry, 2013 - 1.
Based on the INTEL x86 processor, NASM compiler, and BOCHS virtual machine. Assembly language is the language of the processor. In this sense, since learning assembly language, one must directly program for hardware, not use inexplicable DOS interrupts and API calls. This is an interesting book. It does not spend space on calculating some boring math problems. On the contrary, it teaches you how to directly control hardware, such as displaying characters, reading hard disk data, and controlling other hardware without the support of BIOS, DOS, Windows, Linux, or any other software.
We know that 32 - bit and 64 - bit are the mainstream, and real - mode and DOS operating systems have become history. Linux and Windows both work in protected mode. This book talks about real - mode to 32 - bit protected mode, especially focusing on 32 - bit protected mode. Reading this book is of great help for understanding the working principles of modern computers and modern operating systems.
2. "Assembly Language" (2nd Edition), written by Wang Shuang, published by Tsinghua University Press, 2013 - 4 - 1
Based on the INTEL 8086 processor, MASM compiler, and DOS platform assembly textbook, completely focusing on the real - mode of the 8086 processor, not involving common 32 - bit and 64 - bit modes. But because it is easy to understand, the readers' feedback is very good.
3. "80X86 Assembly Language Programming Tutorial", compiled by Yang Jiwen et al., published by Tsinghua University Press, 1999 - 3 - 1
Based on the INTEL x86 processor, MASM and TASM compilers, including the content of 16 - bit real - mode and 32 - bit protected mode, and the latter is described in more detail.
4. "32 - Bit Assembly Language Programming", compiled by Qian Xiaojie, published by China Machine Press, 2011 - 8 - 1
Based on the INTEL x86 processor, MASM compiler, and WINDOWS platform assembly textbook.
5. "16/32 - Bit Microcomputer Principle, Assembly Language and Interface Technology", compiled by Qian Xiaojie and Chen Tao, published by China Machine Press, 2005 - 2 - 1
Based on the INTEL x86 processor, it discusses the basic principles, assembly language, and interface technology of 16 - bit microcomputers, and introduces the relevant technologies of 32 - bit microcomputer systems.
6. "Intel Assembly Language Programming" (5th Edition), written by (US) Irving, published by Publishing House of Electronics Industry, 2012 - 7 - 1
Based on the INTEL x86 processor, MASM compiler, and DOS/WINDOWS platform assembly textbook, it has the content of 16 - bit real - mode and 32 - bit protected mode.
7. "The Art of Assembly Language Programming" (2nd Edition), written by (US) Hyde, published by Tsinghua University Press, 2011 - 12 - 1
Based on the INTEL x86 processor, the author's self - made advanced language assembler (High Level Assembler, HLA) is used as a teaching tool to partially obtain the advantages and functions of high - level languages.
8. "x86 PC Assembly Language, Design and Interface" (5th Edition), written by (US) Mazidi and Kaush, published by Publishing House of Electronics Industry, 2011 - 1 - 1
Based on the INTEL x86 processor, it not only talks about the content of 16 - bit real - mode but also talks about the content of 32 - bit protected mode, and also has some introduction to 64 - bit.
ARM and Microcontrollers
1. "Assembly Language Programming -- Based on ARM Architecture" (2nd Edition), edited by Wen Quangang et al., published by Beijing University of Aeronautics and Astronautics Press, 2010 - 8 - 1
Based on the processor of the ARM architecture, it is an introductory textbook for learning embedded technology.
2. "Learn AVR Microcontroller from Scratch", compiled by Xu Yimin et al., published by China Machine Press, 2011 - 1 - 1
It includes microcontroller overview, development tools of avr microcontroller, avr microcontroller C language, basic structure of atmega16 microcontroller, instruction system and assembly system of avr, etc.
3. "Practical Tutorial on 51 Microcontroller Simulation Based on Multisim10", edited by Nie Dian and Ding Wei, published by Publishing House of Electronics Industry, 2010 - 2 - 1
It expounds various main functions of NI Multisim 10 in microcontroller simulation.
4. "PIC18 Microcontroller: Architecture, Programming and Interface Design", written by (US) Berry, published by Tsinghua University Press, 2009 - 4 - 1
Microcontrollers are widely used in many fields such as automobiles, home appliances, industrial control, and medical equipment. This book takes the PIC18 series microcontrollers of Microchip Company as an example to comprehensively explain how to program microcontrollers using C language and assembly language.
5. "CASL Assembly Language Programming", compiled by Zhao Lihui, published by China Electric Power Press, 2002 - 10 - 1
CASL assembly language is a necessary content for the senior programmer level of the computer software professional technology qualification and level examination in China. This book is a monograph on CASL assembly language programming.
-----------------------------------------------------------------------
http://vdisk.weibo.com/s/BJ8eVgUUujf2u
Last edited by zzz19760225 on 2017 - 11 - 11 at 19:03 ]