China DOS Union

-- Unite DOS · Advance DOS · Grow DOS --

Union site: www.cn-dos.net Forum site: www.cn-dos.net/forum
DOS stands for freedom, openness and progress. Let us work hard, learn from the openness and GNU spirit of FreeDOS and Linux, and together build and grow a free GNU GPL world!

中国DOS联盟论坛
The time now is 2026-06-24 08:25
中国DOS联盟论坛 » 网络日志(Blog) » Conversion between number systems View 40,311 Replies 120
Floor 46 Posted 2016-06-26 19:41 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
Instruction Set
Edit Entry
Instruction Set
Instruction Set (2)
The instruction set is a hard program stored inside the CPU that guides and optimizes CPU operations. With these instruction sets, the CPU can operate more efficiently. Intel has x86, x86-64, MMX, SSE, SSE2, SSE3, SSSE3 (SuperSSE3), SSE4.1, SSE4.2, and EM-64T for 64-bit desktop processors. AMD mainly has the 3D-Now! instruction set. On the basis of the original instruction set, it is increased to 52 instructions, including some SSE instructions, and this instruction set is mainly used in new AMD CPUs.
Quick Navigation
Entry Gallery
Chinese Name Instruction Set
English Name Instruction set
Target Object New AMD CPUs
Meaning The instruction set is stored inside the CPU
Classification SSE instruction set
Number of Instructions 52
Table of Contents
1 Introduction
2 Others
3 Entry Gallery
1 Introduction Edit
SSE Instruction Set

Streaming SIMD Extensions

Since the MMX instruction did not significantly improve the 3D game performance, in 1999, Intel launched the Streaming SIMD Extensions (SSE) in the Pentium III CPU product. SSE is compatible with MMX instructions. It can effectively improve the floating-point operation speed through SIMD (Single Instruction Multiple Data Technology) and parallel processing of multiple floating points in a single clock cycle.

In the MMX instruction set, 8 registers of the floating-point processor are borrowed, which leads to a decrease in the floating-point operation speed. When the SSE instruction set was launched, Intel added 8 128-bit SSE instruction-specific registers in the Pentium III CPU. Moreover, the SSE instruction registers can run at full speed, ensuring parallelism with floating-point operations.

SSE2 Instruction Set

In the Pentium 4 CPU, Intel developed a new instruction set SSE2. This time, the newly developed SSE2 instruction has a total of 144 instructions, including floating-point SIMD instructions, integer SIMD instructions, conversion between SIMD floating-point and integer data, conversion of data in MMX registers, and several major parts. The important improvements include introducing new data formats, such as 128-bit SIMD integer operations and 64-bit double-precision floating-point operations, etc. In order to better utilize the cache. In addition, a few new cache instructions are added in the Pentium 4, allowing programmers to control the cached data.

SSE3 Instruction Set

Compared with SSE2, SSE3 has newly added 13 new instructions, which were previously collectively referred to as pni (prescott new instructions). Among the 13 instructions, one is used for video decoding, two are used for thread synchronization, and the rest are used for complex mathematical operations, floating-point to integer conversion, and SIMD floating-point operations.

SSE4 Instruction Set

SSE4 has added 50 new performance-increasing instructions, which are helpful for compilation, media, character/text processing, and program pointer acceleration.

The SSE4 instruction set will be part of Intel's future "Significant Video Enhancement" platform. Other video enhancement functions of this platform also include Clear Video Technology (CVT) and Unified Display Interface (UDI) support, etc. Among them, the former is a response to ATi AVIVO technology, supporting advanced decoding, post-processing, and enhanced 3D functions.

2 Others Edit
3D Now! Extended Instruction Set

The 3D Now! instruction set is a multimedia extension instruction set developed by AMD in 1998, with a total of 21 instructions. Aiming at the weakness that the MMX instruction set does not enhance the floating-point processing capability, it mainly improves the 3D graphics processing capability of AMD's K6 series CPUs. Due to the limited instructions, the 3D Now! instruction set is mainly used in 3D games, and has insufficient support for other commercial graphics application processing.

X86 Instruction Set

To know what an instruction set is, we have to start from today's X86 architecture CPU. The X86 instruction set was specially developed by Intel for its first 16-bit CPU (i8086). The CPU in the world's first PC launched by IBM in 1981—the i8088 (simplified version of i8086) also used X86 instructions. At the same time, the X87 chip series mathematical coprocessor added to improve the floating-point data processing capability in the computer uses X87 instructions separately. Later, the X86 instruction set and the X87 instruction set are collectively referred to as the X86 instruction set. Although with the continuous development of CPU technology, Intel has successively developed newer i80386, i80486, and so on until today, in order to ensure that the computer can continue to run various types of previously developed applications to protect and inherit rich software resources, so Intel The CPUs produced by the company still continue to use the X86 instruction set, so its CPUs still belong to the X86 series. Since Intel X86 series and its compatible CPUs all use the X86 instruction set, the current large X86 series and compatible CPU lineup is formed.

EM64T Instruction Set

Intel's EM64T (Extended Memory 64 Technology) is the 64-bit memory extension technology. This technology provides expanded memory addressing capabilities for server and workstation platform applications, has more memory address space, can bring greater application flexibility, and is particularly beneficial for enhancing the application of complex engineering software such as audio and video editing, CAD design, and game software.

The so-called 64-bit usually refers to the 64-bit CPU produced by AMD, while EM64T is the 64-bit understood by Inter according to its own meaning, that is, another name corresponding to the 64-bit of AMD.

RISC Instruction Set

The RISC instruction set is the development direction of high-performance CPUs in the future. It is opposite to the traditional CISC (Complex Instruction Set). In comparison, the instruction format of RISC is unified, the types are relatively few, and the addressing methods are also fewer than the complex instruction set. The architectures using the RISC instruction set mainly include ARM and MIPS.

3DNow!+ Instruction Set

On the basis of the original instruction set, it is increased to 52 instructions, including some SSE instructions, and this instruction set is mainly used in new AMD CPUs.

[ Last edited by zzz19760225 on 2016-12-12 at 14:50 ]
1<词>,2,3/段\,4{节},5(章)。
Floor 47 Posted 2016-06-26 19:42 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
MMX Instruction Set/MMX Instruction Set Edit
The MMX instruction set includes 57 multimedia instructions. Through these instructions, multiple data can be processed at one time. When the processing result exceeds the actual processing capacity, it can also be processed normally. So, with the cooperation of software, higher performance can be obtained. The benefit of MMX is that the existing operating systems do not need to make any modifications to easily execute MMX programs. However, the problem is also obvious. That is, the MMX instruction set and the x87 floating-point operation instructions cannot be executed simultaneously and must be switched intensively to execute normally. This situation is bound to cause a decline in the running quality of the entire system.

http://blog.csdn.net/arau_sh/article/details/7575043
http://blog.csdn.net/arau_sh/article/details/7575059
http://blog.csdn.net/arau_sh/article/details/7575066


MMX Instruction Set (Detailed Explanation)
Tags: Compiler Storage Intel FPC Extension
2012-05-17 09:44 559 views Comments(0) Favorite Report
Classification: Asm (33)
Table of Contents(?)
Transferred from http://blog.csdn.net/dahan_wangtao/article/details/1944153

EMMS Empty MMX State:
Set the FP feature word to empty (all 1s), so that subsequent floating-point instructions can use the floating-point registers, and other MMX instructions automatically set FP to all 0s. This instruction should be used at the end of all MMX routines and when calling routines that may contain FP instructions to clear the MMX state.
MOVD mm, r/m32
MOVD r/m32, mm Transfer 32-bit Data:
Move 32-bit data from the integer register/memory to the MMX register and vice versa. MOVD cannot move data between MMX registers, between memory, or between integer registers. When the destination operand is an MMX register, the 32-bit source operand is written to the lower 32 bits of the destination register. The destination register is "0-extended" to 64 bits. When the source operand is an MMX register, the lower 32 bits of this register are written to the destination operand.
MOVQ mm, r/m64
MOVQ r.m64, mm Transfer 64-bit Data:
Move 64-bit data from the integer register/memory to the MMX register and vice versa. The destination operand and the source operand can be MMX registers or 64-bit memory operands. But MOVQ cannot transfer data between memory and memory.
PACKSSWB mm, mm/m64
PACKSSDW mm, mm/m64
Signed Saturation Data Packing:
Pack the signed word groups in the MMX register and the MMX register/memory unit into the signed byte groups in the MMX register. And pack the signed double word groups in the MMX register and the MMX register/memory unit into the signed word groups in the MMX register. (Note 1)
PACKUSWB mm, mm/m64 Unsigned Saturation Data Packing
Pack the signed word groups in the MMX register and the MMX register/memory unit into the unsigned byte groups in the MMX register. (Note 1)
PADDB mm, mm/m64
PADDW mm, mm/m64
PADDD mm, mm/m64
Wrapping Mode Data Group Addition:
Add the byte groups (word groups, double word groups) in the MMX register/memory unit to the MMX register in wrapping mode. (Note 1)
PADDSB mm, mm/m64
PADDSW mm, mm/m64
Signed Saturation Data Group Addition:
Add the signed byte groups (word groups) in the MMX register/memory unit to the signed byte groups (word groups) data in the MMX register in saturation mode. (Note 1)
PADDUSB mm, mm/m64
PADDUSW mm, mm/m64
Unsigned Saturation Data Group Addition:
Add the unsigned byte groups (word groups) in the MMX register/memory unit to the unsigned byte groups (word groups) data in the MMX register in saturation mode. (Note 1)
PAND mm, mm/m64
Bitwise Logical AND:
Perform an AND operation on the 64-bit data in the MMX register/memory unit, and store the result in the MMX register.
PANDN mm, mm/m64
Bitwise Logical AND NOT:
Invert the 64-bit value in the MMX register, then perform an AND operation on the inverted MMX register and the 64-bit data in the MMX register/memory unit, and store the result in the MMX register.
PCMPEQB mm, mm/m64
PCMPEQW mm, mm/m64
PCMPEQD mm, mm/m64
Group Data Equality Comparison:
Compare the byte groups (word groups, double word groups) data in the MMX register with the MMX register/memory unit.
This instruction compares the corresponding data elements of the destination operand and the source operand. If they are equal, the corresponding data element of the destination register is set to all 1s; otherwise, it is set to all 0s.
eg: PCMPEQE mm, mm/m64
mm ? ? 00000000000000111 0111000111000111
mm/m64 ? ? 11111110000001100 0111000111000111
Result mm ? ? 00000000000000000 1111111111111111
PCMPGTB mm, mm/m64
PCMPGTW mm, mm/m64
PCMPGTD mm, mm/m64
Group Data Greater Than Comparison:
Compare the byte groups (word groups, double word groups) data in the MMX register with the MMX register/memory unit.
This instruction compares the corresponding data elements of the destination operand and the source operand. If greater, the corresponding data element of the destination register is set to all 1s; otherwise, it is set to all 0s. (Refer to the previous instruction)
PMADDWD mm, mm/m64 Multiply and Add of Data Groups (Word Groups):
Multiply the word group data in the MMX register with the MMX register/memory unit, then add the 32-bit results pairwise and store them as double words in the MMX register.
eg: PMADDWD mm, mm/m64
mm ? ? 0111000111000111 0111000111000111
Operation * * * *
mm, mm/m64 ? ? 1000000000000000 0000010000000000
Operation /_____+____/ /______+_____/
mm ? ? 1100100011100011 1001110000000000
PMULHW mm, mm/m64
Multiply and Take High Order of Group Data (Word Groups):
Multiply the signed word group data in the MMX register with the MMX register/memory unit, then store the high 16 bits of the result in the MMX register.
eg: PMULHW mm, mm/m64
mm ? ? 0111000111000111 0111000111000111
Operation * * * *
mm/m64 ? ? 1000000000000000 0000010000000000
Operation High Order High Order High Order High Order
mm ? ? 1100011100011100 0000000111000111
PMULLW mm, mm/m64
Multiply and Take Low Order of Group Data (Word Groups):
Multiply the signed word group data in the MMX register with the MMX register/memory unit, then store the low 16 bits of the result in the MMX register. (Refer to the previous instruction)
POR mm, mm/m64 Bitwise Logical OR:
Perform an OR operation on the 64-bit data in the MMX register/memory unit, and store the result in the MMX register.
PSLLW mm, mm/m64
PSLLD mm, mm/m64
PSLLQ mm, mm/m64
PSLLW mm, imm8
PSLLD mm, imm8
PSLLQ mm, imm8
Logical Left Shift of Group Data:
Left shift the word (double word, quad word) data in the MMX register by the number of bits specified by the MMX register/memory unit, and 0s are shifted into the lower bits.
Left shift the word (double word, quad word) data in the MMX register by the number of bits specified by the 8-bit immediate number, and 0s are shifted into the lower bits.
PSRAW mm, mm/m64
PSRAD mm, mm/m64
PSRAW mm, imm8
PSRAD mm, imm8 Arithmetic Right Shift of Group Data:
Right shift the word (double word) data in the MMX register by the number of bits specified by the MMX register/memory unit, and the sign bit is maintained during the shift.
Right shift the word (double word) data in the MMX register by the number of bits specified by the 8-bit immediate number, and the sign bit is maintained during the shift.
PSRLW mm, mm/m64
PSRLD mm, mm/m64
PSRLQ mm, mm/m64
PSRLW mm, imm8
PSRLD mm, imm8
PSRLQ mm, imm8 Logical Right Shift of Group Data:
Right shift the word (double word) data in the MMX register by the number of bits specified by the MMX register/memory unit, and 0s are shifted into the shifted-out bits.
Right shift the word (double word) data in the MMX register by the number of bits specified by the 8-bit immediate number, and 0s are shifted into the shifted-out bits.
PSUBB mm, mm/m64
PSUBW mm, mm/m64
PSUBD mm, mm/m64 Wrapping Mode Group Data Subtraction:
Subtract the byte (word, double word) groups in the MMX register/memory unit from the MMX register by byte (word, double word). (Note 1)
PSUBSB mm, mm/m64
PSUBSW mm, mm/m64
Signed Saturation Group Data Subtraction:
Subtract the signed byte (word) group data in the MMX register/memory unit from the signed byte (word) group data in the MMX register. (Note 1)
PSUBUSB mm, mm/m64
PSUBUSW mm, mm/m64 Signed Saturation Group Data Subtraction:
Subtract the unsigned byte (word) group data in the MMX register/memory unit from the unsigned byte (word) group data in the MMX register. (Note 1)
PUNPCKHBW mm, mm/m64
PUNPCKHWD mm, mm/m64
PUNPCKHDQ mm, mm/m64 High Order Group Data Unpacking:
This instruction alternately takes out the high half parts of the data elements of the source operand and the destination operand and writes them into the destination operand, and the low half parts of the data elements are ignored.
eg: PUNPCKHBW mm, mm/m64

PUNPCKLBW mm, mm/m64
PUNPCKLWD mm, mm/m64
PUNPCKLDQ mm, mm/m64 Low Order Group Data Unpacking:
This instruction alternately takes out the low half parts of the data elements of the source operand and the destination operand and writes them into the destination operand, and the high half parts of the data elements are ignored. (Refer to the previous instruction)
PXOR mm, mm/m64 Bitwise Logical XOR:
Perform an XOR operation on the 64-bit data in the MMX register/memory unit, and store the result in the MMX register.

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Supplementary:
PMULHUW - Packed Unsigned Multiplication Take High Order
Opcode
Instruction
Description
0F E4 /r
PMULHUW mm1, mm2/m64
Multiply the packed unsigned word integers in mm1 register and mm2/m64, and store the high 16 bits of the result into mm1.
66 0F E4 /r
PMULHUW xmm1, xmm2/m128
Multiply the packed unsigned word integers in xmm1 and xmm2/m128, and store the high 16 bits of the result into xmm1.
Description
Perform SIMD multiplication on the packed unsigned word integers in the destination operand (the first operand) and the source operand (the second operand), and store the high 16 bits of each 32-bit intermediate result into the destination operand. (Figure 3-7 shows the situation when using 64-bit operands for this operation). The source operand can be an MMX™ technology register or a 64-bit memory location, or an XMM register or a 128-bit memory location. The destination operand can be an MMX or XMM register.


PMULHUW and PMULHW Instruction Operations
Operation
PMULHUW instruction with 64-bit operands:
TEMP0 DEST * SRC; * Unsigned multiplication *
TEMP1 DEST * SRC;
TEMP2 DEST * SRC;
TEMP3 DEST * SRC;
DEST TEMP0;
DEST TEMP1;
DEST TEMP2;
DEST TEMP3;

PMULHUW instruction with 128-bit operands:
TEMP0 DEST * SRC; * Unsigned multiplication *
TEMP1 DEST * SRC;
TEMP2 DEST * SRC;
TEMP3 DEST * SRC;
TEMP4 DEST * SRC;
TEMP5 DEST * SRC;
TEMP6 DEST * SRC;
TEMP7 DEST * SRC;
DEST TEMP0;
DEST TEMP1;
DEST TEMP2;
DEST TEMP3;
DEST TEMP4;
DEST TEMP5;
DEST TEMP6;
DEST TEMP7;
Intel® C++ Compiler Equivalent Intrinsics
PMULHUW __m64 _mm_mulhi_pu16(__m64 a, __m64 b)
PMULHUW __m128i _mm_mulhi_epu16 ( __m128i a, __m128i b)
Affected Flags
None.

Protected Mode Exceptions
#GP(0) - If the effective address of the memory operand is beyond the segment limit of CS, DS, ES, FS, or GS. (Only for 128-bit operations). If the memory operand is not aligned to a 16-byte boundary, regardless of which segment.

#SS(0) - If the effective address of the memory operand is beyond the segment limit of SS.

#UD - If EM in CR0 is set to 1. (Only for 128-bit operations). If OSFXSR in CR4 is 0. (Only for 128-bit operations). If CPUID feature flag SSE-2 is 0.

#NM - If TS in CR0 is set to 1.

#MF (Only for 64-bit operations) - If there is a pending x87 FPU exception.

#PF(error code) - If a page fault occurs.

#AC(0) (Only for 64-bit operations) - If alignment checking is enabled and an unaligned memory reference is made when the current privilege level is 3.

Real Address Mode Exceptions
#GP(0) (Only for 128-bit operations) - If the memory operand is not aligned to a 16-byte boundary, regardless of which segment. If any part of the operand is outside the valid address space from 0 to FFFFH.

#UD - If EM in CR0 is set to 1. (Only for 128-bit operations). If OSFXSR in CR4 is 0. (Only for 128-bit operations). If CPUID feature flag SSE-2 is 0.

#NM - If TS in CR0 is set to 1.

#MF (Only for 64-bit operations) - If there is a pending x87 FPU exception.

Virtual 8086 Mode Exceptions
Same as exceptions in "Real Address Mode".

#PF(error code) - Page fault.

#AC(0) (Only for 64-bit operations) - If an unaligned memory reference is made when alignment checking is enabled.

Numerical Exceptions
None.

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PACKUSWB - Unsigned Saturation Packing
Opcode
Instruction
Description
0F 67 /r
PACKUSWB mm, mm/m64
Pack 4 signed words in mm into 8 unsigned bytes using saturation operation, and the result is put into mm.
66 0F 67 /r
PACKUSWB xmm1, xmm2/m128
Pack signed words in xmm1 and xmm2/m128 into unsigned bytes using saturation operation, and the result is put into xmm1.
Description
Pack 4 signed words in the destination operand (the first operand) and 4 signed words in the source operand (the second operand) into 8 signed bytes using saturation operation, and the result is put into the destination operand. (Refer to "Figure 3-5"). If the signed value of the word is outside the range of the unsigned byte (i.e., greater than FFH or less than 00H), then the saturated byte value FFH or 00H is stored into the destination operand respectively.

The destination operand must be an MMX™ technology register; the source operand can be an MMX register or a quad-word memory location.

Pack 8 signed words in the source operand xmm2/m128 and 8 signed words in the destination operand xmm1 into 16 unsigned bytes, and the result is put into the destination register xmm1. If the signed value of the word is greater than or less than the range of the unsigned byte, then saturation operation is performed on the value (FFH for overflow, 00H for underflow). The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory operand.


Figure 3-5. Operation of PACKUSWB Instruction
Operation
PACKUSWB instruction with 64-bit operands:
DEST SaturateSignedWordToUnsignedByte DEST;
DEST SaturateSignedWordToUnsignedByte DEST;
DEST SaturateSignedWordToUnsignedByte DEST;
DEST SaturateSignedWordToUnsignedByte DEST;
DEST SaturateSignedWordToUnsignedByte SRC;
DEST SaturateSignedWordToUnsignedByte SRC;
DEST SaturateSignedWordToUnsignedByte SRC;
DEST SaturateSignedWordToUnsignedByte SRC;

PACKUSWB instruction with 128-bit operands:
DEST SaturateSignedWordToUnsignedByte (DEST);
DEST SaturateSignedWordToUnsignedByte (DEST);
DEST SaturateSignedWordToUnsignedByte (DEST);
DEST SaturateSignedWordToUnsignedByte (DEST);
DEST SaturateSignedWordToUnsignedByte (DEST);
DEST SaturateSignedWordToUnsignedByte (DEST);
DEST SaturateSignedWordToUnsignedByte (DEST);
DEST SaturateSignedWordToUnsignedByte (DEST);
DEST SaturateSignedWordToUnsignedByte (SRC);
DEST SaturateSignedWordToUnsignedByte (SRC);
DEST SaturateSignedWordToUnsignedByte (SRC);
DEST SaturateSignedWordToUnsignedByte (SRC);
DEST SaturateSignedWordToUnsignedByte (SRC);
DEST SaturateSignedWordToUnsignedByte (SRC);
DEST SaturateSignedWordToUnsignedByte (SRC);
DEST SaturateSignedWordToUnsignedByte (SRC);
Intel® C++ Compiler Equivalent Intrinsics
__m64 _mm_packs_pu16(__m64 m1, __m64 m2)
Affected Flags
None.

Protected Mode Exceptions
#GP(0) - If the effective address of the memory operand is beyond the segment limit of CS, DS, ES, FS, or GS. (Only for 128-bit operations). If the memory operand is not aligned to a 16-byte boundary, regardless of which segment.

#SS(0) - If the effective address of the memory operand is beyond the segment limit of SS.

#UD - If EM in CR0 is set to 1. (Only for 128-bit operations). If OSFXSR in CR4 is 0. (Only for 128-bit operations). If CPUID feature flag SSE-2 is 0.

#NM - If TS in CR0 is set to 1.

#MF (Only for 64-bit operations) - If there is a pending x87 FPU exception.

#PF(error code) - If a page fault occurs.

#AC(0) (Only for 64-bit operations) - If alignment checking is enabled and an unaligned memory reference is made when the current privilege level is 3.

Real Address Mode Exceptions
#GP(0) (Only for 128-bit operations) - If the memory operand is not aligned to a 16-byte boundary, regardless of which segment. If any part of the operand is outside the valid address space from 0 to FFFFH.

#UD - If EM in CR0 is set to 1. (Only for 128-bit operations). If OSFXSR in CR4 is 0. (Only for 128-bit operations). If CPUID feature flag SSE-2 is 0.

#NM - If TS in CR0 is set to 1.

#MF (Only for 64-bit operations) - If there is a pending x87 FPU exception.

Virtual 8086 Mode Exceptions
Same as exceptions in "Real Address Mode".

#PF(error code) - Page fault.

#AC(0) (Only for 64-bit operations) - If an unaligned memory reference is made when alignment checking is enabled.

---------------------------------------------------------------------------------------------------------------------------------------
PACKSSWB/PACKSSDW - Signed Saturation Packing
Opcode
Instruction
Description
0F 63 /r
PACKSSWB mm1, mm2/m64
Pack 4 signed words in mm1 and 4 signed words in mm2/m64 into 8 signed byte integers using saturation operation, and the result is put into mm1.
66 0F 63 /r
PACKSSWB xmm1, xmm2/m128
Pack 8 signed words in xmm1 and 8 signed words in xxm2/m128 into 16 signed byte integers using saturation operation, and the result is put into xxm1.
0F 6B /r
PACKSSDW mm1, mm2/m64
Pack 2 signed double word integers in mm1 and 2 signed double word integers in mm2/m64 into 4 signed word integers using saturation operation, and the result is put into mm1.
66 0F 6B /r
PACKSSDW xmm1, xmm2/m128
Pack 4 signed double word integers in xmm1 and 4 signed double word integers in xxm2/m128 into 8 signed word integers using saturation operation, and the result is put into xxm1.
Description
Pack signed word integers into signed byte integers using saturation operation (PACKSSWB), or pack signed double word integers into signed word integers (PACKSSDW). The PACKSSWB instruction packs 4 signed words in the destination operand (the first operand) and 4 signed words in the source operand (the second operand) into 8 signed bytes, and the result is put into the destination operand. If the signed value of the word is outside the range of the signed byte (i.e., greater than 7FH or less than 80H), then the saturated byte value 7FH or 80H is stored into the destination operand respectively.

The PACKSSDW instruction packs 2 signed double words in the destination operand (the first operand) and 2 signed double words in the source operand (the second operand) into 4 signed words, and the result is put into the destination operand. (Refer to "Figure 3-4"). If the signed value of the double word is outside the range of the signed word (i.e., greater than 7FFFH or less than 8000H), then the saturated byte value 7FFFH or 8000H is stored into the destination operand respectively.

The destination operand of the PACKSSWB and PACKSSDW instructions must be an MMX™ technology register; the source operand can be an MMX register or a quad-word memory location.

Pack signed data elements in the source operand and the destination operand using signed saturation operation, and the result is written into the destination operand. The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory operand.

The PACKSSWB instruction packs 8 signed words in the source operand and 8 signed words in the destination operand into 16 signed bytes, and the result is put into the destination operand. If the signed value of the word is greater than or less than the range of the signed byte, then saturation operation is performed on the value (7FH for overflow, 80H for underflow).

The PACKSSDW instruction packs 4 signed double words in the source operand and 4 signed double words in the destination operand into 8 signed words, and the result is put into the destination register. If the signed value of the double word is greater than or less than the range of the signed word, then saturation operation is performed on the value (7FFFH for overflow, 8000H for underflow)).


Figure 3-4. Operation of PACKSSDW Instruction
Operation
PACKSSWB instruction with 64-bit operands
DEST SaturateSignedWordToSignedByte DEST;
DEST SaturateSignedWordToSignedByte DEST;
DEST SaturateSignedWordToSignedByte DEST;
DEST SaturateSignedWordToSignedByte DEST;
DEST SaturateSignedWordToSignedByte SRC;
DEST SaturateSignedWordToSignedByte SRC;
DEST SaturateSignedWordToSignedByte SRC;
DEST SaturateSignedWordToSignedByte SRC;

PACKSSDW instruction with 64-bit operands
DEST SaturateSignedDoublewordToSignedWord DEST;
DEST SaturateSignedDoublewordToSignedWord DEST;
DEST SaturateSignedDoublewordToSignedWord SRC;
DEST SaturateSignedDoublewordToSignedWord SRC;

PACKSSWB instruction with 128-bit operands
DEST SaturateSignedWordToSignedByte (DEST);
DEST SaturateSignedWordToSignedByte (DEST);
DEST SaturateSignedWordToSignedByte (DEST);
DEST SaturateSignedWordToSignedByte (DEST);
DEST SaturateSignedWordToSignedByte (DEST);
DEST SaturateSignedWordToSignedByte (DEST);
DEST SaturateSignedWordToSignedByte (DEST);
DEST SaturateSignedWordToSignedByte (DEST);
DEST SaturateSignedWordToSignedByte (SRC);
DEST SaturateSignedWordToSignedByte (SRC);
DEST SaturateSignedWordToSignedByte (SRC);
DEST SaturateSignedWordToSignedByte (SRC);
DEST SaturateSignedWordToSignedByte (SRC);
DEST SaturateSignedWordToSignedByte (SRC);
DEST SaturateSignedWordToSignedByte (SRC);
DEST SaturateSignedWordToSignedByte (SRC);

PACKSSDW instruction with 128-bit operands
DEST SaturateSignedDwordToSignedWord (DEST);
DEST SaturateSignedDwordToSignedWord (DEST);
DEST SaturateSignedDwordToSignedWord (DEST);
DEST SaturateSignedDwordToSignedWord (DEST);
DEST SaturateSignedDwordToSignedWord (SRC);
DEST SaturateSignedDwordToSignedWord (SRC);
DEST SaturateSignedDwordToSignedWord (SRC);
DEST SaturateSignedDwordToSignedWord (SRC);
Intel® C++ Compiler Equivalent Intrinsics
__m64 _mm_packs_pi16(__m64 m1, __m64 m2)
__m64 _mm_packs_pi32 (__m64 m1, __m64 m2)
Affected Flags
None.

Protected Mode Exceptions
#GP(0) - If the effective address of the memory operand is beyond the segment limit of CS, DS, ES, FS, or GS. (Only for 128-bit operations). If the memory operand is not aligned to a 16-byte boundary, regardless of which segment
1<词>,2,3/段\,4{节},5(章)。
Floor 48 Posted 2016-06-26 19:43 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 49 Posted 2016-06-26 19:43 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 50 Posted 2016-06-26 19:44 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 51 Posted 2016-06-26 19:46 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 52 Posted 2016-06-26 19:47 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 53 Posted 2016-06-26 19:47 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 54 Posted 2016-06-26 19:48 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 55 Posted 2016-06-26 19:49 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 56 Posted 2016-06-26 19:51 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 57 Posted 2016-06-26 19:52 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 58 Posted 2016-06-26 19:52 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 59 Posted 2016-06-26 19:53 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Floor 60 Posted 2016-06-26 19:54 ·  中国 海南 海口 电信
超级版主
★★★★
Credits 3,673
Posts 2,020
Joined 2016-02-01 00:00
10-year member
UID 181465
Gender Male
Status Offline
1<词>,2,3/段\,4{节},5(章)。
Forum Jump: