programmers resources
  http://www.intel-assembler.it/  (c)2017 intel-assembler.it   info@intel-assembler.it
 
Search :  
Lingua Italiana    English Language   
Index
 
just an empty assembly space
just an arrow Intel Platform
just an arrow Article & Guides
just an arrow Download Software


23/01/2009 Featured Article: How to remove Buzus Virus (permalink)




:::3349548:::
Bottone Scambio Directory Pubblicitaonline.it
Home Page | Articles & Guides | Download | Intel Platform | Contacts

Google
 


Bookmark and Share
Download 
Tell a friend



Intel MMX Instruction Set +Cyrix extensions

A reference manual for MMX instructions

(by thenet)

Intel MMX Instruction Set also extra Cyrix extensions.


This article is online from 3177 days and has been seen 5178 times


Intel MMX Instruction Set
also extra Cyrix extensions.


A.26 `EMMS': Empty MMX State

EMMS ; 0F 77 [PENT,MMX]

`EMMS' sets the FPU tag word (marking which floating-point registers
are available) to all ones, meaning all registers are available for
the FPU to use. It should be used after executing MMX instructions
and before executing any subsequent floating-point operations.

A.103 `MOVD': Move Doubleword to/from MMX Register

MOVD mmxreg,r/m32 ; 0F 6E /r [PENT,MMX]
MOVD r/m32,mmxreg ; 0F 7E /r [PENT,MMX]

`MOVD' copies 32 bits from its source (second) operand into its
destination (first) operand. When the destination is a 64-bit MMX
register, the top 32 bits are set to zero.

A.104 `MOVQ': Move Quadword to/from MMX Register

MOVQ mmxreg,r/m64 ; 0F 6F /r [PENT,MMX]
MOVQ r/m64,mmxreg ; 0F 7F /r [PENT,MMX]

`MOVQ' copies 64 bits from its source (second) operand into its
destination (first) operand.

A.113 `PACKSSDW', `PACKSSWB', `PACKUSWB': Pack Data

PACKSSDW mmxreg,r/m64 ; 0F 6B /r [PENT,MMX]
PACKSSWB mmxreg,r/m64 ; 0F 63 /r [PENT,MMX]
PACKUSWB mmxreg,r/m64 ; 0F 67 /r [PENT,MMX]

All these instructions start by forming a notional 128-bit word by
placing the source (second) operand on the left of the destination
(first) operand. `PACKSSDW' then splits this 128-bit word into four
doublewords, converts each to a word, and loads them side by side
into the destination register; `PACKSSWB' and `PACKUSWB' both split
the 128-bit word into eight words, converts each to a byte, and
loads _those_ side by side into the destination register.

`PACKSSDW' and `PACKSSWB' perform signed saturation when reducing
the length of numbers: if the number is too large to fit into the
reduced space, they replace it by the largest signed number (`7FFFh'
or `7Fh') that _will_ fit, and if it is too small then they replace
it by the smallest signed number (`8000h' or `80h') that will fit.
`PACKUSWB' performs unsigned saturation: it treats its input as
unsigned, and replaces it by the largest unsigned number that will
fit.

A.114 `PADDxx': MMX Packed Addition

PADDB mmxreg,r/m64 ; 0F FC /r [PENT,MMX]
PADDW mmxreg,r/m64 ; 0F FD /r [PENT,MMX]
PADDD mmxreg,r/m64 ; 0F FE /r [PENT,MMX]

PADDSB mmxreg,r/m64 ; 0F EC /r [PENT,MMX]
PADDSW mmxreg,r/m64 ; 0F ED /r [PENT,MMX]

PADDUSB mmxreg,r/m64 ; 0F DC /r [PENT,MMX]
PADDUSW mmxreg,r/m64 ; 0F DD /r [PENT,MMX]

`PADDxx' all perform packed addition between their two 64-bit
operands, storing the result in the destination (first) operand. The
`PADDxB' forms treat the 64-bit operands as vectors of eight bytes,
and add each byte individually; `PADDxW' treat the operands as
vectors of four words; and `PADDD' treats its operands as vectors of
two doublewords.

`PADDSB' and `PADDSW' perform signed saturation on the sum of each
pair of bytes or words: if the result of an addition is too large or
too small to fit into a signed byte or word result, it is clipped
(saturated) to the largest or smallest value which _will_ fit.
`PADDUSB' and `PADDUSW' similarly perform unsigned saturation,
clipping to `0FFh' or `0FFFFh' if the result is larger than that.

A.115 `PADDSIW': MMX Packed Addition to Implicit Destination

PADDSIW mmxreg,r/m64 ; 0F 51 /r [CYRIX,MMX]

`PADDSIW', specific to the Cyrix extensions to the MMX instruction
set, performs the same function as `PADDSW', except that the result
is not placed in the register specified by the first operand, but
instead in the register whose number differs from the first operand
only in the last bit. So `PADDSIW MM0,MM2' would put the result in
`MM1', but `PADDSIW MM1,MM2' would put the result in `MM0'.

A.116 `PAND', `PANDN': MMX Bitwise AND and AND-NOT

PAND mmxreg,r/m64 ; 0F DB /r [PENT,MMX]
PANDN mmxreg,r/m64 ; 0F DF /r [PENT,MMX]

`PAND' performs a bitwise AND operation between its two operands
(i.e. each bit of the result is 1 if and only if the corresponding
bits of the two inputs were both 1), and stores the result in the
destination (first) operand.

`PANDN' performs the same operation, but performs a one's complement
operation on the destination (first) operand first.

A.117 `PAVEB': MMX Packed Average

PAVEB mmxreg,r/m64 ; 0F 50 /r [CYRIX,MMX]

`PAVEB', specific to the Cyrix MMX extensions, treats its two
operands as vectors of eight unsigned bytes, and calculates the
average of the corresponding bytes in the operands. The resulting
vector of eight averages is stored in the first operand.

A.118 `PCMPxx': MMX Packed Comparison

PCMPEQB mmxreg,r/m64 ; 0F 74 /r [PENT,MMX]
PCMPEQW mmxreg,r/m64 ; 0F 75 /r [PENT,MMX]
PCMPEQD mmxreg,r/m64 ; 0F 76 /r [PENT,MMX]

PCMPGTB mmxreg,r/m64 ; 0F 64 /r [PENT,MMX]
PCMPGTW mmxreg,r/m64 ; 0F 65 /r [PENT,MMX]
PCMPGTD mmxreg,r/m64 ; 0F 66 /r [PENT,MMX]

The `PCMPxx' instructions all treat their operands as vectors of
bytes, words, or doublewords; corresponding elements of the source
and destination are compared, and the corresponding element of the
destination (first) operand is set to all zeros or all ones
depending on the result of the comparison.

`PCMPxxB' treats the operands as vectors of eight bytes, `PCMPxxW'
treats them as vectors of four words, and `PCMPxxD' as two
doublewords.

`PCMPEQx' sets the corresponding element of the destination operand
to all ones if the two elements compared are equal; `PCMPGTx' sets
the destination element to all ones if the element of the first
(destination) operand is greater (treated as a signed integer) than
that of the second (source) operand.

A.119 `PDISTIB': MMX Packed Distance and Accumulate with Implied Register

PDISTIB mmxreg,mem64 ; 0F 54 /r [CYRIX,MMX]

`PDISTIB', specific to the Cyrix MMX extensions, treats its two
input operands as vectors of eight unsigned bytes. For each byte
position, it finds the absolute difference between the bytes in that
position in the two input operands, and adds that value to the byte
in the same position in the implied output register. The addition is
saturated to an unsigned byte in the same way as `PADDUSB'.

The implied output register is found in the same way as `PADDSIW'
(section A.115).

Note that `PDISTIB' cannot take a register as its second source
operand.

A.120 `PMACHRIW': MMX Packed Multiply and Accumulate with Rounding

PMACHRIW mmxreg,mem64 ; 0F 5E /r [CYRIX,MMX]

`PMACHRIW' acts almost identically to `PMULHRIW' (section A.123),
but instead of _storing_ its result in the implied destination
register, it _adds_ its result, as four packed words, to the implied
destination register. No saturation is done: the addition can wrap
around.

Note that `PMACHRIW' cannot take a register as its second source
operand.

A.121 `PMADDWD': MMX Packed Multiply and Add

PMADDWD mmxreg,r/m64 ; 0F F5 /r [PENT,MMX]

`PMADDWD' treats its two inputs as vectors of four signed words. It
multiplies corresponding elements of the two operands, giving four
signed doubleword results. The top two of these are added and placed
in the top 32 bits of the destination (first) operand; the bottom
two are added and placed in the bottom 32 bits.

A.122 `PMAGW': MMX Packed Magnitude

PMAGW mmxreg,r/m64 ; 0F 52 /r [CYRIX,MMX]

`PMAGW', specific to the Cyrix MMX extensions, treats both its
operands as vectors of four signed words. It compares the absolute
values of the words in corresponding positions, and sets each word
of the destination (first) operand to whichever of the two words in
that position had the larger absolute value.

A.123 `PMULHRW', `PMULHRIW': MMX Packed Multiply High with Rounding

PMULHRW mmxreg,r/m64 ; 0F 59 /r [CYRIX,MMX]
PMULHRIW mmxreg,r/m64 ; 0F 5D /r [CYRIX,MMX]

These instructions, specific to the Cyrix MMX extensions, treat
their operands as vectors of four signed words. Words in
corresponding positions are multiplied, to give a 32-bit value in
which bits 30 and 31 are guaranteed equal. Bits 30 to 15 of this
value (bit mask `0x7FFF8000') are taken and stored in the
corresponding position of the destination operand, after first
rounding the low bit (equivalent to adding `0x4000' before
extracting bits 30 to 15).

For `PMULHRW', the destination operand is the first operand; for
`PMULHRIW' the destination operand is implied by the first operand
in the manner of `PADDSIW' (section A.115).

A.124 `PMULHW', `PMULLW': MMX Packed Multiply

PMULHW mmxreg,r/m64 ; 0F E5 /r [PENT,MMX]
PMULLW mmxreg,r/m64 ; 0F D5 /r [PENT,MMX]

`PMULxW' treats its two inputs as vectors of four signed words. It
multiplies corresponding elements of the two operands, giving four
signed doubleword results.

`PMULHW' then stores the top 16 bits of each doubleword in the
destination (first) operand; `PMULLW' stores the bottom 16 bits of
each doubleword in the destination operand.

A.125 `PMVccZB': MMX Packed Conditional Move

PMVZB mmxreg,mem64 ; 0F 58 /r [CYRIX,MMX]
PMVNZB mmxreg,mem64 ; 0F 5A /r [CYRIX,MMX]
PMVLZB mmxreg,mem64 ; 0F 5B /r [CYRIX,MMX]
PMVGEZB mmxreg,mem64 ; 0F 5C /r [CYRIX,MMX]

These instructions, specific to the Cyrix MMX extensions, perform
parallel conditional moves. The two input operands are treated as
vectors of eight bytes. Each byte of the destination (first) operand
is either written from the corresponding byte of the source (second)
operand, or left alone, depending on the value of the byte in the
_implied_ operand (specified in the same way as `PADDSIW', in
section A.115).

`PMVZB' performs each move if the corresponding byte in the implied
operand is zero. `PMVNZB' moves if the byte is non-zero. `PMVLZB'
moves if the byte is less than zero, and `PMVGEZB' moves if the byte
is greater than or equal to zero.

Note that these instructions cannot take a register as their second
source operand.

A.129 `POR': MMX Bitwise OR

POR mmxreg,r/m64 ; 0F EB /r [PENT,MMX]

`POR' performs a bitwise OR operation between its two operands (i.e.
each bit of the result is 1 if and only if at least one of the
corresponding bits of the two inputs was 1), and stores the result
in the destination (first) operand.

A.130 `PSLLx', `PSRLx', `PSRAx': MMX Bit Shifts

PSLLW mmxreg,r/m64 ; 0F F1 /r [PENT,MMX]
PSLLW mmxreg,imm8 ; 0F 71 /6 ib [PENT,MMX]

PSLLD mmxreg,r/m64 ; 0F F2 /r [PENT,MMX]
PSLLD mmxreg,imm8 ; 0F 72 /6 ib [PENT,MMX]

PSLLQ mmxreg,r/m64 ; 0F F3 /r [PENT,MMX]
PSLLQ mmxreg,imm8 ; 0F 73 /6 ib [PENT,MMX]

PSRAW mmxreg,r/m64 ; 0F E1 /r [PENT,MMX]
PSRAW mmxreg,imm8 ; 0F 71 /4 ib [PENT,MMX]

PSRAD mmxreg,r/m64 ; 0F E2 /r [PENT,MMX]
PSRAD mmxreg,imm8 ; 0F 72 /4 ib [PENT,MMX]

PSRLW mmxreg,r/m64 ; 0F D1 /r [PENT,MMX]
PSRLW mmxreg,imm8 ; 0F 71 /2 ib [PENT,MMX]

PSRLD mmxreg,r/m64 ; 0F D2 /r [PENT,MMX]
PSRLD mmxreg,imm8 ; 0F 72 /2 ib [PENT,MMX]

PSRLQ mmxreg,r/m64 ; 0F D3 /r [PENT,MMX]
PSRLQ mmxreg,imm8 ; 0F 73 /2 ib [PENT,MMX]

`PSxxQ' perform simple bit shifts on the 64-bit MMX registers: the
destination (first) operand is shifted left or right by the number
of bits given in the source (second) operand, and the vacated bits
are filled in with zeros (for a logical shift) or copies of the
original sign bit (for an arithmetic right shift).

`PSxxW' and `PSxxD' perform packed bit shifts: the destination
operand is treated as a vector of four words or two doublewords, and
each element is shifted individually, so bits shifted out of one
element do not interfere with empty bits coming into the next.

`PSLLx' and `PSRLx' perform logical shifts: the vacated bits at one
end of the shifted number are filled with zeros. `PSRAx' performs an
arithmetic right shift: the vacated bits at the top of the shifted
number are filled with copies of the original top (sign) bit.

A.131 `PSUBxx': MMX Packed Subtraction

PSUBB mmxreg,r/m64 ; 0F F8 /r [PENT,MMX]
PSUBW mmxreg,r/m64 ; 0F F9 /r [PENT,MMX]
PSUBD mmxreg,r/m64 ; 0F FA /r [PENT,MMX]

PSUBSB mmxreg,r/m64 ; 0F E8 /r [PENT,MMX]
PSUBSW mmxreg,r/m64 ; 0F E9 /r [PENT,MMX]

PSUBUSB mmxreg,r/m64 ; 0F D8 /r [PENT,MMX]
PSUBUSW mmxreg,r/m64 ; 0F D9 /r [PENT,MMX]

`PSUBxx' all perform packed subtraction between their two 64-bit
operands, storing the result in the destination (first) operand. The
`PSUBxB' forms treat the 64-bit operands as vectors of eight bytes,
and subtract each byte individually; `PSUBxW' treat the operands as
vectors of four words; and `PSUBD' treats its operands as vectors of
two doublewords.

In all cases, the elements of the operand on the right are
subtracted from the corresponding elements of the operand on the
left, not the other way round.

`PSUBSB' and `PSUBSW' perform signed saturation on the sum of each
pair of bytes or words: if the result of a subtraction is too large
or too small to fit into a signed byte or word result, it is clipped
(saturated) to the largest or smallest value which _will_ fit.
`PSUBUSB' and `PSUBUSW' similarly perform unsigned saturation,
clipping to `0FFh' or `0FFFFh' if the result is larger than that.

A.132 `PSUBSIW': MMX Packed Subtract with Saturation to Implied Destination

PSUBSIW mmxreg,r/m64 ; 0F 55 /r [CYRIX,MMX]

`PSUBSIW', specific to the Cyrix extensions to the MMX instruction
set, performs the same function as `PSUBSW', except that the result
is not placed in the register specified by the first operand, but
instead in the implied destination register, specified as for
`PADDSIW' (section A.115).

A.133 `PUNPCKxxx': Unpack Data

PUNPCKHBW mmxreg,r/m64 ; 0F 68 /r [PENT,MMX]
PUNPCKHWD mmxreg,r/m64 ; 0F 69 /r [PENT,MMX]
PUNPCKHDQ mmxreg,r/m64 ; 0F 6A /r [PENT,MMX]

PUNPCKLBW mmxreg,r/m64 ; 0F 60 /r [PENT,MMX]
PUNPCKLWD mmxreg,r/m64 ; 0F 61 /r [PENT,MMX]
PUNPCKLDQ mmxreg,r/m64 ; 0F 62 /r [PENT,MMX]

`PUNPCKxx' all treat their operands as vectors, and produce a new
vector generated by interleaving elements from the two inputs. The
`PUNPCKHxx' instructions start by throwing away the bottom half of
each input operand, and the `PUNPCKLxx' instructions throw away the
top half.

The remaining elements, totalling 64 bits, are then interleaved into
the destination, alternating elements from the second (source)
operand and the first (destination) operand: so the leftmost element
in the result always comes from the second operand, and the
rightmost from the destination.

`PUNPCKxBW' works a byte at a time, `PUNPCKxWD' a word at a time,
and `PUNPCKxDQ' a doubleword at a time.

So, for example, if the first operand held `0x7A6A5A4A3A2A1A0A' and
the second held `0x7B6B5B4B3B2B1B0B', then:

(*) `PUNPCKHBW' would return `0x7B7A6B6A5B5A4B4A'.

(*) `PUNPCKHWD' would return `0x7B6B7A6A5B4B5A4A'.

(*) `PUNPCKHDQ' would return `0x7B6B5B4B7A6A5A4A'.

(*) `PUNPCKLBW' would return `0x3B3A2B2A1B1A0B0A'.

(*) `PUNPCKLWD' would return `0x3B2B3A2A1B0B1A0A'.

(*) `PUNPCKLDQ' would return `0x3B2B1B0B3A2A1A0A'.

A.134 `PUSH': Push Data on Stack

PUSH reg16 ; o16 50+r [8086]
PUSH reg32 ; o32 50+r [386]

PUSH r/m16 ; o16 FF /6 [8086]
PUSH r/m32 ; o32 FF /6 [386]

PUSH CS ; 0E [8086]
PUSH DS ; 1E [8086]
PUSH ES ; 06 [8086]
PUSH SS ; 16 [8086]
PUSH FS ; 0F A0 [386]
PUSH GS ; 0F A8 [386]

PUSH imm8 ; 6A ib [286]
PUSH imm16 ; o16 68 iw [286]
PUSH imm32 ; o32 68 id [386]

`PUSH' decrements the stack pointer (`SP' or `ESP') by 2 or 4, and
then stores the given value at `[SS:SP]' or `[SS:ESP]'.

The address-size attribute of the instruction determines whether
`SP' or `ESP' is used as the stack pointer: to deliberately override
the default given by the `BITS' setting, you can use an `a16' or
`a32' prefix.

The operand-size attribute of the instruction determines whether the
stack pointer is decremented by 2 or 4: this means that segment
register pushes in `BITS 32' mode will push 4 bytes on the stack, of
which the upper two are undefined. If you need to override that, you
can use an `o16' or `o32' prefix.

The above opcode listings give two forms for general-purpose
register push instructions: for example, `PUSH BX' has the two forms
`53' and `FF F3'. NASM will always generate the shorter form when
given `PUSH BX'. NDISASM will disassemble both.

Unlike the undocumented and barely supported `POP CS', `PUSH CS' is
a perfectly valid and sensible instruction, supported on all
processors.

The instruction `PUSH SP' may be used to distinguish an 8086 from
later processors: on an 8086, the value of `SP' stored is the value
it has _after_ the push instruction, whereas on later processors it
is the value _before_ the push instruction.

A.135 `PUSHAx': Push All General-Purpose Registers

PUSHA ; 60 [186]
PUSHAD ; o32 60 [386]
PUSHAW ; o16 60 [186]

`PUSHAW' pushes, in succession, `AX', `CX', `DX', `BX', `SP', `BP',
`SI' and `DI' on the stack, decrementing the stack pointer by a
total of 16.

`PUSHAD' pushes, in succession, `EAX', `ECX', `EDX', `EBX', `ESP',
`EBP', `ESI' and `EDI' on the stack, decrementing the stack pointer
by a total of 32.

In both cases, the value of `SP' or `ESP' pushed is its _original_
value, as it had before the instruction was executed.

`PUSHA' is an alias mnemonic for either `PUSHAW' or `PUSHAD',
depending on the current `BITS' setting.

Note that the registers are pushed in order of their numeric values
in opcodes (see section A.2.1).

See also `POPA' (section A.127).

A.136 `PUSHFx': Push Flags Register

PUSHF ; 9C [186]
PUSHFD ; o32 9C [386]
PUSHFW ; o16 9C [186]

`PUSHFW' pops a word from the stack and stores it in the bottom 16
bits of the flags register (or the whole flags register, on
processors below a 386). `PUSHFD' pops a doubleword and stores it in
the entire flags register.

`PUSHF' is an alias mnemonic for either `PUSHFW' or `PUSHFD',
depending on the current `BITS' setting.

See also `POPF' (section A.128).

A.137 `PXOR': MMX Bitwise XOR

PXOR mmxreg,r/m64 ; 0F EF /r [PENT,MMX]

`PXOR' performs a bitwise XOR operation between its two operands
(i.e. each bit of the result is 1 if and only if exactly one of the
corresponding bits of the two inputs was 1), and stores the result
in the destination (first) operand.


Top
Download 
Tell a friend
Bookmark and Share



Similar Articles

8086/80186/80286/80386/80486 Instruction Set
A very nice reference about Intel instruction set
(by unknown)

AMD-K6 MMX Enhanced Processor
MMX Programming Reference
(by AMD)

Hamarsoft's 86BUGS list
Undocumented/buggy instructions of x86 processors
(by Harald Feldmann)

Intel Assembly Instruction Set
Asm instructions explained with many details
(by unknown)

Motorola Instruction set table
MC68030 processor instruction description
(by Bjorn Ove Arthun)

Notes on Intel Pentium Processor
CMPXCHG8B CPUID MOV RDMSR RDTSC RSM WRMSR
(by Microsoft)

The Complete Pentium Instruction Set Table
(32 Bit Addressing Mode Only)
(by Sang Cho)

Uso rapido del CPUID
Esempio di codice per l'istruzione MMX CPUID
(by JES)

 Tags: mmx, instruction set


webmaster jes
writers rguru, tech-g, aiguru, drAx

site optimized for IE/Firefox/Chrome with 1024x768 resolution

Valid HTML 4.01 Transitional


ALL TRADEMARKS ® ARE PROPERTY OF LEGITTIMATE OWNERS.
© ALL RIGHTS RESERVED.

hosting&web - www.accademia3.it

grossocactus
find rguru on
http://www.twitter.com/sicurezza3/
... send an email ...
Your name

Destination email

Message

captcha! Code