Undocumented 3DNow! instructions

All the brand names and product names used here are the property of their respective owners, otherwise the page is
Copyright 1999 by Grzegorz Mazur


Created: 1999-06-23
Last update: 2000-08-31

Links: my home page, my MatroX Files page

visitors since 99-06-23
FastCounter by LinkExchange


There are 3 useful but undocumented 3DNow! instructions in AMD K6-X CPUs. And MASM (6.13/6.14) knows them!
The instructions should not be treated too seriously, as their implementation in newer AMD CPUs is slightly different. Remember that they are not documented, so officially they don't exist...
You may also look up my article on these instructions at DDJ Microprocessor Center website.

Update (08'2000): verified the difference between PF2IW on K6-X and Athlon.
On K6-2 and K6-III the PF2IW, PI2FW and PSWAPW behave exactly as described below.
On K7 CPUs (Athlon and Duron), PF2IW is slightly changed (improved), and PSWAPW is replaced with PSWAPD (useful for complex number arithmetics). MASM 6.15 doesn't recognize PSWAPW mnemonic. The PSWAPD instruction has exactly the same opcode as previously used by PSWAPW.

Update (20.07):
Based on Clive Turvey's speculations we may assume that PF2ID and PI2FD are present in K7 (Athlon), and PSWAPW is replaced in K7 by PSWAPD (swap doublewords within a quadword).

Update (26.06):
Based on the suggestion sent by Jeff Epler I checked the whole 3DNow! opcode suffix space. It seems that, at least on my K6-2 XT, there are no more valid 3DNow! opcoodes. All the other opcodes are equivalent to POR (MMX OR operation).
The undocumented instructions are described below, and their description simply complements the 3DNow! Technology Manual. For more info on 3DNow!, get the Manual (doc #21928) from AMD's website.


PF2IW mmreg1, mmreg2/mem64

Opcode: 0Fh 0Fh / 1Ch

Converts packed floating-point operand to packed 16-bit integer. The instruction is similar to PF2ID, but the result qword contains two 16-bit signed integers in bits 47..32 and 15..0. Result bits 63..48 and 31..16 are cleared.

Function:
IF (mmreg2/mem64[31:0] >= 2^15)
    THEN mmreg1[31:0] = 7FFFh
ELSEIF (mmreg2/mem64[31:0] <= –2^15)
    THEN mmreg1[31:0] = 8000h
ELSE mmreg1[31:0] = int(mmreg2/mem64[31:0])

IF (mmreg2/mem64[63:32] >= 2^15 )
    THEN mmreg1[63:32] = 7FFFh
ELSEIF (mmreg2/mem64[63:32] <= –2^15)
    THEN mmreg1[63:32] = 8000h
ELSE mmreg1[63:32] = int(mmreg2/mem64[63:32])

Note: On K6-2 and K6-III the negative value is stored on 16 bits only, zero-extended to 32 bits. On Athlon (and presumably K6-X+), the 16-bit value is sign extended to 32 bits, which really makes sense.


PI2FW mmreg1, mmreg2/mem64

Opcode: 0Fh 0Fh / 0Ch

Packed 16-bit integer to floating-point conversion.

Function:
mmreg1[31:0] = float(mmreg2/mem64[15:0])
mmreg1[63:32] = float(mmreg2/mem64[47:32])

Note: This instructoin remains unchanged on Athlon CPUs.


PSWAPW mmreg1, mmreg2/mem64

Opcode: 0Fh 0Fh / 0BBh

Swap 16-bit words within 64-bit MMX word.

Function:
mmreg1[15..0] = mmreg2/mem64[63..48]
mmreg1[31..16] = mmreg2/mem64[47..32]
mmreg1[47..32] = mmreg2/mem64[31..16]
mmreg1[63..48] = mmreg2/mem64[15..0]

Note: This instruction opcode is interpreted as PSWAPD on Athlon and K6-X+