NASM - The Netwide Assembler

NASM Forum => Programming with NASM => Topic started by: nobody on November 03, 2008, 04:12:51 PM

Title: crc32 instruction
Post by: nobody on November 03, 2008, 04:12:51 PM
I posted the question about call/jmp relative offsets earlier.

I just went through and implemented all the latest SSE instructions in my code, and am comparing the machine code I generate against NASM.  I'm seeing different/unexpected behavior from NASM -- doesn't seem to follow the Intel manuals.

I think the opcode is supposed to be 0x0F 0x38 0xF0 (8bit src operand) or 0xF1 (16/32/64bit src operand), with an 0xF2 prefix.  For every single documented operand combination, NASM isn't generating what I would expect.  Below I've listed the test instructions I'm generating, along with my (corepy) and NASM's output for a few of the cases.

crc32 r12, r12

nasm output:    f24d0f380166e4
corepy output:  f24d0f38f1e4

crc32 r12, qword [r12 + 32]

nasm output:    f24d0f380166642420
corepy output:  f24d0f38f1642420

crc32 r12, byte [rbp + -8]

nasm output:    f24c0f380165f8
corepy output:  f24c0f38f065f8

Which is correct?  If it is NASM, why, and where is the documentation backing it up?

I'd test this on hardware, but I don't have access to any machines with SSE 4.2.  Also I do realize some of the instructions/operands above don't make practical sense -- these are just tests used to verify machine code output.

Thanks!
Title: Re: crc32 instruction
Post by: H. Peter Anvin on December 10, 2008, 06:35:49 PM
The CRC32 instruction was broken; it is fixed in 2.06rc1.