Author Topic: crc32 instruction (Read 32140 times)

nobody · « **on:** November 03, 2008, 04:12:51 PM »

I posted the question about call/jmp relative offsets earlier.

I just went through and implemented all the latest SSE instructions in my code, and am comparing the machine code I generate against NASM. I'm seeing different/unexpected behavior from NASM -- doesn't seem to follow the Intel manuals.

I think the opcode is supposed to be 0x0F 0x38 0xF0 (8bit src operand) or 0xF1 (16/32/64bit src operand), with an 0xF2 prefix. For every single documented operand combination, NASM isn't generating what I would expect. Below I've listed the test instructions I'm generating, along with my (corepy) and NASM's output for a few of the cases.

crc32 r12, r12

nasm output: f24d0f380166e4
corepy output: f24d0f38f1e4

crc32 r12, qword [r12 + 32]

nasm output: f24d0f380166642420
corepy output: f24d0f38f1642420

crc32 r12, byte [rbp + -8]

nasm output: f24c0f380165f8
corepy output: f24c0f38f065f8

Which is correct? If it is NASM, why, and where is the documentation backing it up?

I'd test this on hardware, but I don't have access to any machines with SSE 4.2. Also I do realize some of the instructions/operands above don't make practical sense -- these are just tests used to verify machine code output.

Thanks!

H. Peter Anvin · « **Reply #1 on:** December 10, 2008, 06:35:49 PM »

The CRC32 instruction was broken; it is fixed in 2.06rc1.

NASM - The Netwide Assembler

News:

Author Topic: crc32 instruction (Read 32140 times)

nobody

crc32 instruction

H. Peter Anvin

Re: crc32 instruction