NASM - The Netwide Assembler

NASM Forum => Programming with NASM => Topic started by: Rookie on November 26, 2018, 01:26:28 PM

Title: Support for FISTTP?
Post by: Rookie on November 26, 2018, 01:26:28 PM: Hi!

I am trying to assemble code with FISTTP (integer store truncate and pop). Apparently it was introduced with SSE3.

I am currently hard-coding
FISTTP QWORD[RAX]
as
DW 8DDh
since I get a "error: no instruction for this cpu level" message.

What CPU level do I need?
Title: Re: Support for FISTTP?
Post by: dreamCoder on November 26, 2018, 04:38:53 PM: Without codes or any background information, can't tell you much about your problem. But testing this on i7, Win10 and it works.
Code: [Select]
extern printf section .data x: dq 18.7653 format: db '%llu',0ah,0 section .text global main main: sub rsp,40 finit mov rax,x fld qword[rax] db 0xdd db 0x08 mov rdx,[x] ;rsi mov rcx,format ;rdi call printf add rsp,40 ret
But I don't understand the need to use hex-encoding of instructions. If it doesn't run pre-Prescott, hard-coding it like that won't run either. Any particular reasons? Perhaps u should update your NASM.
Title: Re: Support for FISTTP?
Post by: Rookie on November 26, 2018, 07:20:34 PM: There are two versions of the macro. The first is the ideal, and the second is the work-around going through rax.

Code: [Select]
;---------------------------------------------- froundz ; st0 <- st0 rounded towards zero at non-zero st1 intervals %macro froundz 0 ; x,|n| fdiv st0,st1 ; xi=x/|n|,|n| sub rsp,8 fisttp qword[rsp] fild qword[rsp] ; Trunc(xi),|n| fmul ; Trunc(xi)|n| add rsp,8 %endmacro ;---------------------------------------------- froundz ; st0 <- st0 rounded towards zero at non-zero st1 intervals %macro froundz2 0 ; x,|n| fdiv st0,st1 ; xi=x/|n|,|n| sub rsp,8 mov rax,rsp ;fisttp qword[rax] ; 0x08DD is the rax opcode dw 8DDh ; |n| fild qword[rax] ; Trunc(xi),|n| fmul ; Trunc(xi)|n| add rsp,8 %endmacro
The assembly command gives the following output:

Code: [Select]
Assemble.cmd ------------ Copyright (c) 1997 Analytical Logic All rights reserved Assembling project: FastMath ... 32-bit Borland Assembly ... Failed! FastMath.32.asm:618: error: no instruction for this cpu level FastMath.inc:796: ... from macro `froundz' defined here 32-bit Microsoft Assembly ... Failed! FastMath.32.asm:618: error: no instruction for this cpu level FastMath.inc:796: ... from macro `froundz' defined here 64-bit Borland Assembly ... Ok 64-bit Microsoft Assembly ... Ok 64-bit Microsoft DLL ... Ok Moving objects: FastMath.x64.obj ... Ok FastMath.o ... Ok FastMath.dll ... Ok C:\Source\Analog\FastMath\Nasm>
Nasm is 2.12.
I omitted to mention that it is only the 32-bit assemble that throws the error and requires the hard-coded opcode.
Title: Re: Support for FISTTP?
Post by: dreamCoder on November 26, 2018, 08:37:07 PM: Rookie, is that 64-bit addressing mode "qword[rax]" compiled in a 32-bit code? Shouldn't it be "qword[eax]" instead?

If FISTTP is not supported, then you could use one of two methods

1) frndint (rouding to zero). But I bet you already knew this.
2) Fxtract. (Get the exponent part in ST1). Then use that information to SHL and SHR the float in question. Exponent part shows the range or offset bits of your integer bit, from 12 bits (EDIT: from the left).
Title: Re: Support for FISTTP?
Post by: Rookie on November 26, 2018, 10:59:17 PM: FastMath.inc is written purely in 64-bit.
FastMath.32.asm handles the 32-bit-specific prologue/epilogue and maps rax -> eax etc (so the code maps to FISTTP QWORD[EAX]).

The library expressly avoids FXTRACT(13), FPREM(16-64) and FSCALE(20-31) because of their dismal performance. FRNDINT(9-20) is out of the question because of the required control word change (and it's slow - worst-case FISTP/FILD with read-after-write penalty is as fast).

The 32-bit libraries work fine with the hard-coded opcode. So the hardware supports the instruction, it's just the 32-bit assembly that rejects it.

SOLVED: I had to specify CPU PRESCOTT instead of CPU 686!
Title: Re: Support for FISTTP?
Post by: dreamCoder on November 26, 2018, 11:45:21 PM: Owh, ok. Glad I mentioned Prescott ;D

But still the question remains. If it is solved by setting the minimum CPU to Prescott, then there's no apparent need to hard-code it anymore because Prescott onwards does support FISTTP instruction (I think). But as long as it is solved, I am happy for you.
Title: Re: Support for FISTTP?
Post by: Rookie on November 27, 2018, 12:05:53 AM: Thanks for the help!

CPU PRESCOTT allows the FISTTP QWORD[rsp] (preferred) to assemble so I was able to remove the hard-coded opcode.