NASM - The Netwide Assembler
NASM Forum => Programming with NASM => Topic started by: Rookie on November 26, 2018, 01:26:28 PM
-
Hi!
I am trying to assemble code with FISTTP (integer store truncate and pop). Apparently it was introduced with SSE3.
I am currently hard-coding
FISTTP QWORD[RAX]
as
DW 8DDh
since I get a "error: no instruction for this cpu level" message.
What CPU level do I need?
-
Without codes or any background information, can't tell you much about your problem. But testing this on i7, Win10 and it works.
extern printf
section .data
x: dq 18.7653
format: db '%llu',0ah,0
section .text
global main
main:
sub rsp,40
finit
mov rax,x
fld qword[rax]
db 0xdd
db 0x08
mov rdx,[x] ;rsi
mov rcx,format ;rdi
call printf
add rsp,40
ret
But I don't understand the need to use hex-encoding of instructions. If it doesn't run pre-Prescott, hard-coding it like that won't run either. Any particular reasons? Perhaps u should update your NASM.
-
There are two versions of the macro. The first is the ideal, and the second is the work-around going through rax.
;---------------------------------------------- froundz
; st0 <- st0 rounded towards zero at non-zero st1 intervals
%macro froundz 0 ; x,|n|
fdiv st0,st1 ; xi=x/|n|,|n|
sub rsp,8
fisttp qword[rsp]
fild qword[rsp] ; Trunc(xi),|n|
fmul ; Trunc(xi)|n|
add rsp,8
%endmacro
;---------------------------------------------- froundz
; st0 <- st0 rounded towards zero at non-zero st1 intervals
%macro froundz2 0 ; x,|n|
fdiv st0,st1 ; xi=x/|n|,|n|
sub rsp,8
mov rax,rsp
;fisttp qword[rax] ; 0x08DD is the rax opcode
dw 8DDh ; |n|
fild qword[rax] ; Trunc(xi),|n|
fmul ; Trunc(xi)|n|
add rsp,8
%endmacro
The assembly command gives the following output:
Assemble.cmd
------------
Copyright (c) 1997 Analytical Logic
All rights reserved
Assembling project: FastMath ...
32-bit Borland Assembly ... Failed!
FastMath.32.asm:618: error: no instruction for this cpu level
FastMath.inc:796: ... from macro `froundz' defined here
32-bit Microsoft Assembly ... Failed!
FastMath.32.asm:618: error: no instruction for this cpu level
FastMath.inc:796: ... from macro `froundz' defined here
64-bit Borland Assembly ... Ok
64-bit Microsoft Assembly ... Ok
64-bit Microsoft DLL ... Ok
Moving objects:
FastMath.x64.obj ... Ok
FastMath.o ... Ok
FastMath.dll ... Ok
C:\Source\Analog\FastMath\Nasm>
Nasm is 2.12.
I omitted to mention that it is only the 32-bit assemble that throws the error and requires the hard-coded opcode.
-
Rookie, is that 64-bit addressing mode "qword[rax]" compiled in a 32-bit code? Shouldn't it be "qword[eax]" instead?
If FISTTP is not supported, then you could use one of two methods
1) frndint (rouding to zero). But I bet you already knew this.
2) Fxtract. (Get the exponent part in ST1). Then use that information to SHL and SHR the float in question. Exponent part shows the range or offset bits of your integer bit, from 12 bits (EDIT: from the left).
-
FastMath.inc is written purely in 64-bit.
FastMath.32.asm handles the 32-bit-specific prologue/epilogue and maps rax -> eax etc (so the code maps to FISTTP QWORD[EAX]).
The library expressly avoids FXTRACT(13), FPREM(16-64) and FSCALE(20-31) because of their dismal performance. FRNDINT(9-20) is out of the question because of the required control word change (and it's slow - worst-case FISTP/FILD with read-after-write penalty is as fast).
The 32-bit libraries work fine with the hard-coded opcode. So the hardware supports the instruction, it's just the 32-bit assembly that rejects it.
SOLVED: I had to specify CPU PRESCOTT instead of CPU 686!
-
Owh, ok. Glad I mentioned Prescott ;D
But still the question remains. If it is solved by setting the minimum CPU to Prescott, then there's no apparent need to hard-code it anymore because Prescott onwards does support FISTTP instruction (I think). But as long as it is solved, I am happy for you.
-
Thanks for the help!
CPU PRESCOTT allows the FISTTP QWORD[rsp] (preferred) to assemble so I was able to remove the hard-coded opcode.