### Author Topic: 80x87 calculations are more "precise" than SSE/AVX?  (Read 5813 times)

#### fredericopissarra

• Full Member
• Posts: 373
• Country:
##### 80x87 calculations are more "precise" than SSE/AVX?
« on: June 25, 2023, 02:46:08 PM »

Fp87, by default, uses extended precision, which has 64 bits of precision. IEEE-754 has 3 standard precision structures: single precision (24 bits of precision), double precision (53 bits) and extended precision (64 bits). SSE/AVX deals only with the first two. Fp87 deals with the third all the time (by default - you can change that). When you load a single precision value it will be converted to extended precision, internally. The same goes for double precision... And when you store a value in single or double precision, conversions are made too.

This is useful. Take a look at these two routines in ASM:
Code: [Select]
`; one.asm  bits  64  default rel  section .text  global oneF87  global oneSSE2  align 4oneF87:  fld   qword [tenth]  mov   ecx,9.loop:  fadd  qword [tenth]  dec   ecx  jnz   .loop    fstp  qword [rsp-8]  movsd xmm0,[rsp-8]  retoneSSE2:  movsd xmm0,[tenth]  mov   ecx,9.loop:  addsd xmm0,[tenth]  dec   ecx  jnz   .loop  ret  section .rodatatenth:  dq    0.1`
Code: [Select]
`// test.c#include <stdio.h>extern double oneF87( void );extern double oneSSE2( void );int main( void ){  static const char *yesno[] = { "no", "yes" };  double a, b;  a = oneF87();  b = oneSSE2();  printf ( "oneF87()  == 1.0? [%s]\n"           "oneSSE2() == 1.0? [%s]\n",           yesno[ a == 1.0 ], yesno[ b == 1.0 ] );}`
Code: [Select]
`\$ nasm -felf64 -o one.o one.asm\$ cc -O2 -c -o test test.c\$ cc -s -o test test.o one.o\$ ./testoneF87()  == 1.0? [yes]oneSSE2() == 1.0? [no]`What's going on? Why the fp87 code says the sum of 10 0.1 is exactly 1.0 and SSE code, which does the same thing, says it isn't!?
That's because fp87 code do operations in extended precision... the result isn't 1.0, but when converted to double the error is truncated (by coincidence). SSE2 deals, here, with scalar doubles directly (this is explicit in the instructions suffix 'sd').

Strangely, fp87 offers you a FALSE value... it is impossible to sum 10 0.1 values and get exactly 1.0 in floating point.

This doesn't mean you can't take advantage of greater precision. 0.1 in extended precision is .099999999999999999999661186821098279864372670999728143215179443359375, in decimal. Since double precision has, roughly, 16 decimal algarisms of precision, the final rounded value, calculated in extended precision, is something as 1.0000000000000000000xxxxx., where xxxx is the error. This is a 19 significant digits value that will be truncated to double as 1.0. But, notice... this isn't the "real" floating point result from addind 0.1 (double) 10 times...

The actual final values are: 0.99999999999999988897769753748434595763683319091796875 (double) and 1.000000000000000055511151231257827021181583404541015625 (long double), the later just a little bit off from becoming 1.0000000000001.

So, yes... fp87 offers better precision for calculations, but beware! This is not a panacea for better results.
« Last Edit: June 25, 2023, 02:47:42 PM by fredericopissarra »

#### munair

• Jr. Member
• Posts: 37
• Country:
• SharpBASIC compiler developer
##### Re: 80x87 calculations are more "precise" than SSE/AVX?
« Reply #1 on: July 20, 2023, 07:00:27 AM »
Thank you for sharing! I still have to add support for real numbers to the SharpBASIC compiler and there is a lot to consider such as the default instruction set. Your information is helpful.
SharpBASIC (www.sharpbasic.com) is a compiler in development that uses NASM as backend.