I am writing SIMD functions and want to use the highest SIMD instruction set on the host CPU. So I make the functions jump at their very beginning to different blocks according to the CPU (.SSE, .SSE2, .AVX, etc). The problem is that it is hard for me to remember if an SIMD instruction is supported in an SSE version, so I frequently use unsupported instructions. For example, I may write
.SSE2:
ADDSUBPD XMM0, XMM1
…
.SSE3
…
There will not be any problem compiling and linking, but the program will crash on computers that don’t have SSE3 because ADDSUBPD is not available.
To prevent this I use the CPU macro to specify the instruction set.
.SSE2:
CPU Willamette
ADDSUBPD XMM0, XMM1
…
.SSE3
CPU Prescott
…
This causes NASM to issue errors on ADDSUBPD. Great! But then I have two problems: 1) CPU accepts CPU id up to Prescott, when I try to use Penryn (SSE41), Nehalem (SSE42) or SANDYBRIDGE (AVX) they are not accepted (Unknown CPU type). In addition, I noticed that all the AVX2 instructions are tagged with AVX2 but no CPU ID. 2) How can I make it back to support all instructions (CPU and CPU all don't work)?
As a suggestion, wouldn't it be nice to allow us to set the highest SIMD instruction set, or allow use to turn on/off a particular set of of instructions? Like
InstructionSet AVX2 off
InstructionSet AES off
Thank you for any suggestion and help!