Author Topic: Suboptimal code generation?  (Read 111 times)

Offline EAirPeter

  • New Member
  • Posts: 1
Suboptimal code generation?
« on: November 18, 2020, 05:18:37 AM »
I found a little bit weird to see nasm generate "FF 14 25 xx xx xx xx" for
Code: [Select]
call qword [__imp_SomeExternalFunction]
where cl (from MSVC) gives "FF 15 xx xx xx xx", which is one byte shorter than what nasm generates, and also is what I am expecting.

I looked them up in Intel's manual, and found "14 25" is the "[scaled_index]+disp32" version with scaled_index = 0, while "15" stands for just "+disp32". I believe they are semantically the same. By my intuition, I think the shorter form should be better so I am some how curious why nasm generates the longer form.

Now the question is, is there any intention of generating the 1-byte longer instruction? or it is just a defect?

Another question, how can I force nasm to generate the 2-byte version of indirect near call?
I know you can do this by "db ff 15 xx xx xx xx", but what I want is a more general solution, like telling nasm to choose a specific form of an instruction (opcode, addressing mode, etc).

Note: the example given is obtained on Windows 10 x64, with nasm targeting win64.

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 102
  • Country: br
Re: Suboptimal code generation?
« Reply #1 on: November 19, 2020, 03:30:14 PM »
Add 'default rel' at the beginning of your code. This will use RIP relative adddressing (the default for C/C++ compilers for x86-64 mode) and the opcode will be what you expect. Or:

Code: [Select]
call qword [rel __imp_SomeExternalFunction]