Author Topic: LEA or MOV  (Read 7902 times)

Offline munair

  • Jr. Member
  • *
  • Posts: 37
  • Country: nl
  • SharpBASIC compiler developer
    • SharpBASIC
LEA or MOV
« on: November 14, 2021, 10:37:04 AM »
I'm fairly new to NASM. I began using it more than a year ago for a 32bits compiler (sharpbasic.com).

Going over some 32 bits code, I found two ways to get the address of a string to be printed, i.e. both give the same output. What I used so far is:

Code: [Select]
  ' address of string buffer
  emittl("mov     ebx, _sb_buf11")
  ' convert integer to string (expect address in ebx)
  emittl("call    _sb_intstr")
  ' print
  emittl("call    _sb_print")

But to obtain the address of the string buffer LEA works as well:

Code: [Select]
emittl("lea     ebx, dword [_sb_buf11]")

Which is the better approach?
SharpBASIC (www.sharpbasic.com) is a compiler in development that uses NASM as backend.

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 368
  • Country: br
Re: LEA or MOV
« Reply #1 on: November 14, 2021, 03:39:43 PM »
It depends on the mode. In x86-64 mode LEA is smaller and with less colateral effects then MOV to obtain the address of an object in .data, .bss or .rodata sections because uses RIP relative addresses. This:
Code: [Select]
  mov rax,x   ; puts the address of x in RAX
  lea  rax,[x]  ; same thing...
Both instructions have REX prefix because of their target (RAX), but the linear address of 'x' must be known in runtime and, since it is a 64 bits value, `mov rax,x` is encoded as a 10 bytes instruction (48 B8 for `mov eax`, plus 8 bytes for the address). 'LEA' is different here... '[ x ]' is usually encoded as a RIP relative address, so only 32 bits of the offset is encoded in the instruction and, since the address is relative to the address of the next instruction, no relocation is needed (the 'x' in MOV is encoded as a constant that must be added to the program base address, which is known only after the image is loaded... consider ASLR, for example!). 'LEA', here, is encoded only in 7 bytes, putting less pressure on L1I cache and internal reordering buffer of the processor (and prefetch queue).

So... LEA is preferable in x86-64 mode to get the address of static objects. In i386 mode (32 bits) it makes no difference.
« Last Edit: November 14, 2021, 08:35:12 PM by fredericopissarra »

Offline munair

  • Jr. Member
  • *
  • Posts: 37
  • Country: nl
  • SharpBASIC compiler developer
    • SharpBASIC
Re: LEA or MOV
« Reply #2 on: November 14, 2021, 05:54:40 PM »
Thanks a lot for your explanation fredericopissarra. It is very helpful.

BTW, I realized I started this topic in the wrong place. Any moderator is welcome to put it in the right place. ;)
SharpBASIC (www.sharpbasic.com) is a compiler in development that uses NASM as backend.

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 368
  • Country: br
Re: LEA or MOV
« Reply #3 on: November 14, 2021, 11:10:14 PM »
Another tip: In i386 mode avoid using EBP as stack "base" pointer. This practice is common in real mode (16 bits) because only BP or BX can be used as base pointers in an effective address calculation. In 386 protected mode any register (but not EBP ou EFLAGS) can be used, so to access objects on the stack you can use ESP directly.

In i386 mode EAX, EBX, ECX, EDX, ESI, EDI and EBP can be used (depending on the calling convention). If you use EBP as a substitute to ESP than you ended with only 6 GPRs available for your routines, instead of 7.

So, the prologue:
Code: [Select]
  push ebp
  mov ebp,esp
and the epilogue:
Code: [Select]
  pop ebp
  ret
Can be (and should be) avoided.

Let's say you build a function taking two integers. Like this, in C:
Code: [Select]
int f(int a, int b) { return a + b; }Instead of doing:
Code: [Select]
f:
  push ebp
  mov  ebp,esp
  mov  eax,[ebp+8]
  add  eax,[ebp+12]
  pop  ebp
  ret
You can write:
Code: [Select]
f:
  mov eax,[esp+4]
  add eax,[esp+8]
  ret
« Last Edit: November 14, 2021, 11:14:29 PM by fredericopissarra »

Offline munair

  • Jr. Member
  • *
  • Posts: 37
  • Country: nl
  • SharpBASIC compiler developer
    • SharpBASIC
Re: LEA or MOV
« Reply #4 on: November 15, 2021, 06:10:25 AM »
That is a really great tip! And I should take that to the 64 bits version later on as well, I guess (I'm certainly not there yet).
SharpBASIC (www.sharpbasic.com) is a compiler in development that uses NASM as backend.

Offline munair

  • Jr. Member
  • *
  • Posts: 37
  • Country: nl
  • SharpBASIC compiler developer
    • SharpBASIC
Re: LEA or MOV
« Reply #5 on: November 15, 2021, 07:50:29 AM »
Instead of doing:
Code: [Select]
f:
  push ebp
  mov  ebp,esp
  mov  eax,[ebp+8]
  add  eax,[ebp+12]
  pop  ebp
  ret
You can write:
Code: [Select]
f:
  mov eax,[esp+4]
  add eax,[esp+8]
  ret

But while the stack base pointer is used to address parameters, the stack pointer is adjusted for local variables:

Code: [Select]
  emittl("push    ebp")
  emittl("mov     ebp, esp")
  if n > 0 then // number of local variables
    emittl("sub     esp, " + str(n * 4))
  end if
SharpBASIC (www.sharpbasic.com) is a compiler in development that uses NASM as backend.

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 368
  • Country: br
Re: LEA or MOV
« Reply #6 on: November 15, 2021, 01:58:39 PM »
But while the stack base pointer is used to address parameters, the stack pointer is adjusted for local variables:

Code: [Select]
  emittl("push    ebp")
  emittl("mov     ebp, esp")
  if n > 0 then // number of local variables
    emittl("sub     esp, " + str(n * 4))
  end if
There is no need... You can, still, allocate local objects manipulating ESP directly. Suppose you have a function that needs a local DWORD. You can do:
Code: [Select]
struc fstk
.localvar:  resd 1   ; local var.
.localsize:
.retaddr:  resd 1   ; return address.
.arg1: resd 1    ; First argument.
endstruc

f:
  ; allocate local space on stack.
  sub esp,fstk.localsize

  mov eax,[esp+fstk.arg1]  ; gets first arg
  ...
  mov [esp+fstk.localvar],eax ; store in local var (.localvar is 0).

  ; deallocate local space on stack.
  add esp,fstk.locaosize

  ret

Offline munair

  • Jr. Member
  • *
  • Posts: 37
  • Country: nl
  • SharpBASIC compiler developer
    • SharpBASIC
Re: LEA or MOV
« Reply #7 on: November 15, 2021, 09:53:50 PM »
I understand that there is no need (actually there is, see my next post). BUT, the nice thing IMO about setting up a stack frame is that parameters are EBP+ while local variables are EBP- (with EBP-4 as the function result). For a compiler emitting asm code this looks easier to me and it's also easier address calculation lowering the stack pointer by [the number of local variables] * 4. To illustrate what I mean, have a look at the stack frame on this cheat sheet: https://www.cs.uaf.edu/2006/fall/cs301/support/x86/
« Last Edit: November 17, 2021, 09:12:56 AM by munair »
SharpBASIC (www.sharpbasic.com) is a compiler in development that uses NASM as backend.

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 368
  • Country: br
Re: LEA or MOV
« Reply #8 on: November 16, 2021, 09:14:21 PM »
Did you notice using a structure you don't need to remember where is the arguments or local stack allocated objects?

Offline munair

  • Jr. Member
  • *
  • Posts: 37
  • Country: nl
  • SharpBASIC compiler developer
    • SharpBASIC
Re: LEA or MOV
« Reply #9 on: November 17, 2021, 09:11:34 AM »
Did you notice using a structure you don't need to remember where is the arguments or local stack allocated objects?

It's not to remember but to make life simple. Giving your proposal of using the stack pointer without stack frame some thought, I think it is not a good idea for the simple reason that the stack pointer may change during the execution of a function. Suppose the code within a function calls another function and (local) variables are pushed on the stack to pass them as parameters. How do you access local variables after the stack pointer has changed? This is where the base pointer comes in. No matter what happens to the stack pointer, the base pointer makes sure that the offsets to local variables and parameters don't change.
« Last Edit: November 17, 2021, 09:13:13 AM by munair »
SharpBASIC (www.sharpbasic.com) is a compiler in development that uses NASM as backend.

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 368
  • Country: br
Re: LEA or MOV
« Reply #10 on: November 17, 2021, 10:36:35 AM »
I think it is not a good idea for the simple reason that the stack pointer may change during the execution of a function. Suppose the code within a function calls another function and (local) variables are pushed on the stack to pass them as parameters. How do you access local variables after the stack pointer has changed?

That's why compilers like C keeps track of (E|R)SP... Supose you have a funcion f() calling a function g(), each one with one argument, using i386 C callng convention:
Code: [Select]
int g(int x) { return x + x; }
int f(int x) { return g(x)+1; }
The generated code is something like this (without using EBP):
Code: [Select]
g:
  mov eax,[esp+4]
  add  eax,eax
  ret

f:
  mov eax,[esp+4]
  push eax
  call g
  add esp,4  ; stack cleanup. keeps in a known position.
  inc eax
  ret

Using the struct approach you'll always sure where arguments and local vars are for a function and don't need to remember the offsets. All you have to do is to pop the pushed argumentos from the stack after the function returns (or before...).

In some compilers (PASCAL, for instance) the responsability for stack cleanup is on the called function ("ret" accepts an argument for that)... The same come, in pascal:
Code: [Select]
g:
  mov eax,[esp+4]
  add  eax,eax
  ret 4

f:
  mov eax,[esp+4]
  push eax
  call g
  inc eax
  ret 4

The point is: Why use prologue/epilogue nowadays? In the old pre-386 processors if you wanted to access data on stack you had two options only: Using POP and using EBP as base pointer in an effective address. It was not possible to use registers other than BP ou BX as base pointer, and other registers than SI or DI as index (and there were no 'scale'), so using BP was mandatory. After 386 this is not the case anymore.
« Last Edit: November 17, 2021, 10:46:00 AM by fredericopissarra »

Offline munair

  • Jr. Member
  • *
  • Posts: 37
  • Country: nl
  • SharpBASIC compiler developer
    • SharpBASIC
Re: LEA or MOV
« Reply #11 on: November 17, 2021, 11:42:14 AM »
What is the difference between the struct approach and the stack frame approach other than keeping R/EBP free? Maybe I misunderstand. I have seen examples of GCC (32bits) output producing stack frames, so it still seems a legitimate technique.

Currently in the SharpBASIC compiler the stack cleanup is done after the function call, as it was also messed up with the push instructions before the call. Seems more logical to me, but it's a matter of opinion. Here is a simple, stupid example I used to test function calls in the expression parser:

Code: [Select]
' SharpBASIC function
' -------------------
incl "lib/sys.sbi";

decl func five(n: int8): int8;

dim sum: int8;

main do
  sum = five(5 * 5) + five(5 + 5);
  print sum;
end;

func five(n: int8): int8
do
  five = n;
end;

Generated asm code (without any code optimizations):

Code: [Select]

SECTION .text

global  _start
global  _end

_start:
movsx   eax, byte [_C3]
push    eax
movsx   eax, byte [_C3]
pop     edx
imul    edx
push    eax

call    _I26

add     esp, 4
push    eax
movsx   eax, byte [_C3]
push    eax
movsx   eax, byte [_C3]
pop     edx
add     eax, edx
push    eax

call    _I26

add     esp, 4
pop     edx
add     eax, edx
; save sum
mov     [_I27], al
; load sum
movsx   eax, byte [_I27]
; print int
mov     ebx, _sb_buf12
call    _sb_intstr
call    _sb_print
call    _sb_printlf

_end:
mov     ebx, 0
mov     eax, 1
int     80h

_I26:
push    ebp
mov     ebp, esp
sub     esp, 4
; init func five
mov     byte [ebp - 4], 0
; load n
movsx   eax, byte [ebp + 8]
; save func result five
mov     [ebp - 4], al
._L0:
; load func result five
movsx   eax, byte [ebp - 4]
mov     esp, ebp
pop     ebp
ret

extern  _sb_intstr
extern  _sb_print
extern  _sb_printlf
extern  _sb_buf12

SECTION .rodata

_C3                 db 5

SECTION .bss

; define sum
_I27                resb 1

When you say Pascal compiler, which compiler do you mean? There are several of them, both commercial and free. Same goes for C compilers.
SharpBASIC (www.sharpbasic.com) is a compiler in development that uses NASM as backend.

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 368
  • Country: br
Re: LEA or MOV
« Reply #12 on: November 17, 2021, 12:23:22 PM »
What is the difference between the struct approach and the stack frame approach other than keeping R/EBP free? Maybe I misunderstand. I have seen examples of GCC (32bits) output producing stack frames, so it still seems a legitimate technique.

Smaller and fastest code. Try to use -fomit-frame-pointer and -O2 options on GCC...

Quote
When you say Pascal compiler, which compiler do you mean? There are several of them, both commercial and free. Same goes for C compilers.
Turbo Pascal, Free Pascal and Delphi... And the old "pascal" calling convention used by C compilers back in the 90's, 2000's: Turbo C, Borland C++, MSC6, ...

Offline munair

  • Jr. Member
  • *
  • Posts: 37
  • Country: nl
  • SharpBASIC compiler developer
    • SharpBASIC
Re: LEA or MOV
« Reply #13 on: November 17, 2021, 01:31:38 PM »
When I get to optimization options for the compiler, fomitting the frame pointer will probably be one of them. Thanks again for the suggestion.
SharpBASIC (www.sharpbasic.com) is a compiler in development that uses NASM as backend.