NASM - The Netwide Assembler

NASM Forum => Using NASM => Topic started by: munair on November 14, 2021, 10:37:04 AM

Title: LEA or MOV
Post by: munair on November 14, 2021, 10:37:04 AM: I'm fairly new to NASM. I began using it more than a year ago for a 32bits compiler (sharpbasic.com).

Going over some 32 bits code, I found two ways to get the address of a string to be printed, i.e. both give the same output. What I used so far is:

Code: [Select]
' address of string buffer emittl("mov ebx, _sb_buf11") ' convert integer to string (expect address in ebx) emittl("call _sb_intstr") ' print emittl("call _sb_print")
But to obtain the address of the string buffer LEA works as well:

Code: [Select]
emittl("lea ebx, dword [_sb_buf11]")
Which is the better approach?
Title: Re: LEA or MOV
Post by: fredericopissarra on November 14, 2021, 03:39:43 PM: It depends on the mode. In x86-64 mode LEA is smaller and with less colateral effects then MOV to obtain the address of an object in .data, .bss or .rodata sections because uses RIP relative addresses. This:
Code: [Select]
mov rax,x ; puts the address of x in RAX lea rax,[x] ; same thing...Both instructions have REX prefix because of their target (RAX), but the linear address of 'x' must be known in runtime and, since it is a 64 bits value, `mov rax,x` is encoded as a 10 bytes instruction (48 B8 for `mov eax`, plus 8 bytes for the address). 'LEA' is different here... '[ x ]' is usually encoded as a RIP relative address, so only 32 bits of the offset is encoded in the instruction and, since the address is relative to the address of the next instruction, no relocation is needed (the 'x' in MOV is encoded as a constant that must be added to the program base address, which is known only after the image is loaded... consider ASLR, for example!). 'LEA', here, is encoded only in 7 bytes, putting less pressure on L1I cache and internal reordering buffer of the processor (and prefetch queue).

So... LEA is preferable in x86-64 mode to get the address of static objects. In i386 mode (32 bits) it makes no difference.
Title: Re: LEA or MOV
Post by: munair on November 14, 2021, 05:54:40 PM: Thanks a lot for your explanation fredericopissarra. It is very helpful.

BTW, I realized I started this topic in the wrong place. Any moderator is welcome to put it in the right place. ;)
Title: Re: LEA or MOV
Post by: fredericopissarra on November 14, 2021, 11:10:14 PM: Another tip: In i386 mode avoid using EBP as stack "base" pointer. This practice is common in real mode (16 bits) because only BP or BX can be used as base pointers in an effective address calculation. In 386 protected mode any register (but not EBP ou EFLAGS) can be used, so to access objects on the stack you can use ESP directly.

In i386 mode EAX, EBX, ECX, EDX, ESI, EDI and EBP can be used (depending on the calling convention). If you use EBP as a substitute to ESP than you ended with only 6 GPRs available for your routines, instead of 7.

So, the prologue:
Code: [Select]
push ebp mov ebp,espand the epilogue:
Code: [Select]
pop ebp retCan be (and should be) avoided.

Let's say you build a function taking two integers. Like this, in C:
Code: [Select]
int f(int a, int b) { return a + b; }Instead of doing:
Code: [Select]
f: push ebp mov ebp,esp mov eax,[ebp+8] add eax,[ebp+12] pop ebp retYou can write:
Code: [Select]
f: mov eax,[esp+4] add eax,[esp+8] ret
Title: Re: LEA or MOV
Post by: munair on November 15, 2021, 06:10:25 AM: That is a really great tip! And I should take that to the 64 bits version later on as well, I guess (I'm certainly not there yet).
Title: Re: LEA or MOV
Post by: munair on November 15, 2021, 07:50:29 AM: Quote from: fredericopissarra on November 14, 2021, 11:10:14 PM
Instead of doing:
Code: [Select]
f: push ebp mov ebp,esp mov eax,[ebp+8] add eax,[ebp+12] pop ebp retYou can write:
Code: [Select]
f: mov eax,[esp+4] add eax,[esp+8] ret

But while the stack base pointer is used to address parameters, the stack pointer is adjusted for local variables:

Code: [Select]
emittl("push ebp") emittl("mov ebp, esp") if n > 0 then // number of local variables emittl("sub esp, " + str(n * 4)) end if
Title: Re: LEA or MOV
Post by: fredericopissarra on November 15, 2021, 01:58:39 PM: Quote from: munair on November 15, 2021, 07:50:29 AM
But while the stack base pointer is used to address parameters, the stack pointer is adjusted for local variables:

Code: [Select]
emittl("push ebp") emittl("mov ebp, esp") if n > 0 then // number of local variables emittl("sub esp, " + str(n * 4)) end if
There is no need... You can, still, allocate local objects manipulating ESP directly. Suppose you have a function that needs a local DWORD. You can do:
Code: [Select]
struc fstk .localvar: resd 1 ; local var. .localsize: .retaddr: resd 1 ; return address. .arg1: resd 1 ; First argument. endstruc f: ; allocate local space on stack. sub esp,fstk.localsize mov eax,[esp+fstk.arg1] ; gets first arg ... mov [esp+fstk.localvar],eax ; store in local var (.localvar is 0). ; deallocate local space on stack. add esp,fstk.locaosize ret
Title: Re: LEA or MOV
Post by: munair on November 15, 2021, 09:53:50 PM: I understand that there is no need (actually there is, see my next post). BUT, the nice thing IMO about setting up a stack frame is that parameters are EBP+ while local variables are EBP- (with EBP-4 as the function result). For a compiler emitting asm code this looks easier to me and it's also easier address calculation lowering the stack pointer by [the number of local variables] * 4. To illustrate what I mean, have a look at the stack frame on this cheat sheet: https://www.cs.uaf.edu/2006/fall/cs301/support/x86/ (https://www.cs.uaf.edu/2006/fall/cs301/support/x86/)
Title: Re: LEA or MOV
Post by: fredericopissarra on November 16, 2021, 09:14:21 PM: Did you notice using a structure you don't need to remember where is the arguments or local stack allocated objects?
Title: Re: LEA or MOV
Post by: munair on November 17, 2021, 09:11:34 AM: Quote from: fredericopissarra on November 16, 2021, 09:14:21 PM
Did you notice using a structure you don't need to remember where is the arguments or local stack allocated objects?

It's not to remember but to make life simple. Giving your proposal of using the stack pointer without stack frame some thought, I think it is not a good idea for the simple reason that the stack pointer may change during the execution of a function. Suppose the code within a function calls another function and (local) variables are pushed on the stack to pass them as parameters. How do you access local variables after the stack pointer has changed? This is where the base pointer comes in. No matter what happens to the stack pointer, the base pointer makes sure that the offsets to local variables and parameters don't change.
Title: Re: LEA or MOV
Post by: fredericopissarra on November 17, 2021, 10:36:35 AM: Quote from: munair on November 17, 2021, 09:11:34 AM
I think it is not a good idea for the simple reason that the stack pointer may change during the execution of a function. Suppose the code within a function calls another function and (local) variables are pushed on the stack to pass them as parameters. How do you access local variables after the stack pointer has changed?

That's why compilers like C keeps track of (E|R)SP... Supose you have a funcion f() calling a function g(), each one with one argument, using i386 C callng convention:
Code: [Select]
int g(int x) { return x + x; } int f(int x) { return g(x)+1; }The generated code is something like this (without using EBP):
Code: [Select]
g: mov eax,[esp+4] add eax,eax ret f: mov eax,[esp+4] push eax call g add esp,4 ; stack cleanup. keeps in a known position. inc eax ret
Using the struct approach you'll always sure where arguments and local vars are for a function and don't need to remember the offsets. All you have to do is to pop the pushed argumentos from the stack after the function returns (or before...).

In some compilers (PASCAL, for instance) the responsability for stack cleanup is on the called function ("ret" accepts an argument for that)... The same come, in pascal:
Code: [Select]
g: mov eax,[esp+4] add eax,eax ret 4 f: mov eax,[esp+4] push eax call g inc eax ret 4
The point is: Why use prologue/epilogue nowadays? In the old pre-386 processors if you wanted to access data on stack you had two options only: Using POP and using EBP as base pointer in an effective address. It was not possible to use registers other than BP ou BX as base pointer, and other registers than SI or DI as index (and there were no 'scale'), so using BP was mandatory. After 386 this is not the case anymore.
Title: Re: LEA or MOV
Post by: munair on November 17, 2021, 11:42:14 AM: What is the difference between the struct approach and the stack frame approach other than keeping R/EBP free? Maybe I misunderstand. I have seen examples of GCC (32bits) output producing stack frames, so it still seems a legitimate technique.

Currently in the SharpBASIC compiler the stack cleanup is done after the function call, as it was also messed up with the push instructions before the call. Seems more logical to me, but it's a matter of opinion. Here is a simple, stupid example I used to test function calls in the expression parser:

Code: [Select]
' SharpBASIC function ' ------------------- incl "lib/sys.sbi"; decl func five(n: int8): int8; dim sum: int8; main do sum = five(5 * 5) + five(5 + 5); print sum; end; func five(n: int8): int8 do five = n; end;
Generated asm code (without any code optimizations):

Code: [Select]
SECTION .text global _start global _end _start: movsx eax, byte [_C3] push eax movsx eax, byte [_C3] pop edx imul edx push eax call _I26 add esp, 4 push eax movsx eax, byte [_C3] push eax movsx eax, byte [_C3] pop edx add eax, edx push eax call _I26 add esp, 4 pop edx add eax, edx ; save sum mov [_I27], al ; load sum movsx eax, byte [_I27] ; print int mov ebx, _sb_buf12 call _sb_intstr call _sb_print call _sb_printlf _end: mov ebx, 0 mov eax, 1 int 80h _I26: push ebp mov ebp, esp sub esp, 4 ; init func five mov byte [ebp - 4], 0 ; load n movsx eax, byte [ebp + 8] ; save func result five mov [ebp - 4], al ._L0: ; load func result five movsx eax, byte [ebp - 4] mov esp, ebp pop ebp ret extern _sb_intstr extern _sb_print extern _sb_printlf extern _sb_buf12 SECTION .rodata _C3 db 5 SECTION .bss ; define sum _I27 resb 1
When you say Pascal compiler, which compiler do you mean? There are several of them, both commercial and free. Same goes for C compilers.
Title: Re: LEA or MOV
Post by: fredericopissarra on November 17, 2021, 12:23:22 PM: Quote from: munair on November 17, 2021, 11:42:14 AM
What is the difference between the struct approach and the stack frame approach other than keeping R/EBP free? Maybe I misunderstand. I have seen examples of GCC (32bits) output producing stack frames, so it still seems a legitimate technique.

Smaller and fastest code. Try to use -fomit-frame-pointer and -O2 options on GCC...

Quote
When you say Pascal compiler, which compiler do you mean? There are several of them, both commercial and free. Same goes for C compilers.
Turbo Pascal, Free Pascal and Delphi... And the old "pascal" calling convention used by C compilers back in the 90's, 2000's: Turbo C, Borland C++, MSC6, ...
Title: Re: LEA or MOV
Post by: munair on November 17, 2021, 01:31:38 PM: When I get to optimization options for the compiler, fomitting the frame pointer will probably be one of them. Thanks again for the suggestion.