Author Topic: Need Information.  (Read 6606 times)

Offline Franciswalser

  • Jr. Member
  • *
  • Posts: 7
Need Information.
« on: February 13, 2023, 10:28:22 AM »
To get an idea of how programs look in assembly language, I've written a few simple programs in VB6 and then disassembled them with OllyDbg. One of the things I notice is that at the end of a sub routine there are typicially two types of returns, both with the mnemonic RET, but with different op-codes. The two op-codes I've seen are C2 and C3. C3 takes no parameter. C2 takes a two byte "word" parameter. What I can't figure out is what you are supposed to pass to that parameter. Is it some relative memory location? Is it supposed to be a value that gets put on some register, or somewhere in the stack, for later use by the program? And in what situation would I most often use a C2 return, versus when should I use a C3 return? I also notice that just prior to the RET, there is usually a LEAVE command. This has the op-code C9. According to this documentation http://ref.x86asm.net/coder32.html a C9 is a "High Level Procedure Exit". What exactly does this mean? In what situations is a LEAVE needed? In cases where LEAVE is not used, the RET is always preceded by a POP EBP. But I can't figure out, in what situations should a POP EBP be used instead of a LEAVE? I also notice that at the start of any subroutine, without fail, and regardless of what commands are used near the end of the routine, the first command is always PUSH EBP. Why is that used? myPennMedicine Login Page
« Last Edit: February 14, 2023, 04:49:24 AM by Franciswalser »

Offline alCoPaUL

  • Jr. Member
  • *
  • Posts: 74
  • Country: ph
    • Webpage
Re: Need Information.
« Reply #1 on: February 13, 2023, 08:57:42 PM »
youll get pcode functions from the vb6 runtime if you base at vb6 to learn assembly.

try c programming coz it's sooo close to assembly opcodes when decompiling/disassembling them.

from there, it will funnel to the things that you want to do.

you wanna be a systems/kernel driver developer?
you wanna be a library .dll developer or some API stuff?
or you just wanna play around some concepts and express it to code (no money gotten from here)?

or some haxxor stuff?

your choice.

now, let's say your computer hardware with software (CPU Tower + OS) is traversing space-time and you click ninja.exe,

ninja can enter the space-time with respect to your computer via

sub rsp, 28h

ninja is here

add rsp, 28h

or

push rbx

ninja is here

pop rbx

or

enter 0,0

ninja is here

leave

or

org 100

ninja is here

int 20

so, that's basically what happens when you view it on some other pov.

and if the ninja is like a matrioshka doll, she can port 6 more ninjas inside the bounds of her presence inside the space-time of the computer..

or if she can do it to the maximum numbers, the ninja operated like a port that can accommodate infinity - 1.
« Last Edit: February 13, 2023, 09:08:20 PM by alCoPaUL »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Need Information.
« Reply #2 on: February 13, 2023, 09:24:48 PM »
To get an idea of how programs look in assembly language, I've written a few simple programs in VB6 and then disassembled them with OllyDbg. One of the things I notice is that at the end of a sub routine there are typicially two types of returns, both with the mnemonic RET, but with different op-codes. The two op-codes I've seen are C2 and C3. C3 takes no parameter. C2 takes a two byte "word" parameter. What I can't figure out is what you are supposed to pass to that parameter. Is it some relative memory location? Is it supposed to be a value that gets put on some register, or somewhere in the stack, for later use by the program? And in what situation would I most often use a C2 return, versus when should I use a C3 return?
Hi Francis,
Good question(s)!
It has to do with "calling conventions". You are looking at 32 bit code, where parameters are pushed on the stack. In a "caller cleans up" convention (C, for example) you use a plain "RET". In   a "callee cleans up" (Windows APIs, for example) you use "RET n" where "n" is the number of bytes (not number of parameters) to be removed from the stack after the return, freeing the caller from having to do it.

Quote
I also notice that just prior to the RET, there is usually a LEAVE command. This has the op-code C9. According to this documentation http://ref.x86asm.net/coder32.html a C9 is a "High Level Procedure Exit". What exactly does this mean? In what situations is a LEAVE needed?
It means what it says. It is a "high level" instruction equivalent to:
Code: [Select]
mov esp, ebp ; restore esp to the value it had before "local variables" were allocated
pop ebp : restore caller's ebp saved at beginning of function
Calling conventions require that certain registers be preserved (to the caller's values) across the function. ebx, esi, edi, and ebp. esp, the stack pointer, must be back where it was, because the location we RET to is popped off the stack. If this isn't "right" we will surely crash!
Quote
In cases where LEAVE is not used, the RET is always preceded by a POP EBP. But I can't figure out, in what situations should a POP EBP be used instead of a LEAVE? I also notice that at the start of any subroutine, without fail, and regardless of what commands are used near the end of the routine, the first command is always PUSH EBP. Why is that used?

ebp is one of the caller's registers which must always be preserved.

There is an "ENTER 0 , 0" instruction which more or less "matches" the "LEAVE" instruction. The first 0 is the 'depth" of how many functions have called this one. The Pascal language allows functions to access the caller's local variables. This value is required to do that calculation. The second 0 is the number of bytes to be subtracted from esp to allocate space for "local" or "automatic" variables. "enter" is slower than doing it "by hand", so it isn't often used. "leave" is a fast as doing it "by hand" so it often is used. Neither of them is "necessary".

To take it from the top:
Code: [Select]
push parameter1
push parameter2
; etc. ?
call afunction; a caller cleans up function
add esp, 8 ; remove parameters from stack
;...
afunction:
push ebp ; save caller's reg
mov ebp, esp ; if local variables
sub esp, 8 ; enough space for two local variables
push ebx
push esi
push edi ; if we alter them

; some code
mov eax, [ebp + 4] ; use parameter 1... etc.

pop edi
pop esi
pop ebp ; if we altered 'em and had to save 'em

mov esp, ebp ; remove local variables, if any
pop ebp ; restore caller's ebp
; or "leave" will replace these two instructions
ret

or if it's a callee cleans up function

Code: [Select]
push paraneter1
push parameter2
call bfunction
; the "ret 8" has removed parameters
; ...
bfunction:
push ebp ; save caller's reg
mov ebp, esp
sub esp, 8 ; space for two local variables
; suppose we don't alter ebx, esi, or edi
;some code

mov esp, ebp
pop ebp ; restore caller's reg
; or "leave"
ret 8 ; return and clean up stack

It may be worth noting that in 16 bit code, [sp] was not valid, so [bp] had to be used. In 32 bit code [esp] is valid...
Well... these are good questions, Francis. If I haven't explained fully, or if I screwed up, feel free to ask more!

Best,
Frank


Offline fredericopissarra

  • Full Member
  • **
  • Posts: 373
  • Country: br
Re: Need Information.
« Reply #3 on: February 13, 2023, 09:29:45 PM »
One of the things I notice is that at the end of a sub routine there are typicially two types of returns, both with the mnemonic RET, but with different op-codes.

There are, actually, 4 different opcodes for RET. 0xC3 is for NEAR return, 0xCB is for FAR return (retf), 0xC2 is the same as RET near, but add an immediate value to (R|E)SP. And there is 0xCA, which is the same as a far return, but adds the immediate value as 0xC2 does.

Quote
The two op-codes I've seen are C2 and C3. C3 takes no parameter. C2 takes a two byte "word" parameter. What I can't figure out is what you are supposed to pass to that parameter.

Certain languages, for example, PASCAL, passes arguments through the stack and expect the called function to do the cleanup. Others, like C, do the cleanup in the caller. Let's say we have, in PASCAL:
Code: [Select]
function F( x : Integer ) : Integer;
begin
  F := 2 * x;
end;
The pascal compiler will, probably, create something like:
Code: [Select]
F:
  mov eax,[esp+4]
  add eax,eax
  ret 4   ; dispose of the argument on the stack.
But a C compiler with a function like this:
Code: [Select]
int f( int x ) { return x + x; }Probably will create:
Code: [Select]
_f:
  mov eax,[esp+4]
  add eax,eax
  ret     ; DON'T dispose of the argument on the stack
          ; leting the caller to do this
When f() is called, the C compiler does:
Code: [Select]
; Example: y = f(2); -- supose y is ECX.
  push 2   ; pushes the 2 argument to the stack
  call _f
  add  esp,4 ; clears the stack
  mov  ecx,eax

Quote
I also notice that just prior to the RET, there is usually a LEAVE command.

LEAVE is an old instruction and should not be used anymore. Essentially it does a pair of instructions at once: `mov esp,ebp/pop ebp`, but this isn't necessary anymore... Not even the usage of EBP as a substitute for using ESP, because, since 386 ESP can be used in an effective address (that address between [ and ]). Until 286 only BX,BP,SI and DI could be used. 386 changed this.

Using NASM you can use structures to make sure the arguments are in their proper positions without having to calculate the relative stack position yourself... Let's take a look at f(), above... Before you call f() the 2 is pushed to the stack, so ESP points to the position where this 2 is... The CALL instruction pushes contents of EIP, so RET will know where to return to, then CALL jumps to f. Your stack will be like this after the CALL:
Code: [Select]
  ESP+4 -> 2
  ESP   -> EIP
Notice that we can use a structure like this:
Code: [Select]
struc fstk
     resd  1    ; offset 0, where ESP points to... the return address.
.x:  resd  1    ; offset 4, where out argument is.
endstruc

_f:
  mov eax,[esp + fstk.x]   ; fstk.x is the offset 4
  add eax,eax
  ret

No need to use `EBP` instead of `ESP`.
« Last Edit: February 13, 2023, 10:30:06 PM by fredericopissarra »

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 373
  • Country: br
Re: Need Information.
« Reply #4 on: February 14, 2023, 12:25:47 PM »
Hehee... Me and Frank wrote almost the same thing almost the same time... ;)

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Need Information.
« Reply #5 on: February 14, 2023, 10:33:53 PM »
Yes. Thank you!

Best,
Frank