Author Topic: How does linker know the location of the function?  (Read 9570 times)

Offline Korybut

  • Jr. Member
  • *
  • Posts: 6
How does linker know the location of the function?
« on: February 15, 2021, 07:41:29 PM »
Hello,

I am new to NASM (I used to program on FASM) and currently on the stage of careful manual reading. However there is some point that is not clear. Creating application in FASM I need to tell the assembler in which dll my function is stored. Doesn't matter if I created this dll myself or it isprovided by OS. 

On the opposite side looking at examples like https://forum.nasm.us/index.php?topic=2365.15
I've noticed that there is no such information in the code or no other files where such information might be provided. Perhaps I'm missing something

How does it work?

Offline debs3759

  • Global Moderator
  • Full Member
  • *****
  • Posts: 136
  • Country: gb
    • GPUZoo
Re: How does linker know the location of the function?
« Reply #1 on: February 15, 2021, 10:09:53 PM »
I have never created a multi file assembler project, but I can tell you that the assembler only needs to know which routines are in external files, and the linker needs to know which library files to find external routines in. That seems to be what your link shows. I can't speak for how other assemblers declare the same routines.
My graphics card database: www.gpuzoo.com

Offline Korybut

  • Jr. Member
  • *
  • Posts: 6
Re: How does linker know the location of the function?
« Reply #2 on: February 16, 2021, 07:33:17 AM »
How about the casewhen two dll have functions with identical names? Is it strictly forbidden?

According to the Manual (section 8.4.4) there is "import" directive which specifies the location of the functions in libraries. It is pretty much the same as FASM has except order of argument difference.

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 171
  • Country: br
Re: How does linker know the location of the function?
« Reply #3 on: February 16, 2021, 02:54:10 PM »
Each time you compile a source code to an object file is assigned a "virtual address" (a "relative" address) to each memory reference in your code. Example:
Code: [Select]
bits 32

section .data

x:  dd  0

section .text

  global f
f:
  inc dword [x]
  ret
If you compile this you'll get:
Code: [Select]
$ nasm -l test.lst test.asm
$ cat test.lst
 1                          bits 32
 2                         
 3                          section .data
 4                         
 5 00000000 00000000        x: dd 0
 6                         
 7                          section .text
 8                         
 9                            global f
10                          f:
11 00000000 FF05[00000000]    inc dword [x]
12 00000006 C3                ret
Notice `x` get the offset 0 in `.data`section. and `f` got offset 0 on `.text` section. (and the offset in inc instruction is [00000000]).

When the linker is used with multiple modules it atributes different offsets to these references... Let's say you have another module (test2.asm) defining `y` as DWORD... to that object file `y` will get the offset 0 as well, but the linker puts `x` and `y` in the same section, assignining a different offsets for these 2 symbols.. The same to function's entrypoints...

Notice, also, that CALL/JMP and conditional jumps use relative addressing (relative to EIP ou RIP)...

In the case of DLLs, when they are loaded, the offset is avaliable in PE file format. In Windows, when you use GetProcAddress you get this address and assign to a function pointer to do an indirect call (late binding). Or the linker do this for you (early binding).

As for DLLs with functions with the same name, in early binding it can be problematic, but with late biding it has no problem, since the same is used only to find the address where the function is...
« Last Edit: February 16, 2021, 03:00:48 PM by fredericopissarra »

Offline Korybut

  • Jr. Member
  • *
  • Posts: 6
Re: How does linker know the location of the function?
« Reply #4 on: February 16, 2021, 07:49:07 PM »
Many thanks for answers. Please correct me if I am wrong. Using some sort of PE Explorer I can look through all the functions that one can call from DLL and linker goes through this information and replaces my labels in Object file with actual offsets from PE headers in DLL.

Looking at disassembled code I've never seen usage of "LoadLibrary" and "GetProcAddress" (except 'wglGetProcAddress" but it is completely different story) is this because in majority of cases early binding was chosen?

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 171
  • Country: br
Re: How does linker know the location of the function?
« Reply #5 on: February 17, 2021, 01:01:54 AM »
Many thanks for answers. Please correct me if I am wrong. Using some sort of PE Explorer I can look through all the functions that one can call from DLL and linker goes through this information and replaces my labels in Object file with actual offsets from PE headers in DLL.

Looking at disassembled code I've never seen usage of "LoadLibrary" and "GetProcAddress" (except 'wglGetProcAddress" but it is completely different story) is this because in majority of cases early binding was chosen?
To the first question, not exactly... take a look at this tiny example:
Code: [Select]
; hello.asm for x86-64
  bits  64
  default rel

  section .rodata

msg:
  db  `Hello\n`
msg_len equ $ - msg

  section .text

  extern __imp_GetStdHandle
  extern __imp_WriteConsoleA
  extern __imp_ExitProcess

  global _start

_start:
  mov   ecx,-11   ; STD_OUTPUT_HANDLE
  call  [__imp_GetStdHandle]

  mov   ecx,eax
  lea   rdx,[msg]
  mov   r8d,msg_len
  xor   r9d,r9d
  push  byte 0
  call  [__imp_WriteConsoleA]
  add   rsp,8

  xor   ecx,ecx
  jmp   [__imp_ExitProcess]

To compile:
Code: [Select]
$ nasm -fwin64 -o hello.o hello.asm
$ x86_64-w64-mingw32-ld -s -o hello.exe hello.o -lkernel32
This was compiled with mingw64 on Linux (but works with m$ linker as well with different command line - probably _start isn't the default entrypoint with link.exe).

If you Search for these 3 functions names on your executable you'll find the 3 function names as in the picture attached. That's because the loader will early bind them. But notice the symbols used in the code has the `__imp_` prefix (these are the symbols defined in the imported static library kernel32.lib (or libkernel32.a, in case of linux).

To the second question... Yes, you don't see LoadLibrary/GetProcAddress/FreeLibrary because of early binding.

And, please, notice: I don't deal with Windows since 2007.
« Last Edit: February 17, 2021, 01:10:39 AM by fredericopissarra »

Offline Korybut

  • Jr. Member
  • *
  • Posts: 6
Re: How does linker know the location of the function?
« Reply #6 on: February 19, 2021, 02:15:59 PM »
I've found out that (as supposed to be) situation depends on linker. GoLink actually scans the DLL for the functions. Other linkers may use LIB files