Author Topic: Quick question about Linux entry point  (Read 9989 times)

Offline Devoum

  • Jr. Member
  • *
  • Posts: 13
Quick question about Linux entry point
« on: March 08, 2010, 01:18:51 PM »
Hey,

I'm reading a tutorial (I read Dr. Paul Carter's a few years ago, but might read it again), and it (and Dr. Carter's, as far as I remember) both demand that whenever you write an assembly program under the Linux environment, you need to define the following:

Code: [Select]
section .text
global _start

_start:
; code

Is it really necessary to do this? In a DOS environment, it isn't used, as far as i've seen from some examples here. (the global _start --- _start: part)

Offline Keith Kanios

  • Full Member
  • **
  • Posts: 383
  • Country: us
    • Personal Homepage
Re: Quick question about Linux entry point
« Reply #1 on: March 08, 2010, 03:03:00 PM »
Most DOS examples you'll see are COM files, in which are nothing more than BIN files that have an ORG of 0x100 as expected by DOS. DOS will load these and always start at 0x100 as the entry point.

Most linkers, e.g. ld, require you to implicitly/explicitly declare a global symbol for program entry. Implicitly in that certain linkers will find and automatically use a pre-defined symbol as the entry-point in the supplied objects. Explicitly in that certain linkers require you tell it what symbol is the program entry point for the supplied objects.

ld in particular will check for explicit (-e _MyEntrySymbol) declaration, and complains if it doesn't find it; and can fall back to an implicit search for _start if explicit (-e) is not supplied, and complains if it doesn't find that either.

As for the reasoning, most executable formats have a special part of their header reserved as a pointer to the program entry point. The linker will resolve this value as the location of implicit/explicit entry symbol. It is more complicated/deeper than that, but that's the bird's eye view of things.

Offline Devoum

  • Jr. Member
  • *
  • Posts: 13
Re: Quick question about Linux entry point
« Reply #2 on: March 08, 2010, 04:23:46 PM »
Alright, thanks Keith.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Quick question about Linux entry point
« Reply #3 on: March 08, 2010, 05:17:51 PM »
In a DOS environment, you're either writing a .com file, in which case the entrypoint is the first thing in your file, or an MZ .exe file, in "-f obj" output format. The "-f obj" output format knows the special symbol "..start", and knows that it needs to be global... so that the linker can "see" it.

In a Linux environment, the linker ld knows "_start" as the default entrypoint. You can use another name... "global frank", "frank:"... and tell ld "-e frank" (or "--entrypoint frank"). Should work...

But I don't think you saw it in Dr. Carter's examples. He uses gcc to invoke ld, and gcc - unless it's told not to with "--nostartfiles" - links in some "startup code" (crt0.o or some such name) which contains the "_start" symbol. This startup code calls "main" (if there's a way to change that name, I don't know it). That's the "entrypoint" as far as we're concerned. But "main" is contained in "driver.c" (->driver.o), which calls "asm_main". So "asm_main" is the real, real entrypoint... for us.

Note that "main" is often spelled "_main", and "printf" is spelled "_printf", but not in Linux. Yet "_start" has an underscore on it. I'm sure there's a fascinating reason for this, but it might be a good time to "just do it" rather than "understand". :)

FWIW, a "typical" entrypoint would be about 0x8048080, varying with the size of the header. I've been told that the header is not loaded into memory, but if you examine memory starting at 0x8048000, I think you'll find it. You shouldn't "need to know" these numbers, but... some of us like to know these things... :)

Best,
Frank


Offline Devoum

  • Jr. Member
  • *
  • Posts: 13
Re: Quick question about Linux entry point
« Reply #4 on: March 08, 2010, 08:49:32 PM »
FWIW, a "typical" entrypoint would be about 0x8048080, varying with the size of the header. I've been told that the header is not loaded into memory, but if you examine memory starting at 0x8048000, I think you'll find it. You shouldn't "need to know" these numbers, but... some of us like to know these things... :)

I would like to know this =D

But yeah, you didn't need to tell me this was a "just do it" moment. I am already in that state-of-mind after my last topic =) But thank you Keith and Frank.