Hello my friends.
First of all thanks, your answers are very clear and complete.
I continue bothering you with other questions popped up reading this thread.
If I got it right,
DB it's not an instruction, it's a directive
means basically that it's not to be translated in opcode like others instructions (i.e. mov eax, 1 etc). This means that whatever i put in a DB directive, it will be stored in main memory as it is. In this case it's stored like instructions because where inside the .text segment, retrieving the instructions from the equivalent ascii values of what i store in.
My question here is where I can get a chart of equivalent opcodes/ASCII from? I suppose it could be possible to write an entire program with a single "db <nonsense string>" (here by "program" I mean a meaningless program with just opcodes in text segment). This chart would be of great interest.
My great doubt is that, based upon what i've read here, nasm interprets as instructions the equivalent ascii codes of a db statement like this one. I thought this happens because i define db 'bla bla' in .text segment, so nasm tries to convert in opcodes what it supposes to be just instructions and no data declarations.
So i dit a try, creating this:
global _start
SECTION .text
_start:
db 'PROVAAAaW'
mov eax, 1
mov ebx, 1
int 80h
SECTION .data
db 'PROVAAAAaW'
mov eax, 1
xor ecx, ecx
int 80h
With this, I wouldn't have expected to find the db statement in section .data following the .text code in gdb:
(gdb) disass
Dump of assembler code for function _start:
=> 0x08048080 <+0>: push eax
0x08048081 <+1>: push edx
0x08048082 <+2>: dec edi
0x08048083 <+3>: push esi
0x08048084 <+4>: inc ecx
0x08048085 <+5>: inc ecx
0x08048086 <+6>: inc ecx
0x08048087 <+7>: popa
0x08048088 <+8>: push edi
0x08048089 <+9>: mov eax,0x1
0x0804808e <+14>: mov ebx,0x1
0x08048093 <+19>: int 0x80
End of assembler dump.
(gdb) x/25i $eip
=> 0x8048080 <_start>: push eax
0x8048081 <_start+1>: push edx
0x8048082 <_start+2>: dec edi
0x8048083 <_start+3>: push esi
0x8048084 <_start+4>: inc ecx
0x8048085 <_start+5>: inc ecx
0x8048086 <_start+6>: inc ecx
0x8048087 <_start+7>: popa
0x8048088 <_start+8>: push edi
0x8048089 <_start+9>: mov eax,0x1
0x804808e <_start+14>: mov ebx,0x1
0x8048093 <_start+19>: int 0x80
0x8048095: add BYTE PTR [eax],al
0x8048097: add BYTE PTR [eax+0x52],dl
0x804809a: dec edi
0x804809b: push esi
0x804809c: inc ecx
0x804809d: inc ecx
0x804809e: inc ecx
0x804809f: inc ecx
0x80480a0: popa
0x80480a1: push edi
0x80480a2: mov eax,0x1
---Type <return> to continue, or q <return> to quit---
0x80480a7: xor ecx,ecx
0x80480a9: int 0x80
(gdb)
No trace of that "db 'PROVAAAAaW'" as string in $esp (initialized data should be in stack segment right?).
As you can see after $eip+19, there are again those fake instructions, but the first two are being translated differently.
In .text section
0x08048080 <+0>: push eax
0x08048081 <+1>: push edx
In .data section:
0x8048095: add BYTE PTR [eax],al
0x8048097: add BYTE PTR [eax+0x52],dl
I thought it happens because I do not store an actual variable, so I tried a second version, with this code:
global _start
SECTION .text
_start:
db 'PROVAAAaW'
mov eax, 1
mov ebx, 1
int 80h
SECTION .data
aw db 'PROVAAAAaW'
mov eax, 1
xor ecx, ecx
int 80h
but I get the same result, no trace of "PROVAAAAaW" as string, nor aw (except for its normal presence in symbol table of elf object file).
utente@utente-virtual-machine:~/Scrivania/programmazione/nasm$ readelf -s prova2
Symbol table '.symtab' contains 10 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 08048080 0 SECTION LOCAL DEFAULT 1
2: 08049098 0 SECTION LOCAL DEFAULT 2
3: 00000000 0 FILE LOCAL DEFAULT ABS prova2.asm
4: 08049098 0 NOTYPE LOCAL DEFAULT 2 aw
5: 00000000 0 FILE LOCAL DEFAULT ABS
6: 08048080 0 NOTYPE GLOBAL DEFAULT 1 _start
7: 080490ab 0 NOTYPE GLOBAL DEFAULT 2 __bss_start
8: 080490ab 0 NOTYPE GLOBAL DEFAULT 2 _edata
9: 080490ac 0 NOTYPE GLOBAL DEFAULT 2 _end
utente@utente-virtual-machine:~/Scrivania/programmazione/nasm$
Why does this happen? Again thank you for your patience, I'm trying to get the functioning of nasm itself rather than actually programming something.
Best,
Neuro