Author Topic: Square brackets and how compiler knows what kind of value are to be processed  (Read 20044 times)

Offline neurophobos

  • Jr. Member
  • *
  • Posts: 11
  • Country: 00
  • Spengler: "We'll cross the streams"
Hello everyone.
As the title says, I've a doubt concerning the square brackets. Let's suppose we have the simple:

Code: [Select]
        data:       dd      1234 ;double word, 32 bit
        mov     ebx, data
        mov     eax, [data]

As far as I know, the first mov stores in ebx the address of data, as the second stores the actual value, so eax will be 1234.
A parallelism with C, data is a pointer, and [data] is like *data.

My doubt is, now, what if then I get something stupid like

Code: [Select]
inc eax
Who tells the compiler wheter a a register contains an address or an actual value?
After this, what will eax store? An incremented address or the value 1235?
I think it would store the incremented address, for coherence with the upper example. What if I wanted to increment the value?
I've never seen something like

Code: [Select]
inc [eax]
Maybe I do with

Code: [Select]
add [eax], 1
? I've never catched it around, that's why I'm confused.

Another question...if it's an increment of address, it will point to the next element in memory, and the compiler has to go down much bytes the preceding operand was composed of...what's telling the compiler how many bytes? The dd directive? As, if it were dw, the compiler would have jumped 2 bytes and not 4?

Thanks as always!
Best regards
N.

« Last Edit: March 04, 2014, 09:08:13 AM by neurophobos »

Offline encryptor256

  • Full Member
  • **
  • Posts: 250
  • Country: lv
  • Win64 .
    • On Youtube: encryptor256
Hi!

Here is some examples, might help to understand more:
Code: [Select]
address: dd 1234
bytes: db 55,12,2,3,4,5,6
words: dw 6666,11,24,3,4,5,6

mov ebx, address    ; -> Assign ebx data address.

mov eax, [address]  ; -> Assign eax a 32 bit value from address, eax now is 1234
inc eax                  ; -> eax now is 1235
mov [address], eax  ; Store eax into address, so, after this operation data would be

"address: dd 1235"

mov al, [address] ; -> Assign AL a 8 bit value from address, al now is 5
mov ax, [address] ; -> Assign AL a 16 bit value from address, ax now is 6666

; --------------------------------------------------------------

inc dword [address] ; -> increase 32 bit value at address, so, after this operation data would be

"address: dd 1235"


A parallelism with C, data is a pointer, and [data] is like *data.

Yes, data is a memory address - a pointer, address where actual data lies.

Number, memory address, a pointer, value - they are the same, in 32 bit environment, they all are 32 bit numbers.

C language has ruined your thinking a bit, but you will get back on track fast. :D

My doubt is, now, what if then I get something stupid like

Code: [Select]
inc eax

There is nothing stupid.
"inc eax" - increase eax by 1 and eax is 32 bit register

Who tells the compiler wheter a a register contains an address or an actual value?

What is address or actual value ? Both are the same.
32bit Register like eax, can hold 32 bit value - AND value is a number.
It all depends on you, what this value indicates to you? - Is it a memory address or value.

I've never seen something like
Code: [Select]
inc [eax]Maybe with
Code: [Select]
add [eax], 1?

Let's say:
Code: [Select]

mynumber: dd 1234

When you use inc or add instruction you
have to specify size - byte, dword, word
- it states the question - how large that value is, that you want to increase,
so, mynumber is dword = 32 bit.

inc dword [mynumber] ; -- It increases 32 bit number at address "mynumber".

; -> Right now mynumber is: "mynumber: dd 1235"

add dword [mynumber],1 ; -- Add value 1 to 32 bit number at address  "mynumber".

; -> Right now mynumber is: "mynumber: dd 1236"



Good questions, once upon a time i had them too.
I hope my answers are good enough. :D

Bye,
Encryptor256.
Encryptor256's Investigation \ Research Department.

Offline neurophobos

  • Jr. Member
  • *
  • Posts: 11
  • Country: 00
  • Spengler: "We'll cross the streams"
Hi encryptor, thank you so much for your suggestions.
I try to get down here my new questions...

Quote
mov ebx, address    ; -> Assign ebx data address.
ebx here stores the 32 bit address of address variable, let's pretend it's 0x08048084

Quote
mov eax, [address]  ; -> Assign eax a 32 bit value from address, eax now is 1234
eax stores the value at that 0x08048084 address, *0x08048084, which is 1234

Quote
inc eax                  ; -> eax now is 1235
Here's my doubt. inc eax adds 1 to 1234 (and NOT some offset to 0x08048084, did I get it right?). This makes sense because in inc eax we do not specify if it's a halfword, double and so on (the amount of offset) so compiler could not know how many bytes to add to the memory pointer to get to the next valid element (just like array+1 or array+4 whether we have a char or an int array). So in inc eax, even though we don't use [eax], this instruction accesses the value contained in the address, not the address itself. That's why the value contained in the address is updated to 1235 even without the square brackets (I thought that in inc eax, 0x08048084 would have been affected, not *eax. Typing inc eax instead of inc [eax] it's just a synthax simplification?)


Quote
mov al, [address] ; -> Assign AL a 8 bit value from address, al now is 5
mov ax, [address] ; -> Assign AL a 16 bit value from address, ax now is 6666
Could you explain to me how did you get those two values? They're not related to bytes and words declarations at the beginning of your sample, aren't they?

Quote
inc dword [address] ; -> increase 32 bit value at address, so, after this operation data would be
Here you used [], so they have the equivalent effect of inc address right? (and address or register have equivalent effects?)


Last question....you used dword..it means that you take all 32 bits and add 1 right?

Let's say, we have 255 stored in address,
Code: [Select]
00000000 00000000 00000000 11111111After inc dword address we have had 256,
Code: [Select]
->00000000 00000000 00000001 00000000<- (all 32 bits)

If we were to use inc word address, we would have had
Code: [Select]
00000000 00000000 00000000 ->00000000<- ? (8 bits)

If we were to use hword address, we would have had
Code: [Select]
00000000 00000000 ->00000001 00000000<- ? (16 bits, however correct but just a case since here we do not overgo the 16 bits as we did in the 8 bit).

Basically, know what type of data you'r dealing with and choose wisely your dword/word/ecc directives or you'll have wrong results?

Again, thank you!
« Last Edit: March 04, 2014, 11:17:04 AM by neurophobos »

Offline encryptor256

  • Full Member
  • **
  • Posts: 250
  • Country: lv
  • Win64 .
    • On Youtube: encryptor256
Okay, let's pretend that 32bit number is stored at address 0x08048084 and it have value 1234.

Data segment contains data.
Code segment contains code.
From code segment you can access data segment.

Address is a number which represents an offset within a segment.

Code: [Select]
segment .data

mynumber: dd 1234 ; -> Let's pretend that "mynumber" is at address 0x08048084.

segment .code
;
; Task: How to copy 32 bit value from address 0x08048084 into EAX register.
;
mov eax,[0x08048084] ; -> What value EAX have now? - it's 1234
;
mov eax,[mynumber] ; -> What value EAX have now? - it's 1234
;
mov eax,0x08048084 ; -> What value EAX have now? - it's 0x08048084
;

MOV instruction works like this:
Code: [Select]
mov destination, source

Quote
Quote
mov ebx, address    ; -> Assign ebx data address.
ebx here stores the 32 bit address of address variable, let's pretend it's 0x08048084
- Here we assign 0x08048084 to EBX. So, EBX now is 0x08048084.

Quote
Quote
mov eax, [address]  ; -> Assign eax a 32 bit value from address, eax now is 1234
eax stores the value at that 0x08048084 address, *0x08048084, which is 1234
- Here a 32 bit value, from address 0x08048084, is stored in EAX register, which is 1234.

Quote
Quote
inc eax                  ; -> eax now is 1235
Here's my doubt. inc eax adds 1 to 1234 (and NOT some offset to 0x08048084, did I get it right?).
Yes, eax was 1234, then add 1, and get 1235.

Quote
Typing inc eax instead of inc [eax] it's just a synthax simplification?
No.

Quote
Quote
mov al, [address] ; -> Assign AL a 8 bit value from address, al now is 5
mov ax, [address] ; -> Assign AL a 16 bit value from address, ax now is 6666
Could you explain to me how did you get those two values? They're not related to bytes and words declarations at the beginning of your sample, aren't they?
- Sorry there was my fault, small mistake there, shown and fixed below.

Well, learn though examples:
Code: [Select]
segment .data

mynumber: dd 1234 ; -> Let's pretend that "mynumber" is at address 0x08048084.

bytearray: db 5,22,17,26

wordarray: dw 4423,7724,3356

segment .code

mov al,[bytearray] ; -> AL now is 5
mov al,[bytearray+1] ; -> AL now is 22
mov al,[bytearray+2] ; -> AL now is 17
mov al,[bytearray+3] ; -> AL now is 26

mov al,[bytearray]   ; -> AL now is 5
add al,[bytearray+1] ; -> AL now is 5+22 = 27

inc byte [bytearray]
mov al,[bytearray] ; -> AL now is 6

mov ax,[wordarray] ; -> AX now is 4423
mov ax,[wordarray+2] ; -> AX now is 7724
mov ax,[wordarray+4] ; -> AX now is 3356

dec word [wordarray+2]
dec word [wordarray+2]
dec word [wordarray+2]
mov ax,[wordarray+2] ; -> AX now is 7724-3=7721

mov eax,0x08048084 ; -> EAX now is 0x08048084
mov ebx,[eax] ; -> EBX now is 1234
inc ebx ; -> EBX now is 1235
mov eax,ebx ; -> EAX now is 1235
mov [0x08048084], eax ; -> Save value of EAX at address 0x08048084
mov edx, [0x08048084] ; -> EDX now is 1235




Quote
Quote
inc dword [address] ; -> increase 32 bit value at address, so, after this operation data would be
Here you used [], so they have the equivalent effect of inc address right? (and address or register have equivalent effects?)
Yes, it can increase value that is stored in register OR
it can increase value that is stored at specified memory location.

Quote
Last question....you used dword..it means that you take all 32 bits and add 1 right?
Yes, it will add 1 to 32 bit number.

Quote
Let's say, we have 255 stored in address,
Code: [Select]
00000000 00000000 00000000 11111111After inc dword address we have had 256,
Code: [Select]
->00000000 00000000 00000001 00000000<- (all 32 bits)
Yes, 32bits: 00000000 00000000 00000001 00000000 OR it is 256.

byte is 8 bits:
Quote
If we were to use inc word address, we would have had
Code: [Select]
00000000 00000000 00000000 ->00000000<- ? (8 bits)
Yes, but word is 16 bits, byte is 8 bits.

word is 16 bits:
Quote
If we were to use hword address, we would have had
Code: [Select]
00000000 00000000 ->00000001 00000000<- ? (16 bits, however correct but just a case since here we do not overgo the 16 bits as we did in the 8 bit).
Yes, i dont know what is hword, i think you mean word because it's 16 bits.

So, you were mixing byte and word.
BYTE/WORD/DWORD/QWORD = 8/16/32/64 bits

Quote
Basically, know what type of data you'r dealing with and choose wisely your dword/word/ecc directives or you'll have wrong results?
Yes, correct!

Here is one more example:
Code: [Select]
segment .data

num: db 255   ; db - define byte

segment .code

inc byte [num] ; Increase byte at address "num". (255+1=0)
mov al,[num]   ; AL is 8 bit register, so it loads one byte from address "num"
inc al             ; AL now is 1


And that's it!





Code examples are good resource for learning,
but not all code examples are always correct. :D

Bye!
Encryptor256's Investigation \ Research Department.

Offline neurophobos

  • Jr. Member
  • *
  • Posts: 11
  • Country: 00
  • Spengler: "We'll cross the streams"
Thank you man!

All clear, exept, I think I did not explain well my last question.
Quote
So, you were mixing byte and word.
BYTE/WORD/DWORD/QWORD = 8/16/32/64 bits

Let's say I have [eax]:
Code: [Select]
00000000.00000000.00000000.11111111 ; kinda ip address style, just to make it more readable
If I have:
Code: [Select]
inc byte, my result will be:
Code: [Select]
00000000.00000000.00000000.00000000 ; as I consider only byte, the carry for the nine-th bit won't be set (It would be a word) and I'll have a wrong result of 0
If I have:
Code: [Select]
inc word, my result will be:
Code: [Select]
00000000.00000000.00000001.00000000 ; here I consider a word, so the carry will be set, and the correct result will be 256
If I have:
Code: [Select]
inc dword, my result will be:
Code: [Select]
00000000.00000000.00000001.00000000 ; same as above, but just because dword > word. If I had [eax]
11111111.11111111.11111111.11111111, I'd have got the same (wrong result) as the #1 example, so I'd have had to use
Code: [Select]
inc QWORD.

Am I right?
« Last Edit: March 04, 2014, 02:59:05 PM by neurophobos »

Offline encryptor256

  • Full Member
  • **
  • Posts: 250
  • Country: lv
  • Win64 .
    • On Youtube: encryptor256
Quote
If I have:
Code: [Select]
inc byte, my result will be:
Code: [Select]
00000000.00000000.00000000.00000000 ; as I consider only byte, the carry bit won't be set (It would be a word) and I'll have a wrong result of 0
Yes.

Quote
If I have:
Code: [Select]
inc word, my result will be:
Code: [Select]
00000000.00000000.00000001.00000000 ; here I consider a word, so the carry will be set, and the correct result will be 256
Yes.

Quote
If I have:
Code: [Select]
inc dword, my result will be:
Code: [Select]
00000000.00000000.00000001.00000000 ; same as above, but just because dword > word. If I had [eax]
11111111.11111111.11111111.11111111, I'd have got the same (wrong result) as the #1 example, so I'd have had to use
Code: [Select]
inc QWORD.
Yes.

Quote
Am I right?
Well, yes, pretty good!

So, YES, you have to know, with what kind'a data types you are dealing with, bytes, words, dwords, qwords or even more, or some scientific data type.

Bye.
Encryptor256's Investigation \ Research Department.

Offline neurophobos

  • Jr. Member
  • *
  • Posts: 11
  • Country: 00
  • Spengler: "We'll cross the streams"
 :D
Oook, again, thank you for your help!
Have a nice day!
N.

Offline encryptor256

  • Full Member
  • **
  • Posts: 250
  • Country: lv
  • Win64 .
    • On Youtube: encryptor256
:D
Oook, again, thank you for your help!
Have a nice day!
N.

You too, bye, Encryptor256! :D
Encryptor256's Investigation \ Research Department.

Offline neurophobos

  • Jr. Member
  • *
  • Posts: 11
  • Country: 00
  • Spengler: "We'll cross the streams"
Ok...I know I told you that was my last question...but you know, it was yesterday, today it's another day  ;D
Just kidding, but I found out there's one last doubt I have I forgot to ask about...it may be easy though, I already imagine the answer,
just posting to be completely sure.

If everything between square brackets (register, label ecc.) is the value contained in register/label ecc,
when I see something like

Code: [Select]
mov eax,dword ptr [edi]
here, even though square brackets would detect the value contained in EDI, however eax will get a pointer to that value, and not the value itself.

And the dword as always indicates that it's to be a 32 bit pointer, so if following comes something like:

Code: [Select]
mov eax,dword ptr [edi+1]
That would copy the address of next 32-bit element, so, in the first mov eax will store let's say 0x08048084 address,
and after the second eax will store 0x08048088.

Offline encryptor256

  • Full Member
  • **
  • Posts: 250
  • Country: lv
  • Win64 .
    • On Youtube: encryptor256
Quote
Code: [Select]
mov eax,dword ptr [edi]

I have never used "ptr", i don't think it's even NASM code. I have seen something similar in MASM32.

Quote
when I see something like
Code: [Select]
mov eax,dword ptr [edi]
Stop seeing that "ptr" thingy. :D
The one who invented that "ptr" thingy, could answer - what that really means.

How to get a pointer to value?

Code: [Select]
segment .data

wordarray: dw 1132,5143,2212,3334; - Let's say wordarray is at address 0x08048084.

segment .code

lea eax,[wordarray+2]  ; eax now is 0x08048084 + 2= 0x08048086 ;;; eax now holds a pointer to 5143
mov ax,[eax]              ; ax now is 5143

mov eax,0x08048084 ; eax now is 0x08048084
add eax,2                 ; eax now is 0x08048084 + 2= 0x08048086 ;;; eax now holds a pointer to 5143
mov ax,[eax]              ; ax now is 5143

mov ax,[0x08048084+2] ; ax now is 5143


You don't need "ptr" thingy to program assembly language.

« Last Edit: March 05, 2014, 09:19:11 AM by encryptor256 »
Encryptor256's Investigation \ Research Department.