Author Topic: Acessing different indexes of matrix (Read 27488 times)

PWright · « **on:** June 22, 2010, 08:38:11 PM »

Hello, I've started using NASM and I'm a bit confused about the adressing methods x86 uses. I made a program that was supposed to read 2 numbers, store then in different positions of an allocated matrix, and then show them. But it seems like I'm doing something wrong when accessing those positions.

The program is the following

SECTION .text
global main
main:
mov edx, 0h
call read
mov edx, 4h
call read
mov edx, 0h
call show
mov edx, 4h
call show
end:
mov eax, 0
ret

read:
push elemento1
push Fscanf
call scanf
add esp, 8
mov ebx, dword[elemento1]
mov [matrixA+edx], ebx
jmp show

show:
mov ecx, [matrixA+edx]
push ecx

push Fprintf
call printf
add esp, 8
jmp end
ret

And section .DATA is

Fscanf: db "%d", 0
Fprintf: db ">>> %d", 10, 0
elemento1: times 4 db 0
matrixA: times 128 db 0

Where edx should indicate the desired matrix position. Anyone know what I could be doing wrong? Thanks for your help

Frank Kotler · « **Reply #1 on:** June 22, 2010, 10:29:46 PM »

Well, you've got the right idea. Unfortunately, edx is one of the registers a C function is allowed to alter - it may not have the same value when scanf returns as it had when you called it. By contrast, ebx (and esi and edi... and ebp) is a register C will not alter across a call - and we're not supposed to alter it either! That doesn't mean we can't use it, but we have to put it back the way it was before we return from "main". You use ebx to transfer the value from the temporary variable "elemento1" to your array. Perhaps you don't have to do that! If you handled the setup for the call to "scanf" in the same way you do "printf" - well, not quite the same - I think it might solve your problem... and get rid of ebx, and get rid of the temporary variable "elemento1". "scanf" needs the address of a place to put the "int" (in this case), but it can be "where it goes" in your array as easily as a temporary variable. You know "lea"? "load effective address"?

Code: [Select]

lea ecx, [matrixA + edx]
push ecx
push Fscanf
call scanf
add esp, 4 * 2
ret

(I write it as "4 * 2" instead of "8" 'cause it makes it easier to modify if I change the number of parameters)

I *would* expect "ret" there! And I would expect "show" to get to "ret" instead of jumping back to "end". I suspect you had that right, and fiddled with it trying to get it to work.

Try something like that. We can "improve" the code, perhaps - make "read" and "show" more flexible, not limited to using "matrixA" specifically, by doing it earlier, and calling "read" with the address of "where to put it" - "matrixA", "matrixB" or wherever you like. Could even pass the parameter on the stack instead of in edx. But get it working the way you've got it first, and then take it in "small steps".

Best,
Frank

PWright · « **Reply #2 on:** June 23, 2010, 02:05:55 AM »

Thank you Frank, for the very quick answer!

Well, I kinda forgot to say that the code was a prototype of a prototype. It was actualy only me messing around with printf and scanf to see if I could understand it's logic.

So, it seems like I can't use edx, unless I push it before calling "read", right

Also, could you explain a little better this "Load effective address" you told me about?

Thanks again!

Frank Kotler · « **Reply #3 on:** June 23, 2010, 12:11:28 PM »

Sure you can use edx. You just can't expect it to survive a call to a C function! If you would like edx preserved across the call(s) to "read", that can easily be arranged:

Code: [Select]

       
read:
        push edx

        push elemento1
        push Fscanf
        call scanf
        add esp, 8
        mov ebx, dword[elemento1]

        pop edx
        mov [matrixA+edx], ebx
        ret

Normally, we'd leave the "pop edx" for last, right before the "ret", but since we need the original value, we do it "early". Since we don't alter it afterwards, edx retains its value after "read" returns. You do "mov edx, 0"..."mov edx, 4", but now that edx is "preserved" you could do "add edx, 4" and loop around filling more array elements until some "max" is reached (don't overflow the buffer!). I assume that's what you're working towards.

This is part of what I call the "C calling convention" - more correctly known as the "Intel ABI" (Application Binary Interface). I consider that name somewhat misleading, since the hardware doesn't care. It's just a "convention"... but we have to follow it if we expect to interface with C, and we "might as well" follow it in our own code for a consistant interface. With minor variations, this applies to other languages besides C, but we're interested in C right now... (also known as "cdecl", but I don't know how to pronounce that, so I just call it "the C calling convention")

Parameters are passed on the stack, pushed right to left (except the "Pascal" varient, which pushes 'em left to right). That is, the "last" parameter, as read in the C source, is pushed first. I prefer to think of it as the "closest" parameter to the name of the function is the "closest", up the stack, from the code to that function.

It is the responsibility of the calling code to "remove" these parameters ("clean up the stack") after the function returns. That's the "add esp, 4", etc. - it can be done by popping a "dummy" register, too, or with "lea"(!)... The "stdcall" (and "Pascal") varient expects the called function to "clean up the stack" - Windows APIs use this! - the called functions end in "ret 4","ret 8", etc. instead of a plain "ret". This isn't suitable for functions that can take a variable number of parameters - like "scanf" and "printf"... but we're interfacing with C anyway...

Certain registers are guaranteed to retain their value across a call, and are expected to retain their value across a callback function ("main" is a callback function!). ebx, esi, edi, and ebp need to be preserved - if we alter 'em, we need to restore their value - if we don't use 'em, we don't need to do anything, but it won't hurt to push/pop 'em "to be ready" in case we decide to use them.

Code: [Select]

main:
    push epb
    mov ebp, esp
;   sub esp, ??? ; make space for "local" variables  - we have none
    push ebx
    push esi
    push edi

; our code

    xor eax, eax ; claim "no error"

    pop edi
    pop esi
    pop ebx
    mov esp, ebp
    pop ebp
    ret

The "result" is returned in eax, or in the case of a 64-bit return value in edx:eax (rare, in my experience). There's no guarantee that a particular implementation of a C function *will* trash ecx... and edx... but it's "allowed", so we can't count on them surviving a call. The "return value" from printf and scanf (in eax) are "number of items" printed/read. It may be worth checking the return value from scanf to make sure the pesky user hasn't entered garbage (and clean up the mess if they have). The return value from printf probably isn't interesting, but it's in eax...

I think I'll leave "lea" for another post. I think I need a nap.

Later,
Frank

PWright · « **Reply #4 on:** June 23, 2010, 06:00:27 PM »

Hello Frank,

Well, I did a little research on LEA, and, if I understood it correctly, instead of loading to the register what's in the address, it loads the number of the adress itself. I've tried the way you wrote, using LEA, and it worked perfectly!

Quote

main:
mov ebx, 0h
call read
add ebx, 4h
call read
sub ebx, 4h
call show
add ebx, 4h
call show
end:
mov eax, 0
ret

read:
lea ecx, [matrixA+ebx]
push ecx
push Fscanf
call scanf
add esp, 4 * 2
ret

show:
mov ecx, dword[matrixA+ebx]
push ecx
push Fprintf
call printf
add esp, 4 * 2
ret

I also tried the same code, but using edx instead of ebx, and pushing / poping it, and it also worked flawlessly!

Since you've been so helpful, I think you deserve to know what the final program should be. It's a Matrix multiplication exercise, where I need to read 4 numbers, rows of matrixA, columns of A, rows of B, columns of B, test if they can be multiplied, and, If so, read the matrixes, multiply them and show the result to the user.

If you don't mind, I'd like to ask one more questions. Is there any way to debug my code running it line by line, setting breakpoints, inspecting memory / registers? It sure would help a lot!

Thanks again for your support!

Frank Kotler · « **Reply #5 on:** June 23, 2010, 07:59:01 PM »

Sorry for the delay. You got "lea" (and the difference from "mov") exactly right. Notice that it doesn't touch memory, but just does "arithmetic". Here's a cute "off label" use of lea - it isn't really an address at all, but has to have the "form" of a valid effective address - you can't do arbitrary arithmetic with it.. A compiler generated this code (the lea part), and I "stole" it.

Code: [Select]

global _start

section .data
    number_string db '123', 10

section .text
_start:
    nop
    
    push number_string
    call atoi
    add esp, byte 4

    mov ebx, eax
    mov eax, 1
    int 80h


atoi:
    mov edx, [esp + 4]  ; pointer to string
    xor eax, eax        ; clear "result"
.top:
    movzx ecx, byte [edx]
    inc edx
    cmp ecx, byte '0'
    jb .done
    cmp ecx, byte '9'
    ja .done
    
    ; we have a valid character - multiply
    ; result-so-far by 10, subtract '0'
    ; from the character to convert it to
    ; a number, and add it to result.
    
    lea eax, [eax + eax * 4]
    lea eax, [eax * 2 + ecx - 48]

    jmp short .top
.done
    ret

Note that this should be called "atou", not "atoi", since it doesn't do "signed" numbers (I'm sloppy that way).

As for a debugger... it depends "what OS". Since you spell "main", "main" and not "_main" ("printf" not "_printf", etc.), I'm guessing Linux. The native debugger is gdb. I consider it a rather unfriendly beast, but it's very powerful (I guess...). It will help if you've got "symbolic debug info" in your program. The "-g" switch on the command line to both Nasm and gcc would do it. Nasm knows two formats of debug info "stabs" and "dwarf". "stabs" is default for Nasm, but "dwarf" is default for gdb (I'm told). Enable it with "-F dwarf" on the Nasm command line. I honestly don't see much difference, but Chuck Crayne contributed "dwarf" to Nasm shortly before his death, so use it in his memory (RIP, Chuck). Then, after "gdb myfile", try "break main", "run"... and then you should be able to "step" through your code. When you come to "call scanf" etc., you may want "next"(?) to skip over it - or maybe you *want* to step through it (good luck!).

I'll attach a "gdbinit.txt" I got from Jeff Owens (I think), which if saved as ".gdbinit" (in ~ or current directory, I think) will perhaps make gdb more "asm friendly"(?).

I'm not really that familiar with gdb - I prefer (usually) to use Patrick Alken's "ALD" (Assembly Language Debugger) - http://ald.sf.net (I think - if not, sf.net/projects/ald), or better yet(?) a fork of that program which borrows some code from Nasm to implement the "a(ssemble)" option. Terry Loveall's "coughing up a furball" page, where I found it, seems to be dead, so I put it here: http://home.myfairpoint.net/fbkotler/debug-0.0.21.tgz - if anyone wants to try it.

Best,
Frank

NASM - The Netwide Assembler

News:

Author Topic: Acessing different indexes of matrix (Read 27488 times)

PWright

Acessing different indexes of matrix

Frank Kotler

Re: Acessing different indexes of matrix

PWright

Re: Acessing different indexes of matrix

Frank Kotler

Re: Acessing different indexes of matrix

PWright

Re: Acessing different indexes of matrix

Frank Kotler

Re: Acessing different indexes of matrix