NASM - The Netwide Assembler

NASM Forum => Programming with NASM => Topic started by: ashwin on February 19, 2012, 05:06:14 AM

Title: problem with linear search
Post by: ashwin on February 19, 2012, 05:06:14 AM
so i have written this program for linear search that first takes the number of numbers of array as input. then the user inputs the numbers and finally the number to be searched.it then searches from the last element and prints the position of the first match it finds and if no match is found, it prints "element not found".now the programs works fine as long as all the numbers of the array are below 256 but even if one of the inputs is above or equal to 256, the program doesnt work properly and i have no idea why.

Code: [Select]
section .data

msg1 : db "enter the number of elements of array : ", 10, 13
len1 : equ $-msg1
msg2 : db "enter the elements : ", 10, 13
len2 : equ $-msg2
msg3 : db "enter the element to be searched : ", 10, 13
len3 : equ $-msg3
msg4: db "element found at position : ", 10, 13
len4: equ $-msg4
msg5: db "element not found !! ", 10, 13
len5: equ $-msg5



section .bss

no: resd 1
no2: resd 1
divide: resd 1
dispcount: resd 1
no_of_elements: resd 1
array: resd 50
temp: resd 1
element: resd 1
pos: resd 1



section .text
global _start

_start:
mov eax, 4
mov ebx, 1
mov ecx, msg1
mov edx, len1
int 80h

call getnumber

mov eax, [no]
mov [no_of_elements], eax

mov eax, 4
mov ebx, 1
mov ecx, msg2
mov edx, len2
int 80h

mov eax, [no_of_elements]
mov [temp], eax


getarr:
cmp dword [temp], 0
je donegetno

call getnumber

mov eax, [no]
push eax

dec dword [temp]
jmp getarr

donegetno:

mov eax, 4
mov ebx, 1
mov ecx, msg3
mov edx, len3
int 80h

call getnumber

mov eax, [no]
mov [element], eax

mov ebx, [no_of_elements]
mov [temp], ebx


search:
pop eax

mov [no], eax

cmp [element], eax
je found

dec dword [temp]

cmp dword [temp], 0
jne search

jmp notfound

found :
add dword [temp], 30h

mov eax, 4
mov ebx, 1
mov ecx, msg4
mov edx, len4
int 80h

mov eax, 4
mov ebx, 1
mov ecx, temp
mov edx, 1
int 80h

mov dword [no], 10

mov eax, 4
mov ebx, 1
mov ecx, no
mov edx, 1
int 80h

jmp exit

notfound :
mov eax, 4
mov ebx, 1
mov ecx, msg5
mov edx, len5
int 80h

exit:
mov eax, 1
mov ebx, 0
int 80h


getnumber:
mov eax, 3
mov ebx, 0
mov ecx, no
mov edx, 1
int 80h

sub dword [no], 30h

l1: mov eax, 3
mov ebx, 0
mov ecx, no2
mov edx, 1
int 80h

cmp dword [no2], 10d
je done

sub dword [no2], 30h

mov eax, dword [no]
mov ecx, 10
mul ecx

mov edx, 0

add eax, dword [no2]
mov dword [no], eax

jmp l1

done:
ret

Title: Re: problem with linear search
Post by: Frank Kotler on February 19, 2012, 12:54:22 PM
Well, that's... interesting! Your "getnumber" routine is... unusual... but it works... almost.

Code: [Select]
getnumber:
mov eax, 3
mov ebx, 0
mov ecx, no
mov edx, 1
int 80h

At this point, you put one and only one byte into [no]. This will overwrite any "previous garbage", unless the "previous garbage" exceeds one byte - 255. If the previous number was bigger than 255, there's some "crud" left in [no] which doesn't get handled properly! This results in the final number being not what you intended - not even close.

We can (apparently) fix this be clearing out [no] before starting each "getnumber"....

Code: [Select]
getnumber:
        mov dword [no], 0
mov eax, 3
mov ebx, 0
mov ecx, no
mov edx, 1
int 80h

This is still going to have problems if the pesky user enters a non-digit - they'll enter any old thing that comes into their head, y'know - but it seems to solve your immediate puzzle. (not tested extensively, but "seems to work")

Best,
Frank

Title: Re: problem with linear search
Post by: ashwin on February 19, 2012, 04:56:22 PM
Thanks for the prompt reply Frank. and yes, that does solve my problem.  :D

but you did say my method version of getnumber was quite unusual. i'm just a novice assembly language programmer and this is just some code i thought up by myself. what would be a better method to get integer inputs??  ???
Title: Re: problem with linear search
Post by: Frank Kotler on February 20, 2012, 07:43:47 AM
There are different ways of doing it. Your way isn't bad. The general idea is:

; zero out the "result so far"
top:
; get a character
; make sure it's a valid decimal digit
; if so, multiply the "result so far" by ten
; add in the "new" character, with '0' subtracted to "convert" from a character to the number it represents
; do more

That's what you did, except that you used "[no]" both as a buffer for the first character and as "result so far", and it only got cleared if it was less than 256, resulting in that weird behavior.

You could also figure out in advance how many characters you had, and multiply by 10000, 1000, 1000, 10 as required (subtracting '0' first), but multiplying "result so far" by 10 each time does the same thing without having to know in advance how many characters we've got. Does a useless "0 x 10" the first time through, but that does no harm...

I like to read the whole "string" before starting, rather than reading one character at a time. As you've probably figured out, sys_read on stdin doesn't actually return until "enter" is hit. You get edx characters, and any excess stays in the OS's i/o buffer, to be read next time. This was throwing me for a loop trying to run your program in a debugger. I'd type "123", the sys_read would get the "1" and the "23" would show up at the debugger's prompt, causing an error!

You can demonstrate this... sys_read say 4 characters, and then exit. Then type "xxxxls" and you wind up with a directory listing after the program exits! This could potentially do harm, but as long as you keep reading until the linefeed (10) shows up, as you did, it works okay. By rights, we really ought to flush the buffer, removing any "excess" beyond the length specified in edx, but... "close enough for government work". :)

This is what I've been using lately to convert string to number, mostly because I think it's "cute" to do the "multiply result so far by ten and add in the new character with '0' subtracted" with just two "lea" instructions. It was shown to me by Herbert Kleebauer - output from a C compiler, supposedly.

Code: [Select]
;--------------------
atoi:
    mov edx, [esp + 4]  ; pointer to string
    xor eax, eax        ; clear "result"
.top:
    movzx ecx, byte [edx]
    inc edx
    cmp ecx, byte '0'
    jb .done
    cmp ecx, byte '9'
    ja .done
   
    ; we have a valid character - multiply
    ; result-so-far by 10, subtract '0'
    ; from the character to convert it to
    ; a number, and add it to result.
   
    lea eax, [eax + eax * 4]
    lea eax, [eax * 2 + ecx - '0']

    jmp short .top
.done:
    ret
;------------------------

Really should be named "atou" not "atoi", since it won't do negative numbers. That would require some extra processing - check for a minus sign first, and negate eax at the end, if so. I don't do anything different for "invalid character" - any non-digit ends the processing... so if the user types "12xyz34", we get only 12. That could perhaps be improved (but that's the way C does it). A disadvantage to using "lea" is that there's no indication of overflow, it just "rolls over". If we used "mul" and "add", we could check for overflow (both places) and avoid this. Good enough for "toy programs".

Here's how it could be integrated into your program:
Code: [Select]
section .data

msg1 : db "enter the number of elements of array : ", 10, 13
len1 : equ $-msg1
msg2 : db "enter the elements : ", 10, 13
len2 : equ $-msg2
msg3 : db "enter the element to be searched : ", 10, 13
len3 : equ $-msg3
msg4: db "element found at position : ", 10, 13
len4: equ $-msg4
msg5: db "element not found !! ", 10, 13
len5: equ $-msg5



section .bss

inbuf resb 12

no: resd 1
no2: resd 1
divide: resd 1
dispcount: resd 1
no_of_elements: resd 1
array: resd 50
temp: resd 1
element: resd 1
pos: resd 1



section .text
global _start

_start:
mov eax, 4
mov ebx, 1
mov ecx, msg1
mov edx, len1
int 80h

call getnumber

mov eax, [no]
mov [no_of_elements], eax

mov eax, 4
mov ebx, 1
mov ecx, msg2
mov edx, len2
int 80h

mov eax, [no_of_elements]
mov [temp], eax


getarr:
cmp dword [temp], 0
je donegetno

call getnumber

mov eax, [no]
push eax

dec dword [temp]
jmp getarr

donegetno:

mov eax, 4
mov ebx, 1
mov ecx, msg3
mov edx, len3
int 80h

call getnumber

mov eax, [no]
mov [element], eax

mov ebx, [no_of_elements]
mov [temp], ebx


search:
pop eax

mov [no], eax

cmp [element], eax
je found

dec dword [temp]

cmp dword [temp], 0
jne search

jmp notfound

found :
add dword [temp], 30h

mov eax, 4
mov ebx, 1
mov ecx, msg4
mov edx, len4
int 80h

mov eax, 4
mov ebx, 1
mov ecx, temp
mov edx, 1
int 80h

mov dword [no], 10

mov eax, 4
mov ebx, 1
mov ecx, no
mov edx, 1
int 80h

jmp exit

notfound :
mov eax, 4
mov ebx, 1
mov ecx, msg5
mov edx, len5
int 80h

exit:
mov eax, 1
mov ebx, 0
int 80h


getnumber:

mov eax, 3
mov ebx, 0
mov ecx, inbuf
mov edx, 11
int 80h

push inbuf
call atoi
add esp, 4

mov dword [no], eax ; or could just return result in eax
ret

;--------------------
atoi:
    mov edx, [esp + 4]  ; pointer to string
    xor eax, eax        ; clear "result"
.top:
    movzx ecx, byte [edx]
    inc edx
    cmp ecx, byte '0'
    jb .done
    cmp ecx, byte '9'
    ja .done
   
    ; we have a valid character - multiply
    ; result-so-far by 10, subtract '0'
    ; from the character to convert it to
    ; a number, and add it to result.
   
    lea eax, [eax + eax * 4]
    lea eax, [eax * 2 + ecx - '0']

    jmp short .top
.done:
    ret
;------------------------

Note that this is "C compatible" - expects the parameter (address of the string) on the stack, and we have to "clean up the stack" on return (add esp, 4). This could be modified to pass the parameter in a register, if you want. The registers used, ecx and edx, are allowed to be "trashed" in C. You can't access 'em in C, but actually ecx (just cl, really) contains the "invalid digit" that ended processing, and edx points to the next character. This could be useful in evaluating a "dotted quad" IP address like "123.456.78.1", for example...

Not necessarily the "ideal" method, but I think it's "cute".

Best,
Frank

Title: Re: problem with linear search
Post by: ashwin on February 21, 2012, 04:47:42 PM
I used this method only because i didnt know that sys_read exits as soon as "enter" key is pressed. i thought we had to get edx characters, no matter what, or else it would result in an error. Either way it works and that's all that matters for me!!

anyhow thanks for helping me out Frank ;D