Author Topic: problem with linear search (Read 18387 times)

ashwin · « **on:** February 19, 2012, 05:06:14 AM »

so i have written this program for linear search that first takes the number of numbers of array as input. then the user inputs the numbers and finally the number to be searched.it then searches from the last element and prints the position of the first match it finds and if no match is found, it prints "element not found".now the programs works fine as long as all the numbers of the array are below 256 but even if one of the inputs is above or equal to 256, the program doesnt work properly and i have no idea why.

Code: [Select]

section .data

msg1 : db "enter the number of elements of array : ", 10, 13
len1 : equ $-msg1
msg2 : db "enter the elements : ", 10, 13
len2 : equ $-msg2
msg3 : db "enter the element to be searched : ", 10, 13
len3 : equ $-msg3
msg4: db "element found at position : ", 10, 13
len4: equ $-msg4
msg5: db "element not found !! ", 10, 13
len5: equ $-msg5



section .bss

no: resd 1
no2: resd 1
divide: resd 1
dispcount: resd 1
no_of_elements: resd 1
array: resd 50
temp: resd 1
element: resd 1
pos: resd 1



section .text
	global _start
	
_start:
	mov eax, 4
	mov ebx, 1
	mov ecx, msg1
	mov edx, len1
	int 80h

call getnumber

	mov eax, [no]
	mov [no_of_elements], eax
	
	mov eax, 4
	mov ebx, 1
	mov ecx, msg2
	mov edx, len2
	int 80h

	mov eax, [no_of_elements]
	mov [temp], eax


getarr:
	cmp dword [temp], 0
	je donegetno

	call getnumber	

	mov eax, [no]
	push eax
	
	dec dword [temp]
	jmp getarr

donegetno:

	mov eax, 4
	mov ebx, 1
	mov ecx, msg3
	mov edx, len3
	int 80h

	call getnumber

	mov eax, [no]
	mov [element], eax

	mov ebx, [no_of_elements]
	mov [temp], ebx

	
search:
	pop eax

	mov [no], eax
		
	cmp [element], eax
	je found

	dec dword [temp]

	cmp dword [temp], 0
	jne search

	jmp notfound

found : 
	add dword [temp], 30h

	mov eax, 4
	mov ebx, 1
	mov ecx, msg4
	mov edx, len4
	int 80h

	mov eax, 4
	mov ebx, 1
	mov ecx, temp
	mov edx, 1
	int 80h

	mov dword [no], 10

	mov eax, 4
	mov ebx, 1
	mov ecx, no
	mov edx, 1
	int 80h
	
	jmp exit

notfound :
	mov eax, 4
	mov ebx, 1
	mov ecx, msg5
	mov edx, len5
	int 80h

exit:
	mov eax, 1
	mov ebx, 0
	int 80h


getnumber:
	mov eax, 3
	mov ebx, 0
	mov ecx, no
	mov edx, 1
	int 80h

	sub dword [no], 30h

l1:	mov eax, 3
	mov ebx, 0
	mov ecx, no2
	mov edx, 1
	int 80h

	cmp dword [no2], 10d
	je done
	
	sub dword [no2], 30h
	
	mov eax, dword [no]
	mov ecx, 10
	mul ecx
	
	mov edx, 0
	
	add eax, dword [no2]
	mov dword [no], eax
	
	jmp l1

done:
	ret

Frank Kotler · « **Reply #1 on:** February 19, 2012, 12:54:22 PM »

Well, that's... interesting! Your "getnumber" routine is... unusual... but it works... almost.

Code: [Select]

getnumber:
	mov eax, 3
	mov ebx, 0
	mov ecx, no
	mov edx, 1
	int 80h

At this point, you put one and only one byte into [no]. This will overwrite any "previous garbage", unless the "previous garbage" exceeds one byte - 255. If the previous number was bigger than 255, there's some "crud" left in [no] which doesn't get handled properly! This results in the final number being not what you intended - not even close.

We can (apparently) fix this be clearing out [no] before starting each "getnumber"....

Code: [Select]

getnumber:
        mov dword [no], 0
	mov eax, 3
	mov ebx, 0
	mov ecx, no
	mov edx, 1
	int 80h

This is still going to have problems if the pesky user enters a non-digit - they'll enter any old thing that comes into their head, y'know - but it seems to solve your immediate puzzle. (not tested extensively, but "seems to work")

Best,
Frank

ashwin · « **Reply #2 on:** February 19, 2012, 04:56:22 PM »

Thanks for the prompt reply Frank. and yes, that does solve my problem.

but you did say my method version of getnumber was quite unusual. i'm just a novice assembly language programmer and this is just some code i thought up by myself. what would be a better method to get integer inputs??

Frank Kotler · « **Reply #3 on:** February 20, 2012, 07:43:47 AM »

There are different ways of doing it. Your way isn't bad. The general idea is:

; zero out the "result so far"
top:
; get a character
; make sure it's a valid decimal digit
; if so, multiply the "result so far" by ten
; add in the "new" character, with '0' subtracted to "convert" from a character to the number it represents
; do more

That's what you did, except that you used "[no]" both as a buffer for the first character and as "result so far", and it only got cleared if it was less than 256, resulting in that weird behavior.

You could also figure out in advance how many characters you had, and multiply by 10000, 1000, 1000, 10 as required (subtracting '0' first), but multiplying "result so far" by 10 each time does the same thing without having to know in advance how many characters we've got. Does a useless "0 x 10" the first time through, but that does no harm...

I like to read the whole "string" before starting, rather than reading one character at a time. As you've probably figured out, sys_read on stdin doesn't actually return until "enter" is hit. You get edx characters, and any excess stays in the OS's i/o buffer, to be read next time. This was throwing me for a loop trying to run your program in a debugger. I'd type "123", the sys_read would get the "1" and the "23" would show up at the debugger's prompt, causing an error!

You can demonstrate this... sys_read say 4 characters, and then exit. Then type "xxxxls" and you wind up with a directory listing after the program exits! This could potentially do harm, but as long as you keep reading until the linefeed (10) shows up, as you did, it works okay. By rights, we really ought to flush the buffer, removing any "excess" beyond the length specified in edx, but... "close enough for government work".

This is what I've been using lately to convert string to number, mostly because I think it's "cute" to do the "multiply result so far by ten and add in the new character with '0' subtracted" with just two "lea" instructions. It was shown to me by Herbert Kleebauer - output from a C compiler, supposedly.

Code: [Select]

;--------------------
atoi:
    mov edx, [esp + 4]  ; pointer to string
    xor eax, eax        ; clear "result"
.top:
    movzx ecx, byte [edx]
    inc edx
    cmp ecx, byte '0'
    jb .done
    cmp ecx, byte '9'
    ja .done
    
    ; we have a valid character - multiply
    ; result-so-far by 10, subtract '0'
    ; from the character to convert it to
    ; a number, and add it to result.
    
    lea eax, [eax + eax * 4]
    lea eax, [eax * 2 + ecx - '0']

    jmp short .top
.done:
    ret
;------------------------

Really should be named "atou" not "atoi", since it won't do negative numbers. That would require some extra processing - check for a minus sign first, and negate eax at the end, if so. I don't do anything different for "invalid character" - any non-digit ends the processing... so if the user types "12xyz34", we get only 12. That could perhaps be improved (but that's the way C does it). A disadvantage to using "lea" is that there's no indication of overflow, it just "rolls over". If we used "mul" and "add", we could check for overflow (both places) and avoid this. Good enough for "toy programs".

Here's how it could be integrated into your program:

Code: [Select]

section .data

msg1 : db "enter the number of elements of array : ", 10, 13
len1 : equ $-msg1
msg2 : db "enter the elements : ", 10, 13
len2 : equ $-msg2
msg3 : db "enter the element to be searched : ", 10, 13
len3 : equ $-msg3
msg4: db "element found at position : ", 10, 13
len4: equ $-msg4
msg5: db "element not found !! ", 10, 13
len5: equ $-msg5



section .bss

inbuf resb 12

no: resd 1
no2: resd 1
divide: resd 1
dispcount: resd 1
no_of_elements: resd 1
array: resd 50
temp: resd 1
element: resd 1
pos: resd 1



section .text
	global _start
	
_start:
	mov eax, 4
	mov ebx, 1
	mov ecx, msg1
	mov edx, len1
	int 80h

call getnumber

	mov eax, [no]
	mov [no_of_elements], eax
	
	mov eax, 4
	mov ebx, 1
	mov ecx, msg2
	mov edx, len2
	int 80h

	mov eax, [no_of_elements]
	mov [temp], eax


getarr:
	cmp dword [temp], 0
	je donegetno

	call getnumber	

	mov eax, [no]
	push eax
	
	dec dword [temp]
	jmp getarr

donegetno:

	mov eax, 4
	mov ebx, 1
	mov ecx, msg3
	mov edx, len3
	int 80h

	call getnumber

	mov eax, [no]
	mov [element], eax

	mov ebx, [no_of_elements]
	mov [temp], ebx

	
search:
	pop eax

	mov [no], eax
		
	cmp [element], eax
	je found

	dec dword [temp]

	cmp dword [temp], 0
	jne search

	jmp notfound

found : 
	add dword [temp], 30h

	mov eax, 4
	mov ebx, 1
	mov ecx, msg4
	mov edx, len4
	int 80h

	mov eax, 4
	mov ebx, 1
	mov ecx, temp
	mov edx, 1
	int 80h

	mov dword [no], 10

	mov eax, 4
	mov ebx, 1
	mov ecx, no
	mov edx, 1
	int 80h
	
	jmp exit

notfound :
	mov eax, 4
	mov ebx, 1
	mov ecx, msg5
	mov edx, len5
	int 80h

exit:
	mov eax, 1
	mov ebx, 0
	int 80h


getnumber:

	mov eax, 3
	mov ebx, 0
	mov ecx, inbuf
	mov edx, 11
	int 80h

	push inbuf
	call atoi
	add esp, 4

	mov dword [no], eax ; or could just return result in eax
	ret

;--------------------
atoi:
    mov edx, [esp + 4]  ; pointer to string
    xor eax, eax        ; clear "result"
.top:
    movzx ecx, byte [edx]
    inc edx
    cmp ecx, byte '0'
    jb .done
    cmp ecx, byte '9'
    ja .done
    
    ; we have a valid character - multiply
    ; result-so-far by 10, subtract '0'
    ; from the character to convert it to
    ; a number, and add it to result.
    
    lea eax, [eax + eax * 4]
    lea eax, [eax * 2 + ecx - '0']

    jmp short .top
.done:
    ret
;------------------------

Note that this is "C compatible" - expects the parameter (address of the string) on the stack, and we have to "clean up the stack" on return (add esp, 4). This could be modified to pass the parameter in a register, if you want. The registers used, ecx and edx, are allowed to be "trashed" in C. You can't access 'em in C, but actually ecx (just cl, really) contains the "invalid digit" that ended processing, and edx points to the next character. This could be useful in evaluating a "dotted quad" IP address like "123.456.78.1", for example...

Not necessarily the "ideal" method, but I think it's "cute".

Best,
Frank

ashwin · « **Reply #4 on:** February 21, 2012, 04:47:42 PM »

I used this method only because i didnt know that sys_read exits as soon as "enter" key is pressed. i thought we had to get edx characters, no matter what, or else it would result in an error. Either way it works and that's all that matters for me!!

anyhow thanks for helping me out Frank

NASM - The Netwide Assembler

News:

Author Topic: problem with linear search (Read 18387 times)

ashwin

problem with linear search

Frank Kotler

Re: problem with linear search

ashwin

Re: problem with linear search

Frank Kotler

Re: problem with linear search

ashwin

Re: problem with linear search