### Author Topic: problem with linear search  (Read 9069 times)

#### ashwin

• Jr. Member
• Posts: 3
##### problem with linear search
« on: February 19, 2012, 05:06:14 AM »
so i have written this program for linear search that first takes the number of numbers of array as input. then the user inputs the numbers and finally the number to be searched.it then searches from the last element and prints the position of the first match it finds and if no match is found, it prints "element not found".now the programs works fine as long as all the numbers of the array are below 256 but even if one of the inputs is above or equal to 256, the program doesnt work properly and i have no idea why.

Code: [Select]
`section .datamsg1 : db "enter the number of elements of array : ", 10, 13len1 : equ \$-msg1msg2 : db "enter the elements : ", 10, 13len2 : equ \$-msg2msg3 : db "enter the element to be searched : ", 10, 13len3 : equ \$-msg3msg4: db "element found at position : ", 10, 13len4: equ \$-msg4msg5: db "element not found !! ", 10, 13len5: equ \$-msg5section .bssno: resd 1no2: resd 1divide: resd 1dispcount: resd 1no_of_elements: resd 1array: resd 50temp: resd 1element: resd 1pos: resd 1section .text global _start _start: mov eax, 4 mov ebx, 1 mov ecx, msg1 mov edx, len1 int 80hcall getnumber mov eax, [no] mov [no_of_elements], eax mov eax, 4 mov ebx, 1 mov ecx, msg2 mov edx, len2 int 80h mov eax, [no_of_elements] mov [temp], eaxgetarr: cmp dword [temp], 0 je donegetno call getnumber mov eax, [no] push eax dec dword [temp] jmp getarrdonegetno: mov eax, 4 mov ebx, 1 mov ecx, msg3 mov edx, len3 int 80h call getnumber mov eax, [no] mov [element], eax mov ebx, [no_of_elements] mov [temp], ebx search: pop eax mov [no], eax cmp [element], eax je found dec dword [temp] cmp dword [temp], 0 jne search jmp notfoundfound : add dword [temp], 30h mov eax, 4 mov ebx, 1 mov ecx, msg4 mov edx, len4 int 80h mov eax, 4 mov ebx, 1 mov ecx, temp mov edx, 1 int 80h mov dword [no], 10 mov eax, 4 mov ebx, 1 mov ecx, no mov edx, 1 int 80h jmp exitnotfound : mov eax, 4 mov ebx, 1 mov ecx, msg5 mov edx, len5 int 80hexit: mov eax, 1 mov ebx, 0 int 80hgetnumber: mov eax, 3 mov ebx, 0 mov ecx, no mov edx, 1 int 80h sub dword [no], 30hl1: mov eax, 3 mov ebx, 0 mov ecx, no2 mov edx, 1 int 80h cmp dword [no2], 10d je done sub dword [no2], 30h mov eax, dword [no] mov ecx, 10 mul ecx mov edx, 0 add eax, dword [no2] mov dword [no], eax jmp l1done: ret`

#### Frank Kotler

• NASM Developer
• Hero Member
• Posts: 2667
• Country:
##### Re: problem with linear search
« Reply #1 on: February 19, 2012, 12:54:22 PM »
Well, that's... interesting! Your "getnumber" routine is... unusual... but it works... almost.

Code: [Select]
`getnumber: mov eax, 3 mov ebx, 0 mov ecx, no mov edx, 1 int 80h`
At this point, you put one and only one byte into [no]. This will overwrite any "previous garbage", unless the "previous garbage" exceeds one byte - 255. If the previous number was bigger than 255, there's some "crud" left in [no] which doesn't get handled properly! This results in the final number being not what you intended - not even close.

We can (apparently) fix this be clearing out [no] before starting each "getnumber"....

Code: [Select]
`getnumber:        mov dword [no], 0 mov eax, 3 mov ebx, 0 mov ecx, no mov edx, 1 int 80h`
This is still going to have problems if the pesky user enters a non-digit - they'll enter any old thing that comes into their head, y'know - but it seems to solve your immediate puzzle. (not tested extensively, but "seems to work")

Best,
Frank

#### ashwin

• Jr. Member
• Posts: 3
##### Re: problem with linear search
« Reply #2 on: February 19, 2012, 04:56:22 PM »
Thanks for the prompt reply Frank. and yes, that does solve my problem.

but you did say my method version of getnumber was quite unusual. i'm just a novice assembly language programmer and this is just some code i thought up by myself. what would be a better method to get integer inputs??

#### Frank Kotler

• NASM Developer
• Hero Member
• Posts: 2667
• Country:
##### Re: problem with linear search
« Reply #3 on: February 20, 2012, 07:43:47 AM »
There are different ways of doing it. Your way isn't bad. The general idea is:

; zero out the "result so far"
top:
; get a character
; make sure it's a valid decimal digit
; if so, multiply the "result so far" by ten
; add in the "new" character, with '0' subtracted to "convert" from a character to the number it represents
; do more

That's what you did, except that you used "[no]" both as a buffer for the first character and as "result so far", and it only got cleared if it was less than 256, resulting in that weird behavior.

You could also figure out in advance how many characters you had, and multiply by 10000, 1000, 1000, 10 as required (subtracting '0' first), but multiplying "result so far" by 10 each time does the same thing without having to know in advance how many characters we've got. Does a useless "0 x 10" the first time through, but that does no harm...

I like to read the whole "string" before starting, rather than reading one character at a time. As you've probably figured out, sys_read on stdin doesn't actually return until "enter" is hit. You get edx characters, and any excess stays in the OS's i/o buffer, to be read next time. This was throwing me for a loop trying to run your program in a debugger. I'd type "123", the sys_read would get the "1" and the "23" would show up at the debugger's prompt, causing an error!

You can demonstrate this... sys_read say 4 characters, and then exit. Then type "xxxxls" and you wind up with a directory listing after the program exits! This could potentially do harm, but as long as you keep reading until the linefeed (10) shows up, as you did, it works okay. By rights, we really ought to flush the buffer, removing any "excess" beyond the length specified in edx, but... "close enough for government work".

This is what I've been using lately to convert string to number, mostly because I think it's "cute" to do the "multiply result so far by ten and add in the new character with '0' subtracted" with just two "lea" instructions. It was shown to me by Herbert Kleebauer - output from a C compiler, supposedly.

Code: [Select]
`;--------------------atoi:    mov edx, [esp + 4]  ; pointer to string    xor eax, eax        ; clear "result".top:    movzx ecx, byte [edx]    inc edx    cmp ecx, byte '0'    jb .done    cmp ecx, byte '9'    ja .done        ; we have a valid character - multiply    ; result-so-far by 10, subtract '0'    ; from the character to convert it to    ; a number, and add it to result.        lea eax, [eax + eax * 4]    lea eax, [eax * 2 + ecx - '0']    jmp short .top.done:    ret;------------------------`
Really should be named "atou" not "atoi", since it won't do negative numbers. That would require some extra processing - check for a minus sign first, and negate eax at the end, if so. I don't do anything different for "invalid character" - any non-digit ends the processing... so if the user types "12xyz34", we get only 12. That could perhaps be improved (but that's the way C does it). A disadvantage to using "lea" is that there's no indication of overflow, it just "rolls over". If we used "mul" and "add", we could check for overflow (both places) and avoid this. Good enough for "toy programs".

Here's how it could be integrated into your program:
Code: [Select]
`section .datamsg1 : db "enter the number of elements of array : ", 10, 13len1 : equ \$-msg1msg2 : db "enter the elements : ", 10, 13len2 : equ \$-msg2msg3 : db "enter the element to be searched : ", 10, 13len3 : equ \$-msg3msg4: db "element found at position : ", 10, 13len4: equ \$-msg4msg5: db "element not found !! ", 10, 13len5: equ \$-msg5section .bssinbuf resb 12no: resd 1no2: resd 1divide: resd 1dispcount: resd 1no_of_elements: resd 1array: resd 50temp: resd 1element: resd 1pos: resd 1section .text global _start _start: mov eax, 4 mov ebx, 1 mov ecx, msg1 mov edx, len1 int 80hcall getnumber mov eax, [no] mov [no_of_elements], eax mov eax, 4 mov ebx, 1 mov ecx, msg2 mov edx, len2 int 80h mov eax, [no_of_elements] mov [temp], eaxgetarr: cmp dword [temp], 0 je donegetno call getnumber mov eax, [no] push eax dec dword [temp] jmp getarrdonegetno: mov eax, 4 mov ebx, 1 mov ecx, msg3 mov edx, len3 int 80h call getnumber mov eax, [no] mov [element], eax mov ebx, [no_of_elements] mov [temp], ebx search: pop eax mov [no], eax cmp [element], eax je found dec dword [temp] cmp dword [temp], 0 jne search jmp notfoundfound : add dword [temp], 30h mov eax, 4 mov ebx, 1 mov ecx, msg4 mov edx, len4 int 80h mov eax, 4 mov ebx, 1 mov ecx, temp mov edx, 1 int 80h mov dword [no], 10 mov eax, 4 mov ebx, 1 mov ecx, no mov edx, 1 int 80h jmp exitnotfound : mov eax, 4 mov ebx, 1 mov ecx, msg5 mov edx, len5 int 80hexit: mov eax, 1 mov ebx, 0 int 80hgetnumber: mov eax, 3 mov ebx, 0 mov ecx, inbuf mov edx, 11 int 80h push inbuf call atoi add esp, 4 mov dword [no], eax ; or could just return result in eax ret;--------------------atoi:    mov edx, [esp + 4]  ; pointer to string    xor eax, eax        ; clear "result".top:    movzx ecx, byte [edx]    inc edx    cmp ecx, byte '0'    jb .done    cmp ecx, byte '9'    ja .done        ; we have a valid character - multiply    ; result-so-far by 10, subtract '0'    ; from the character to convert it to    ; a number, and add it to result.        lea eax, [eax + eax * 4]    lea eax, [eax * 2 + ecx - '0']    jmp short .top.done:    ret;------------------------`
Note that this is "C compatible" - expects the parameter (address of the string) on the stack, and we have to "clean up the stack" on return (add esp, 4). This could be modified to pass the parameter in a register, if you want. The registers used, ecx and edx, are allowed to be "trashed" in C. You can't access 'em in C, but actually ecx (just cl, really) contains the "invalid digit" that ended processing, and edx points to the next character. This could be useful in evaluating a "dotted quad" IP address like "123.456.78.1", for example...

Not necessarily the "ideal" method, but I think it's "cute".

Best,
Frank

#### ashwin

• Jr. Member
• Posts: 3
##### Re: problem with linear search
« Reply #4 on: February 21, 2012, 04:47:42 PM »
I used this method only because i didnt know that sys_read exits as soon as "enter" key is pressed. i thought we had to get edx characters, no matter what, or else it would result in an error. Either way it works and that's all that matters for me!!

anyhow thanks for helping me out Frank