Author Topic: str replace  (Read 9530 times)

Offline CrawlingCondor

  • Jr. Member
  • *
  • Posts: 2
str replace
« on: August 08, 2011, 10:08:30 AM »
Hey

Im a new to ASM and especially to the netwide assembler. Im trying to code my own str replace function. But im getting always segmentation faults and i don't know how to use gdb in pure asm code (dunno how to set braekpoints), so i cannot locate the failing function.

But i know that the fault must be in replace_str, the other 3 'functions' work like a charm...

Can u see the fault?

Code: [Select]
;coded by Lovly_Spid3r
;published on back2hack.cc

section .data
hello     db "Enter your text naaaaaauw",10   
helloLen  equ $-hello
replace_with db "test"
needle db "huhu"
MAX_STR_LEN equ 1000h
section .bss
input_text: resb 128
str_length: resd 1
str_len_needle: resd 1
section .text
global _start

_start:
;main logicz
push hello
push helloLen
call print_str
add esp, 8

push input_text
push 128
call read_str
add esp, 8

push input_text
push needle
push replace_with
call replace_str
add esp, 12

push input_text
push 128
call print_str
add esp, 8

mov eax,1
mov ebx,0
int 80h

;stdcall calling convention
;1 param: address of buffer to store the read text
;2 param: length of buf
read_str:
push ebp
mov ebp, esp

mov eax, 3
mov ebx, 1
mov ecx, [ebp+12]
mov edx, [ebp+8]

int 80h

pop ebp
ret

print_str:
push ebp
mov ebp, esp

mov eax, 4
mov ebx, 1
mov ecx, [ebp+12]
mov edx, [ebp+8]

int 80h

pop ebp
ret

;1 param: adress of string to measure
;strlen is stored in eax
str_len:
push ebp
mov ebp, esp

mov ecx, MAX_STR_LEN
loop_len:
cmp byte [ebp+8+ecx], 0h
jz found
loop loop_len
found:
mov eax, ecx

pop ebp
ret

;1 param: address of text
;2: address of needle
;3: address of replacement string
replace_str:
push ebp
mov ebp, esp

mov ebx, dword[ebp+12]
push ebx
call str_len
add esp, 4
mov [str_length], eax

mov ebx, dword[ebp+8]
push ebx
call str_len
add esp, 4
mov [str_len_needle], eax

;uses simple search algorithm
mov ecx, str_len
mov edx, 0h ;counts successfull collisions
;as long we cant find not the needle:
find_loop:
mov ah, byte[ebp+12+ecx];move the value in the char array at the current position into eax
cmp ah, 0h ;did we reached the end of the string?
jz end_algo
mov ah, byte[ebp+12+ecx]
cmp byte[ebp+8+ecx], ah;are the char values equal?
jz needle_success
jmp continue
needle_success:
mov eax, [str_len_needle]
cmp eax, edx
jz end_algo;then ecx-edx points to the offset in string where needle starts
inc edx
continue:
loop find_loop
end_algo:
mov eax, ecx
sub eax, edx;calculate needle offset within the char array
;now the replacin stuff :(
mov ecx, 0
replace_loop:
xchg byte[ebp+16+ecx],bh
mov byte[ebp+12+ecx],bh
cmp ecx, [str_len_needle]
jz replace_end
loop replace_loop
replace_end:
;we really made it, although, assembly is a ruuuuuuude way to play with computers

pop ebp
ret

Well, after reading further information about how to debug with dwarf and stabs format with the help of gdb, my problem can be solved by myself. sorry for posting before reading...
have nice days :)
« Last Edit: August 08, 2011, 01:56:45 PM by CrawlingCondor »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: str replace
« Reply #1 on: August 08, 2011, 02:50:22 PM »
Hi CrawlingCondor,

Good to hear you've got gdb working. That will help.

The "biggest" problem I see is that you're searching the stack. Only the address of the string is on the stack, not the string itself. Another fairly "big" problem is that you're looking for zeros, but you don't have any zero-terminated strings!

If you can't get it working even with gdb's help, post again and we'll discuss it further...

Best,
Frank


Offline avcaballero

  • Full Member
  • **
  • Posts: 132
  • Country: es
    • Abre los Ojos al Ensamblador
Re: str replace
« Reply #2 on: August 09, 2011, 08:23:42 AM »
Interesting exercise  :)

I reused code that I wrote long ago to find strings inside each other. It replaces all occurrences of "mayo" ("May") by "enero" ("January") in an excerpt from the romance of the prisoner. Can surely be improved, but will probably work. It's in Spanish, sorry. Greetings.
« Last Edit: August 09, 2011, 08:26:25 AM by avcaballero »

Offline avcaballero

  • Full Member
  • **
  • Posts: 132
  • Country: es
    • Abre los Ojos al Ensamblador
Re: str replace
« Reply #3 on: August 09, 2011, 08:33:13 AM »
Opps. At the last moment I put one more parameter, so you have to change "ret 6 * 2" by "ret 7 * 2" in order it work well ;)

Offline Bryant Keller

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 360
  • Country: us
    • About Bryant Keller
Re: str replace
« Reply #4 on: August 10, 2011, 02:32:06 AM »
(dunno how to set braekpoints)

Hardware breakpoints can be set using 'INT3' in your code. You should definitely make that into a macro:

Code: [Select]
%ifndef NDEBUG
%macro HW_BREAK 0
int3
%endmacro
%else
%macro HW_BREAK 0
; does nothing.
%endmacro
%endif

This way you can just insert HW_BREAK on a line where you want to force a break while debugging. Then when you get to your release code, just use -DNDEBUG on the command line.

If you want to set breakpoints in GDB, try using the tbreak (temporary breakpoint) and break commands.

HtH,
~Bryant

About Bryant Keller
bkeller@about.me

Offline CrawlingCondor

  • Jr. Member
  • *
  • Posts: 2
Re: str replace
« Reply #5 on: August 17, 2011, 12:36:48 AM »
Thanks for all your replies :)
my first code wouldn't work, but after debugging with edb, i finally suceeded and im really looking forward to create and develop new programs :)
My next little thing will be a simple AI for a even more simple game :)
Whatever, here is the propper working code:

Code: [Select]
;coded by Lovly_Spid3r
;published on back2hack.cc
;that was pretty hard work oO
;nasm -f elf32 -g -F dwarf -l filename.lst  filename.asm
;ld -m elf_i386 -s -o a filename.o 2> log.txt
;usage: program string needle replace
;example usage ---> ./a thistextisthebase text f***
;output ---> thisfuckisthebase

section .data
    MAX_STR_LEN equ 1000h
    needle: times 20 db 0x0
    replace: times 20 db 0x0

section .bss
    input_text: resb 128
    str_length: resd 1
    str_len_needle: resd 1
section .text
    global _start

_start:
    pop ecx;argc
    cmp ecx, 4
    jz con
    jmp quit

con:
    pop ebx;programname
    pop ebx;first param :)
    mov [input_text], ebx
    pop ebx
    mov [needle], ebx
    pop ebx
    mov [replace], ebx
   
   
    push input_text
    push needle
    push replace
    call replace_str
    add esp, 12
   
    mov eax, [input_text]
    push eax
    push dword[str_length]
    call print_str
    add esp, 8

quit:
    mov eax,1
    mov ebx,0
    int 80h

;stdcall calling convention
;1 param: address of buffer to store the read text
;2 param: length of buf
read_str:
    push ebp
    mov ebp, esp

    mov eax, 3
    mov ebx, 1
    mov ecx, [ebp+12]
    mov edx, [ebp+8]

    int 80h
   
    pop ebp
    ret
   
print_str:
    push ebp
    mov ebp, esp

    mov eax, 4
    mov ebx, 1
    mov ecx, [ebp+12]
    mov edx, [ebp+8]

    int 80h

    pop ebp
    ret

;1 param: adress of string to measure
;strlen is stored in eax
str_len:
    push ebp
    mov ebp, esp
   
    mov edx, 0 ;counter
    mov ebx,  [ebp+8]
    mov ebx, [ebx];is there any better solution??? feeling dumb now :(
    loop_len:
        mov ah,  byte [ebx+edx]
        inc edx
        cmp ah, 0h
        jz found
        jmp loop_len
    found:
        mov eax, edx
       
    pop ebp
    ret

;1 param: address of text
;2: address of needle
;3: address of replacement string
replace_str:
    push ebp
    mov ebp, esp
   
    mov ebx, dword[ebp+16]
    push ebx
    call str_len
    add esp, 4
    mov [str_length], eax
   
    mov ebx, dword[ebp+12]
    push ebx
    call str_len
    add esp, 4
    mov [str_len_needle], eax
   
    ;uses simple search algorithm
    mov edx, 0h ;counts successfull collisions
    mov ecx, 0h ;eatz catz
    ;again here: feeling not comfortable with this solution :(
    mov esi, [ebp+12]
    mov esi, [esi]
    mov edi, [ebp+16]
    mov edi, [edi]
    mov ebx, 1h ;determines wheter last char was goodboy
    ;as long we cant find not the needle:
    find_loop:
        mov eax, [str_len_needle]
        push ecx;save ecx on the stack
        mov ecx, edx
        inc ecx
        cmp eax, ecx
        pop ecx;restore ecx
        jz end_algo;then ecx-edx points to the offset in string where needle starts
        cmp ecx, dword[str_length] ;did we reached the end of the string?
        jz end_algo
        mov ah, byte[esi+edx] ;lil trick here :
        cmp byte[edi+ecx], ah;are the char values equal?
        jz needle_success
        xor ebx, ebx
        xor edx, edx
        jmp continue
        needle_success:
            cmp edx, 1
            jge check           
            inc edx
            add ebx, 1
            jmp continue
            check:
                cmp ebx, 0
                jnz continue_add_edx
                xor edx, edx
                jmp continue
        continue_add_edx:
            inc edx
        continue:
        inc ecx
        jmp find_loop
    end_algo:
    mov eax, ecx
    sub eax, edx;calculate needle offset within the char array
    ;now the replacin stuff :(
    mov ecx, 0
    mov esi, [ebp+8]
    mov esi, [esi]
    replace_loop:
        xchg byte[esi+ecx],bh
        mov byte[edi+eax],bh
        inc eax
        inc ecx
        cmp ecx, [str_len_needle]
        jz replace_end
        jmp replace_loop
    replace_end:
    ;we really made it, although, assembly is a ruuuuuuude way to play with computers
   
    pop ebp
    ret

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: str replace
« Reply #6 on: August 17, 2011, 01:43:51 AM »
Maybe it's just me, but this thing seems to be dropping the character after the replacement, no? Behavior is somewhat erratic if needle and replace texts are not the same length. Might need some more work before you declare it "working properly". Good start, though - much improved from your last example!

If you don't want to have to "dereference the pointer" (mov esi, [esi] etc.) then don't pass a pointer. You've popped the address of a zero-terminated string from the stack, and stored it in "input_text" (etc.). Then you "push input_text" before calling your function(s). You've passed your function an address where the address of your string can be found (a pointer). Nothing wrong with this. If you don't want to have to "dereference the pointer", just "push dword [input_text]" (etc.) instead.

I hope you get that AI thing working - the regular kind doesn't work worth squat! :)

Best,
Frank