Author Topic: What's wrong in this code please?  (Read 8478 times)

Offline prdas31

  • Jr. Member
  • *
  • Posts: 7
What's wrong in this code please?
« on: March 22, 2018, 05:50:20 AM »
Request someone to point out the errors in this code please. [g++, nasm64, windows 10, fastcall]

C++ caller:

Code: [Select]
extern "C" char* ConCatProc(char*, char*);


int main() {
char str1[] = "Test string from C++";
char str2[] = "Test string from NASM64";

char* rax = ConCatProc(str1, str2);
cout << "Concat: " <<rax<< endl;

return 0;
}

The called Nasm routine:

Code: [Select]
segment .data

concbuf times 2 dq " ",'0'

segment .text

global ConCatProc

ConCatProc:
pushfq

mov [concbuf], rcx
mov [concbuf + 1], rdx

mov rax, [concbuf]


popfq
ret

Appreciate if someone can help correcting the problem.
Thanks.

Offline stressful

  • Full Member
  • **
  • Posts: 105
  • Country: 00
    • CPU2.0
Re: What's wrong in this code please?
« Reply #1 on: March 22, 2018, 01:42:42 PM »
Maybe what you want is a correct buffer setup with large enough size to hold both strings.

Code: [Select]
section .bss
conBuff: resb 100

Then the next logical steps would be;

0. Make RDI point to the buffer.
1. Find the length of first string.
2. Copy the 1st string to the buffer, according to the length
3. Find the length of second string
4. Copy the 2nd string to the buffer, according to the length
5. 0-ended the buffer
6. Return the address of the buffer in RAX

Need to use lots of string instructions (stosb, scasb, rep, repne etc).  I think you should be aware what you are dealing with. Calling C's "strcat" is probably not a bad idea if you're new.

Offline prdas31

  • Jr. Member
  • *
  • Posts: 7
Re: What's wrong in this code please?
« Reply #2 on: March 22, 2018, 04:10:00 PM »
Maybe what you want is a correct buffer setup with large enough size to hold both strings.

Code: [Select]
section .bss
conBuff: resb 100

Then the next logical steps would be;

0. Make RDI point to the buffer.
1. Find the length of first string.
2. Copy the 1st string to the buffer, according to the length
3. Find the length of second string
4. Copy the 2nd string to the buffer, according to the length
5. 0-ended the buffer
6. Return the address of the buffer in RAX

Need to use lots of string instructions (stosb, scasb, rep, repne etc).  I think you should be aware what you are dealing with. Calling C's "strcat" is probably not a bad idea if you're new.

Thanks for your reply.
I was also thinking in similar lines and so followed your advice to write this implementation below.
But it still does not work, and returns only the first string.

Could you point out the problems please?

Code: [Select]
segment .data

segment .bss
concbuf: resb 100

segment .text

global ConCatProc


;Calculate length of string in RCX
LengthOfStringProc:
push rdi

mov rdi, rcx
xor rax, rax

_strlen_next:
    cmp         byte [rdi],  0
    jz    _strlen_null_or_done
    inc rdi
    inc         rax
    jne         _strlen_next

_strlen_null_or_done:

mov rdi, rcx

    ; return length of string in RAX

    pop rdi
ret



;Proc called from C++
ConCatProc:

mov r9, rcx
call LengthOfStringProc
mov r10, rax ;length of str 1

mov r11, rdx
mov rcx, rdx
call LengthOfStringProc
mov r12, rax ;length of str 1

push rdi

lea rdi, [concbuf] ;Make RDI point to the buffer.

;Copy the 1st string to the buffer, according to the length
cld
mov rax, r9
mov rcx, r10
rep stosq

;Copy the 2nd string to the buffer, according to the length
lea rdi, [concbuf + r10+1]
cld
mov rax, r11
mov rcx, r12
rep stosq

;0-ended the buffer
lea rdi, [concbuf + r10+r12+1]
mov rdi, '0'

mov rax, [concbuf] ;should return the concatenated buffer? NOT WORKING, returns only the first string

pop rdi

ret


EDIT: I think I cannot use the QWORD variant of STOS. Have to use the STOSB. Is that correct?
Thanks a lot.
« Last Edit: March 22, 2018, 04:35:05 PM by prdas31 »

Offline stressful

  • Full Member
  • **
  • Posts: 105
  • Country: 00
    • CPU2.0
Re: What's wrong in this code please?
« Reply #3 on: March 22, 2018, 04:46:40 PM »
You're close. Only that I expected you use MOVSB instead of STOSQ. I suspect you're dealing with UTF characters?

Here a quick code, dealing with ASCII text, that is close enough to your interpretation and my suggested algorithm above...

Code: [Select]
        global ConCatProc

        section .bss
conbuf: resb 100

        section .text
ConCatProc:
        pushfq
        cld                ;all-forward operation
        mov     rdi,conbuf ;pointer to buffer
        mov     r8,rcx     ;first string
        mov     r9,rdx     ;second string

        mov     rsi,r8     ;find string1 length
        call    str_length
        mov     rcx,rax
        rep     movsb      ;copy to buffer. RSI--> RDI

        mov     rsi,r9     ;find string2 length
        call    str_length
        mov     rcx,rax
        rep     movsb      ;copy to buffer. RSI--> RDI

        xor     al,al
        stosb              ;0-ended the buffer
        mov     rax,conbuf ;return buffer's address
        popfq
        ret

;Find length of a C string
str_length:
        push    rdi
        mov     rdi,rsi
        mov     al,0
        mov     rcx,-1
        repne   scasb
        mov     rax,-2
        sub     rax,rcx
        pop     rdi
        ret

The key here is to not manually modify RDI as the buffer's pointer. Let it increments naturally in MOVSB instructions.


Offline prdas31

  • Jr. Member
  • *
  • Posts: 7
Re: What's wrong in this code please?
« Reply #4 on: March 23, 2018, 03:50:50 AM »
You're close. Only that I expected you use MOVSB instead of STOSQ. I suspect you're dealing with UTF characters?

Here a quick code, dealing with ASCII text, that is close enough to your interpretation and my suggested algorithm above...

Code: [Select]
        global ConCatProc

        section .bss
conbuf: resb 100

        section .text
ConCatProc:
        pushfq
        cld                ;all-forward operation
        mov     rdi,conbuf ;pointer to buffer
        mov     r8,rcx     ;first string
        mov     r9,rdx     ;second string

        mov     rsi,r8     ;find string1 length
        call    str_length
        mov     rcx,rax
        rep     movsb      ;copy to buffer. RSI--> RDI

        mov     rsi,r9     ;find string2 length
        call    str_length
        mov     rcx,rax
        rep     movsb      ;copy to buffer. RSI--> RDI

        xor     al,al
        stosb              ;0-ended the buffer
        mov     rax,conbuf ;return buffer's address
        popfq
        ret

;Find length of a C string
str_length:
        push    rdi
        mov     rdi,rsi
        mov     al,0
        mov     rcx,-1
        repne   scasb
        mov     rax,-2
        sub     rax,rcx
        pop     rdi
        ret

The key here is to not manually modify RDI as the buffer's pointer. Let it increments naturally in MOVSB instructions.

Thanks a lot!

I tried with MOBSB earlier,  but somehow could not make it work correctly. :-[

Now thanks to you I've re-adjusted my code accordingly and it works great.

The fact is that the second MOVSB can also be replaced by a MOVSQ, like in the following code.
Also, since I have used LEA DI, buffer to load the address instead of MOV, I think I can get rid of CLD:

Code: [Select]
  global ConCatProc, LengthOfStringProc

        section .bss
concbuf: resb 100

        section .text
ConCatProc:
        pushfq
        ;;;;;;;;;;;;;cld                ;all-forward operation

        mov r9, rcx
call LengthOfStringProc
mov r10, rax ;length of str 1

mov r11, rdx
mov rcx, rdx
call LengthOfStringProc
mov r12, rax ;length of str 2

lea rdi, [concbuf]  ;Make RDI point to the buffer.

        mov     rsi,r9
        mov     rcx,r10
        rep     movsb      ;copy to buffer. RSI--> RDI


        mov     rsi,r11
        mov     rcx,r12
        rep     movsq      ;copy to buffer. RSI--> RDI

        xor     al,al
        stosb               ;0-ended the buffer
        mov     rax,concbuf ;return buffer's address

        popfq
        ret

;Find length of a C string
LengthOfStringProc:
push rdi

mov rdi, rcx
xor rax, rax

_strlen_next:
    cmp         byte [rdi],  0
    jz    _strlen_null_or_done
    inc rdi
    inc         rax
    jne         _strlen_next

_strlen_null_or_done:

mov rdi, rcx

    ; return length of string in RAX

    pop rdi
ret



Nonetheless, I think your implementation of string length is far better optimized than mine.
Anyway, it works now, and thanks again for your help and time!

« Last Edit: March 23, 2018, 03:54:08 AM by prdas31 »

Offline prdas31

  • Jr. Member
  • *
  • Posts: 7
Re: What's wrong in this code please?
« Reply #5 on: March 24, 2018, 06:04:27 AM »
You're close. Only that I expected you use MOVSB instead of STOSQ. I suspect you're dealing with UTF characters?

Here a quick code, dealing with ASCII text, that is close enough to your interpretation and my suggested algorithm above...

Code: [Select]
        global ConCatProc

        section .bss
conbuf: resb 100

        section .text
ConCatProc:
        pushfq
        cld                ;all-forward operation
        mov     rdi,conbuf ;pointer to buffer
        mov     r8,rcx     ;first string
        mov     r9,rdx     ;second string

        mov     rsi,r8     ;find string1 length
        call    str_length
        mov     rcx,rax
        rep     movsb      ;copy to buffer. RSI--> RDI

        mov     rsi,r9     ;find string2 length
        call    str_length
        mov     rcx,rax
        rep     movsb      ;copy to buffer. RSI--> RDI

        xor     al,al
        stosb              ;0-ended the buffer
        mov     rax,conbuf ;return buffer's address
        popfq
        ret

;Find length of a C string
str_length:
        push    rdi
        mov     rdi,rsi
        mov     al,0
        mov     rcx,-1
        repne   scasb
        mov     rax,-2
        sub     rax,rcx
        pop     rdi
        ret

The key here is to not manually modify RDI as the buffer's pointer. Let it increments naturally in MOVSB instructions.

Thanks a lot!

I tried with MOBSB earlier,  but somehow could not make it work correctly. :-[

Now thanks to you I've re-adjusted my code accordingly and it works great.

The fact is that the second MOVSB can also be replaced by a MOVSQ, like in the following code.
Also, since I have used LEA DI, buffer to load the address instead of MOV, I think I can get rid of CLD:

Code: [Select]
  global ConCatProc, LengthOfStringProc

        section .bss
concbuf: resb 100

        section .text
ConCatProc:
        pushfq
        ;;;;;;;;;;;;;cld                ;all-forward operation

        mov r9, rcx
call LengthOfStringProc
mov r10, rax ;length of str 1

mov r11, rdx
mov rcx, rdx
call LengthOfStringProc
mov r12, rax ;length of str 2

lea rdi, [concbuf]  ;Make RDI point to the buffer.

        mov     rsi,r9
        mov     rcx,r10
        rep     movsb      ;copy to buffer. RSI--> RDI


        mov     rsi,r11
        mov     rcx,r12
        rep     movsq      ;copy to buffer. RSI--> RDI

        xor     al,al
        stosb               ;0-ended the buffer
        mov     rax,concbuf ;return buffer's address

        popfq
        ret

;Find length of a C string
LengthOfStringProc:
push rdi

mov rdi, rcx
xor rax, rax

_strlen_next:
    cmp         byte [rdi],  0
    jz    _strlen_null_or_done
    inc rdi
    inc         rax
    jne         _strlen_next

_strlen_null_or_done:

mov rdi, rcx

    ; return length of string in RAX

    pop rdi
ret



Nonetheless, I think your implementation of string length is far better optimized than mine.
Anyway, it works now, and thanks again for your help and time!

One thing that I noticed about overflown stack is when I build this code (same code, just ported) in VStudio with MASM64. With the /GS option (canary check) the C++ compiler detected a stack overflow and I had to change the second MOVSQ to MOBSB to rectify this.

I've experimented this in GCC with these options, but none detected any error or thrown any warnings:

Code: [Select]
-Wformat-overflow=2 -Wformat-truncation=2 -D_FORTIFY_SOURCE=1 -fstack-check -fstack-protector-strong -fstack-protector-explicit

However G++ detected some problemwhen I used the -fstack-protector in the compiler, which then throws this following error:
Code: [Select]
undefined reference to `__stack_chk_fail'
But now the problem is that it throws this error even after I changed  second MOVSQ to MOBSB !!

Any ideas please?

Thanks.






Offline dreamCoder

  • Full Member
  • **
  • Posts: 107
Re: What's wrong in this code please?
« Reply #6 on: March 24, 2018, 11:33:09 AM »
movsq is really not necessary and potentially erronous. movsq will transfer 8-bytes at a time and updates RSI/RDI to 8 every time. In a tight heap allocation policy, RDI/RSI from movsq will likely to point to other area, breaking some section/memory barrier, hence "overflow" after addressing the last data, forcing you to use all g++ unnecessary "stack" switches. So you should stick to movsb.

Refer to movs/b/q/d/q here

Offline dreamCoder

  • Full Member
  • **
  • Posts: 107
Re: What's wrong in this code please?
« Reply #7 on: March 24, 2018, 11:45:53 AM »
Quote
Also, since I have used LEA DI, buffer to load the address instead of MOV, I think I can get rid of CLD

CLD has nothing to do with LEA. Setting the CLD (DF=0) is important in string operations so that RDI/RSI will be incremented towards higher memory address after every string operations (movs, stos, scas, cmps + rep, repne, repe). If, the caller to your function set the string direction backward instead, via STD, then your function will give you funny results due to the strings being saved/read from backwards. So using CLD is a wise precaution measure. stressful's code has lots of educational materials in there. So you should stick to it until you know some more stuff.