Author Topic: Segmentation fault error!  (Read 8353 times)

Offline maria03

  • Jr. Member
  • *
  • Posts: 9
Segmentation fault error!
« on: October 06, 2014, 01:47:18 PM »
Hello,
I have to make an university project:in a first step a have to program an algorithm in C, and in a second step a have to program some of the function in NASM to improve the performance.
The first function i decided to program in nasm was totalCost (it make the sum of an array's elements).
The nasm function is used on the C program and it has the parameters:cost (the array), n (length of the array), sum (the result)
Code: [Select]
global costTot


cost equ 8
n equ 12
sum equ 16



costTot:

push ebp
mov ebp, esp
push ebx
push esi
push edi

mov edi, [ebp+cost] ;
mov   esi, [ebp+n]
xorps xmm2,xmm2

mov ecx,0 ;index of the loop

movups xmm0,[edi]

mov edx,16

;if an element is less then 0 it is not added

.loop xorps xmm1,xmm1
cmpltps xmm1,xmm0
printregps xmm0
haddps xmm0,xmm0
haddps xmm0,xmm0


addss xmm2,xmm0
printregps xmm2
add ecx,4 ; i use the single precision so in each register i put 4 number


movups  xmm0,[edi+edx] ;
add edx,16 ;add 16 for the next four number

cmp ecx,esi
jb .loop
movups [ebp+sum],xmm2


pop edi ; ripristina i registri da preservare
pop esi
pop ebx
mov esp, ebp ; ripristina lo Stack Pointer
pop ebp ; ripristina il Base Pointer
ret ; torna alla funzione C chiamante
When i lunch the program i have segmentation fault and i don't know why.
The code run on the virtualbox with Ubuntu.
Can someone help me?!
Thanks

NB: I'm sorry for the bad english  :'(

Offline Rob Neff

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 430
  • Country: us
Re: Segmentation fault error!
« Reply #1 on: October 06, 2014, 04:16:35 PM »
Code: [Select]
printregps xmm0

The call to that function ( I'm assuming a macro invocation call ) will probably destroy the contents in ecx and edx which are being used in your loop.
Without seeing how you've implemented that function I can't really state much more.

Offline maria03

  • Jr. Member
  • *
  • Posts: 9
Re: Segmentation fault error!
« Reply #2 on: October 06, 2014, 05:30:32 PM »
Sorry you're right!
The prof gave us a file ssutils with some macros the help us in the programming, for example printregps print on the console the four element of a register, i post all the file of the macro utils because printregps uses other macros..
Code: [Select]
extern printf

section .bss

dbuf: resq 1

section .data

imask: db 'Stampa: %d',13,10,0
dmask: db '%f ',0
cr: db 10,0
br1: db '( ',0
br2: db ')',10,0
align 16
xmmtemp: dd 0.0, 0.0, 0.0, 0.0


%macro start 0
push ebp
mov ebp, esp
pushad
%endmacro

%macro stop 0
popad
mov esp, ebp
pop ebp
ret
%endmacro

%macro prints 1
pushad
push %1
call printf
add esp, 4
popad
%endmacro

%macro printreg32 1
pushad
push %1
push imask
call printf
add esp, 8
popad
%endmacro

%macro printregps 1
vpushax
        movaps [xmmtemp], %1
        printps xmmtemp, 1
vpopax
%endmacro


%macro dprint 1
pushad
mov eax,[%1+4]
push eax
mov eax,[%1]
push eax
push dmask
call printf
add esp, 12
popad
%endmacro

%macro sprint 1
finit
fld dword [%1]
fst qword [dbuf]
dprint dbuf
%endmacro

%macro printss 1
sprint %1
prints cr
%endmacro

%macro printps 2
prints br1
push edx
push ecx
mov edx, %1
mov ecx, %2
%%loopps:
sprint edx
sprint edx+4
sprint edx+8
sprint edx+12
add edx, 16
dec ecx
jnz %%loopps
pop ecx
pop edx
prints br2
%endmacro

%macro vprintps 2
prints br1
push edx
push ecx
mov edx, %1
mov ecx, %2
%%loopps:
sprint edx
sprint edx+4
sprint edx+8
sprint edx+12
sprint edx+16
sprint edx+20
sprint edx+24
sprint edx+28
add edx, 32
dec ecx
jnz %%loopps
pop ecx
pop edx
prints br2
%endmacro

%macro printsd 1
dprint %1
prints cr
%endmacro

%macro printpd 2
prints br1
push edx
push ecx
mov edx, %1
mov ecx, %2
%%looppd:
dprint edx
dprint edx+8
add edx, 16
dec ecx
jnz %%looppd
pop ecx
pop edx
prints br2
%endmacro

%macro vprintpd 2
prints br1
push edx
push ecx
mov edx, %1
mov ecx, %2
%%looppd:
dprint edx
dprint edx+8
dprint edx+16
dprint edx+24
add edx, 32
dec ecx
jnz %%looppd
pop ecx
pop edx
prints br2
%endmacro
%macro vpush 1
sub esp, 16
movups [esp], %1
%endmacro

%macro vpop 1
movups %1, [esp]
add esp, 16
%endmacro

%macro vpushax 0
vpush xmm0
vpush xmm1
vpush xmm2
vpush xmm3
vpush xmm4
vpush xmm5
vpush xmm6
vpush xmm7

%endmacro

%macro vpopax 0

vpop xmm7
vpop xmm6
vpop xmm5
vpop xmm4
vpop xmm3
vpop xmm2
vpop xmm1
vpop xmm0
%endmacro
I'm so sorry by i begin start program with nasm a few days ago..
 :'( :'( :'( :'(

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2359
  • Country: us
Re: Segmentation fault error!
« Reply #3 on: October 06, 2014, 05:42:44 PM »
Thanks Rob. Good catch. I noticed that was a "mystery", but it hadn't occurred to me that it might trash registers (causing your loop not to end and run off into memory you don't "own"... causing a segfault).

This is mostly over my head - I still haven't learned those xmm instructions. I know that some of 'em expect aligned data, and will segfault otherwise. You don't seem to have used those. Messing up the stack so you miss the return address will segfault. You don't seem to have done that. Rob may have caught it.

Something that caught my eye was the way you return the result:
Quote
it has the parameters:cost (the array), n (length of the array), sum (the result)
Code: [Select]
movups [ebp+sum],xmm2
That will put the result on the stack, but will not put it into the C variable, which I assume you want. I think what you want to pass as a parameter is the address of the C variable (&sum ?). In that case, you'd want something like:
Code: [Select]
    mov ecx, [ebp + sum]
    movups [ecx], xmm2
That wouldn't cause a segfault (I don't think), but might cause your code not to work as expected.

Please remember that this is beyond my skill level and I could be completely off base. Good luck with it!

Best,
Frank


Offline Rob Neff

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 430
  • Country: us
Re: Segmentation fault error!
« Reply #4 on: October 07, 2014, 02:15:28 AM »
As Frank noted the way you currently attempt to return the results will cause a GPF.
Also, depending on the address of the cost array you most certainly can GPF due to your use of edx.
Finally, make sure that the address in your C code that will contain the results is at least 16 bytes of storage or you will overwrite the stack and/or potentially GPF.
I've inserted comments into the relevant areas of your code below:

Code: [Select]
global costTot


cost equ 8
n equ 12
sum equ 16



costTot:

push ebp
mov ebp, esp
push ebx
push esi
push edi

mov edi, [ebp+cost] ;
mov   esi, [ebp+n]
xorps xmm2,xmm2

mov ecx,0 ;index of the loop

movups xmm0,[edi]

mov edx,16   ;; <-- You initialize edx to 16

;if an element is less then 0 it is not added

.loop xorps xmm1,xmm1
cmpltps xmm1,xmm0
printregps xmm0
haddps xmm0,xmm0
haddps xmm0,xmm0


addss xmm2,xmm0
printregps xmm2
add ecx,4 ; i use the single precision so in each register i put 4 number


movups  xmm0,[edi+edx] ;  <-- Get the next number
                                                ;       This move will eventually be a memory access of one number beyond end of array!
add edx,16 ;add 16 for the next four number

cmp ecx,esi
jb .loop
movups [ebp+sum],xmm2  ; <-- This is certainly not correct!


pop edi ; ripristina i registri da preservare
pop esi
pop ebx
mov esp, ebp ; ripristina lo Stack Pointer
pop ebp ; ripristina il Base Pointer
ret ; torna alla funzione C chiamante

I made some very slight adjustments.  Note that the code is still a far cry from "optimized".  We're more concerned with getting you past the GPF at the moment.

Code: [Select]
costTot:

push ebp
mov ebp, esp
push ebx
push esi
push edi

xorps xmm2,xmm2   ; initialize the return value

mov edi, [ebp+cost] ;
mov   esi, [ebp+n]
cmp   esi, 0
jle  .done    ; prevent a zero or negative length array

mov ecx,0 ;index of the loop

xor  edx, edx    ; offset into array

;if an element is less then 0 it is not added

.loop
movups xmm0,[edi + edx]

        xorps xmm1,xmm1
cmpltps xmm1,xmm0
printregps xmm0
haddps xmm0,xmm0
haddps xmm0,xmm0


addss xmm2,xmm0
printregps xmm2

add ecx,4 ; i use the single precision so in each register i put 4 number

add edx,16 ;add 16 for the next four number

cmp ecx,esi
jb .loop

.done:

mov  ecx, [ebp+sum]
movups [ecx], xmm2   ; Note: make sure that the address in your C code
                                              ; that will contain the results is at least 16 bytes!

pop edi ; ripristina i registri da preservare
pop esi
pop ebx
mov esp, ebp ; ripristina lo Stack Pointer
pop ebp ; ripristina il Base Pointer
ret ; torna alla funzione C chiamante

That code is untested so give it a try and report back your results.

Offline maria03

  • Jr. Member
  • *
  • Posts: 9
Re: Segmentation fault error!
« Reply #5 on: October 07, 2014, 08:35:47 AM »
I tried this changes but nothing is changed. :'( :'(
I have also tried to comment the macro printregps but nothing, there is always the problem.
Each number is a float, and so it is 4 byte. So I think that the result is smaller than 16 bytes.
I can't understand because the first time i run code it is all ok, the next day when i try to run it, segmentation fault. :'( :'( :'(

Offline maria03

  • Jr. Member
  • *
  • Posts: 9
Re: Segmentation fault error!
« Reply #6 on: October 07, 2014, 02:33:49 PM »
Maybe the another test is to align the address, but i don't know how do it! Yes, i know that i must use the "aps" instructions but i don't know what i must do before..

Offline Rob Neff

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 430
  • Country: us
Re: Segmentation fault error!
« Reply #7 on: October 07, 2014, 03:26:26 PM »
Post your C code that is calling your assembly function so we can examine it as well. 
We will find your bug.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2359
  • Country: us
Re: Segmentation fault error!
« Reply #8 on: October 07, 2014, 04:24:33 PM »
Quote
Each number is a float, and so it is 4 byte. So I think that the result is smaller than 16 bytes.
I'm not sure that this is correct. As I understand it, you want just one single-precision float - four bytes - the sum of your array. But unless I'm mistaken (entirely possible) an xmm register holds four such floats - sixteen bytes - and you need to provide storage for that. I don't know how you find the one you want...

Now that you've provided the macros, I may attempt to assemble your code and fake up a caller to make something I can run (if I can get into the mood). I'll never learn it any younger.

If I were you, I'd stick with the "ups" instructions until you get that working, then advance to the "aps" instructions. Take it one step at a time, would be my advice.

If I may be permitted a totally off-topic comment... I've got some friends visiting Italy and sending back pictures.  What a beautiful country!

Best,
Frank


Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2359
  • Country: us
Re: Segmentation fault error!
« Reply #9 on: October 07, 2014, 05:43:59 PM »
Well... no luck. I knew that I wasn't familiar with "haddps", but apparently my CPU is unfamiliar with it too. Crashes with "illegal instruction" after printing one line of zeros. I really must upgrade this beast Real Soon Now.

After commenting out those two lines, it prints two lines of zeros (surrounded by parentheses). I don't think this is correct - shouldn't I see some numbers from my array? It at least exits without a segfault...

Best,
Frank

Code: [Select]
; nasm -f elf32 maria.asm
; ld -o maria maria.o -I/lib/ld-linux.so.2 -lc -melf_i386


%include "ssutils.inc"

global _start

section .data
    carray dd 1.0, 2.0, 3.0, 36.0

section .bss
    csum reso 1

section .text
_start:
    nop
; fake caller   
    push carray
    push 4
    push csum
    call costTot
    add esp, 12

exit:
    mov ebx, eax
    mov eax, 1 ; sys_exit
    int 80h

global costTot


cost equ 8
n equ 12
sum equ 16



costTot:

push ebp
mov ebp, esp
push ebx
push esi
push edi

mov edi, [ebp+cost] ;
mov   esi, [ebp+n]
xorps xmm2,xmm2

mov ecx,0 ;index of the loop

movups xmm0,[edi]

mov edx,16

;if an element is less then 0 it is not added

.loop xorps xmm1,xmm1
cmpltps xmm1,xmm0
printregps xmm0

; this instruction crashes my machine
; obviously, your algorithm won't work without it
; but "just to try"... comment it out
; haddps xmm0,xmm0
; haddps xmm0,xmm0


addss xmm2,xmm0
printregps xmm2
add ecx,4 ; i use the single precision so in each register i put 4 number


movups  xmm0,[edi+edx] ;
add edx,16 ;add 16 for the next four number

cmp ecx,esi
jb .loop
; movups [ebp+sum],xmm2
    mov ecx, [ebp + sum]
    movups [ecx], xmm2


pop edi ; ripristina i registri da preservare
pop esi
pop ebx
mov esp, ebp ; ripristina lo Stack Pointer
pop ebp ; ripristina il Base Pointer
ret ; torna alla funzione C chiamante


Offline maria03

  • Jr. Member
  • *
  • Posts: 9
Re: Segmentation fault error!
« Reply #10 on: October 07, 2014, 09:40:08 PM »
I tried to comment the haddps and I had the same problem. I am so desperate that a try to comment ALL the code, except the first part when I save the value of the register in the stack, and the last part when i pop the value from the stack. Only leaving there the instruction xops xmm2,xmm2 I have the segmentation fault. How is it possible?!
I can't understand. Maybe is some problem with the Virtualbox?! I can't think anything..
 >:(

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2359
  • Country: us
Re: Segmentation fault error!
« Reply #11 on: October 08, 2014, 01:35:40 AM »
Anything is possible. Unlikely that it's a problem with Virtualbox. Unlikely that it's a problem with "xorps". But see point 1.

You need a debugger. If you were running your code in a debugger, you'd know where the segfault occurs, at least... so I assume you're not. Learning to run a debugger will take a while, but it's the "right" way to do it...

Here's a "trick": deliberately put the program in an infinite loop...
Code: [Select]
hang: jmp hang
I'd start with that just before the "ret". If it segfaults  before it hangs, move it up earlier in the code, but I don't think it will. Kill your program with "control-c", and don't forget to remove that code when you're done with it!

ASSuming that it hangs before it segfaults, we know the problem's on the stack, and/or back in your C code. Rob suggested that you post the C code that calls your little gem. If you don't want to post the whole thing, at least show us the line that calls "costTot", and how the parameters it uses are declared.

If you're still doing:
Code: [Select]
movups [ebp + sum], mmx2
That messes with the stack, and could be a problem - change it to what Rob suggested.

As Betov (a rude Frenchman, author of Rosasm) used to say, "Courage!"

Best,
Frank


Offline maria03

  • Jr. Member
  • *
  • Posts: 9
Re: Segmentation fault error!
« Reply #12 on: October 08, 2014, 10:09:09 AM »
Sorry, i forget yesterday to show the C code: I'm afraid I haven't translated the names of the functions.. But I insert some comments so I help you to understand the code, the matrix is rappresented by one block of memory for reasons of efficiency:
Code: [Select]
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <time.h>
#include <xmmintrin.h>

void costoTotaleN(float* costi, int n,float* somma); // The nasm Function
float* creaMatrice(int n,int m); //create a new matrix with n rows and m columns
void riempiMatrice(float* matrice,int n, int m); // you can introduce the elements of the matrix by keybord

main(int argc,char* argv[]){
float* costi;
float* somma;
int n=8;
costi=creaMatrice(n,1);
somma=creaMatrice(1,1);
riempiMatrice(costi,n,1);
costoTotaleN(costi,n,somma);
printf("costo totale %f\n ",somma[0]);
}


float* creaMatrice(int n,int m){
float* matrice;
matrice=(float*)malloc(n*m*sizeof(float));

}

void riempiMatrice(float* matrice,int n, int m){
int i,j;
int unused __attribute__((unused));
for (i=0;i<n;i++){
for(j=0;j<m;j++){
printf("Inserisci elemento matrice[%d][%d]= ",i,j);
unused=scanf("%f",&matrice[i*m+j]);
}

}

 }




« Last Edit: October 08, 2014, 10:13:09 AM by maria03 »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2359
  • Country: us
Re: Segmentation fault error!
« Reply #13 on: October 08, 2014, 02:15:13 PM »
Well that's interesting. The "interesting" part is that it doesn't segfault for me! I don't get the correct answer. That's expected, since I had to comment out the "haddps" instructions. My machine is too old and tired to know that one. I sympathize. :) I also don't have xmmintrin.h. Since you don't seem to use it in this version of the C code, I commented that out, too. That was the only change I made to your C code.

This is exactly the .asm file I used:
Code: [Select]
; nasm -f elf32 maria.asm
; ld -o maria maria.o -I/lib/ld-linux.so.2 -lc -melf_i386


%include "ssutils.inc"

%define costTot costoTotaleN

%ifdef TESTMAIN

global _start

section .data
    carray dd 1.0, 2.0, 3.0, 36.0

section .bss
    csum reso 1

section .text
_start:
    nop
; fake caller   
    push carray
    push 4
    push csum
    call costTot
    add esp, 12

exit:
    mov ebx, eax
    mov eax, 1 ; sys_exit
    int 80h

%endif

global costTot


cost equ 8
n equ 12
sum equ 16



costTot:

push ebp
mov ebp, esp
push ebx
push esi
push edi

mov edi, [ebp+cost] ;
mov   esi, [ebp+n]
xorps xmm2,xmm2

mov ecx,0 ;index of the loop

movups xmm0,[edi]

mov edx,16

;if an element is less then 0 it is not added

.loop xorps xmm1,xmm1
cmpltps xmm1,xmm0
printregps xmm0

; this instruction crashes my machine
; obviously, your algorithm won't work without it
; but "just to try"... comment it out
; haddps xmm0,xmm0
; haddps xmm0,xmm0


addss xmm2,xmm0
printregps xmm2
add ecx,4 ; i use the single precision so in each register i put 4 number


movups  xmm0,[edi+edx] ;
add edx,16 ;add 16 for the next four number

cmp ecx,esi
jb .loop
; movups [ebp+sum],xmm2
    mov ecx, [ebp + sum]
    movups [ecx], xmm2


pop edi ; ripristina i registri da preservare
pop esi
pop ebx
mov esp, ebp ; ripristina lo Stack Pointer
pop ebp ; ripristina il Base Pointer
ret ; torna alla funzione C chiamante
As you can see, I "defined out" the fake caller and "translated" costTot to the beautiful language. If you're going to try it, you probably want to uncomment the haddpses (try it both ways, perhaps).

There are a couple issues that confuse me. When I run it with my fake caller, I get a couple of lines of zeros printed. When I run it with your C caller, I see the expected numbers printed. (I just entered 1, 2, 3, 4, 5, 6, 7, 8 ) This is apparently an error in my fake caller, but I don't see it. I'm new to this xmm stuff...

What I "thought" when I saw your C code was that "summa" isn't big enough. Try mallocing that to 4 * 1 (or 4 * 0 ?) and see if it makes any difference.

I can't explain why this should segfault for you and not for me. Maybe a problem with Virtualbox, but I doubt it very much. If I feel ambitious I'll fool with it some more, but for right now I'm just puzzled!

Best,
Frank


Offline encryptor256

  • Full Member
  • **
  • Posts: 250
  • Country: lv
  • Win64 .
    • On Youtube: encryptor256
Re: Segmentation fault error!
« Reply #14 on: October 08, 2014, 05:13:33 PM »
Hi Folks!

I think it fault's because of memory somewhere.

MOVUPS
Move four unaligned packed single-precision floating-point values between XMM registers or
between and XMM register and memory


So, do what ever you want, instruction movups will LOAD OR STORE 16 bytes FROM or TO memory.
Dare to access unallocated space and you're done. :D

Other thing:
Code: [Select]
costi=creaMatrice(n,1);
somma=creaMatrice(1,1);

...

float* creaMatrice(int n,int m){
float* matrice;
matrice=(float*)malloc(n*m*sizeof(float));

}

I dont see that function creaMatrice returns some value, where is return?, "return matrice;" is missing.


EDIT:
This code:
Quote
movups    [ebp+sum],xmm2
will give seg fault, because your sum is only four bytes.

At least that's what I think, so far!

...
All Rights and Lefts Reserved, I keep rights to modify this post for one reason or another, later or never!  ;D

EDIT 2: Here I come again:

Heads up, this might cheer you up: https://www.youtube.com/watch?v=lzUat_wC6Ko

:D

EDIT 3:

How about this instruction, like MOVD.
The MOVD (Move 32 Bits) instruction transfers 32 bits of packed data from memory to an MMX register and vice
versa; or from a general-purpose register to an MMX register and vice versa.


This code looks permitted on both x32 and x64:
Code: [Select]
movd ecx,xmm0

... or....

movd [ebp+sum],xmm0

So, that code should load only 32 bit float into variable. Should work, theoretically.
« Last Edit: October 08, 2014, 05:52:59 PM by encryptor256 »
Encryptor256's Investigation \ Research Department.