NASM - The Netwide Assembler
NASM Forum => Programming with NASM => Topic started by: azagaros on May 26, 2019, 08:17:51 PM
-
This is on linux 64. I am getting a segmentation fault at trying to read the command line arguments and print them out. I am not sure where my logic error is or which boundary I am crossing. As far as I know my management of the stack is working.
section .data
strNewLine db 10
section .text
global _start
_start:
pop rcx ; pop the count of arguments
.argloop:
pop rax ; pop the ptr of the arg.
call _printstr ; print the string
call _printnl ; print a new line character
dec rcx
cmp rcx, 0
jne .argloop
call _ExitProg
ret
;rax ptr to string to count
;rbx count of string
_printstr:
push rcx
push rax
mov rbx,0
.lenloop:
inc rax
inc rbx
mov cl, [rax]
cmp cl, 0
jne .lenloop
mov rax, 1
mov rdi, 1
pop rsi ;pop rax to rsi
mov rdx, rbx
syscall
pop rcx
ret
_printnl:
mov rax, 1
mov rdi, 1
mov rsi, strNewLine
mov rdx, 1
syscall
ret
_ExitProg:
xor rdi, rdi
mov rax, 60
syscall
-
I am not good on 64-bit code. It looks like rcx is being trashed in one of your routines. Either save rcx across the two routines, or test rax for zero instead of counting on argc.
I think that'll work...
Best,
Frank
-
Maybe this could help:
; test.asm
;
; compile with:
; $ nasm -felf64 -o test.o test.asm
; $ ld -o test test.o
;
bits 64
default rel ; x86-64 default addressing is RIP relative!
; Moved to .rodata.
section .rodata
strNewLine: db `\n`
section .text
global _start
_start:
; argc is in [rsp]
; argv[n] are in [rsp+8*n+8]
;mov ecx,[rsp] ; argc (not used here!)
xor ebx,ebx ; RBX is saved by syscalls (SysV ABI).
; Used as index...
.loop:
mov rdi,[rsp+rbx*8+8]
test rdi,rdi ; argv[rbx] == NULL?
jz .no_more_args
call printstr ; print the string
call printnl ; print a new line character
add ebx,1 ; avoid using INC/DEC (they are slow!).
jmp .loop
.no_more_args:
mov eax,60 ; syscall_exit
xor edi,edi ; exit(0).
syscall
; rdi = string ptr.
strlen:
xor eax,eax
.loop:
cmp byte [rdi+rax],0
jz .strlen_end
add eax,1
jmp .loop
.strlen_end:
ret
;rdi = string ptr
printstr:
call strlen
mov edx,eax ; string length.
mov eax,1 ; syscall_write
mov rsi,rdi ; string ptr.
mov edi,eax ; stdout
syscall
ret
printnl:
mov eax,1 ; syscall_write
mov edi,eax ; stdout
mov edx,eax ; 1 char do print
lea rsi,[strNewLine] ; points to '\n'.
syscall
ret
-
Thank you Fredericopissarra. You got me by one stumbling block. Just trying to understand how the registers and stack on linux are handled is being challenging. A small expansion of code has created a new segmentation fault.
bits 64
default rel ; x86-64 default addressing is RIP Relative
; section .data
section .rodata
strNewLine db 10
strSpace db 32
strFileName db 'is a file name.',0
strOption db 'is an option flag.',0
section .text
global _start
_start:
xor rbx, rbx ;zero the rbx- will not get trashed by sys v abi (the system calls)
; rsp is the **argv
.argloop:
mov rdi, [rsp+rbx*8+8]
test rdi, rdi ; testing for null
jz .noMoreArg ; jump on zero
cmp rbx, 0 ; consuming the first arg.
je .skipProcess
call printstr ; print the string
call printSpc ; print a space
; procecc the arg
call processArg
call printnl ; print a new line character
.skipProcess:
add rbx, 1 ; increment the index.
jmp .argloop
.noMoreArg:
call ExitProg
ret
; rdi = string ptr
strlen:
xor rax, rax ; zero rax register
.loop:
cmp byte[rdi+rax],0 ;check for null termination of string
jz .strlenend
add rax,1
jmp .loop
.strlenend:
ret
; rdi = str ptr
printstr:
call strlen
mov rdx, rax ; store the length rdx
mov rax, 1 ; syscall write
mov rsi, rdi ; str ptr
mov rdi, rax ; stdout
syscall
ret
; rdi = str ptr
printnl:
mov rax, 1
mov rdi, rax
mov rdx, rax
lea rsi, [strNewLine]
syscall
ret
;rdi = str ptr
printSpc:
mov rax, 1
mov rdi, rax
mov rdx, rax
lea rsi, [strSpace]
syscall
ret
; rdi = str ptr to process
processArg:
xor rax, rax ;string index position
cmp byte[rdi+rax],'-'
je .argOption
; is the stack getting trashed? stack alignment? (system calls use the local stack?)
push rdi ; save the current rdi (should not change)
push rax ; save the working rax (It is getting trashed?) (it gets trashed in the string length function)
mov rdi, strFileName ; set up a differnet rdi...
call printstr
pop rax
pop rdi
;Assumed file concept (can have more than one)
;allocate a file infromation block
;copy the file name to the file name of the block
.argOption:
add rax, 1 ;consume the option flag
; the test for the list of options.
push rax
push rdi
mov rdi, strOption
call printstr
pop rdi
pop rax
ret
ExitProg:
xor rdi, rdi
mov rax, 60
syscall
As for you Kolter as a NASM developer some food for thought:
https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models (https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models)
every thing I am fighting seems to be in the code generation of Nasm or Fasm for that matter
-
SysV ABI amd64 calling convention:
The integer arguments (or pointers) are passed to a function in order: RDI, RSI, RDX, RCX, R8 and R9, from 7th argument to the last, they are stored on stack cdecl style. For syscalls, instead of RCX, R10 is used.
For floating point, XMM0 to XMM7 will be used for arguments.
Example: int f( int a, float b, int c ); // EDI=a, XMM0=b, ESI=c
Registers RSP, RBP, RBX and R12 to R15 must be preserved between calls. Any other registers can be changed inside a function.
RAX is the returned value...
-
First off tell me where I am touching anything but the system calls.. I am not making the function calls. It appears the registers get trashed in a task switch of linux. Assembly has always been the argument of the registers. I save registers if I do not care what they return as from process. I am being specific about the calls and when I do them. Most of the registers I have not touched. I have read the abi backwards and forwards more times that care to realize. I have something interfering with position by most addressing models that should not move, so any number I move from x to y, it does not change that number. If it gets relocated by the OS, the os should clean up the pointers. I am crossing a boundary because a number changed by some abstract. My assembly do not change any numbers and I expect enter and exit of any function to change certian numbers. Which stupidity am I really following doing simple assembly? I have been chasing bugs like every time I code something on linux in 64 bit and have been doing assembly for 30+ years..
-
Take a closer look at your printstr function... What registers it changes?
-
Another thing: Avoid using R?? registers if you can. If you need a simple counter, 32 bits will sufice and you should use E??. This because using R?? for other thing than pointers will add an REX prefix to instructions and the imediate values will be 64 bits long...
XOR EAX,EAX will clear the entire RAX and it is a shorter instruction... Every instruction which deals with E?? registers will erase, automatically the upper bits of R?? registers...
To load an effective address (pointer), you should use LEA instruction to take advantage of RIP relative addressing. This:
MOV RDI,strNewLine
Is better defined as
LEA RDI,[strNewLine]
Because it is wise do prefer Position Indementent Executables (PIE) in x86-64 mode.
-
oh wow. talk about stupidity... rax - 64bit, eax -32 bit, ax - 16 bit al or ah is 8 bit.. of the same register. if I read another post nasm through optimization selects the smallest form. 64 bits addresses are 3 or 4 bytes long, 32 bit are 2. that is code generation, it is in the prefixes on the byte sequence to denote 16, 32, 64 bit number that follows..
it could be your fallacy, if you forget there is a full 64 bit number in a register when you only zero out 32 bits of the register, it does not end up 0..
-
oh wow. talk about stupidity... rax - 64bit, eax -32 bit, ax - 16 bit al or ah is 8 bit.. of the same register. if I read another post nasm through optimization selects the smallest form. 64 bits addresses are 3 or 4 bytes long, 32 bit are 2. that is code generation, it is in the prefixes on the byte sequence to denote 16, 32, 64 bit number that follows..
it could be your fallacy, if you forget there is a full 64 bit number in a register when you only zero out 32 bits of the register, it does not end up 0..
So, what if you read from Intel (https://software.intel.com/en-us/articles/intel-sdm) and AMD (https://developer.amd.com/resources/developer-guides-manuals/) themselves... READ the Development Manuals and see for yourself.
-
A simple test:
; asm.asm
bits 64
default rel
section .text
global f
; unsigned long long f( unsigned long long );
; Entry: RDI
; Output: RAX
f:
mov rax,rdi
mov eax,1 ; it will zero the upper 32 bits?
ret
/* test.c */
#include <stdio.h>
extern unsigned long long f( unsigned long long );
int main( void )
{
unsigned long long x;
x = f(0xffffffffffffffffULL); // all bits set.
// Maybe it will print 0xffffffff00000001?! Nah!
printf( "%#016llx\n", x );
}
# Makefile
test: test.o asm.o
$(CC) -o $@ $^
test.o: test.c
asm.o: asm.asm
nasm -felf64 -o $@ $<
Just:
$ make
$ ./test
-
Gee your stupidity again. I have 5 programming languages under my belt. I have read the books you dumb a**. From every piece of documentation of registers:
| 32 | 16 | 16| rax is the full 64 and eax is the bottom 32 and ax is then set of 16 is the last set. rax is 8 bytes eax 4. Adding the two should look and act like for furthest right bit goes first and cycles to the left. if it overflows the 64 bit it will pop the carry flag. Endian just changes the direction of the cycling goes through. This has not changed in 30 years of x86 class processors. I think arm processors do it the same way. It is the way the transistors are put in sequence. Clearing of the register is left up to the programmer and most bios leave them empty when they start an os. Did you not learn how to add two numbers early in your life? 2's complement follows same logic if you have ever did the math by hand. You start at the zero bit and go to max bit and Big and little endian determine where the 0 bit is.
Since I know that impling a register in nasm, does not guarantee the size of the register. If you want the smallest code concept, it is in the preamble on any integer or pointer; by sequence of any command. In code generation, by both the AMD and intel documentation, you have the byte code of command, then, depending on command, is the scaler on a number and the bytes that make up the number. All address space is pointers, which are all integers. In nasm could take a 64 bit pointer and inadvertently make it a 16 bit pointer, if the value was under 2^16. A 16 bit number has no scaler and is only 2 bytes for the number, a 32 has 2 byte scaler with 4 bytes to the number and 64 has a 3 byte scaler and 8 bytes to the number. There is a lot of space saving in this concept. it also argues the byte concept of the command
One thing that is in AS-- gcc's assembler. It has a denotation to imply which state of command you use. For example mov, movw, movq, mov is the 8 bit/16 bit concept, and movw is the 32 bit concept and movq is the 64 bit concept.
-
azagaros, please do not insult someone who is trying to help you. You may know several programming languages, but that does not mean you know all languages. Higher lever languages do much of the work for you, so you don't need to know as much in order to use them. I have never tried programming in 64-bit code, but fredericopissarra appears to have done so. As he says, the Intel/AMD documentation gives much more info than those who work on nasm, who used that info to write an app which follows Intel assembly coding conventions. Our documentation will never be as complete as that you can get (or maybe have) from Intel.
-
Intel IA-32 and Intel64 Software Development Manual - Vol I - Chapter 3, Topic 3.4.1.1 (General Purpose Registers in 64 bits mode):
"When in 64-bit mode, operand size determines the number of valid bits in the destination general-purpose
register:
...
* 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose
register."
AMD64 Architecture Programmer’s Manual Volume 1 - Application Programming - Chapter 3, Topic 3.1.2 (64-Bit Mode Registers), Figure 3-3 and 3-4... Topic 3.1.2.3 explains...
-
By the way... I have almost 40 years of software development AND electronics experience, not only with assembly (of more than a dozen of processors), but lots of high level languages as well...
You're welcome.
-
Debs, all languages boil down to the machine language on any processor. 5 of them have been the most common I have been programming for 30 years. All other languages have the same basic logic to them and if one can pick up on the syntax of the language, one pick out the common struct to all of them because of the machine language concept. It is like I can read lisp but cannot program in it. The concepts from 8086 or 6502 or the 68000 from various companies follow similar constructs of machine language. Even the current forms of arm V follow simple abstracts of those earlier ones. It is like the simple move from memory to register is common to all of them.