NASM - The Netwide Assembler
NASM Forum => Using NASM => Topic started by: pikestar on June 12, 2011, 04:52:59 PM
-
Hi just playing around with assembly for the first time. At the moment I'm trying to combine the hello world program in an assembly guide (http://"http://asm.sourceforge.net/intro/hello.html") with Jonathan Leto's Writing A Useful Program With NASM (http://"http://leto.net/writing/nasm.php"). I want to get the program to echo the first parameter passed to it. The trouble is I'm not sure how to work out the length of the parameter with is needed for the system call sys_write in register edx (at the moment I've just stated 3).
section .data ;store all program data
section .text
global _start ;tells linker where to start
_start: ;program starts here
pop ebx
pop ebx ;remove No args and program name from stack
pop ecx ;put arument into ecx ready to output
mov eax,4 ;the system call for write
mov ebx,1 ;file descriptor for std output
mov edx,3 ;need length of paramater for edx
int 80h ;call kernal
mov eax,1 ;sys call exit
mov ebx,0 ;return 0 i.e. no error
int 80h ;call kernal
any help appricated
-
ecx will point to a zero-terminated string, so find the zero...
...
pop ebx
; better make sure we've got one!
cmp ebx, 2
jb usage ; print a usage message and exit?
pop ebx ;remove No args and program name from stack
pop ecx ;put arument into ecx ready to output
; get length in edx
xor edx, edx
getlen:
cmp byte [ecx + edx], 0
jz gotlen
inc edx
jmp getlen
gotlen:
mov eax,4 ;the system call for write
mov ebx,1 ;file descriptor for std output
int 80h ;call kernal int 80h ;call kernal
...
Or some such...
Best,
Frank
-
Got it. Simply when you think about it. I suppose still feeling a bit cautious of assembly so not sparking as I should! Need to get out of the habit of relying on ready made functions. :D
-
It seems ironic to me that UNIX is based on null terminated strings everywhere except where they would actually be nice to use, like when calling their kernel functions. Why should we have to pass a length? Go look for the null, ya buttheads! :D :o >:(
-
Quick question looking at the code.
; get length in edx
xor edx, edx
getlen:
cmp byte [ecx + edx], 0
jz gotlen
inc edx
jmp getlen
gotlen:
Is the use of the XOR just to ensure edx is set to zero (i.e. the start of the string) or is it for something else? If it is to set it to zero why use it instead of move edx,0?
sorry for the stupid questions!
-
I read in Duntemann's book xor used to be faster than mov with an immediate operand. He implies things changed but who knows. There are a few other ways to zero a register like subtracting it from itself (don't know if that's a good one or even exists in x86) etc.
-
xor edx, edx
Actually takes less opcodes ( 2 bytes i guess)
as opposed to mov edx, 0 ( 6 bytes i guess )
So the executable size will be smaller.
(mov is faster than xor , but i too use xor reg,reg to initialize a reg. to zero :)
A general practice... if you had stumbled across some size optimization tutorials )
-
Yeah, "xor edx, edx" is just to zero edx. Use "mov edx, 0" if it's clearer to you - "sub edx, edx" would also work. Programmer's choice! (isn't it nice being the programmer?)
Why should we have to pass a length? Go look for the null, ya buttheads!
Why should we have to go looking for the null, when the length is known in most cases (not this one)? Zero-terminated strings are actually kind of a dumb data structure (IMO). Length-prefixed strings (a la Pascal) are usually more efficient! (programmer's choice again, except where we have to interface with C).
Best,
Frank
-
Yeah, "xor edx, edx" is just to zero edx. Use "mov edx, 0" if it's clearer to you - "sub edx, edx" would also work. Programmer's choice! (isn't it nice being the programmer?)
We use a logical (unsigned) subtract where I come from. Is there one of those on x86 and is it faster than a regular sub?
Why should we have to go looking for the null, when the length is known in most cases (not this one)? Zero-terminated strings are actually kind of a dumb data structure (IMO). Length-prefixed strings (a la Pascal) are usually more efficient! (programmer's choice again, except where we have to interface with C).
That's what I was saying, the irony. It seems ironic to me that UNIX is based on null terminated strings everywhere except where they would actually be nice to use, like when calling their kernel functions. Virtually everywhere in NIX they expect null terminated strings because of C. The one time it would have helped so we didn't have to figure a length, they pull out the rug from under us and expect us to give them a length. Why I outta!
PL/I had length prefixed strings way before PASCAL. It still does and the string code is real fast because of it, no scanning and no buffer overflows because the compiler knows the limits and will not move more than a maximal number of bytes to a target. I believe COBOL does the same thing, but I will have to double check later. I am talking about the IBM mainframe versions, I have no idea what other implementations do.
-
It seems ironic to me that UNIX is based on null terminated strings everywhere except where they would actually be nice to use, like when calling their kernel functions. Why should we have to pass a length? Go look for the null, ya buttheads! :D :o >:(
Actually, sys_write & sys_read are BINARY input/output routines, yes they are used to work with text on the console via stdin,stdout, stderr file descriptors, but keep in mind that you might not always be using text. Say, for example, you're wanting to read the contents of a bitmap image into memory. In such a case you really don't want your sys_read routine to terminate at every null. Same can be said for when you want to dump that bitmap image back to disk.
Why should we have to go looking for the null, when the length is known in most cases (not this one)? Zero-terminated strings are actually kind of a dumb data structure (IMO). Length-prefixed strings (a la Pascal) are usually more efficient! (programmer's choice again, except where we have to interface with C).
I completely agree about having the length of the string with the string itself. That is one of the few places where I tend to be "wasteful" and actually deal with string references rather than strings themselves.
; build with : nasm -f elf strdemo.asm && gcc -nostartfiles -nostdlib strdemo.o -o strdemo
bits 32
__NR_exit equ 1
__NR_write equ 4
STDOUT equ 1
section .rodata
; Our DB Array's and size equates
dbaHello: DB "Hello, World!", 10, 0
dbaHello_size EQU ($-dbaHello-1)
dbaBye: DB "Goodbye, World!", 10, 0
dbaBye_size EQU ($-dbaBye-1)
section .data
; "String" structures
Hello:
.length DD dbaHello_size
.string DD dbaHello
Bye:
.length DD dbaBye_size
.string DD dbaBye
section .text
global _start
_start:
mov edx, [Hello.length]
mov ecx, [Hello.string]
mov ebx, STDOUT
mov eax, __NR_write
int 0x80
mov edx, [Bye.length]
mov ecx, [Bye.string]
mov ebx, STDOUT
mov eax, __NR_write
int 0x80
xor ebx, ebx
mov eax, __NR_exit
int 0x80
In the above example I added use $-label-1 because the length is 1 byte (the null terminator) less than the full ASCIIZ string. I still put the null terminator in strings simply cause I might someday use that string with a C function and as a good forward thinking programmer I don't want the possible C functions to puke on me. :D
I started doing this when I was coding under windows and I wanted to keep all my strings themselves in the CONST section, later I just adapted the same practice to the .rodata section. It's mostly useful for dynamic strings where you might need to resize the allocated space and you can set .length to contain the max space allocated for that string (if reached just get more memory). Personally though, I think it's a good trade off, yeah I may be using up more space but I'll never have to do a null byte search. :P
-
Ooops, yeah I was thinking only of text. You are right, that won't work for binary data. I have to keep telling myself, in NIX everything is a file.
Thanks :-)