NASM - The Netwide Assembler
NASM Forum => Programming with NASM => Topic started by: Luns on April 03, 2011, 06:22:16 PM
-
Sorry if this isn't specifically NASM-related, but I'm not sure of another board where I could ask this - how exactly do syscalls work? For example, if I'm using Linux and call sys_write (with int 0x80 and 4 in eax), what happens to print out a message? I know the syscall is defined in read_write.c in the Linux source, and if you go there then it seems to call do_sendfile (), which then calls fput_light (), which then calls fput () and then __fput ()... I lost it a few functions after that. At the root of it all, is there something calling a BIOS interrupt to print out the message? Is there any assembly along the way or is it all implemented in C?
Thanks, just trying to wrap my head around this...
e: New question below.
e2: Another new question!
-
You've done well to track it down that far! We do have the source code. Given the skill and persistance, we should be able to figure out exactly how it all works. I find it heavy going! I suspect that if you spend enough time with kernel code, and get "oriented", it gets easier. I'm not there yet, and I'm not working too hard at it...
For general "syscalls" processing, this seems to be relevant:
/usr/src/linux/arch/i386/kernel/entry.S
I see, in /usr/src/linux/arch/i386/boot/video.S, we do use BIOS int 10h calls (and some "outw"s to ports, as well). If I were writing an OS, I've often thought, I wouldn't be in any hurry to switch to pmode, but would interrogate the BIOS mercilessly while still in rmode, and save what I found. Maybe even switch to a video mode, while BIOS was still available. It looks to me like this file is doing something like that(?).
It would be possible to use BIOS ints after switching to pmode, using the vm86 sys_call, but I doubt (guessing) that it is used here. I suspect that if sys_write is to STDOUT, and if STDOUT is not redirected - we're really printing to the screen - that we write to 0xB8000 (or possibly 0xA0000 if we're not in "text mode" - say a "linear frame buffer" has been specified), or probably a second buffer (per process) which gets copied to 0xB8000 when it's "our turn"... I haven't found the exact code that does this - I suspect it's done in C, not asm - but I'm guessing that's what happens.
As to where else you might ask these questions, the kernel developers' list probably doesn't welcome such "newbie questions". I'd poke around here:
http://kernelnewbies.org/
I suspect they aren't much into assembly language. There's a "linux-assembly" list that might be helpful. Subscribe by sending:
subscribe linux-assembly
in the message body to:
majordomo@vger.kernel.org
Of course, if you're molesting these sys_calls using Nasm, it's on topic here. :)
Best,
Frank
-
Ah, that all makes a lot of sense. Thanks, that's cleared up some stuff for me.
-
Sorry to bump this up, but I thought it would be better just to post again here rather than make another topic...
Now I have an array of characters, which I'd like to print (through stdout) with a Linux syscall. I can loop through the array and get each value fine, but my problem is the syscall wants a pointer to the value be in ecx, not the value itself. Is it possible to put a pointer to a specific array value into a register?
-
If you have an array of characters, it would be easier (and better all around) to just print the whole string.
BITS 32
CPU 486
SECTION .data
strHey: DB "Hello, World!", 10
strHey_size EQU ($-strHey)
SECTION .text
GLOBAL _start
_start:
mov edx, strHey_size ; strlen = sizeof("Hello, World!")
mov ecx, strHey ; pstr = address of strHey
mov ebx, 1 ; fileno = STDOUT
mov eax, 4 ; SYS_write
int 0x80
mov ebx, 0 ; errcode = 0
mov eax, 1 ; SYS_exit
int 0x80
However, if you have your heart set on doing it one character at a time, you could try something like.
BITS 32
CPU 486
SECTION .data
strHey: DB "Hello, World!",10
strHey_size EQU ($-strHey)
SECTION .text
GLOBAL _start
_start:
mov esi, strHey ;; address of array
lea edi, [strHey + strHey_size] ;; address of end of array
.nextchar:
pushad
mov edx, 1 ; strlen = 1
mov ecx, esi ; pstr = address_of(strHey[index])
mov ebx, 1 ; fileno = STDOUT
mov eax, 4 ; SYS_write
int 0x80
popad
inc esi ;; move to next character
cmp esi, edi ;; are we at the end of the array
jne .nextchar
mov ebx, 0 ; errcode = 0
mov eax, 1 ; SYS_exit
int 0x80
This is, of course, untested code. Unfortunately with me, untested code can (and very often does) fail, so good luck building it. the idea is there though, just increment through until you reach the end and use SYS_write with a length value of 1.
[EDIT: Keith tested it for me. it's good]
-
Yeah, I think I wanted to loop through two strings and compare them character by character. Thanks though, I'm gonna go play around with this code.
And thanks for testing it Keith :)
-
Sorry to ask such a simple question, but I've been trying to get this to work all afternoon. All I'm trying to do is take two strings, convert them to numbers, divide them and return the result. Here's the code...
section .data
a db "3",0
b db "44",0
section .text
global _start
_start:
sub esp, 4
mov dword [esp], a
call stoi
add esp,4
mov ebx, eax
push ebx ;ebx will be used in stoi, so store current value on stack
sub esp, 4
mov dword [esp], b
call stoi
add esp, 4
pop ebx
xor edx, edx
div ebx ;divide eax (44) by ebx (3)
mov ebx, eax ;move result into ebx
mov eax, 1
int 0x80
stoi:
push ebp
mov ebp, esp
mov esi, [ebp+8] ;*string
mov edi, 1 ;multiplier
mov ebx, esi
xor eax, eax
xor ecx, ecx
.getSize:
inc ecx
inc esi
cmp byte [esi], 0
jne .getSize
mov esi, ebx
.convertToInt:
mov ebx, [esi+ecx-1]
sub ebx, 0x30
imul ebx, edi
imul edi, dword 10
add eax, ebx
loop .convertToInt
mov esp, ebp
pop ebp
ret
I'm reasonably sure that my stoi function works - because if I try to multiply, add, or subtract the two numbers I do get the correct result. But when I try dividing, it returns 0. Also, when I divide it seems to think the remainder is whatever the value of 'b' was. I.e., if I try dividing 44 by 3, eax will contain 0 but edx will contain 44.
Is there something else I should be doing to divide? Or is my stoi function not actually returning the correct numbers, do you think? Any advice is much appreciated :)
-
i've just run your program through a debugger and the stoi function doesn't seem to work.
try this (if your using linux):
make the first instruction (its for the debugger)
_start: nop
assemble it with nasm -f elf -g -F dwarf program.asm, ld program.o -o program
install data display debugger and run it with ddd program, step through your program 1 instruction at a time and you'll see eax doesn't contain the integer version of the string after running through stoi
ps you'll need to set a breakpoint at the first instruction after the nop, and select status->registers so you can see the value of eax as you step through stoi
i'll be back tomorrow, if your still having problems i'll post a working ascii to dword routine
-
Yeah, that's what I was worried the problem might be. I went through it with gdb, and as far as I can tell if I put the string "4", (ascii = 0x34) into stoi it returns '4' (ascii = '0x04'). So does that mean that it's still just returning a character representation of four, rather than an integer? How would I return the integer? I searched around and I thought all you had to do to convert a character to it's equivalent number is subtract 0x30 from the character - but is that not actually the case?
-
stoi:
push ebp
mov ebp, esp
mov esi, [ebp+8] ;*string
mov edi, 1 ;multiplier
mov ebx, esi
xor eax, eax
xor ecx, ecx
.getSize:
inc ecx
inc esi
cmp byte [esi], 0
jne .getSize
mov esi, ebx
.convertToInt:
mov ebx, [esi+ecx-1]
At this point, you're getting multiple bytes into ebx. You only want one (at a time).
movzx ebx, byte [esi + ecx - 1]
Will do it, or you could do the same thing with:
xor ebx, ebx ; make sure upper bytes are clear
mov bl, [esi + ecx - 1]
sub ebx, 0x30
imul ebx, edi
imul edi, dword 10
add eax, ebx
loop .convertToInt
mov esp, ebp
pop ebp
ret
When your debugger shows you the number 4, it has converted it to the ascii character '4' for you. Don't let that confuse you - you're on the right track!
There is a somewhat simpler way to do it... if you multiply "result so far" by 10 before adding in the "new digit" each time, you don't need to change the multiplier for each digit, and you can work through the string "frontwards", without even knowing how long it is!
get a character from the string
make sure it's a valid decimal digit character
if it's zero, we're done
if it's "something else"... programmer's choice
C just figures we're done for any non-digit...
you may want to do something for "invalid digit"...
subtract '0' to "convert" it to a number
multiply "result so far" by 10
add in the new digit
until done
This is something Herbert Kleebauer showed me. It was originally generated by a compiler(!). Uses "lea" twice to multiply result so far by 10 and add the "new digit"... converted to a number!
;--------------------
; this has the "special property"
; that it "returns" the invalid
; character in cl,
; and the next position in edx
atoi:
mov edx, [esp + 4] ; pointer to string
xor eax, eax ; clear "result"
.top:
movzx ecx, byte [edx]
inc edx
cmp ecx, byte '0'
jb .done
cmp ecx, byte '9'
ja .done
; we have a valid character - multiply
; result-so-far by 10, subtract '0'
; from the character to convert it to
; a number, and add it to result.
lea eax, [eax + eax * 4]
lea eax, [eax * 2 + ecx - '0']
jmp short .top
.done:
ret
;------------------------
I thought that was pretty cute!
Best,
Frank
-
as i promised:) heres a working example that you can read. it produces no output though so make sure you run it through a debugger
SECTION .data
str1 db "-123", 0
str2 db "+456", 0
str3 db "72 years old", 10, 0
SECTION .text
GLOBAL _start:
_start:
nop
mov eax, str1
call atodw ;EAX=-123
mov eax, str2
call atodw ;EAX=456
mov eax, str3
call atodw ;EAX=72
mov eax, 1
mov ebx, 0
int 80h
;---------------------------------------------------------
; convert an ascii string containing a number to a dword
; input: EAX=pointer to the string containing the number
; returns: EAX=returned dword
;---------------------------------------------------------
atodw:
push ebx
push ecx
push esi
xor ecx, ecx ;ECX is used as a flag. 0=positive number, 1=negative number
mov esi, eax ;copy the string pointer to ESI
xor eax, eax ;EAX is now used for returning the value to the user
.SignTest:
movzx ebx, byte[esi] ;get the first character from the string
cmp ebx, '-' ;and test if it is a minus sign
jne .PosSignCheck
mov ecx, 1 ;set ECX=1 (number is negative)
inc esi
jmp .DoConversion
.PosSignCheck:
cmp ebx, '+' ;numbers that start with '+' aren't used often (but they are legal)
jne .DoConversion
inc esi
.DoConversion:
movzx ebx, byte[esi] ;get an ascii digit
sub ebx, '0' ;convert ascii digit to decimal value
cmp ebx, 9 ;if the value in EBX is above 9 then the character
;pointed to by ESI wasn't a valid digit (bewtween '0'-'9')
ja .WasNumNeg ;this jump is taken when the conversion is completed
;ie we reached the null char or we encountered an invalid digit
;these 2 instructions multiply EAX by 10 and add the digit that we've retrieved to EAX
lea eax, [eax+eax*4]
lea eax, [ebx+eax*2]
inc esi
jmp .DoConversion
.WasNumNeg:
;we check the flag in ECX to find out if the string representation of the number
;was negative or positive. If it was negative we have to two's complement EAX
or ecx, ecx
jz .Finished
neg eax ;perform two's complement
.Finished:
pop esi
pop ecx
pop ebx
ret
-
Thank you both so much! I now have a working string to integer function ;D
So 'movzx eax ...' automatically clears the upper bits of eax, right?
-
So 'movzx eax ...' automatically clears the upper bits of eax, right?
Yes, you are right. And it is one byte shorter than:
xor eax, eax
mov ax, [...]