NASM - The Netwide Assembler
NASM Forum => Programming with NASM => Topic started by: turtle13 on September 04, 2017, 06:50:15 AM
-
For a class assignment I must write custom versions of C type functions such as:
strlen, strcmp, gets, puts, write, open, close, exit
From what I understand, this requires using the cdecl calling convention so I will be preserving and restoring the ebx, edi, esi, and ebp registers, and the caller will clean the stack. eax holds the return value.
So far I have some skeleton that I have begun for the strlen (called l_strlen) type function (which returns the int. value of the number of characters in a given string):
bits 32
section .data
; variables go here:
; var_name db values
string1 db 'string', 0 ; null terminated string
string1_len equ $ - string1 ; length of string1
section .text
global l_strlen
l_strlen:
xor eax, eax ; zero eax
push eax ; preserve eax
push ebx ; preserve ebx
push edi ; preserve edi
push esi ; preserve esi
push ebp ; prologue: set up stack frame
mov ebp, esp
.char_loop
; while the byte (char) being compared is not "0"
; add one to ecx
; jmp .char_loop
; if the byte (char) is "0" and no characters remaing (meaning null terminated)
; jmp .end_loop
.end_loop
mov esp, ebp ; epilogue: restore caller's frame pointer
pop ebp
ret
pop eax ; this is where the final return value is located
pop esi ; restore esi
pop edi ; restore edi
pop ebx ; restore ebx
Questions about my code:
- for the .char_loop I have pseudocode, I'm trying to figure out exactly how to accomplish this task (or if the task is even appropriate?)
- How do I manipulate the code so that the string being measured is not statically declared like I did with variable 'string1' (such that 'l_strlen(any_string)') ?
- anything else that seems off to you (or better yet, is anything even correct?)
-
It's pretty much correct, but you've got some things out of order.
You don't need to "preserve" registers that you don't use. For this simple task, we can get by with registers that we're allowed to alter. That simplifies things. The "prologue", as the name suggests, wants to be the first thing in your function...
bits 32
section .data
; variables go here:
; var_name db values
string1 db 'string', 0 ; null terminated string
string1_len equ $ - string1 ; length of string1
section .text
;-----------------------------------------------
; this is a "test main" it should not be in your final code
global _start
_start:
push string1 ; address of string1
call l_strlen
add esp, 4 ; "remove" parameter
; length is returned in eax
; make it our exit code
mov ebx, eax
mov eax, 1 ; sys_exit
int 0x80
; end of "test main"
;-------------------------------------------
global l_strlen
l_strlen:
; xor eax, eax ; zero eax
; push eax ; preserve eax
; push ebx ; preserve ebx
; push edi ; preserve edi
; push esi ; preserve esi
; this part we do need:
push ebp ; prologue: set up stack frame
mov ebp, esp
; if we needed to preserve registers, do it here
xor eax, eax ; since we want the result in eax
mov ecx, [ebp + 8] ; first (only) parameter
.char_loop
; while the byte (char) being compared is not "0"
; for clarity: the byte we're looking for is the number zero
; not the character "0". They're not the same thing!
mov dl, [ecx]
cmp dl, byte 0
jz .end_loop
inc eax ; increase counter
inc ecx ; move to next character
; add one to ecx
; jmp .char_loop
jmp .char_loop
; if the byte (char) is "0" and no characters remaing (meaning null terminated)
; jmp .end_loop
.end_loop
; if we had preserved registers, pop 'em here
mov esp, ebp ; epilogue: restore caller's frame pointer
pop ebp
ret
; this stuff after "ret" would never be reached anyway
; pop eax ; this is where the final return value is located
; pop esi ; restore esi
; pop edi ; restore edi
; pop ebx ; restore ebx
That's untested. I should know better than to post untested code, but it's late here...
As you can see, I've added a "test main" so that you can assemble and link the code and run it. As you probably know, we can see the exit code by typing "echo $?". Only one byte is valid, but that should be enough for short strings. I think I've got it right, but no promises...
Best,
Frank
-
Frank your advice worked perfectly, I compiled the program with the short "main" function and it is returning the length of "string1" as exit code!
Now I'm assuming that I don't need to leave in the string1 db 'string', 0 ; null terminated string
string1_len equ $ - string1 ; length of string1
part of the code because this function should be used to examine any length string. Should I just delete those two lines or is anything else required to make this happen?
Why is "dl" used in 'mov dl, [ecx]' ? If I understand it, dl is the low order byte of the edx register, but how does edx and dl come into play here?
Thanks again!
-
Moving on to the next C function "strcmp"
Instructions:
int l_strcmp(char *str1, char *str2);
return 0 if str1 and str2 are equal, return 1 if they are not. Note that this is not the same definition as the C standard library function strcmp.
Here is my code so far:
bits 32
section .data
string1 db 'hello', 0
string2 db 'hello', 0
string3 db 'Hello!', 0
section .text
global l_strcmp
l_strcmp:
push ebp ; prologue: set up stack frame
mov ebp, esp
xor eax, eax ; zero eax to prepare for storing result (0= equal, 1= not equal)
mov ecx, [ebp + 8] ; first parameter (string1) stored in ecx
mov edx, [ebp + 12] ; second parameter (string2) stored in edx
.char_loop:
; code to compare every character in both strings
mov cl, [ecx] ; move the current character into the cl segment of ecx
mov dl, [edx] ; move the current character into the dl segment of edx
cmp dl, cl
jne .done_1 ; if char in string1 != string2, exit with result 1
; how to examine if the null terminator has been met and both strings match?
jmp .char_loop ; continue examining characters
.done_1:
mov eax, 1 ; returns 1 when strings do not match
mov esp, ebp ; epilogue: restore caller's frame pointer
pop ebp
ret
.done_0:
mov eax, 0 ; returns 0 when strings do match
mov esp, ebp
pop ebp
ret
A bit of misunderstanding on what "cl" and "dl" are doing.. need some clarification on that, as well as if the loops appear to operate properly.
-
Good. What I found when I tried it was that Nasm burped up a couple of warnings about the lack of colons on a couple of labels. Just put colons on 'em, or that warning can be turned off.
The "string" can be considered part of the "test main". You don't need it in the final code.
The "dl" register was just someplace to put the single byte we're looking at. You don't need that, either. Could be done as:
cmp [ecx], byte 0
I was just trying to implement "something" from your pseudo-code. edx, and its "parts" dl and dh, are "volatile" according to the cdecl calling convention. We don't have to preserve it... so I used it.
We can get by without the stack frame, too. If we don't meddle with ebp, the first parameter is at [esp + 4]. We probably "should" use a stack frame, though - it allows a debugger to do a "back trace" to see where we were called from.
If you'll step into the museum for a moment... In 16-bit code, only bx and bp could be used for "base" registers. [sp + ?] was not a valid addressing mode. We had no choice but to...
push bp
mov bp, sp
mov ax, [bp + 4]
or whatever. 32-bit addressing modes are much more flexible - any register can be a "base" register, so we can use [esp + 4]. etc. Still, it is common to set up a stack frame...
If I'm feeling ambitious, I may work up a "super short" version of this. Probably not...
Best,
Frank
-
You're getting ahead of me... you Hare! :)
A bit of misunderstanding on what "cl" and "dl" are doing..
Since cl and dl are parts of ecx and edx, they're trashing ecx and edx so they no longer will point to your strings. Use some other 8-bit registers - al and ah, perhaps.
Best,
Frank
-
^ I was thinking that as I was doing it, that edx and ecx would get messed up somehow. This assembly stuff is so strict but at the same time the wild wild west in how you want to handle data and instructions
So here is my final version of the strcmp:
bits 32
section .data
string1 db 'hello', 0
string2 db 'hello', 0
string3 db 'Hello!', 0
section .text
global l_strcmp
l_strcmp:
push ebp ; prologue: set up stack frame
mov ebp, esp
xor eax, eax ; zero eax to prepare for storing result (0= equal, 1= not equal)
mov ecx, [ebp + 8] ; first parameter (string1) stored in ecx
mov edx, [ebp + 12] ; second parameter (string2) stored in edx
.char_loop:
; code to compare every character in both strings
cmp [ecx], [edx] ; compare the characters in the ecx, edx registers
jne .done_1 ; if char in string1 != string2, exit with result 1
cmp ecx, byte 0 ; tests for null terminator
je .done_0 ; jump to done if null terminator
jmp .char_loop ; continue examining characters
.done_1:
mov eax, 1 ; returns 1 when strings do not match
mov esp, ebp ; epilogue: restore caller's frame pointer
pop ebp
ret
.done_0:
mov eax, 0 ; returns 0 when strings do match
mov esp, ebp
pop ebp
ret
I feel I've built a suspension bridge on quicksand with this one
*OK so I just realize I forgot to increment ecx and edx. I would add:
inc ecx
inc edx
in the .char_loop between je.done_0 and jmp .char_loop
-
I don't think that'll even assemble, will it?
Best,
Frank
-
nope, giving me an error with line "cmp [ecx], [edx]"
*just did it!! Returns 1 when strings are different, 0 when strings are the same!
bits 32
section .data
string1 db 'hello', 0
string2 db 'hello', 0
string3 db 'Hello!', 0
section .text
;-----------------------------------------------
; this is a "test main" it should not be in your final code
global _start
_start:
push string1 ; address of string1
push string3
call l_strcmp
add esp, 8 ; "remove" parameters
; length is returned in eax
; make it our exit code
mov ebx, eax
mov eax, 1 ; sys_exit
int 0x80
; end of "test main"
;-------------------------------------------
global l_strcmp
l_strcmp:
push ebp ; prologue: set up stack frame
mov ebp, esp
xor eax, eax ; zero eax to prepare for storing result (0= equal, 1= not equal)
mov ecx, [ebp + 8] ; first parameter (string1) stored in ecx
mov edx, [ebp + 12] ; second parameter (string2) stored in edx
.char_loop:
; code to compare every character in both strings
mov al, [ecx]
mov ah, [edx]
cmp al, ah ; compare the characters in the ecx, edx registers
jne .done_1 ; if char in string1 != string2, exit with result 1
cmp al, byte 0 ; tests for null terminator
je .done_0 ; jump to done if null terminator
inc ecx
inc edx
jmp .char_loop ; continue examining characters
.done_1:
mov eax, 1 ; returns 1 when strings do not match
mov esp, ebp ; epilogue: restore caller's frame pointer
pop ebp
ret
.done_0:
mov eax, 0 ; returns 0 when strings do match
mov esp, ebp
pop ebp
ret
-
There ya go!
Now... the real C "gets()" is notoriously unsafe. Some versions of gcc will warn if you try to use it. Instead, make the caller tell you how big the buffer is, and don't "get" any more than that. Please!
Best,
Frank
-
No onto the l_gets:
instructions:
int l_gets(int fd, char *buf, int len);
read at most len bytes from file fd, placing them into buffer buf. Terminate early if a new line character ('\n', 0x0A) characters is read. If a new line character is encountered, it should be stored into the output buffer and counted in the total number of bytes read. Return the total number of bytes read (which may be zero if end of file is reached or an error occurs). This function does not place a null termination character after the last character read. That is the responsibility of the caller.
Here is some code I have so far, I just want to make sure I am setting it up correctly:
bits 32
section .data
section .text
global l_gets
l_gets:
push ebp ; prologue, set up stack frame
mov ebp, esp
xor eax, eax ; zero eax to prepare for storing return result
mov ecx, [ebp + 8] ; third parameter (int len) stored into ecx
mov edx, [ebp + 12] ; second parameter (char *buf) stored into edx
mov esi, [ebp + 16] ; first parameter (int fd) stored into esi
^since parameters for cdecl are stored right to left, that is why I am adding to the stack like that. Not sure if this is correct.
I'm lost as to where to/ how to store the buffer data. I should get the value for len, and loop that many times while writing the data to the buffer (which would be edx according to my code above)?
-
Looks remarkably like sys_read, does it not?
bits 32
section .data
section .text
global l_gets
l_gets:
push ebp ; prologue, set up stack frame
mov ebp, esp
xor eax, eax ; zero eax to prepare for storing return result
; going to need it for the system call number, no?
; going to need to preserve ebx
push ebx
mov ecx, [ebp + 8] ; third parameter (int len) stored into ecx
; fd - going to want it in ebx
mov edx, [ebp + 12] ; second parameter (char *buf) stored into edx
; going to want it in ecx
mov esi, [ebp + 16] ; first parameter (int fd) stored into esi
; max length - going to want it in edx
; now do your sys_read
; if error (eax negative) we want it to be zero
; that's what it says...
; otherwise number of characters - like sys_read
pop ebx
; epilogue...
That's how I understand it, anyway...
You may want to "flush" any excess the pesky user types...
Best,
Frank
-
Here is what I came up with so far:
bits 32
section .data
section .text
global l_gets
l_gets:
push ebp ; prologue, set up stack frame
mov ebp, esp
xor eax, eax ; zero eax to prepare for syscall #
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter goes into ebx
mov ecx, [ebp + 12] ; char *buf stored into ecx
mov edx, [ebp + 16] ; len stored into edx
mov eax, 3 ; sys call for read
int 0x80
; read data onto stack:
.buf_loop:
; read each character one at a time, increment counter (in eax), when counter matches len, jump out of loop
xor eax, eax ; zero eax to be used for counter
push ebx ; push the character onto stack
inc ebx ; advance to next character
inc eax ; advance the counter
cmp edx, eax
je .done
.loop1:
cmp register, byte 0x0A ; check for newline, exit loop if true
je .done
.done:
Hopefully the comments are enough to tell you about what I am trying to do here..
-
Well... no...
bits 32
section .data
section .text
global l_gets
l_gets:
push ebp ; prologue, set up stack frame
mov ebp, esp
xor eax, eax ; zero eax to prepare for syscall #
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter goes into ebx
mov ecx, [ebp + 12] ; char *buf stored into ecx
mov edx, [ebp + 16] ; len stored into edx
mov eax, 3 ; sys call for read
int 0x80
Up to here, I follow you. In fact, it looks like you're about done...
; read data onto stack:
.buf_loop:
; read each character one at a time, increment counter (in eax), when counter matches len, jump out of loop
xor eax, eax ; zero eax to be used for counter
If you zero eax in the loop, it's going to run for a long time!
push ebx ; push the character onto stack
inc ebx ; advance to next character
Last I knew, ebx was your file descriptor...
inc eax ; advance the counter
cmp edx, eax
je .done
Fair enough... if you don't zero eax in the loop...
.loop1:
cmp register, byte 0x0A ; check for newline, exit loop if true
je .done
.done:
I don't see where we "loop", and to where... The last part of it won't even assemble!
After the sys_read, your data's in the buffer that the caller specified, and eax holds bytes read, including the linefeed that ends input. At least that's true if we're reading from stdin. I'm less sure of how sys_read will behave on a "real file" (or, for that matter, if stdin is redirected). If it's a "text file", okay, but what if it's a "binary file"? Are we expected to stop at any number 10 we encounter? I think of "gets()" as being exclusively for stdin, but your assigned "l_gets" is apparently different. I may have to experiment and see what happens on a "real file"...
I mentioned up above that you might want to "flush" any excess. That would apply only to stdin.
Later,
Frank
-
The point of this assignment is to use these functions for our next assignment, which makes a socket call to a web server and downloads a .html or .txt file, so it is supposed to be reading plain text (no binary).
I played with it a little more.. I would like to use the stack as the buffer and push each byte onto the stack, and use esp as the pointer to the buffer.
bits 32
section .data
section .text
global l_gets
l_gets:
push ebp ; prologue, set up stack frame
mov ebp, esp
xor eax, eax ; zero eax to prepare for syscall #
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter goes into ebx
mov ecx, [ebp + 12] ; char *buf stored into ecx
mov edx, [ebp + 16] ; len stored into edx
add esp, 12 ; will use esp for the pointer to the buffer, the bytes to be read will be pushed onto stack
cmp edx, 0 ; if len is zero or less, exit program
jle .done
; read data onto stack:
.buf_loop:
mov eax, 3 ; sys call for read, to begin reading bytes
int 0x80
; read each character one at a time, increment counter (in eax), when counter matches len, jump out of loop
push esp
add esp, 4 ; advance to next character
inc eax ; advance the counter
cmp edx, eax
je .done
; ignore stuff below this for now
.loop1:
cmp register, byte 0x0A ; check for newline, exit loop if true
je .done
.done:
is this making sense?
-
Well, no. This may be my fault. Shortly after my last post, I lost my computer, and it's taken me 'til just now to beat it into rebooting. I'm pretty frazzeled, and haven't looked at any of the stuff I said I'd look at.
You're pretty far from a socket call. You can read from a socket with sys_read, but recv is more common. I don't see where this is going at all.
You can use the stack for a buffer. You want to subtract something from esp, not add it. You almost certainly do not want to use esp as a pointer into it.
push esp
add esp, 4
gets you right back where you were before pushing esp. It may be my confused mental state, but I don't see what you're trying to accomplish here. You can not push a byte. You can push a word, but you don't usually want to.
You can read bytes one at a time with sys_read. It'll be slower than a gut-shot wolf bitch with nine suckling pups dragging a number nine trap uphill in a snowstorm, but you can do it. I don't see the point, when sys_read does exactly what your assignment describes.
Let me try to get myself organized and see if I can get back into this.
Best,
Frank
-
Frank. you are killing me here with your colloquialisms ;D ;D I needed the laugh after the stress this class is putting me through
So I cleaned up the code a bit and I think got it closer to where I need to be:
bits 32
section .data
section .text
global l_gets
l_gets:
push ebp ; prologue, set up stack frame
mov ebp, esp
xor eax, eax ; zero eax to prepare for syscall #
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter goes into ebx
mov ecx, [ebp + 12] ; char *buf stored into ecx
mov edx, [ebp + 16] ; len stored into edx
cmp edx, 0 ; if len is zero or less, exit program
jle .done
; read data onto stack:
.buf_loop:
push esp
cmp esp, byte 0x0A ; if a newline character is encountered, exit
je .done
sub esp, 4 ; will use esp for the pointer to the buffer, the bytes to be read will be pushed onto stack
mov eax, 3 ; sys call for read, to begin reading bytes- for sys read, ebx= int (fd), ecx= char, edx= size_t (len)
int 0x80
; read each character one at a time, increment counter (in eax), when counter matches len, jump out of loop
inc eax ; advance the counter
cmp edx, eax
je .done
jmp .buf_loop ; continue
.done:
mov eax, 1 ; sys exit
mov ebx, edx ; return value is number of bytes written (len)
int 0x80
Maybe you could help me write a test main like you did before so I can test this one out? It is at least "compiling" (or do we say assembling?) fine.
-
We say "assembling". It is correct to call it "compiling" (an assembler is a compiler for assembly language)... but the asm-heads will think you're a newbie if you do.
This is a bare minimum. I commented out a lot of your code. You do a bunch of stuff with esp before you even read anything. I moved the exit code up into the "test main" where it belongs. Your caller is not likely to be pleased if you exit in the middle of the function. The function wants an epilogue and return (with the count in eax... where sys_read puts it).
It only tests reading from stdin. In order to test with a file, we'd need a file descriptor. We could put a sys_open in the test main, or... you're going to write one, right?
You seem to be very concerned with finding that linefeed. It's at [ecx + eax -1] - sys_read stops when it sees it. If it's not there, the pesky user has typed more than we had room for... and we probably should "flush" that. If we don't, it'll screw up the next read... which may be after the program has exited! Try it! Type 10 characters and "ls" before you hit "enter" and see what happens. "ls" is harmless, but "rm ." is not!
Reading from a real file won't do that. It will stop after edx bytes and there's no "keyboard buffer". It "may" stop earlier if it sees a linefeed. I kinda don't think so, but I need to try it. If you're going to be reading from a socket, what you'll actually see is carriage return linefeed pairs. http is fussy about that! It would make a lot of sense to keep reading past them until we've got the whole file or a full buffer. But the assignment seems to say it wants us to stop...
; Nasm -f elf32 l_gets.asm -d TESTMAIN
; ld -o l_gets l_gets.o
bits 32
%ifdef TESTMAIN
section .data
section .bss
buf resb 10
section .text
global _start
_start:
; test a call to it
push 10 ; length
push buf
push 0 ; stdin
call l_gets
add esp, 4 * 3
;print what we l_getsed
mov eax, 4 ;sys_write
mov ebx, 1 ; stdout
mov ecx, buf
mov edx, 10
int 80h
exit:
mov eax, 1 ; sys exit
mov ebx, edx ; return value is number of bytes written (len)
int 0x80
%endif
;-----------------------------
section .text
global l_gets
l_gets:
push ebp ; prologue, set up stack frame
mov ebp, esp
; xor eax, eax ; zero eax to prepare for syscall #
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter goes into ebx
mov ecx, [ebp + 12] ; char *buf stored into ecx
mov edx, [ebp + 16] ; len stored into edx
cmp edx, 0 ; if len is zero or less, exit program
jle .done
; read data onto stack:
.buf_loop:
; push esp
; cmp esp, byte 0x0A ; if a newline character is encountered, exit
; je .done
; sub esp, 4 ; will use esp for the pointer to the buffer, the bytes to be read will be pushed onto stack
mov eax, 3 ; sys call for read, to begin reading bytes- for sys read, ebx= int (fd), ecx= char, edx= size_t (len)
int 0x80
; read each character one at a time, increment counter (in eax), when counter matches len, jump out of loop
; inc eax ; advance the counter
; cmp edx, eax
; je .done
; jmp .buf_loop ; continue
.done:
pop ebx ; restore caller's reg
; epilogue
mov esp, ebp
pop ebp
ret
Tested (a little) but incomplete...
Best,
Frank
-
Should I not leave in the last part of the .buf_loop:
inc eax ; advance the counter
cmp edx, eax
je .done
jmp .buf_loop ; continue
-
I don't see why. After your sys_read, eax is the number of bytes actually read. edx is the maximum number to read. You're counting the number of bytes not read? There's something you're trying to do with this ".buf_loop" that I'm not getting. Quite possibly my fault...
Put it back in and see what it does, if you like...
Best,
Frank
-
- How do you declare a variable that takes in a dynamic sized amount (for buf).. because right now buf is only big enough for 10 characters
- The program requirement is for it to stop running when it reaches a new line feed, I think this is for parsing in subsequent networking assignment. Why is the newline character at [ecx + eax -1]?
- Why are we only preserving and restoring ebx and not ecx or edx?
-
You can get more memory with sys_brk... but you shouldn't have to worry about that. It's the caller's responsibility to provide the buffer and tell you how big it is. To get more buffer in "test main" just make it bigger.
The linefeed's at [ecx + eax -1] 'cause that's where it is. sys_read returns the number of bytes read in eax, and it stops when (doesn't stop until) it gets the linefeed. ecx, of course, is the beginning of the buffer.
As we've discussed, this is (probably) going to be different with a real file... or a socket. If the assignment really requires it, you could read one byte at a time (put 1 in edx and ignore what the caller tells you). Makes more sense to simply find it in the buffer...
Best,
Frank
-
Will this code do what I need it to (it works using test main and preset buffer but if it is trying to take in a variable amount of text, will it do that and return the number of bytes read as exit code: even if a newline character is located before the amount of "len" ?
global l_gets
l_gets:
push ebp ; prologue, set up stack frame
mov ebp, esp
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter goes into ebx
mov ecx, [ebp + 12] ; char *buf stored into ecx
mov edx, [ebp + 16] ; len stored into edx
cmp edx, 0 ; if len is zero or less, exit program
jle .done
cmp ecx, [ecx + eax -1] ; compare curremt character to new line, exit if true
je .done
; read data onto stack:
mov eax, 3 ; sys call for read, to begin reading bytes- for sys read, ebx= int (fd), ecx= char, edx= size_t (len)
int 0x80
.done:
pop ebx ; restore the caller's ebx
mov esp, ebp ; epilogue- clean up the stack
pop ebp
ret
-
Something like this?
bits 32
%ifdef TESTMAIN
section .data
section .bss
buf resb 80
section .data
filename db `l_gets.asm\0` ; ourself - we know it's there
section .text
global _start
_start:
mov eax, 5 ; sys_open
mov ebx, filename
xor ecx, ecx
xor edx, edx
int 80h
test eax, eax
js exit ; bail out if error
; test a call to it
push 80 ; length
push buf
push eax ; fd
call l_gets
add esp, 4 * 3
; find NL - we want to stop there
xor edx, edx
find:
cmp [ecx + edx], byte 10
je found
inc edx
jmp find
found:
; edx is length to print
;print what we l_getsed
; mov edx, eax ; length read
mov eax, 4 ;sys_write
mov ebx, 1 ; stdout
mov ecx, buf
int 80h
exit:
mov eax, 1 ; sys exit
mov ebx, edx ; return value is number of bytes written (len)
int 0x80
%endif
;-----------------------------
global l_gets
l_gets:
push ebp ; prologue, set up stack frame
mov ebp, esp
; xor eax, eax ; zero eax to prepare for syscall #
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter goes into ebx
mov ecx, [ebp + 12] ; char *buf stored into ecx
mov edx, [ebp + 16] ; len stored into edx
cmp edx, 0 ; if len is zero or less, exit program
jle .done
; read data onto stack:
.buf_loop:
; push esp
; cmp esp, byte 0x0A ; if a newline character is encountered, exit
; je .done
; sub esp, 4 ; will use esp for the pointer to the buffer, the bytes to be read will be pushed onto stack
mov eax, 3 ; sys call for read, to begin reading bytes- for sys read, ebx= int (fd), ecx= char, edx= size_t (len)
int 0x80
; read each character one at a time, increment counter (in eax), when counter matches len, jump out of loop
; inc eax ; advance the counter
; cmp edx, eax
; je .done
; jmp .buf_loop ; continue
.done:
pop ebx ; restore caller's reg
mov esp, ebp
pop ebp
ret
Could put the "find linefeed" part in the function. That may be what the assignment expects?
Best,
Frank
Ahhh, just read your latest. I think your code will work reading from stdin. (ahhh, no it won't - ecx is not likely 0xA - especially before the sys_read!) This tests from a real file. As I expected, it reads past the linefeed. Comment out the "find LF" part to see what it does without it...
-
Frank,
At this point my brain is feeling clobbered and I want to break this simple function down step by step.
First off, I would just like to write the part of the code that pushes each byte onto the stack to be read.
Would I implement this as a something like:
sys read
push buf
pop ecx
add [buf], 4
inc counter
if counter == len OR if ecx == 0xOA
jmp done
Is this the correct way? I am having a tough time understanding how the stack can be used to store one byte at a time as a kind of "buffer," but my professor says that it can be done. He isn't too keen on giving us the exact code on how to do it, however.
We're gonna have to pop each byte as it is pushed onto the stack, correct?
-
Here is a stack diagram I drew to try to visualize how this code is working:
bits 32
section .text
l_gets:
push ebp ; prologue
mov ebp, esp
push ebx
mov ebx, [ebp + 8] ; set up registers on stack for args
mov ecx, [ebp + 12]
mov edx, [ebp + 16]
cmp edx, 0
jle .done
.read_loop:
mov eax, 3 ; sys call read
int 0x80
push buf ; push buf onto a stack because this is where each character is going to be stored
pop ecx ; pops the current character in ecx into buf
; this part I am getting hung up on, how to increment to the next character to be read in from whatever file is coming in
cmp edx, char_count ; jump if number of bytes read has reached “len”
je .done
.done:
pop ebx
mov esp, ebp ; epilogue
pop ebp
ret
(http://i.imgur.com/tbed9hM.jpg)
-
i was able to write this functioning l_gets:
bits 32
section .text
global l_gets
l_gets:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter into ebx
mov ecx, [ebp + 12] ; char *buf in ecx
mov edx, [ebp + 16] ; len in edx
cmp edx, 0 ; if len zero or less, exit
jle .done
mov eax, 3 ; sys read
int 0x80
.done:
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
When using your testmain though, you move 10 to edx (for 10 characters). Even though I type less than 10 characters into stdin, i always get a return value of 10.. the program should return only the number of bytes read. This and figuring out how to stop input after the newline character is read are all that's left and i'm pretty much done with this one. Onto the next..
-
Ahhh... your instructor is apparently not explaining this well. I probably won't explain it well, either. Such is life...
You can indeed put one byte on the stack, but forget about "push" and "pop" to do so. First, let us create a "local" variable on the stack to use as a "buffer":
func:
push ebp
mov ebp, esp
sub esp, 64
...
The amount we subtract from esp should be a multiple of 4, just to keep the stack well aligned. It could be more (or less) but probably shouldn't be more than 4096 (a "page"). So we've got a local variable that we can use as a buffer (or anything else). Its address is ebp - 64. That's the address of the first (lowest) byte, as usual.
I guess what you're trying to do is to read one byte at a time into this buffer - instead of the buffer the caller has provided? If so, we ignore the buffer the caller has provided, and the length. We still need the file descriptor...
func:
push ebp
mov ebp, esp
sub esp, 64
mov ebx, [ebp + 8]
lea ecx, [ebp - 64]
xor esi, esi ; to use as counter
mov edx, 1 ; read just one byte
.top:
mov eax, 3 ; sys_read
int 0x80
cmp [ecx], byte 0xA ; linefeed?
je .done
inc ecx ; next byte in buffer
inc esi ; count it
cmp esi, 64 ; we're out of local buffer
je .done
cmp esi, [ebp + 16] ; caller's idea of length
je .done
jmp .top ; go get another one
.done:
; do something intelligent...
; clean up and go home
We may need more than one ".done:" label here... I don't think I'm too fond of this approach.
I think what I would rather do is a perfectly ordinary sys_read into the buffer the caller provides for the length the caller provides. Then... if the file descriptor is stdin, make sure that we do have that linefeed. If not, read into a dummy buffer until we find it to flush the keyboard buffer. This is complicated by the fact that *nix is fairly tolerant about file descriptors. If we try to read from stdout or even stderr, we still read from the keyboard. I'm not sure what happens with stdaux. If the file descriptor is 4 or greater, we presumably have a disk file or socket. This will read past the linefeed. In the case of a disk file, we could seek back to where the linefeed was found. I don't think this will work on a socket. We may have no choice but to read one byte at a time. I think I'd still read into the buffer the caller provides...
We're gonna have to pop each byte as it is pushed onto the stack, correct?
I don't think so. Why push it at all if you're only going to pop it again? I'm still not sure I understand what you're trying to do here.
Your diagram looks pretty good except that you've left out the return address (below the parameters and above old ebp). Also, esp points at what you pushed last, not below it yet...
push buf
pushes the address of buf - probably not useful. You could push four bytes and ignore all but one. To increment an address, you want to put it in a register and increment that.
; if ecx is caller's buffer...
lea edi, [ebp - 64] ; our buffer on stack
.top:
mov al, [ecx]
mov [edi], al
inc edi ; or use stosb
inc ecx
jmp .top
That will need an exit condition, of course. I've ignored preserving registers for now. Beside ebx (used by system calls) we'll probably need both esi and edi. That can be added - just trying to keep it simple. Doubt if I've succeeded...
Best,
Frank
next post: probably my fault you're getting the bad return value. Check closely what I've done...
-
Thanks, some things make sense now, others still don't.
Here's what I have so far. I'm not spending any time on this one tonight.
global l_gets
l_gets:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter into ebx
mov ecx, [ebp + 12] ; char *buf in ecx
mov edx, [ebp + 16] ; len in edx
xor esi, esi ; counter
cmp edx, 0 ; if len zero or less, exit
jle .done
.char_loop:
mov eax, 3 ; sys read
int 0x80
cmp [ecx], byte 0xA ; test for linefeed
je .done
inc ecx ; advance to next byte
inc esi ; +count
cmp esi, [ebp + 16] ; does read bytes = len?
je .done
jmp .char_loop
.done:
mov eax, esi ; # bytes read into eax
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
-
These additional functions I need to complete:
void l_puts(const char *buf);
write the contents of the null terminated string buf to stdout. The null byte must not be written. If the length of the string is zero, then no bytes are to be written.
int l_write(int fd, char *buf, int len);
write len bytes from buffer buf to file fd. Return the number of bytes actually written or -1 if an error occurs.
int l_open(const char *name, int flags, int mode);
opens the named file with the supplied flags and mode. Returns the integer file descriptor of the newly opened file or -1 if the file can't be opened.
int l_close(int fd);
close the indicated file, returns 0 on success or -1 on failure.
int l_exit(int rc);
terminate the calling program with exit code rc.
Can you maybe help to "guide" as far as how I should set these up so I can work on these this afternoon? Thank you again
-
Well, "puts" is just a sys_write to stdout. Yhe only "different" thing about it is the null-terminated string. You can call your I_strlen and move the length to edx or do:
cmp [ecx + edx], byte 0
; etc.
The rest of 'em are just wrappers around system calls. They'll look a lot like your l_gets. If error, eax will be -ERRNO - you need to change it to -1. The real C library does this and puts the error number in the global variable "errno". You apparently don't need to do that. Probably were supposed to do that for l_gets, too.
Best,
Frank
-
Does this look good for l_puts:
bits 32
section .text
global l_puts
l_puts:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; const char *buf goes into ebx
.char_loop:
cmp [ebx], byte 0x0 ; look for null terminator
je .done
mov eax, 4 ; sys write
int 0x80
jmp .char_loop
.done:
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
I'm thinking I would need to write each byte, kind of like l_gets reads in each byte. The sys_write takes three args: int, const char*, size_t. Would I need to set up the stack in the .char_loop so that ebx, ecx, and edx hold each of these arguments? I'm a little confused here because the actual call to l_puts only takes in one argument (or are these called parameters?), so not sure how to set this up.
-
Here is what I have so far for l_write:
bits 32
section .text
global l_write
l_write:
push ebp ;prologue
mov ebp, esp
push ebx
mov ebx, [ebp + 8] ; fd stored in ebx
mov ecx, [ebp + 12] ; char *buf stored in ecx
mov edx, [ebp + 16] ; len stored in edx
xor esi, esi ; counter
cmp edx, 0 ; check for error
jle .error
.char_loop:
mov eax, 4 ; sys write
int 0x80
inc ecx
inc esi
cmp esi, [ebp + 16] ; does bytes written = len?
je .done
jmp .char_loop
.error:
mov eax, -1 ; error
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
.done:
mov eax, esi ; return # bytes written
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
-
Well, no. The one and only parameter (you can call it an argument), the address of the string, goes in ecx. We know that the file descriptor wants to be stdout (1) - that goes in ebx. The length, which you need to find either by calling l_strlen or by finding the zero here, goes in edx. You might want to use a loop like this:
; address is in ecx
xor edx, edx
.find:
cmp [ecx + edx], byte 0
jnz .found
inc edx
jmp .find
Or call l_strlen...
Best,
Frank
-
bits 32
section .text
global l_write
l_write:
push ebp ;prologue
mov ebp, esp
push ebx
mov ebx, [ebp + 8] ; fd stored in ebx
mov ecx, [ebp + 12] ; char *buf stored in ecx
mov edx, [ebp + 16] ; len stored in edx
xor esi, esi ; counter
; you don't need a counter, eax does it
; if you do use esi, you need to push/pop it with ebx
cmp edx, 0 ; check for error
; and eax will be negative if error, not edx
; and you need to do this after sys_write, not before
jle .error
.char_loop:
mov eax, 4 ; sys write
int 0x80
inc ecx
inc esi
cmp esi, [ebp + 16] ; does bytes written = len?
je .done
; eax will be edx... even if some of 'em are garbage
; unless there's an error
jmp .char_loop
.error:
mov eax, -1 ; error
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
.done:
mov eax, esi ; return # bytes written
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
Best,
Frank
-
; address is in ecx
xor edx, edx
.find:
cmp [ecx + edx], byte 0
jnz .found
inc edx
jmp .find
Did you mean that to say "je .found" instead of "jnz .found" ?
-
Sure enough. My bad.
Best,
Frank
-
OK so I got a lot going through my head now, back to the l_gets,
can you take a look at the code and let me know where I should go from here?
l_gets:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve ebx
mov ebx, [ebp + 8] ; fd parameter into ebx
mov ecx, [ebp + 12] ; char *buf in ecx
mov edx, [ebp + 16] ; len in edx
xor esi, esi ; counter
cmp edx, 0 ; if len zero or less, exit
jle .done
.char_loop:
mov eax, 3 ; sys read
int 0x80
cmp [ecx], byte 0xA ; test for linefeed
je .done
inc ecx ; advance to next byte
inc esi ; +count
cmp esi, [ebp + 16] ; does read bytes = len?
je .done
jmp .char_loop
.done:
mov eax, esi ; # bytes read into eax
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
-
Frank can you provide a little insight as to why
cmp [ecx + edx], byte 0
is used to find a null terminator for "l_puts"
ecx= the pointer to the address of the string, correct? void l_puts(const char *buf)
if edx starts at 0, is the first iteration checking the first character in the string for 0?
Then if not zero, edx increments to 1, does that mean that the second byte in the string (or "character array") is then checked for zero? It's not making sense to me how this is working and seems like we have to take a lot on faith in this programming stuff.
-
code for l_puts:
bits 32
section .text
global l_puts
l_puts:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve ebx
push edi ; preserve edi
push esi ; preserve esi
mov ebx, 1 ; 1= stdout
mov ecx, [ebp + 12] ; const char *buf [address of string] goes into ecx
xor edx, edx
.char_loop:
cmp [ecx + edx], byte 0 ; look for null terminator
je .done
mov eax, 4 ; sys write
int 0x80
inc edx
jmp .char_loop
.done:
pop esi ; restore esi
pop edi ; restore edi
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
notice I am now preserving and restoring edi and esi registers in addition to ebx, per instructions by my professor
-
What I got so for for l_write:
bits 32
section .text
global l_write
l_write:
push ebp ;prologue
mov ebp, esp
push ebx ; preserve regisers
push edi
push esi
mov ebx, [ebp + 8] ; fd stored in ebx
mov ecx, [ebp + 12] ; char *buf stored in ecx
mov edx, [ebp + 16] ; len stored in edx
xor esi, esi ; counter
cmp edx, 0 ; check for 0 len
jle .done
.char_loop:
mov eax, 4 ; sys write
int 0x80
cmp eax, 0 ; check for error
jle .error
inc ecx ; move to next char
inc esi ; increment counter
cmp esi, [ebp + 16] ; does bytes written = len?
je .done
jmp .char_loop
.error:
mov eax, -1 ; error
pop esi ; restore registers
pop edi
pop ebx
mov esp, ebp ; epilogue
pop ebp
ret
.done:
mov eax, esi ; return # bytes written
pop esi
pop edi
pop ebx ; restore registers
mov esp, ebp ; epilogue
pop ebp
ret
l_write instructions:
int l_write(int fd, char *buf, int len)
write len bytes from buffer buf to file fd. Return the number of bytes actually written or -1 if an error occurs.
-
If you're using esi, you ought to preserve it. Push it right after ebx and pop it right before, or the other way around. That's easy.
Now... s'pose we're reading from stdin, and the user types nothing but the "enter" key. After the sys_read eax will be 1 and that's what we want to return - but esi is still zero, no? You may want to start off with esi = 1. Suppose the length, as provided by the caller, is 4, and the user types 3 characters and "enter". eax will be 4 and I guess that's what esi will be when we put it back into eax. I may have to try that one. If the user types 4 or more characters before "enter", eax will be 4 and that's what we'll return. The linefeed and perhaps some characters will remain in the "keyboard buffer" to mess us up later unless we flush them. The assignment doesn't say anything about that, so I guess we can ignore it. We may regret that.
Suppose we're reading from a disk file, or socket. We'll read edx bytes, regardless of linefeeds. Your code counts up to the linefeed (if any) and returns that, "as if" we had stopped at the linefeed. But we didn't. Another read from that file will start where we left off, edx bytes into the file, not at the linefeed. This may not be satisfactory. The assignment says to stop at the linefeed. The only way I can think of to do that is to read one byte at a time, ugly as that is. I really don't know what to advise you on this. Best to stick to the assignment, I'm afraid...
If I get to it, I'll download your code and try it. As we have discovered, untested code can have misteaks. :)
Best,
Frank
Aw, jeez, three new messages? I'll get back to ya...
-
Frank can you provide a little insight as to why
cmp [ecx + edx], byte 0
is used to find a null terminator for "l_puts"
ecx= the pointer to the address of the string, correct? void l_puts(const char *buf)
if edx starts at 0, is the first iteration checking the first character in the string for 0?
Then if not zero, edx increments to 1, does that mean that the second byte in the string (or "character array") is then checked for zero? It's not making sense to me how this is working and seems like we have to take a lot on faith in this programming stuff.
No faith, just logic.Everything you say is correct and as it should be. If the zero is the first character, the length is zero - we don't want to count the zero as part of the length. If the zero is the second character, the length is 1, etc.
However...
bits 32
section .text
global l_puts
l_puts:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve ebx
push edi ; preserve edi
push esi ; preserve esi
mov ebx, 1 ; 1= stdout
mov ecx, [ebp + 12] ; const char *buf [address of string] goes into ecx
Yes, but it's at [ebp + 8], being the first and only parameter!
xor edx, edx
.char_loop:
cmp [ecx + edx], byte 0 ; look for null terminator
; je .done
No, only "found length", not "done"!
je .found_lenght
inc edx
jmp .char_loop
.found_length:
; now we've got ebx, ecx, and edx where we want 'em
mov eax, 4 ; sys write
int 0x80
test eax, eax ; just to set flags
jns .done ; no error
mov eax, -1
; or eax, -1 ; shorter way to do the same thing
; inc edx
; jmp .char_loop
.done:
pop esi ; restore esi
pop edi ; restore edi
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
Does no harm to preserve registers we don't use.
"l_write" is simpler than you've got it. With the exception of making the error -1, depriving the caller of "what" went wrong, it's just sys_write.
bits 32
section .text
global l_write
l_write:
push ebp ;prologue
mov ebp, esp
push ebx ; preserve regisers
push edi
push esi
mov ebx, [ebp + 8] ; fd stored in ebx
mov ecx, [ebp + 12] ; char *buf stored in ecx
mov edx, [ebp + 16] ; len stored in edx
; xor esi, esi ; counter
; we don't need a counter
cmp edx, 0 ; check for 0 len
jle .done
; does no harm to check caller for idiocy
; we don't write anything anyway
; .char_loop:
mov eax, 4 ; sys write
int 0x80
cmp eax, 0 ; check for error
jle .error
; probably should be just "jl"
; strictly speaking , 0 is not an error
; inc ecx ; move to next char
; inc esi ; increment counter
; cmp esi, [ebp + 16] ; does bytes written = len?
; je .done
; jmp .char_loop
; we don't need any of that
.error:
mov eax, -1 ; error
pop esi ; restore registers
pop edi
pop ebx
mov esp, ebp ; epilogue
pop ebp
ret
.done:
mov eax, esi ; return # bytes written
pop esi
pop edi
pop ebx ; restore registers
mov esp, ebp ; epilogue
pop ebp
ret
No real need to duplicate the entire "clean up and go home". We just need to make eax -1 if it was negative (depriving the caller of useful information) and leave it alone if no error. Does no harm.
It probably would have been a good idea to make each of these functions a separate "topic". A little late now.
Now... see if I still feel like looking at l_gets...
Best,
Frank
-
ok, nearly final l_write:
bits 32
section .text
global l_write
l_write:
push ebp ;prologue
mov ebp, esp
push ebx ; preserve regisers
push edi
push esi
mov ebx, [ebp + 8] ; fd stored in ebx
mov ecx, [ebp + 12] ; char *buf stored in ecx
mov edx, [ebp + 16] ; len stored in edx
cmp edx, 0 ; check for 0 len
jle .done
mov eax, 4 ; sys write
int 0x80
cmp eax, 0 ; check for error (when eax is less than zero)
jl .error
.done:
pop esi
pop edi
pop ebx ; restore registers
mov esp, ebp ; epilogue
pop ebp
ret
.error:
mov eax, -1 ; error
jmp .done
I'm not following how this would return the number of bytes written (which is why I was using esi as a counter, every byte written would increment 1, and then that is moved into eax before returning) unless that part needs to be amended..
-
This is what I've got for l_gets. It's pretty much your code. I moved "inc esi" up to the top so our first try is 1, not 0. If esi==len, we do want to do that read, in case the LF is there. Now we do. I cut back to reading one byte at a time, little as I like it. Fugly! I did not attempt to "flush the buffer". I indicated where we might want to - only if we're reading from stdin!
; nasm -f elf32 l_gets.asm -d TESTMAIN
; ld -o l_gets l_gets.o
bits 32
%ifdef TESTMAIN
section .bss
buf resb 80
fd resd 1
section .data
filename db `l_gets.asm\0` ; ourself - we know it's there
section .text
global _start
_start:
mov eax, 5 ; sys_open
mov ebx, filename
xor ecx, ecx
xor edx, edx
int 80h
test eax, eax
js exit ; bail out if error
mov [fd], eax
; test a call to it
; try multiple calls if we're reading file
; just to make sure we're stopping at LF
; and can continue from there
mov esi, 7
top:
push 80 ; length
push buf
push dword [fd]
call l_gets
add esp, 4 * 3
; print what we l_getsed - l_got?
mov edx, eax ; length read
mov eax, 4 ;sys_write
mov ebx, 1 ; stdout
mov ecx, buf
int 80h
; only if we're doing multiple reads
dec esi
jnz top
exit:
mov ebx, eax ; return value is number of bytes written (len)
mov eax, 1 ; sys exit
int 0x80
%endif
;-----------------------------
section .text
global l_gets
l_gets:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve ebx
push esi
mov ebx, [ebp + 8] ; fd parameter into ebx
mov ecx, [ebp + 12] ; char *buf in ecx
mov edx, [ebp + 16] ; len in edx
xor esi, esi ; counter
cmp edx, 0 ; if len zero or less, exit
jle .done
mov edx, 1 ; read one byte at a time. ugh!
.char_loop:
inc esi ; increment count first
mov eax, 3 ; sys read
int 0x80
cmp [ecx], byte 0xA ; test for linefeed
je .done
inc ecx ; advance to next byte
cmp esi, [ebp + 16] ; does read bytes = len?
je .done
; if this happens, we didn't find a LF
; if we're reading stdin, this indicates overflow
; we might want to flush OS's input buffer ("keyboard buffer")
jmp .char_loop
.done:
mov eax, esi ; # bytes read into eax
pop esi
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
;-----------------------------------------------
That's as far as I got. I'm not sure it's "complete". See what you think...
Best,
Frank
Ah, again we're bumping into each other. Your l_write looks good at first glance. It returns the number of bytes written because that's what sys_write does!
-
completed l_puts:
bits 32
section .text
global l_puts
l_puts:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve ebx
push edi ; preserve edi
push esi ; preserve esi
mov ebx, 1 ; 1= stdout
mov ecx, [ebp + 8] ; const char *buf [address of string] goes into ecx (first and only parameter)
xor edx, edx ; set up a counter
.char_loop:
cmp [ecx + edx], byte 0 ; look for null terminator
je .found_len
inc edx ; counter + 1
jmp .char_loop
inc edx
jmp .char_loop
.found_len:
mov eax, 4 ; sys write
int 0x80
test eax, eax ; set flags
jns .done ; jump if zero or positive (no error)
mov eax, -1 ; set error code if error
.done:
pop esi ; restore esi
pop edi ; restore edi
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
-
completed l_open:
bits 32
section .text
global l_open
l_open:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve registers
push edi
push esi
mov ebx, [ebp + 8] ; name of the file (const char * name)
mov ecx, [ebp + 12] ; flags
mov edx, [ebp + 16] ; mode
mov eax, 5 ; open sys call
int 0x80
cmp eax, 0 ; check for error
jl .error
.done:
pop esi ; restore registers
pop edi
pop ebx
mov esp, ebp ; epilogue
pop ebp
ret
.error:
mov eax, -1
jmp .done
int l_open(const char *name, int flags, int mode)
opens the named file with the supplied flags and mode. Returns the integer file descriptor of the newly opened file or -1 if the file can't be opened.
-
completed l_close:
bits 32
section .text
global l_close
l_close:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve registers
push edi
push esi
mov ebx, [ebp + 8] ; file descriptor
mov eax, 6 ; close sys call
int 0x80
cmp eax, 0
jl .error ; check for error
je .success ; check for success
.done:
pop esi ; restore registers
pop edi
pop ebx
mov esp, ebp ; epilogue
pop ebp
ret
.error:
mov eax, -1
jmp .done
.success:
mov eax, 0
jmp .done
int l_close(int fd)
close the indicated file, returns 0 on success or -1 on failure.
-
And finally, the last function, "l_exit"
bits 32
section .text
l_exit:
; are these needed for exit:
; push ebp ; prologue
; mov ebp, esp
; push ebx ; preserve registers
; push edi
; push esi
mov ebx, [ebp + 8] ; int rc
mov eax, 1 ; exit sys call
int 0x80
int l_exit(int rc)
terminate the calling program with exit code rc.
-
Should I be performing xor on all of these registers at the beginning of all of these functions?
-
Your l_gets seems to work great! However, it looks like it is repeating the same line 7 times, care to extrapolate on that one?
The return value 37 is correct, 36 characters + linefeed :D
(http://i.imgur.com/OMZFsOh.png)
*edit aha, never mind, I see in the testmain code that you are repeating 7 times (esi= 7)
Ignore this post!
This is what I've got for l_gets. It's pretty much your code. I moved "inc esi" up to the top so our first try is 1, not 0. If esi==len, we do want to do that read, in case the LF is there. Now we do. I cut back to reading one byte at a time, little as I like it. Fugly! I did not attempt to "flush the buffer". I indicated where we might want to - only if we're reading from stdin!
; nasm -f elf32 l_gets.asm -d TESTMAIN
; ld -o l_gets l_gets.o
bits 32
%ifdef TESTMAIN
section .bss
buf resb 80
fd resd 1
section .data
filename db `l_gets.asm\0` ; ourself - we know it's there
section .text
global _start
_start:
mov eax, 5 ; sys_open
mov ebx, filename
xor ecx, ecx
xor edx, edx
int 80h
test eax, eax
js exit ; bail out if error
mov [fd], eax
; test a call to it
; try multiple calls if we're reading file
; just to make sure we're stopping at LF
; and can continue from there
mov esi, 7
top:
push 80 ; length
push buf
push dword [fd]
call l_gets
add esp, 4 * 3
; print what we l_getsed - l_got?
mov edx, eax ; length read
mov eax, 4 ;sys_write
mov ebx, 1 ; stdout
mov ecx, buf
int 80h
; only if we're doing multiple reads
dec esi
jnz top
exit:
mov ebx, eax ; return value is number of bytes written (len)
mov eax, 1 ; sys exit
int 0x80
%endif
;-----------------------------
section .text
global l_gets
l_gets:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve ebx
push esi
mov ebx, [ebp + 8] ; fd parameter into ebx
mov ecx, [ebp + 12] ; char *buf in ecx
mov edx, [ebp + 16] ; len in edx
xor esi, esi ; counter
cmp edx, 0 ; if len zero or less, exit
jle .done
mov edx, 1 ; read one byte at a time. ugh!
.char_loop:
inc esi ; increment count first
mov eax, 3 ; sys read
int 0x80
cmp [ecx], byte 0xA ; test for linefeed
je .done
inc ecx ; advance to next byte
cmp esi, [ebp + 16] ; does read bytes = len?
je .done
; if this happens, we didn't find a LF
; if we're reading stdin, this indicates overflow
; we might want to flush OS's input buffer ("keyboard buffer")
jmp .char_loop
.done:
mov eax, esi ; # bytes read into eax
pop esi
pop ebx ; restore ebx
mov esp, ebp ; epilogue
pop ebp
ret
;-----------------------------------------------
That's as far as I got. I'm not sure it's "complete". See what you think...
Best,
Frank
Ah, again we're bumping into each other. Your l_write looks good at first glance. It returns the number of bytes written because that's what sys_write does!
-
In regards to l_open not working here are feedback comments:
Your l_open doesn't work correctly. After sys_open, you don't check the return value. Your l_open should return -1 if you fail to open the file for any reason.
My l_open code:
l_open:
push ebp ; prologue
mov ebp, esp
push ebx ; preserve registers
push edi
push esi
mov ebx, [ebp + 8] ; name of the file (const char * name)
mov ecx, [ebp + 12] ; flags
mov edx, [ebp + 16] ; mode
mov eax, 5 ; open sys call
int 0x80
cmp eax, 0 ; check for error
jl .error
.done:
pop esi ; restore registers
pop edi
pop ebx
mov esp, ebp ; epilogue
pop ebp
ret
.error:
mov eax, -1
jmp .done
So.. how to fix so that it returns -1 for failure to open?
-
That looks correct to me!