NASM - The Netwide Assembler
NASM Forum => Programming with NASM => Topic started by: annonymous on February 01, 2012, 01:06:23 AM
-
I am "trying" to write a program that takes in user input and prints it. It prompts the user for data and then prints you have entered: <BLANK> nothing to the screen and exits. What am i doing wrong?
section .text
global _start
_start:
mov eax,4
mov ebx,1
mov ecx,msg
mov edx,msg_size
int 80h
mov eax,3
mov ebx,1
xor ecx,ecx
mov ecx,inp_buf
int 80h
mov eax,4
mov ebx,1
mov ecx,msg2
mov edx,msg2_size
int 80h
mov eax,4
mov ebx,1
mov ecx,[inp_buf]
int 80h
mov eax,1
xor ebx,ebx
mov ebx,0
int 80h
section .data
msg db 'Input some data: '
msg_size equ $-msg
inp_buf: resq 100
msg2 db 'You entered: '
msg2_size equ $-msg2
-
"[contents]" of inp_buf instead of address
mov ecx,[inp_buf]
mostly...
You could give some thought to what's in edx. When you read the input, edx has msg_size in it from the first write, so that's all the user can input, even though you've got a buffer of 800 bytes (100 qwords). If you let 'em put in up to 800 bytes, the amount that they actually input will be in eax. You want that value in edx when you write it back... but you write msg2 first, so you'll need to save that value... the stack might be a handy place...
...
mov eax,3
mov ebx,1 ; reading from stdout? will work(!). zero (stdin) better
; xor ecx,ecx ; not useful
mov ecx,inp_buf
mov edx, 800
int 80h
push eax ; save input count
mov eax,4
mov ebx,1
mov ecx,msg2
mov edx,msg2_size
int 80h
mov eax,4
mov ebx,1
mov ecx, inp_buf
pop edx ; get input count back
int 80h
...
That's untested, but I think it'll work. The "count" from the read will include the linefeed (0x0A). If you don't want to print that (you probably do, in this case), subtract 1. By rights, you might want to read only 799 bytes - make sure there's room for it in the buffer...
Best,
Frank
-
Thank you Frank.
i now see what i was doing wrong and made a few modifications. Here they are:
section .text
global _start
_start:
mov eax,4 ;sys_wite
mov ebx,1 ;To stdout
mov ecx,msg ;'Input some data: '
mov edx,msg_size
int 80h ;Call kernel
mov eax,3 ;sys_read. Read what user inputs
mov ebx,0 ;From stdin
mov ecx,inp_buf ;Save user input to buffer.
int 80h
push eax
mov eax,4
mov ebx,1
mov ecx,msg2 ;'You entered: '
mov edx,msg2_size
int 80h
mov eax,4
mov ebx,1
mov ecx,inp_buf
pop edx
int 80h
mov eax,1
mov ebx,0
int 80h
section .bss
inp_buf resb 256
section .data
msg: db 'Input some data: '
msg_size: equ $-msg
msg2: db 'You entered: '
msg2_size: equ $-msg2
I appreciate the help!
So i tried to enter a string 19 chars long but it only prints 17 then exits. Doesnt matter how big i make the buffer either.
heres the error:
ubuntu@ubuntu:~/Documents$ ./usrinput
Input some data: hello there bro how
You entered: hello there bro hubuntu@ubuntu:~/Documents$ ow
ow: command not found
ubuntu@ubuntu:~/Documents$
What gives?
-
Looks like we need to specifiy the MAX LIMIT for the read() system call.
Plz try this for the read() system call.
mov eax,3 ;sys_read. Read what user inputs
mov ebx,0 ;From stdin
mov ecx,inp_buf ;Save user input to buffer.
mov edx, 256 ;; No of bytes to read.
int 80h
Reference:
http://leto.net/writing/nasm.php (http://leto.net/writing/nasm.php)
ssize_t read(int fd, void *buf, size_t count);
NOTE: ssize_t and size_t are just integers .
The first argument is the file descriptor, followed by the buffer, and then how
many bytes to read in, which should be however long the buffer is. Reading the
RETURN VALUE section, you should see how read() returns the number of bytes it
read, 0 for EOF, and -1 for errors.
Then why 17 Chars always.???
mov edx,msg_size in the first write() system call (for printing 'Input some data: ' ) stored 17 in edx. which got carried to the read system call too.
Whatever you entered after the 17th character ('ow' and enter) were given to the shell prompt as input.
EDIT: I think Frank corrected that too in his post, and u missed it :)
-
Wow, i can't believe i missed that!
And I now understand why read was only taking in 17 bytes. Thanks Mathi!
-
Hello.
(I will continue this thread because it is directly related, but if it requires a new thread just let me know)
mov edx, 256 ;; No of bytes to read.
What if I don't know the size of the input? Is it safe to give a huge buffer? Or can I write a loop until I find the end of the input?
Regards.
-
It should be "safe" to use a huge buffer, provided you've got a big enough buffer to hold all that you've specified in edx. Allowing edx to exceed the size of the buffer is a grave error! The problem with this method is "how huge is huge enough?"
Processing multiple bufferfuls in a loop "until done" is probably a better approach. Rather than using a hard-coded number like "256", it's probably better to:
%define BUFSIZ 256
...
buffer resb BUFSIZ
...
mov edx, BUFSIZ
That way you can try a small value to test that you haven't got an "off by one" error when going from buffer to buffer (easy to do!), and increase the size for better performance when you've got it working.
After the sys_read, eax will have the number of characters actually read. You may want to pass this value to your "processing". If eax = edx (or BUFSIZ), that's probably an indication that there's more to read, but you should probably check if the last character is a linefeed (10) - if so, we're done... and probably don't want to try to read more. When reading a disk file, sys_read will return zero in eax (it isn't an error to attempt to read past EOF), but stdin will wait for more (I think) and you probably don't want that.
See how it goes, and if you have trouble, ask more.
Best,
Frank
-
Personally, I don't like specifying a static BUFSIZ, instead, I prefer the <name>_size method.
buffer: resb 256
buffer_size equ ($-buffer)
There isn't really a difference other than it fits in with the way structures define sizes.
-
Thank you both, the information sure makes things clearer. I have about 2 weeks with assembly and it has been quite a journey!
I wrote this:
setBuffSize:
inc edx ;Increment the size of edx, I could also use buffer_size or BUFSIZ
cmp edx,eax ;Compare them, should I be using eax * the bytes or bits per char?
jg setBuffSize ;Loop until I find out the correct number
I am not sure if it is working, but say it eventually does, once I find out the number of characters in eax, how do I go back and load those extra characters?
Regards,
Felipe
-
Hi Felipe,
I see I've said something misleading. I said eax will contain the number of characters (bytes) actually read. This is literally true - sys_read won't "read" more than edx bytes, even if the pesky user types more than that. You seem to be treating eax as if it was the number of bytes typed - even if it's more than we had room for in the buffer, and/or what we allowed in edx. This isn't what happens - eax won't be more than edx. Might be less - which is good - or equal, which we'll have to check for. (maybe we've got everything typed, maybe not)
My favorite distraction just showed up, I'll have to get back to this...
Later,
Frank
-
Sorry. Where was I? Oh, yeah, if eax = edx... we might have everything typed in the buffer, or there might be more. As annonymous observed above, if more was typed than will fit in the buffer, it's liable to show up on the command prompt when we exit. We don't want that, so we could flush the buffer - read whatever's left and throw it away. Since what you want to do is process the buffer in a loop (just print it?) until there's no more input, we can do that instead...
Since sys_read from stdin won't return until the user hits "enter", even if more than edx was typed, we can be assured that the last "character typed" is a linefeed (10 decimal or 0xA), whether it's in our buffer or not (it remains in the OS's buffer, if not... until we read it). At first, I was thinking that we'd have to do this immediately after the sys_read, and remember whether it was "last" or not after the sys_write. But I figured out that we can actually check for the linefeed after the sys_write, and "reread" until we find it. This doesn't work if stdin has been redirected ("myprog<somefile") since there can be more than one linefeed in the input, and this will quit after the first one it finds. (seems to me that there's a way to determine if stdin has been redirected, but if I ever knew it, I've forgotten) Seems to work okay for "keyboard input"...
Notice that I liked Bryant's way of naming "buffer-size" rather than "BUFSIZ" - the point is to make sure we put a value in edx that matches the size of the buffer. I've used an insanely small buffer size - makes it easy to test without typing all day. :) You'll probably want to increase that (256 is probably good) for actual use...
; nasm -f elf32 myprog.asm
; ld -o myprog myprog.o -m elf_i386
; (don't need "-m" switch for a 32-bit ld
; but for a 64-bit ld, we need to specify 32-bit code)
global _start
section .bss
buffer resb 4 ; absurdly small
buffer_size equ $ - buffer
count resd 1
section .text
_start:
; print prompt
reread:
mov edx, buffer_size
mov ecx, buffer
mov ebx, 0 ; stdin
mov eax, 3 ; sys_read
int 80h
; check for error (unlikely?)
cmp eax, -4096
ja exit
; save count for later
mov [count], eax
; do processing on buffer, if any
; print another prompt, if any
; print what we got from sys_read
mov edx, [count]
mov ecx, buffer
mov ebx, 1 ; stdout
mov eax, 4 ; sys_write
int 80h
; check for error (unlikely?)
cmp eax, -4096
ja exit
; if the last byte we printed is a linefeed, we're done
; if not, there must be more to read
cmp byte [ecx + edx - 1], 10
jne reread
exit:
mov eax, 1
mov ebx, 0
int 80h
Best,
Frank
-
Well, first of all, thank you for the level of detail and patience!! It really helps a lot.
The code is actually very clear, but I do have a few questions:
In the consecutive passes through the loop, the system does not wait for the user input because it passes whatever is left on the OS's buffer correct?
cmp eax, -4096
What type of error is this? Is there a table with the error codes?
I found a few in errno-base.h, but not really sure how to use them or if there are better resources.
I like the buffer size, it works fine, but I guess a bigger size would eventually reduce the amount of loop passes?
One more, this reads 4 bits at a time (in buffer) and then it prints them, on each pass the previous 4 bits are overridden correct? Is there a way to concatenate them and store the full input?
In case of a file I guess you should look for an EOF, but yes, it may be a problem if you don't know that the input comes from there.
Again thanks.
Best regards,
Felipe
-
Well, first of all, thank you for the level of detail and patience!! It really helps a lot.
I got a lot of help when I was first learning assembly, and I still do. I'm just trying to "pass it on".
The code is actually very clear, but I do have a few questions:
In the consecutive passes through the loop, the system does not wait for the user input because it passes whatever is left on the OS's buffer correct?
Correct.
cmp eax, -4096
What type of error is this?
Any kind. The man pages (the system calls we're interested in are in "man 2") tell us that the C functions return -1 (or sometimes 0), and the error number is in "errno". Using int 80h directly, what we actually get is the negative ERRNO (error numbers are defined as positive integers) in eax. We can't just look for negative numbers (although it usually works), since that would limit legitimate return values to 2G. Almost all return values would be in that range, but memory allocation (for example) could return a legitimate value up to 0xFFFFF000 (-4096) - it can't allocate a page above that, so that's where error returns will lie.
Is there a table with the error codes?
I found a few in errno-base.h, but not really sure how to use them or if there are better resources.
"/usr/src/linux/include/asm-i386/errno.h" would be the version of "errno.h" you want, I think. I fairly recently posted my half-baked error reporting code - lemme see if I can find that...
Edit: ah, here 'tis:
http://forum.nasm.us/index.php?topic=1386.0
I like the buffer size, it works fine, but I guess a bigger size would eventually reduce the amount of loop passes?
Right!
One more, this reads 4 bits at a time (in buffer) and then it prints them, on each pass the previous 4 bits are overridden correct?
4 bytes, not 4 bits (eight bits to a byte), but yeah, they're overwritten on each pass through the loop.
Is there a way to concatenate them and store the full input?
Yeah, you can provide a "huge" buffer like you originally asked about, or you could use a "dynamic" buffer - allocating more memory if there's more input than will fit. This is a little more complicated.
In case of a file I guess you should look for an EOF, but yes, it may be a problem if you don't know that the input comes from there.
Right. Reading from a disk file, or from a socket, sys_read will return zero in eax. But reading from stdin, since it doesn't return until the user hits "enter", will always return at least 1 - the linefeed (10 decimal or 0xA). I can't figure out (at the moment) a good way to handle both/either...
Best,
Frank
-
Pay it forward! Great way to live, I hope I'll be there at some point, I like helping out when I can too :)
I'll look deep into those error tables, I found the archive that you mentioned, I am using OpenSuse and everything is in different places from where the tutorials mention, but I have found my way there, I'll probably switch to Debian soon, so that problem will probably go away. And I saw your post, looks great, I'll try to use it :)
I'll make a new thread regarding concatenation, because I have a few questions about it that I don't think fit into this post.
I am also working on reading from a file, so I'll most likely start a thread there too, but I would make too procedures or a macro with some indicator if the input is from a file or from user input in order to avoid guessing, but there could be a good way to have it all. :)
The picture is quite more clear now, a lot more clear.
Thanks again!
Best regards,
Felipe