NASM - The Netwide Assembler
NASM Forum => Programming with NASM => Topic started by: turtle13 on September 15, 2017, 09:18:50 PM
-
Continuing on earlier conversations about networking in NASM:
for the exit code..
exit:
mov ebx, eax
test ebx, ebx
jns exit_good
neg ebx
exit_good:
mov eax, 1
int 0x80
I would like to use my custom l_exit function.
I coded it as:
l_exit:
mov ebx, [ebp + 8] ; int rc
mov eax, 1 ; exit sys call
int 0x80
Can I simply replace the exit and exit_good and call this function as:
push ebx
call l_exit
-
Frank, earlier you provided sample code:
request_len equ $ - sa
sa:
dw 2
dw htons (80) ; port for http
.addr:
dd 0, 0
sa_size equ $ -sa
^ I'm not sure what exactly is going on here?
Next:
section .data
request db "GET / HTTP/1.0", 13, 10
db "Host: www.google.com", 13, 10
db "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10
What are the "13, 10" after each line? The first line has 14 characters, not 13.. and what is the 10 for after the 13?
Also, there should be a \r\n after the GET line, so should it look something like:
request db "GET / HTTP1.0\r\n", 15, 10
I am probably way off here..
-
The way that I exit is just so I can see the error code (or not) by typing "echo $?". If I weren't so lazy, I would "issue a diagnostic and exit cleanly" - also known as "scream and die". Your l_exit function should do the same, depending on what you've got in ebx when you push it.
request_len equ $ - sa
; wot? should be $ - request, no?
sa:
dw 2
dw htons (80) ; port for http
.addr: should be a "dd 0" here, no?
dd 0, 0
sa_size equ $ -sa
"sa" is a structure expected by socket code. The "connect" command wants a pointer to it, and its length, in its arguments.
13 is a carriage return and 10 is a linefeed. It has nothing to do with the length of the line. Nasm will accept:
`foo \r\n`
if you put it between "back quotes" - the character under the ~ on my keyboard. Between double quotes " or single quotes ' - Nasm just sees a literal "\r\n". Do one or the other, not both. I don't know where you got 15, 10. The double 13, 10 at the end is just a blank line which indicates the end of the request. The internet is fussy about that cruft.
Best,
Frank
-
Thanks that clears a bit up for me.
If I want to be able to input a custom URL on the command line instead of the default "www.somehost.com" as below:
section .data
request db "GET / HTTP/1.0", 13, 10 ; input the carriage return and linefeed after request
db "Host: www.somehost.com", 13, 10
db "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10
I would need to parse out the contents of argv[1] correct?
So that if I entered a URL like:
./networkprogram www.anotherhost.com/dir/index.html
it would connect to www.anotherhost.com and not www.somehost.com
If I want to have argv[1] become the URL:
request db "GET / HTTP/1.0", 13, 10 ; input the carriage return and linefeed after request
db "Host: www.somehost.com", 13, 10
db "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10
Now I'm stuck
-
Yeah, I guess. I didn't realize that you were supposed to get the URL from the command line. How the command line (and environment variables) are set up depends on whether we start with "_start:" or "main". In either case, the command line argument will be zero-terminated. That should be all set to send off to "resolv". To fit it into the "request", we'll have to concatenate it to "Host:" and replace the zero with CR,LF (13, 10)... and we will need to calculate the length of the request at runtime. "$ - request" isn't going to work.
I do not have an example at hand of doing that.
Best,
Frank
-
Back to basics..
if I want the command line argv[1] to be the URL to get the file from:
[shell]$ ./prog_name http://www.someplace.com/index.html
Would I code this something like:
pop ebx ; pops esp- top of stack (argc) into ebx
pop ebx ; pops argv[0] (program name: ./prog_name) into ebx
pop ebx ; pops argv[1] (URL) into ebx
mov ecx, [ebx] ; store contents of ebx (URL) into ecx for sys write
mov eax, 4 ; write sys call- to write the URL into the HTTP request
int 0x80
(When I compile and run the above, I get a seg fault)
How do I code what needs to be written to? So that the URL can be parsed and placed into a request:
GET /index.html HTTP/1.0\r\n
Host: www.someplace.com\r\n
Connection: close\r\n
\r\n
Using pseudocode below, I am thinking that I would need to declare a variable such as
section .data
request db "GET " + (everything after hostname) + " HTTP/1.0\r\nHost: " + (host name) + "\r\nConnection: close\r\n\r\n"
Because the length of the request will be different each time the program is run.
I can use my previous l_write function to pass the file descriptor (this should be the socket for the web server?), the char *buf, and the len of the amount to write. To determine len for the initial GET request, I will need to determine how many characters (bytes) are in the URL typed at the command line and then add that to the rest of the generic GET request. I'm trying to think more on how I can do this. I would need to declare some variable like request_len
What is a good way to accomplish this, if I have explained myself clearly?
-
Hi turtle13,
To begin with, what we find on the stack depends on how we link the program. I guess I just told you that. Consult "start.asm" (from an earlier assignment) for what linking with the C startup code changes. I'll assume we do not do that here.
_start:
pop ebx ; pops esp- top of stack (argc) into ebx
; might want to check that this is 2 - "usage" otherwise
pop ebx ; pops argv[0] (program name: ./prog_name) into ebx
pop ebx ; pops argv[1] (URL) into ebx
; I'd pop ecx here, or...
mov ecx, ebx ; store contents of ebx (URL) into ecx for sys write
; [ebx] should be "http" or so, thus the segfault
mov eax, 4 ; write sys call- to write the URL into the HTTP request
int 0x80
; this is not going to write into your request
; ebx is gawd knows what
; and we need to find length of zero-terminated string
I think what I would do is to reserve a "big enough" buffer in .bss. I would then copy into it, counting all the way, "GET " (note the space), the "tail" of the command line argument, then "HTTP 1.1" 13, 10. Then "Host: ", and the hostname part - we will want to send this off to "resolv", too. That wants zero-terminated, the request wants CR,LF terminated (13, 10). Then the rest of the request.
That is all untested. I'll try to work up an example. You seem to understand what's required, you try it too.
Best,
Frank
-
^Frank, looking at my start.asm from last assignment,
bits 32
extern main
global _start
section .text
_start:
lea eax, [esp + 4]
mov ecx, [esp]
lea edx, [eax + ecx * 4 + 4]
push edx
push eax
push ecx
call main
add esp, 12
mov ebx, eax
mov eax, 1
int 0x80
In this case it looks like ecx would store argc, eax argv[0], and edx argv[1] (which would be the URL)
Is this what you were referring to?
Here is some code I coughed up which is not nearly finalized but which in my mind is where I should start to get to writing the GET request:
bits 32
section .bss
%define buf_size 0x10000 ; initialize a 10,000 byte buffer
buf resb buf_size ; buf will be written into for setting up HTTP GET request
section .data
hostname db 1000 ; set up 1000 bytes for hostname
pathname db 1000 ; set up 1000 bytes for URL path
filename db 1000 ; set up 1000 bytes for filename to retrieve
get_request1 db "GET "
get_request2 db " HTTP/1.0", 13, 10
get_request3 db "Host: "
get_request4 db "Connection: close", 13, 10, 13, 10
section .text
global _start
_start:
; accepting command line arguments:
pop eax ; pops esp- top of stack (argc) into eax
pop ecx ; pops argv[0] (program name: ./assign5) into ecx
pop edx ; pops argv[1] (URL) into edx
; something like this to build the GET request?
push get_request4
push hostname
push get_request3 ; need to also include the \r\n somehow
push get_request2
push filename
push pathname
push get_request1
; ok the GET request is on the stack, now what?
In my code above I pushed the GET request one at a time onto the stack, but now I'm wondering how do I get all of that information into a "single" variable to be used to send in one big chunk to the web server?
-
I've been doing some research on creating a socket and have some questions..
- The syscall for socket is 102. So..
mov eax, 102
mov ebx, "connect"
mov ecx, full_get_request
int 0x80
I'm stuck on how to get it to connect.. what value goes into ebx here?
Looking at the Linux man pages for "socketcall"
int socketcall(int call, unsigned long *args)
I believe that first I need to "bind" so that I have the web server IP address bound with port 80.
Well, then I go into the man page for "bind"
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
Now it looks like there are a bunch of data structures like "struct" involved here and I'm lost as to how I get all of this information connected so that I can just make a connection to the web server and send the GET request.
To bind, would I need to do something like
mov eax, 102
mov ebx, "bind"
mov ecx, "address and port"
int 0x80
How do I get the "code" for "bind" and how to get address and port into ecx, or would address be in ecx and port be in edx? Is this making any sense?
Then, how would I get the new value that was "bound" to be "saved" so that I can use it again..
mov eax, 102
mov ebx, "bind"
mov ecx, "address and port"
int 0x80
mov edi, [ebp - 8] ; get the return value from "bind"
mov [bound_socket], edi ; store the bound socket into a variable called "bound_socket"
I am not allowed to call any C library functions, so I am stuck as to how to implement all of this using assembly.
-
Unless I'm mistaken, "bind" is if you're the server - you don't need it here.
As the client, you need to "socket" and then "connect". Then you could use the "send" command to sys_socketcall, or use sys_write... or maybe your l_write?
Did I post "fbkget.asm" that gets "www.nasm.us"? I think so. I'm ripping it to shreds now to read the command line. I'm about up to the '/' in "www.nasm.us/index"... if the pesky user types that... and trying to figure out if not... It is getting extremely ugly. Won't even assemble at the moment, let alone work. I may rip it up and start over. Here it is, as is:
; nasm -f elf32 httpget.asm
; ld -o httpget httpget.o dns.o (-melf_i386 for 64-bit systems?)
global _start
extern resolv ; from The professor's dns.o
; Convert numbers (constants!) to network byte order
%define hton(x) ((x & 0xFF000000) >> 24) | ((x & 0x00FF0000) >> 8) | ((x & 0x0000FF00) << 8) | ((x & 0x000000FF) << 24)
%define htons(x) ((x >> 8) & 0xFF) | ((x & 0xFF) << 8)
section .data
; the real request is named request - this just for reference
msg db"GET / HTTP/1.1", 13, 10
db "Host: www.nasm.us", 13, 10
db "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10
msg_len equ $ - msg
msg1 db "GET "
msg1len equ $ - msg1
; cltail comes next
msg2 db "HTTP/1.1", 13, 10, "Host: "
msg2len equ $ - msg2
; clhostname next
msg3 db 13, 10, "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10
msg3len equ $ - msg3
; todo: correct name and description of this structure
sa:
dw 2 ; family
dw htons (80) ; port
.addr dd 0 ; fill in later
dd 0, 0 ; zero
sa_size equ $ - sa
; first of these wants to be socket descriptor
; we fill it in later...
connect_args dd 0, sa, sa_size
name db "www.nasm.us", 0
section .bss
%define BUFSIZ 4000h
buf resb BUFSIZ
sock_fd resd 1
request resb 100h
clname resb 80 ; this one zero terminated for resolv
clnamelen resd 1
clhostname resb 80 ;
cltail resb 80
section .text
_start:
nop
; get argc
pop eax
cmp eax, 2
je goodargc
;print usage message
jmp exit
goodargc:
pop esi ; throw away program name
pop esi ; our argument?
cmp [esi], dword "http"
je killhttp
cmp [esi], dword "HTTP"
jne nohttp
killhttp:
add esi, 4
cmp [esi], word "\\"
jne nohttp
add esi, 2
mov edi, clname
morename:
lodsb
cmp al, "/"
je foundtail
stosb
cmp al, 0
je endarg
jmp morename
foundtail:
; end clname with 0
mov [edi], byte 0
sub edi, clname
mov [clnamelen], edi
mov edi, cltail
stosb
moretail:
lodsb
stosb
cmp al, 0
je endarg2
push name
call resolv ; returns ip, big endian
add esp, 4
; call showeaxh ; just to see it
mov [sa.addr], eax
; socket
push 0
push 1
push 2
mov ecx, esp
mov ebx, 1
mov eax, 102
int 80h
add esp, 12
test eax, eax
js exit
mov [sock_fd], eax
mov [connect_args], eax
; connect
mov ecx, connect_args
mov ebx, 3 ; connect
mov eax, 102
int 80h
test eax, eax
js exit
; write
mov edx, msg_len
mov ecx, msg
mov ebx, [sock_fd]
mov eax, 4
int 80h
test eax, eax
js exit
; read
xor esi, esi ; total count read
mov edx, BUFSIZ
mov ecx, buf
mov ebx, [sock_fd]
reread:
mov eax, 3
int 80h
test eax, eax
js exit
jz goodread
sub edx, eax
jna goodread ; all we have room for
add ecx, eax
add esi, eax
jmp reread
goodread:
; write to stdout
mov ecx,buf
mov edx, esi
mov ebx, 1
mov eax, 4
int 80h
xor eax, eax
exit:
mov ebx, eax
test ebx, ebx
jns goodexit
neg ebx
goodexit:
mov eax, 1
int 80h
;------------------------------
showeaxh:
push eax
push ebx
push ecx
push edx
sub esp, 10h
mov ecx, esp
xor edx, edx
mov ebx, eax
.top:
rol ebx, 4
mov al, bl
and al, 0Fh
cmp al, 0Ah
sbb al, 69h
das
mov [ecx + edx], al
inc edx
cmp edx, 8
jnz .top
; add a linefeed
mov byte [ecx + edx], 10
inc edx
mov ebx, 1
mov eax, 4
int 80h
add esp, 10h
pop edx
pop ecx
pop ebx
pop eax
ret
;------------------------------
The formerly working parts may help you figure out how "socket" and "connect" and sys_read go together...
Best,
Frank
-
This is something different. It connects to an echo server, not a http server. It has some comments and names of structures that might be helpful.
;-----------------------------
;
; nasm -f elf32 echocli.asm
; ld -o echocli echocli.o
global _start
struc sockaddr_in
.sin_family resw 1
.sin_port resw 1
.sin_addr resd 1
.sin_zero resb 8
endstruc
_ip equ 0x7F000001 ; loopback - 127.0.0.1
;_ip equ 0x48400C6B ; loopback - 127.0.0.1
_port equ 2002
; Convert numbers to network byte order
IP equ ((_ip & 0xFF000000) >> 24) | ((_ip & 0x00FF0000) >> 8) | ((_ip & 0x0000FF00) << 8) | ((_ip & 0x000000FF) << 24)
PORT equ ((_port >> 8) & 0xFF) | ((_port & 0xFF) << 8)
AF_INET equ 2
SOCK_STREAM equ 1
BUFLEN equ 0x80
STDIN equ 0
STDOUT equ 1
LF equ 10
EINTR equ 4
__NR_exit equ 1
__NR_read equ 3
__NR_write equ 4
__NR_socketcall equ 102
SYS_SOCKET equ 1
SYS_CONNECT equ 3
section .data
my_sa istruc sockaddr_in
at sockaddr_in.sin_family, dw AF_INET
at sockaddr_in.sin_port, dw PORT
at sockaddr_in.sin_addr, dd IP
at sockaddr_in.sin_zero, dd 0, 0
iend
socket_args dd AF_INET, SOCK_STREAM, 0
; first of these wants to be socket descriptor
; we fill it in later...
connect_args dd 0, my_sa, sockaddr_in_size
section .bss
my_buf resb BUFLEN
sock_desc resd 1
section .text
_start:
; socket(AF_INET, SOCK_STREAM, 0)
mov ecx, socket_args ; address of args structure
mov ebx, SYS_SOCKET ; subfunction or "command"
mov eax, __NR_socketcall ;c.f. /usr/src/linux/net/socket.c
int 80h
cmp eax, -4096
ja exit
mov [sock_desc], eax
; and fill in connect_args
mov [connect_args], eax
; connect(sock, (struct sockaddr *)&sa, sizeof(struct sockaddr))
mov ecx, connect_args
mov ebx, SYS_CONNECT ; subfunction or "command"
mov eax, __NR_socketcall
int 80h
cmp eax, -4096
ja exit
again:
push BUFLEN
push my_buf
push STDIN
call readline
add esp, 12
push eax
push my_buf
push dword [sock_desc]
call writeNbytes
add esp, 12
cmp dword [my_buf], 'quit'
jz goodexit
push BUFLEN
push my_buf
push dword [sock_desc]
call readline
add esp, 12
push eax
push my_buf
push STDOUT
call writeNbytes
add esp, 12
jmp again
goodexit:
xor eax, eax ; success
exit:
mov ebx, eax ; exitcode
neg ebx
mov eax, __NR_exit
int 80h
readline:
push ebp
mov ebp, esp
sub esp, 4
pusha
%define fd ebp + 8
%define buf ebp + 12
%define max ebp + 16
%define result ebp - 4
mov dword [result], 0
mov ebx, [fd]
mov ecx, [buf]
mov edx, [max]
.reread:
mov eax, __NR_read
int 80h
cmp eax, -EINTR
jz .reread
cmp eax, -4096
ja exit
add [result], eax
cmp byte [eax + ecx - 1], LF
jz .done
add ecx, eax
sub edx, eax
jna .done
jmp .reread
.done:
popa
mov eax, [result]
%undef fd
%undef buf
%undef max
%undef result
mov esp, ebp
pop ebp
ret
writeNbytes:
push ebp
mov ebp, esp
sub esp, 4
pusha
%define fd ebp + 8
%define buf ebp + 12
%define Nbytes ebp + 16
%define bytesleft ebp - 4
mov ebx, [fd]
mov ecx, [buf]
mov edx, [Nbytes]
mov [bytesleft], edx
.rewrite:
mov eax, __NR_write
int 80h
cmp eax, -EINTR
jz .rewrite
cmp eax, -4096
ja exit
sub [bytesleft], eax
jz .done
add ecx, eax
sub edx, eax
jna .done
jmp .rewrite
.done:
popa
mov eax, [Nbytes]
%undef fd
%undef buf
%undef Nbytes
%undef bytesleft
mov esp, ebp
pop ebp
ret
I've got an echo server to go with it...
Best,
Frank
-
I'm going to look at it more carefully tomorrow morning but for now I noticed something that sticks out to me in your code:
goodargc:
pop esi ; throw away program name
pop esi ; our argument?
cmp [esi], dword "http"
je killhttp
cmp [esi], dword "HTTP"
jne nohttp
killhttp:
add esi, 4
cmp [esi], word "\\"
jne nohttp
add esi, 2
mov edi, clname
under killhttp:, under add esi, 4, shouldn't it be "cmp [esi], dword "://" you have two backslashes and forgot the colon which the user would type as part of the request. Which brings up another issues- there are three characters but dword makes room for 4 characters, meaning that there is a memory alignment issue. How to solve that?
Also, "add esi, 2" shouldn't that be "add esi, 3" to include the colon as well. So if the user typed in "http://www.url.com" then this is trying to parse out the "http://" which is 7 characters. add esi, 4 and add esi, 3 would add up to 7 characters.
And while I can get one more thing off my head, you declare "sock_fd resd 1" under section .bss, does this mean that the file descriptor for socket is "1" or is that just reserving 1 double word (4 bytes) for a socket file descriptor which will be determined later on.. and how are socket file descriptors determined?
-
Good Catch! Besides missing the colon, there are probably other errors. I will probably do the colon as a single byte. As you point out, 3 bytes doesn't "fit" well.
"resd 1" reserves one dword. .bss is uninitialized, so it doesn't have a value until we put one there. Actually, the OS fills it with zeros, but I don't think we're supposed to "count on" that
Socket descriptors are returned by the "socket" command :)
This is my echo server. I post it only because it has "bind" which you were asking about.
; echo server
; echos lines until "quit"
; "kill" unloads server
; runs on loopback, 127.0.0.1, and port 2002,
; no options on this model!
;
; nasm -f elf echosrv.asm
; ld -o echosrv echosrv.o
global _start
;-------------------
; should probably be in an .inc file
struc sockaddr_in
.sin_family resw 1
.sin_port resw 1
.sin_addr resd 1
.sin_zero resb 8
endstruc
; Convert numbers (constants!) to network byte order
%define hton(x) ((x & 0xFF000000) >> 24) | ((x & 0x00FF0000) >> 8) | ((x & 0x0000FF00) << 8) | ((x & 0x000000FF) << 24)
%define htons(x) ((x >> 8) & 0xFF) | ((x & 0xFF) << 8)
AF_INET equ 2
SOCK_STREAM equ 1
INADDR_ANY equ 0 ; /usr/include/linux/in.h
STDIN equ 0
STDOUT equ 1
__NR_exit equ 1
__NR_read equ 3
__NR_write equ 4
__NR_close equ 6
__NR_socketcall equ 102
; commands for sys_socketcall
; /usr/include/linux/in.h
SYS_SOCKET equ 1
SYS_BIND equ 2
SYS_CONNECT equ 3
SYS_LISTEN equ 4
SYS_ACCEPT equ 5
SYS_SEND equ 9
SYS_RECV equ 10
;------------------------
_ip equ 0x7F000001 ; loopback - 127.0.0.1
_port equ 2002
; Convert 'em to network byte order
IP equ hton(_ip)
PORT equ htons(_port)
BACKLOG equ 128 ; for "listen"
BUFLEN equ 1000
section .data
my_sa istruc sockaddr_in
at sockaddr_in.sin_family, dw AF_INET
at sockaddr_in.sin_port, dw PORT
at sockaddr_in.sin_addr, dd INADDR_ANY
at sockaddr_in.sin_zero, dd 0, 0
iend
socket_args dd AF_INET, SOCK_STREAM, 0
; first of these wants to be socket descriptor
; we fill it in later...
bind_args dd 0, my_sa, sockaddr_in_size
listen_args dd 0, BACKLOG
accept_args dd 0, 0, 0
section .bss
my_buf resb BUFLEN
fd_socket resd 1
fd_conn resd 1
section .text
_start:
; socket(AF_INET, SOCK_STREAM, 0)
mov ecx, socket_args ; address of args structure
mov ebx, SYS_SOCKET ; subfunction or "command"
mov eax, __NR_socketcall ;c.f. /usr/src/linux/net/socket.c
int 80h
cmp eax, -4096
ja exit
mov [fd_socket], eax
; and fill in bind_args, etc.
mov [bind_args], eax
mov [listen_args], eax
mov [accept_args], eax
mov ecx, bind_args
mov ebx, SYS_BIND ; subfunction or "command"
mov eax, __NR_socketcall
int 80h
cmp eax, -4096
ja exit
mov ecx, listen_args
mov ebx, SYS_LISTEN ; subfunction or "command"
mov eax, __NR_socketcall
int 80h
cmp eax, -4096
ja exit
again:
mov ecx, accept_args
mov ebx, SYS_ACCEPT ; subfunction or "command"
mov eax, __NR_socketcall
int 80h
cmp eax, -4096
ja exit
mov [fd_conn], eax
readagain:
; read(sock, buf, len)
mov edx, BUFLEN ; arg 3: max count
mov ecx, my_buf ; arg 2: buffer
mov ebx, [fd_conn] ; arg 1: fd
mov eax, __NR_read ; sys_read
int 80h
cmp eax, -4096
ja exit
mov edx, eax ; length read is length to write
mov ecx, my_buf
mov ebx, [fd_conn]
mov eax, __NR_write
int 80h
cmp eax, -4096
ja exit
cmp dword [my_buf], 'quit'
jz closeconn
cmp dword [my_buf], 'kill'
jz killserver
jmp readagain
closeconn:
mov eax, __NR_close
mov ebx, [fd_conn]
int 80h
cmp eax, -4096
ja exit
jmp again
killserver:
mov eax, __NR_close
mov ebx, [fd_conn]
int 80h
mov eax, __NR_close
mov ebx, [fd_socket]
int 80h
goodexit:
xor eax, eax ; success
exit:
mov ebx, eax ; exitcode
neg ebx
mov eax, __NR_exit
int 80h
;-----------------------
A better server would fork off a child to handle the connection, while the parent went back to listening. (I think)
Best,
Frank
-
- Do I need to use "hton" because the dns.o file I was provided with has unsigned int resolv(const char *hostName) function that should already resolve hostnames:
This function uses the cdecl calling convention and returns a network byte ordered IP address of the named host or -1 if the host name can't be resolved.
- doing some research on Unix socket programming I am starting to understand more how this all works and is set up but still have some questions. So how I understand it is a socket needs to be created using the socket syscall. This sys call takes three arguments: domain/ family (AF_INET), type (SOCK_STREAM), and PROTOCOL. Protocol should be 0, but how is "AF_INET" and "SOCK_STREAM" set in assembly? And what registers should each one go in? That is my biggest headache with assembly is that there is really no clear answer as to what registers do what at a given time.
- after this socket sys call, I should receive an integer which will be the file descriptor for the socket? OK, now how do I "store" that to save later on? For something like Python it would be
socket_fd= socket(AF_INET, SOCK_STREAM, 0)
but translating this to assembly is throwing me off.
- OK now after the socket is created (and the socket file descriptor is set), I must use this file descriptor to connect and download something from the web server. Now this is where a "structure" is required? Ex:
struc sockaddr_in
.sin_family: resw 1
.sin_port: resw 1
.sin_addr: resd 1
.sin_pad: resb 8
endstruc
How I understand it:
sockaddr_in is the name of the structure.
.sin_family is the 16-bit code for AF_INET
.sin_port is the 16- bit port # (which would be port 80 for HTTP)
.sin_addr is the 32- bit IP address (that was previously resolved using the dns.o and resolv function)
*Not sure what .sin_pad resb 8 is used for
I feel really stupid because I'm sure it's been explained 1,000 times before but.. how do I get those values to send to the web server now?
To set up the initial socket sys call do I do something like:
mov eax, 102
mov ebx, 3
mov ecx, full_request
Ok now, the resource I was using to help explain this http://www.linuxhowtos.org/C_C++/socket.htm (http://www.linuxhowtos.org/C_C++/socket.htm) which is of course a C/C++ reference because there is no other resource for assembly programming on the planet besides Frank Kotler, but anyway.. it says that I then need to make a "connect" sys call. Well I don't see a sys call for "connect" (using this reference table as a guide http://faculty.nps.edu/cseagle/assembly/sys_call.html (http://faculty.nps.edu/cseagle/assembly/sys_call.html) ) so, I'm lost as to how to actually do that sys call.
Then after connecting, I make "read" sys calls to read the web page into the buffer, and then "write" sys calls to write the buffer to a local file. Right?
I think I have the basic outline, but getting this all into something that the processor can execute so that it does what I want is consuming my every waking moment at this point.
-
Hi turtle13,
No, you won't need "hton" - what you get from "resolv" is already in "network byte order". That is, "big endian". We normally use "little endian" in x86. You may wish to use "htons" - the "short" (16 bit) version. The "port" - 80 for http - needs to be in "network byte order", too.
"AF_INET" and "SOCK_STREAM" can be found in the echo examples, or you can do it the crude way:
; socket
push 0 ; todo: name these!
push 1
push 2
mov ecx, esp
mov ebx, 1 ; the "socket" command
mov eax, 102 ; sys_socketcall
int 80h
add esp, 12
test eax, eax
js exit
mov [sock_fd], eax
mov [connect_args], eax
Although rather cryptic, this creates a "structure" on the stack. What goes in registers is the address of this "structure" (esp in this case), the "command" (for socket, bind, connect, etc.) and the sys_socketcall number (102). You can also see how the socket descriptor (returned in eax) is stored in variables. The C library pretends that these are separate functions, but under the hood it's the same sys_socketcall with different numbers in ebx. While I've got the socket descriptor in eax, I stuff it into the arguments we'll be using for "connect". The remainder of those arguments are the address and length of that "sockaddr_in" structure... into which we have stuffed the address we got from "resolv". The "padding" may have once been "reserved for future use", but it's just zeros. It has to be there.
After the "connect", we first write to the socket - our "request". Then we read what it sends us. Besides the page we requested, I find that this might be "bad request" or perhaps "forbidden".
I'm still having a lot of trouble getting from the command line to the request. The command line argument is zero-terminated. We want the "hostname" that we send to "resolv" to be zero terminated, but we don't want the zero in the "request". I think that's where I'm going wrong...
One thing to keep in mind is that Nasm's "struc" keyword is just a "typedef". It does not create an "instance" of your structure or reserve any memory. For that, you need "istruc". See the Friendly Manual or there's an example in the "echo" code. Fairly ugly, IMO. You do not need to use this. If you put a few variables together, one after another, it's a "structure" whether we call it that or not. "struc" and "istruc" just help you keep 'em together.
Best,
Frank
-
This man page for Linux "socketcall" lists the various options that can go into "call" type (which would be what is placed in ebx for the syscall)
It looks like I would need to use "socket" (ebx= 1) and connect (ebx=3) , are there any others: send and receive maybe? And would ebx be the number of the socket commands in order (ex. would "send" ebx= 9 because it is the 9th one down the list?)
http://man7.org/linux/man-pages/man2/socketcall.2.html (http://man7.org/linux/man-pages/man2/socketcall.2.html)
- I think I may be having a moment of clarity here, below your sample code:
; socket
push 0 ; todo: name these!
push 1
push 2
mov ecx, esp
mov ebx, 1 ; the "socket" command
mov eax, 102 ; sys_socketcall
int 80h
add esp, 12
test eax, eax
js exit
mov [sock_fd], eax
mov [connect_args], eax
So when mov ecx, esp happens, that is actually putting the three args pushed before it (2 [AF_INET], 1 [SOCK_STREAM], 0 [protocol type]) into the ecx register so that those can be used when invoking the socketcall syscall, that is, when eax= 102 and ebx=1 (1= sys_socket command).
If that's true then I think that helps to clear a lot up about where these arguments are actually coming from. In a sense it seems like ecx can actually hold more than one "thing" at a time, if this makes sense. Then after the socketcall, the stack is "erased" three memory slots so the arguments 0, 1, 2 that were put there are now gone?
- At this point I'm trying to figure out how to convert the hostname from the command line using the dns.o "resolv" to an IP address that I can start to use in these socket sys calls. Nothing is easy :-\
-
The code you mention is part of a work-in-progress. It currently looks like:
; socket
push INADDR_ANY
push SOCK_STREAM
push AF_INET
mov ecx, esp ; structure's on stack
mov ebx, SYS_SOCKET ;
mov eax, __NR_socketcall
int 80h
add esp, 12 ; remove structure
test eax, eax
js exit
mov [sock_fd], eax
mov [connect_args], eax
What goes in ecx is not really "several things" but the address of several things - "*args" in C terms.
When we "remove" these items from the stack, they aren't really "gone" but the stack pointer (esp) has been moved above them so they'll be overwritten by any subsequent pushes or calls.
Your link to the man pages knows a lot more about it than I do, so go by what they say, not what I say.
I think that code is finally working. Besides the error you pointed out to me, there were others. I get a 404 when I ask for "/index" from some sites - good(?) content from others. Haven't got a "bad request" or "request timeout" for a while. Change is not always progress, but I think it's coming along...
Best,
Frank
-
Here is what I have so far.. there are lots of gaps so at this point want to get these filled in. Goal is to get this entire thing complete and running by midnight tonight
bits 32
global _start
global resolv
; creating the initial socket syscall to initiate a socket:
AF_INET equ 2 ; sin_family
SOCK_STREAM equ 1 ; socket type (TCP)
PROTOCOL equ 0 ; just because
SYS_SOCKET equ 1
SYS_CONNECT equ 3
buflen equ 1000
section .data
socket_args dd AF_INET, SOCK_STREAM, 0
section .text
_start:
push PROTOCOL
push SOCK_STREAM
push AF_INET
mov ecx, esp ; places socketcall args into ecx for syscall
mov ebx, SYS_SOCKET ; socket function to invoke (1= socket command)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; test for error (eax= socket file descriptor)
jl exit ; jump if negative (error)
mov [sock_fd], eax ; place the newly created socket file descriptor into sock_fd
mov [connect_args], eax ; ???
; now make a connection with the created socket
push addr_len
push ip_addr
push sock_fd
mov ecx, esp ; places connect args into ecx for syscall
mov ebx, SYS_CONNECT ; socket function to invoke (3= socket connect)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; check for errors
jl exit
; now start sending stuff
; now read into the buffer
readwrite:
mov edx, buflen ; edx= sys read size_t buflen= max. 1000 bytes
mov ecx, buffer ; each character (char *) to be read
mov ebx, [connect_fd] ; int: file descriptor to be read from the socket connection
mov eax, 3 ; read sys call
int 0x80
cmp eax, 0 ; returns the number of bytes read
jl exit ; exit on error (if return value -1 or less)
; now write the length of what was read
mov edx, eax ; the size_t count of bytes read moved into edx
mov ecx, buffer ; the characters to write are stored in "buffer"
mov ebx, [connect_fd] ; file descriptor to write to
mov eax, 4 ; write sys call
int 0x80
cmp eax, 0 ; return value= bytes written, if less than 0, error occurred
jl exit
; need something here to determine EOF??
jmp readwrite ; continue reading/ writing until EOF
xor eax, eax ; clear eax if no errors occurred
exit:
mov ebx, eax ; put exit code into ebx
mov eax, 1 ; exit sys call
int 0x80
section .bss
sock_fd resd 1 ; socket fd = 32- bits
connect_fd resd 1
buffer resb buflen ; reserve 1000 bytes for 'buffer'
A few questions on your previous server/bind code:
You use:
my_sa istruc sockaddr_in
at sockaddr_in.sin_family, dw AF_INET
at sockaddr_in.sin_port, dw PORT
at sockaddr_in.sin_addr, dd INADDR_ANY
at sockaddr_in.sin_zero, dd 0, 0
iend
My professor provided us with sample code:
struc sockaddr_in
.sin_family: resw 1
.sin_port: resw 1
.sin_addr: resd 1
.sin_pad: resb 8
endstruc
are these the same/ interchangeable? If not what is going on and what is the difference?
What is this for and how does it work:
BACKLOG equ 128 ;for listen
For code below:
mov [fd_socket], eax
; and fill in bind_args, etc.
mov [bind_args], eax
mov [listen_args], eax
mov [accept_args], eax
are fd_socket, bind_args, listen_args, and accept_args all going to have the same value (whatever is in eax) or does the value in eax change each time something is moved?
You declared in section .data:
bind_args dd 0, my_sa, sockaddr_in_size
and then there is a socket call:
mov ecx, bind_args
mov ebx, SYS_BIND ; subfunction or "command"
mov eax, __NR_socketcall
int 80h
What is the value of sockaddr_in_size and what are values of my_sa:
my_sa istruc sockaddr_in
at sockaddr_in.sin_family, dw AF_INET
at sockaddr_in.sin_port, dw PORT
at sockaddr_in.sin_addr, dd INADDR_ANY
at sockaddr_in.sin_zero, dd 0, 0
iend
I'm not making a "connection" as to how those values get filled up or how those are being stored from the previous socket call.
I notice you do this a lot:
cmp eax, -4096
ja exit
Is there any significance to -4096 or does this only pertain to the server side of things?
Are "listen" and "accept" required on the client (my) side?
Can you help explain what's going on here:
section .bss
my_buf resb BUFLEN
and
BUFLEN equ 1000
Why is BUFLEN used sometimes instead of my_buf?
For example:
; read(sock, buf, len)
mov edx, BUFLEN ; arg 3: max count
mov ecx, my_buf ; arg 2: buffer
mov ebx, [fd_conn] ; arg 1: fd
mov eax, __NR_read ; sys_read
int 80h
All the pieces coming together...
-
First some comments on your code-so-far. I don't think "global resolv" is correct. I think you want "extern resolv". "global" tells Nasm to tell the linker, "this function/data is here, if anybody is looking for it". "extern" tells Nasm to tell the linker, "we use this but it isn't here. Please find it for us".
At your comment "now start sending stuff" you want to send the "request". Then you read what the server sends us into a buffer. Then you appear to write it back to the socket. I think you want to write it to stdout, or to a disk file. Other than that, it looks like you're on the right track.
"struc" is a typedef. "istruc" creates an instance of the structure. There's a small error in my code there. At sin.addr I've got INADDR_ANY. We want a zero there - to be filled in with the address we get from "resolv" or "localhost" in the case of "echo". It is not the "protocol". I don't know how I got away with that. Nasm calculates sockaddr_in_size from the "struc" typedef. It is the size of the structure(!). "my_sa" is the address of my instance of the structure.
"BACKLOG" is just an argument to "listen". You don't need it (or "bind" or "accept") for a client.
After the "socket" call, eax contains the descriptor, which is the first argument to several other calls. I'm just filling them in while I've got the value handy. The value in eax does not change.
"-4096" is a Linux thing, not limited to socket programming. I often use the "quick and dirty" js error. Strictly speaking, there are negative numbers which are not an error. We might encounter them in allocating memory or... I dunno perhaps other things. You may notice that I use a negative number with an unsigned condition code. C would complain, but I know what I mean. It's just a more accurate way to identify what's an error. js or jl (than 0) should work for anything we encounter here.
BUFLEN is "EQU"ated to a constant value, 1000 in this case. "my_buf" is the address of the buffer of that size.
If I've missed any questions, ask again...
Best,
Frank
-
Which file descriptor do I use with write to write to a local file?
I imagine the file first has to be "created".. do I use a sys call for create such that
mov eax, 8
mov ebx, pathname
mov ecx, 3 (mode, write)
int 0x80
eax would now equal the file descriptor to write to for subsequent writes
but.. how do we specify a pathname in nasm?
I am going by this linux man page for "open/ create" http://man7.org/linux/man-pages/man2/creat.2.html (http://man7.org/linux/man-pages/man2/creat.2.html)
-
The descriptor you got from sys_open, I guess.
Best,
Frank
-
I notice you declare sizes for multiple things on the same lines:
bind_args dd 0, my_sa, sockaddr_in_size
Does this mean that bind_args is of size "double word" (32- bits/ 4 bytes) and that each "argument" such as 0, my_sa, and sockaddr_in_size are also 32- bits/ 4 bytes each? So bind_args would end up being a total of 12 bytes?
And what does "nop" do right after your start?
-
Yes, each argument is 32 bits. If they were different sizes, we would want to put them on different lines.
The "nop" is sometimes commented as "parking place for GDB". GDB seems happier if it has a single-byte instruction to replace with the single-byte single step interrupt opcode (0xCC or whatever it is). It also seems happier if you assemble with -F dwarf. I find GDB "unfriendly" and avoid it if I can. You should learn to use it!
Best,
Frank
-
I'm working at parsing the hostname from the command line, here is what I have going so far:
section .bss
sock_fd resd 1 ; socket fd = 32- bits
connect_fd resd 1
buffer resb buflen ; address of 1000 byte 'buflen' (1,000 characters at a time)
get_request resb 0x1000 ; 1000 bytes (characters) reserved for the HTTP GET request
cl_name resb 150 ; 150 bytes (characters) reserved for URL typed at command line
cl_name_len resd 1 ; 4 bytes (1 double word) reserved for storing length of typed URL
cl_hostname resb 50 ; 50 bytes (characters) reserved for the hostname of URL
cl_tail resb 100 ; 100 bytes (characters) reserved for the end of URL (past hostname)
section .text
_start:
; set up and parse from command line for HTTP GET request
; nop?
pop eax ; pops esp (top of stack [argc]) into eax
cmp eax, 2 ; were only 2 args typed in command line?
je good_argc ; if yes, away we go
jmp error_exit ; buh bye
; start parsing the URL
good_argc:
pop esi ; pop argv[0] (program name) throw it away- don't need it
pop esi ; pop argv[1] (URL) into esi- this is what we need
cmp [esi], dword "http" ; does URL start with http?
je x_http ; if so, parse it out
cmp [esi], dword "HTTP" ; same thing but user typed all caps
jne get_hostname ; if none of the above conditions apply, no need to parse out http://
x_http:
add esi, 4 ; jump to next character after http
cmp [esi], byte ":" ; should be :
jne error_exit ; user typed in the url incorrectly
add esi, 1 ; proceed to character after :
cmp [esi], word "//" ; // comes after :
jne error_exit ; user typed in url incorrectly
add esi, 2 ; now we are at the beginning of hostname
get_hostname:
mov cl_hostname, [esi] ; store character in variable 'cl_hostname'
cmp [esi], byte "/" ; check for end of hostname
je found_hostname
inc esi ; inc to next character to be stored into 'cl_hostname'
jmp get_hostname
found_hostname:
; now start parsing the pathname
exit:
mov ebx, eax ; put exit code into ebx
mov eax, 1 ; exit sys call
int 0x80
error_exit:
xor ebx, ebx
mov ebx, -1 ; produce -1 exit error code
mov eax, 1 ; exit sys call
int 0x80
yay or nay?
-
Nasm says nay.
section .bss
sock_fd resd 1 ; socket fd = 32- bits
connect_fd resd 1
buffer resb buflen ; address of 1000 byte 'buflen' (1,000 characters at a time)
get_request resb 0x1000 ; 1000 bytes (characters) reserved for the HTTP GET request
cl_name resb 150 ; 150 bytes (characters) reserved for URL typed at command line
cl_name_len resd 1 ; 4 bytes (1 double word) reserved for storing length of typed URL
cl_hostname resb 50 ; 50 bytes (characters) reserved for the hostname of URL
cl_tail resb 100 ; 100 bytes (characters) reserved for the end of URL (past hostname)
section .text
_start:
; set up and parse from command line for HTTP GET request
; nop?
pop eax ; pops esp (top of stack [argc]) into eax
cmp eax, 2 ; were only 2 args typed in command line?
je good_argc ; if yes, away we go
mov eax, -1 ; to produce an error code upon exiting
jmp error_exit ; buh bye
; start parsing the URL
good_argc:
pop esi ; pop argv[0] (program name) throw it away- don't need it
pop esi ; pop argv[1] (URL) into esi- this is what we need
cmp [esi], dword "http" ; does URL start with http?
je x_http ; if so, parse it out
cmp [esi], dword "HTTP" ; same thing but user typed all caps
jne get_hostname ; if none of the above conditions apply, no need to parse out http://
x_http:
add esi, 4 ; jump to next character after http
cmp [esi], byte ":" ; should be :
jne error_exit ; user typed in the url incorrectly
add esi, 1 ; proceed to character after :
cmp [esi], word "//" ; // comes after :
jne error_exit ; user typed in url incorrectly
add esi, 2 ; now we are at the beginning of hostname
get_hostname:
BZZZZZT!
mov cl_hostname, [esi] ; store character in variable 'cl_hostname'
cl_hostname is the address of the variable. it is an immediate number. You're trying to do "mov 6, 9" (reference to an old Jimi Hendrix tune).
[cl_hostname] would be the contents, which we could put something into. But... we can't do a memory to memory move. The movsb instruction is an exception, and might be an option here. Otherwise we have to use a register (8-bit). al will do...
mov al, [esi]
mov [cl_hostname], al
inc esi? also known as lodsb?
cmp [esi], byte "/" ; check for end of hostname
je found_hostname
inc esi ; inc to next character to be stored into 'cl_hostname'
jmp get_hostname
found_hostname:
; now start parsing the pathname
Nasm has other complaints, but that one's a flat out error.
I think I mentioned that I had a heck of a time with this part of the code. I started out thinking I needed two copies of hostname - one zero terminated for resolv and one CR, LF terminated for the request. No, I can put the CR, LF in the "hard coded" part of the request. I was and am parsing the command line by copying parts to two intermediate variables in .bss. It occurs to me that I don't have to move the command line at all. I could just find the start address and length of the hostname and the "tail" if any. Then I could copy them directly into the request. I don't think I'd want to try to write them to the socket directly from there. Just thinking out loud... I'll probably keep the intermediate copy...
My suggestion would be to keep it as simple as you can. I haven't written the "usage" message yet. My thought would be to not tell 'em http at all. I've got http and HTTP covered, but if the pesky user goes Http or https I'm still toast... Since you and I and maybe your prof are probably going to be the only users we can be well behaved, I suppose.
Code on, Turtle13!
Best,
Frank
-
^If I want to use movsb:
get_hostname:
inc cl_hostname_len ; variable for number of characters in hostname
;mov cl_hostname, [esi] ; store character in variable 'cl_hostname'
cmp [esi], byte "/" ; check for end of hostname
je copy_hostname
cmp [esi], 0 ; because user can type in just a host with no path, need to return /index.html
je copy_hostname_only
inc esi ; inc to next character to be stored into 'cl_hostname'
jmp get_hostname
copy_hostname:
; try movsb?
cld ; clear direction flag
mov ecx, [cl_hostname_len] ; the count= number of characters to copy into cl_hostname
mov edi, cl_hostname
mov esi, esi
rep movsb ; move the string
?
To resolve hostname- to- IP (assuming that I already parsed the hostname correctly):
; resolve DNS hostname to IP address
push cl_hostname
call resolv
add esp, 4
mov [ip_hostname], ebx
...
section .bss
ip_hostname resd 1 ; for storing the resolved hostname- to- IP address
?
-
Almost 2am here and going to bed but in case you have nothing better to do on a Friday morning when you wake up before I wake up, here is some of my code for socket call stuff:
AF_INET equ 2 ; sin_family
SOCK_STREAM equ 1 ; socket type (TCP)
PROTOCOL equ 0 ; just because
SYS_SOCKET equ 1
SYS_CONNECT equ 3
SYS_SEND equ 9
SYS_RECV equ 10
buflen equ 1000
; structure for sockaddr_in
struc sockaddr_in
.sin_family: resw 1 ; address family AF_INET 16- bits
.sin_port: resw 1 ; port # 16- bits
.sin_addr: resd 1 ; 32- bit IP address of web server
.sin_pad: resb 8 ; 8 bytes of padding
endstruc
section .data
socket_args dd AF_INET, SOCK_STREAM, 0
addr_len dw 1 ; 32- bit IP address
section .text
...
; set up and create a socket file descriptor
push PROTOCOL
push SOCK_STREAM
push AF_INET
mov ecx, esp ; places socketcall args into ecx for syscall
mov ebx, SYS_SOCKET ; socket function to invoke (1= socket command)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; test for error (eax= socket file descriptor)
jl exit ; jump if negative (error)
mov [sock_fd], eax ; place the newly created socket file descriptor into sock_fd
mov [connect_args], eax ; this will need the socket fd as an argument also
; now make a connection with the created socket
push addr_len ; 32- bit IP address
push ip_hostname ; resolved hostname- to- IP address
push sock_fd ; identifies the socket
mov ecx, esp ; places connect args into ecx for syscall
mov ebx, SYS_CONNECT ; socket function to invoke (3= socket connect)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; check for errors
jl exit
...
xor eax, eax ; clear eax if no errors occurred
exit:
mov ebx, eax ; put exit code into ebx
mov eax, 1 ; exit sys call
int 0x80
error_exit:
xor ebx, ebx
mov ebx, -1 ; produce -1 exit error code
mov eax, 1 ; exit sys call
int 0x80
...
section .bss
sock_fd resd 1 ; socket fd = 32- bits
connect_fd resd 1
buffer resb buflen ; address of 1000 byte 'buflen' (1,000 characters at a time)
get_request resb 0x1000 ; 1000 bytes (characters) reserved for the HTTP GET request
cl_name resb 300 ; 300 bytes (characters) reserved for URL typed at command line
cl_path_len resd 1 ; 4 bytes (1 double word) reserved for storing length of typed URL's path
cl_filename_len resd 1 ; 4 bytes to store length of the URL's filename
cl_filename resb 50 ; 50 bytes (characters) reserved for filename to download
cl_hostname resb 50 ; 50 bytes (characters) reserved for the hostname of URL
cl_hostname_len resd 1 ; 4 bytes to store length of the hostname
cl_tail resb 100 ; 100 bytes (characters) reserved for the end of URL (past hostname)
pathname resb 100 ; 100 bytes (characters) for the parsed pathname
ip_hostname resd 1 ; for storing the resolved hostname- to- IP address
-
Ahhh had a reply typed in and then lost it, Hate it when that happens! [cl_hostname] ! And put esi back to where you want movsb to start.
G'night!
Frank
Oh, and resolv teturns answer or -1 in eax, not ebx!
-
Usual confusion between address and [contents]!
AF_INET equ 2 ; sin_family
SOCK_STREAM equ 1 ; socket type (TCP)
PROTOCOL equ 0 ; just because
SYS_SOCKET equ 1
SYS_CONNECT equ 3
SYS_SEND equ 9
SYS_RECV equ 10
buflen equ 1000
; structure for sockaddr_in
struc sockaddr_in
.sin_family: resw 1 ; address family AF_INET 16- bits
.sin_port: resw 1 ; port # 16- bits
.sin_addr: resd 1 ; 32- bit IP address of web server
.sin_pad: resb 8 ; 8 bytes of padding
endstruc
section .data
socket_args dd AF_INET, SOCK_STREAM, 0
addr_len dw 1 ; 32- bit IP address
section .text
...
; set up and create a socket file descriptor
; here you're pushing constants - that's fine
push PROTOCOL
push SOCK_STREAM
push AF_INET
mov ecx, esp ; places socketcall args into ecx for syscall
mov ebx, SYS_SOCKET ; socket function to invoke (1= socket command)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; test for error (eax= socket file descriptor)
jl exit ; jump if negative (error)
mov [sock_fd], eax ; place the newly created socket file descriptor into sock_fd
mov [connect_args], eax ; this will need the socket fd as an argument also
; now make a connection with the created socket
; you're pushing addresses of variables here!
; probably want [contents]
; you'll have to say "dword".
push addr_len ; 32- bit IP address
push ip_hostname ; resolved hostname- to- IP address
push sock_fd ; identifies the socket
mov ecx, esp ; places connect args into ecx for syscall
mov ebx, SYS_CONNECT ; socket function to invoke (3= socket connect)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; check for errors
jl exit
...
xor eax, eax ; clear eax if no errors occurred
exit:
mov ebx, eax ; put exit code into ebx
mov eax, 1 ; exit sys call
int 0x80
error_exit:
xor ebx, ebx
mov ebx, -1 ; produce -1 exit error code
mov eax, 1 ; exit sys call
int 0x80
...
section .bss
sock_fd resd 1 ; socket fd = 32- bits
connect_fd resd 1
buffer resb buflen ; address of 1000 byte 'buflen' (1,000 characters at a time)
get_request resb 0x1000 ; 1000 bytes (characters) reserved for the HTTP GET request
cl_name resb 300 ; 300 bytes (characters) reserved for URL typed at command line
cl_path_len resd 1 ; 4 bytes (1 double word) reserved for storing length of typed URL's path
cl_filename_len resd 1 ; 4 bytes to store length of the URL's filename
cl_filename resb 50 ; 50 bytes (characters) reserved for filename to download
cl_hostname resb 50 ; 50 bytes (characters) reserved for the hostname of URL
cl_hostname_len resd 1 ; 4 bytes to store length of the hostname
cl_tail resb 100 ; 100 bytes (characters) reserved for the end of URL (past hostname)
pathname resb 100 ; 100 bytes (characters) for the parsed pathname
ip_hostname resd 1 ; for storing the resolved hostname- to- IP address
Past my bedtime...
Best,
Frank
-
Well... another day...
; now make a connection with the created socket
push addr_len ; 32- bit IP address
push ip_hostname ; resolved hostname- to- IP address
push sock_fd ; identifies the socket
What I've got for connect arguments (and it works) is the descriptor ([contents] !), the address of a sockaddr_in structure... this is not just the IP returned from resolv... and the length of that structure. You've got the "typedef" of the structure, but I don't see an instance of it.
This is what I've got:
sa:
dw AF_INET ; family
dw htons (80) ; port
.addr dd 0 ; fill in later
dd 0, 0 ; zero
sa_size equ $ - sa
; first of these wants to be socket descriptor
; we fill it in later...
connect_args dd 0, sa, sa_size
As you can see, I don't use the "struc" or "istruc" keywords, but it generates exactly the same code. Note that "family" and "port" are words, the IP address is a dword, and two dwords padding at the end. "port" needs to be in "network byte order. Apparently "family" does not - I don't know why not. The IP address is already in network byte order. I don't put it on the stack, but that shouldn't matter.
If you can try out the socket code with a hard-coded request - separate from parsing the command line - that would be a good thing.
Best,
Frank
-
Trying to hard code a URL to test socket call and it's returning "255" (-1) so at least I know that my error_exit is working (or not)
; nasm -f elf32 -g sockettest.asm -o sockettest.o
; ld -m elf_i386 sockettest.o dns.o -o sockettest
bits 32
global _start
extern resolv
AF_INET equ 2 ; sin_family
SOCK_STREAM equ 1 ; socket type (TCP)
PROTOCOL equ 0 ; just because
SYS_SOCKET equ 1
SYS_CONNECT equ 3
SYS_SEND equ 9
SYS_RECV equ 10
buflen equ 1000
%define htons(x) ((x >> 8) & 0xFF) | ((x & 0xFF) << 8) ; for converting port number to network byte order
; structure for sockaddr_in
struc sockaddr_in
.sin_family: resw 1 ; address family AF_INET 16- bits
.sin_port: resw 1 ; port # 16- bits
.sin_addr: resd 1 ; 32- bit IP address of web server
.sin_pad: resb 8 ; 8 bytes of padding
endstruc
section .data
socket_args dd AF_INET, SOCK_STREAM, 0 ; each arg is given 4 bytes
addr_len dd 1 ; 32- bit IP address
; instance of sockaddr_in structure
sock_addr_inst:
dw AF_INET ; family
dw htons (80) ; network byte order converted HTTP port number
.addr dd 0 ; fill in later- IP address
dd 0, 0 ; 8 bytes of padding
sock_addr_inst_size equ $ - sock_addr_inst
connect_args dd 0, sock_addr_inst, sock_addr_inst_size
section .text
_start:
push "www.google.com"
call resolv ; resolve to www.google.com IP address
add esp, 4 ; clean up stack
mov [ip_hostname], eax
; set up and create a socket file descriptor
push PROTOCOL
push SOCK_STREAM
push AF_INET
mov ecx, esp ; places socketcall args into ecx for syscall
mov ebx, SYS_SOCKET ; socket function to invoke (1= socket command)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; test for error (eax= socket file descriptor)
jl error_exit ; jump if negative (error)
mov [sock_fd], eax ; place the newly created socket file descriptor into sock_fd
mov [connect_args], eax ; this will need the socket fd as an argument also
; now make a connection with the created socket
push dword [addr_len] ; 32- bit IP address
push dword [ip_hostname] ; resolved hostname- to- IP address
push dword [sock_fd] ; identifies the socket
mov ecx, esp ; places connect args into ecx for syscall
mov ebx, SYS_CONNECT ; socket function to invoke (3= socket connect)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; check for errors
jl error_exit
xor eax, eax ; clear eax if no errors occurred
exit:
mov ebx, eax ; put exit code into ebx
mov eax, 1 ; exit sys call
int 0x80
error_exit:
xor ebx, ebx
mov ebx, -1 ; produce -1 exit error code
mov eax, 1 ; exit sys call
int 0x80
section .bss
sock_fd resd 1 ; socket fd = 32- bits
connect_fd resd 1
buffer resb buflen ; address of 1000 byte 'buflen' (1,000 characters at a time)
get_request resb 0x1000 ; 1000 bytes (characters) reserved for the HTTP GET request
cl_name resb 300 ; 300 bytes (characters) reserved for URL typed at command line
cl_path_len resd 1 ; 4 bytes (1 double word) reserved for storing length of typed URL's path
cl_filename_len resd 1 ; 4 bytes to store length of the URL's filename
cl_filename resb 50 ; 50 bytes (characters) reserved for filename to download
cl_hostname resb 50 ; 50 bytes (characters) reserved for the hostname of URL
cl_hostname_len resd 1 ; 4 bytes to store length of the hostname
cl_tail resb 100 ; 100 bytes (characters) reserved for the end of URL (past hostname)
pathname resb 100 ; 100 bytes (characters) for the parsed pathname
ip_hostname resd 1 ; for storing the resolved hostname- to- IP address
-
Returns 0 now.
; nasm -f elf32 -g sockettest.asm -o sockettest.o
; ld -m elf_i386 sockettest.o dns.o -o sockettest
bits 32
global _start
extern resolv
AF_INET equ 2 ; sin_family
SOCK_STREAM equ 1 ; socket type (TCP)
PROTOCOL equ 0 ; just because
SYS_SOCKET equ 1
SYS_CONNECT equ 3
SYS_SEND equ 9
SYS_RECV equ 10
buflen equ 1000
%define htons(x) ((x >> 8) & 0xFF) | ((x & 0xFF) << 8) ; for converting port number to network byte order
; structure for sockaddr_in
struc sockaddr_in
.sin_family: resw 1 ; address family AF_INET 16- bits
.sin_port: resw 1 ; port # 16- bits
.sin_addr: resd 1 ; 32- bit IP address of web server
.sin_pad: resb 8 ; 8 bytes of padding
endstruc
section .data
socket_args dd AF_INET, SOCK_STREAM, 0 ; each arg is given 4 bytes
addr_len dd 1 ; 32- bit IP address
; instance of sockaddr_in structure
sock_addr_inst:
dw AF_INET ; family
dw htons (80) ; network byte order converted HTTP port number
.addr dd 0 ; fill in later- IP address
dd 0, 0 ; 8 bytes of padding
sock_addr_inst_size equ $ - sock_addr_inst
connect_args dd 0, sock_addr_inst, sock_addr_inst_size
name db "www.google.com", 0
section .text
_start:
push name ; "www.google.com"
call resolv ; resolve to www.google.com IP address
add esp, 4 ; clean up stack
cmp eax, -1
je exit
mov [ip_hostname], eax
mov [sock_addr_inst.addr], eax
; set up and create a socket file descriptor
push PROTOCOL
push SOCK_STREAM
push AF_INET
mov ecx, esp ; places socketcall args into ecx for syscall
mov ebx, SYS_SOCKET ; socket function to invoke (1= socket command)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; test for error (eax= socket file descriptor)
jl exit ; jump if negative (error)
mov [sock_fd], eax ; place the newly created socket file descriptor into sock_fd
mov [connect_args], eax ; this will need the socket fd as an argument also
; now make a connection with the created socket
push dword sock_addr_inst_size ;[addr_len] ; 32- bit IP address
push dword sock_addr_inst ; tname] ; resolved hostname- to- IP address
push dword [sock_fd] ; identifies the socket
mov ecx, esp ; places connect args into ecx for syscall
mov ebx, SYS_CONNECT ; socket function to invoke (3= socket connect)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; check for errors
jl exit
xor eax, eax ; clear eax if no errors occurred
exit:
mov ebx, eax ; put exit code into ebx
neg ebx
mov eax, 1 ; exit sys call
int 0x80
error_exit:
xor ebx, ebx
mov ebx, -1 ; produce -1 exit error code
mov eax, 1 ; exit sys call
int 0x80
section .bss
sock_fd resd 1 ; socket fd = 32- bits
connect_fd resd 1
buffer resb buflen ; address of 1000 byte 'buflen' (1,000 characters at a time)
get_request resb 0x1000 ; 1000 bytes (characters) reserved for the HTTP GET request
cl_name resb 300 ; 300 bytes (characters) reserved for URL typed at command line
cl_path_len resd 1 ; 4 bytes (1 double word) reserved for storing length of typed URL's path
cl_filename_len resd 1 ; 4 bytes to store length of the URL's filename
cl_filename resb 50 ; 50 bytes (characters) reserved for filename to download
cl_hostname resb 50 ; 50 bytes (characters) reserved for the hostname of URL
cl_hostname_len resd 1 ; 4 bytes to store length of the hostname
cl_tail resb 100 ; 100 bytes (characters) reserved for the end of URL (past hostname)
pathname resb 100 ; 100 bytes (characters) for the parsed pathname
ip_hostname resd 1 ; for storing the resolved hostname- to- IP address
As you can see, I've changed quite a lot. Haven't really determined much. I'll paste a request into it and see if I can get it to actually work.
Later,
Frank
-
Well, I pasted a request into it, and it did work. Unfortunately, I screwed up and overwrote it, so I can't post it. It did work...
Best,
Frank
-
Okay, I reconstructed it...
; nasm -f elf32 -g sockettest.asm -o sockettest.o
; ld -m elf_i386 sockettest.o dns.o -o sockettest
bits 32
global _start
extern resolv
AF_INET equ 2 ; sin_family
SOCK_STREAM equ 1 ; socket type (TCP)
PROTOCOL equ 0 ; just because
SYS_SOCKET equ 1
SYS_CONNECT equ 3
SYS_SEND equ 9
SYS_RECV equ 10
buflen equ 1000
%define htons(x) ((x >> 8) & 0xFF) | ((x & 0xFF) << 8) ; for converting port number to network byte order
; structure for sockaddr_in
struc sockaddr_in
.sin_family: resw 1 ; address family AF_INET 16- bits
.sin_port: resw 1 ; port # 16- bits
.sin_addr: resd 1 ; 32- bit IP address of web server
.sin_pad: resb 8 ; 8 bytes of padding
endstruc
section .data
socket_args dd AF_INET, SOCK_STREAM, 0 ; each arg is given 4 bytes
addr_len dd 1 ; 32- bit IP address
; instance of sockaddr_in structure
sock_addr_inst:
dw AF_INET ; family
dw htons (80) ; network byte order converted HTTP port number
.addr dd 0 ; fill in later- IP address
dd 0, 0 ; 8 bytes of padding
sock_addr_inst_size equ $ - sock_addr_inst
connect_args dd 0, sock_addr_inst, sock_addr_inst_size
name db "www.google.com", 0
msg db"GET / HTTP/1.1", 13, 10
db "Host: www.google.com", 13, 10
db "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10
msg_len equ $ - msg
section .text
_start:
push name ; "www.google.com"
call resolv ; resolve to www.google.com IP address
add esp, 4 ; clean up stack
cmp eax, -1
je exit
mov [ip_hostname], eax
mov [sock_addr_inst.addr], eax
; set up and create a socket file descriptor
push PROTOCOL
push SOCK_STREAM
push AF_INET
mov ecx, esp ; places socketcall args into ecx for syscall
mov ebx, SYS_SOCKET ; socket function to invoke (1= socket command)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; test for error (eax= socket file descriptor)
jl exit ; jump if negative (error)
mov [sock_fd], eax ; place the newly created socket file descriptor into sock_fd
mov [connect_args], eax ; this will need the socket fd as an argument also
; now make a connection with the created socket
push dword sock_addr_inst_size ; size of the structure
push dword sock_addr_inst ; address of the structure
push dword [sock_fd] ; identifies the socket
mov ecx, esp ; places connect args into ecx for syscall
mov ebx, SYS_CONNECT ; socket function to invoke (3= socket connect)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; check for errors
jl exit
; write the request to socket
mov eax, 4
mov ebx, [sock_fd]
mov ecx, msg
mov edx, msg_len
int 80h
; read the page (if they send it to us)
mov eax, 3
mov ebx, [sock_fd]
mov ecx, buffer
mov edx, buflen
int 80h
; write it to stdout
mov edx, eax ; length is whatever we read
mov eax, 4
mov ebx, 1
mov ecx, buffer
int 80h
xor eax, eax ; clear eax if no errors occurred
exit:
mov ebx, eax ; put exit code into ebx
neg ebx
mov eax, 1 ; exit sys call
int 0x80
; this throws away information about what the problem was
; I don't use it
error_exit:
xor ebx, ebx
mov ebx, -1 ; produce -1 exit error code
mov eax, 1 ; exit sys call
int 0x80
section .bss
sock_fd resd 1 ; socket fd = 32- bits
connect_fd resd 1
buffer resb buflen ; address of 1000 byte 'buflen' (1,000 characters at a time)
get_request resb 0x1000 ; 1000 bytes (characters) reserved for the HTTP GET request
cl_name resb 300 ; 300 bytes (characters) reserved for URL typed at command line
cl_path_len resd 1 ; 4 bytes (1 double word) reserved for storing length of typed URL's path
cl_filename_len resd 1 ; 4 bytes to store length of the URL's filename
cl_filename resb 50 ; 50 bytes (characters) reserved for filename to download
cl_hostname resb 50 ; 50 bytes (characters) reserved for the hostname of URL
cl_hostname_len resd 1 ; 4 bytes to store length of the hostname
cl_tail resb 100 ; 100 bytes (characters) reserved for the end of URL (past hostname)
pathname resb 100 ; 100 bytes (characters) for the parsed pathname
ip_hostname resd 1 ; for storing the resolved hostname- to- IP address
So if you can parse the accursed command line properly (good luck!), your socket code should work.
Best,
Frank
-
My last final push. Toward the bottom I am trying to write to a file descriptor so it can be written locally. I'm not sure how to declare a local directory to be written to using the open sys call. I've given up on parsing the command line stuff.
It's been a good run. Tried my hardest and best. At least I learned something. Gonna have lots to drink tonight.
; nasm -f elf32 -g sockettest.asm -o sockettest.o
; ld -m elf_i386 sockettest.o dns.o -o sockettest
bits 32
global _start
extern resolv
AF_INET equ 2 ; sin_family
SOCK_STREAM equ 1 ; socket type (TCP)
PROTOCOL equ 0 ; just because
SYS_SOCKET equ 1
SYS_CONNECT equ 3
SYS_WRITE equ 4
SYS_OPEN equ 5
SYS_SEND equ 9
SYS_RECV equ 10
buflen equ 1000
%define htons(x) ((x >> 8) & 0xFF) | ((x & 0xFF) << 8) ; for converting port number to network byte order
; structure for sockaddr_in
struc sockaddr_in
.sin_family: resw 1 ; address family AF_INET 16- bits
.sin_port: resw 1 ; port # 16- bits
.sin_addr: resd 1 ; 32- bit IP address of web server
.sin_pad: resb 8 ; 8 bytes of padding
endstruc
section .data
socket_args dd AF_INET, SOCK_STREAM, 0 ; each arg is given 4 bytes
addr_len dd 1 ; 32- bit IP address
; instance of sockaddr_in structure
sock_addr_inst:
dw AF_INET ; family
dw htons (80) ; network byte order converted HTTP port number
.addr dd 0 ; fill in later- IP address
dd 0, 0 ; 8 bytes of padding
sock_addr_inst_size equ $ - sock_addr_inst
connect_args dd 0, sock_addr_inst, sock_addr_inst_size
name db "www.google.com", 0
msg db"GET / HTTP/1.0", 13, 10
db "Host: www.google.com", 13, 10
db "Connection: close", 13, 10, 13, 10
msg_len equ $ - msg
section .text
_start:
push name ; pointer to URL
call resolv ; resolve to www.google.com IP address
add esp, 4 ; clean up stack
cmp eax, -1
je exit
mov [ip_hostname], eax
mov [sock_addr_inst.addr], eax ; puts ip address into sock addr instance
; set up and create a socket file descriptor
push PROTOCOL
push SOCK_STREAM
push AF_INET
mov ecx, esp ; places socketcall args into ecx for syscall
mov ebx, SYS_SOCKET ; socket function to invoke (1= socket command)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; test for error (eax= socket file descriptor)
jl exit ; jump if negative (error)
mov [sock_fd], eax ; place the newly created socket file descriptor into sock_fd
mov [connect_args], eax ; this will need the socket fd as an argument also
; now make a connection with the created socket
push dword sock_addr_inst_size ; 32- bit IP address
push dword sock_addr_inst ; resolved hostname- to- IP address
push dword [sock_fd] ; identifies the socket
mov ecx, esp ; places connect args into ecx for syscall
mov ebx, SYS_CONNECT ; socket function to invoke (3= socket connect)
mov eax, 102 ; socketcall syscall
int 0x80
add esp, 12 ; clean up stack
cmp eax, 0 ; check for errors
jl exit
; write the request to socket
mov eax, 4
mov ebx, [sock_fd]
mov ecx, msg
mov edx, msg_len
int 0x80
; read the page
mov eax, 3
mov ebx, [sock_fd]
mov ecx, buffer
mov edx, buflen
int 0x80
; open/ create file
mov eax, SYS_OPEN
mov ebx, PATHNAME?
mov ecx, 3 ; read and write access
mov edx, 00700 ; read, write, execute permission
int 0x80
mov [file_fd], ebx ; stores newly created file descriptor to write to
; write it to file_fd
mov edx, eax ; length is whatever we read
mov eax, SYS_WRITE
mov ebx, [file_fd]
mov ecx, buffer
int 0x80
xor eax, eax ; clear eax if no errors occurred
exit:
mov ebx, eax ; put exit code into ebx
neg ebx
mov eax, 1 ; exit sys call
int 0x80
section .bss
sock_fd resd 1 ; socket fd = 32- bits
connect_fd resd 1
buffer resb buflen ; address of 1000 byte 'buflen' (1,000 characters at a time)
get_request resb 0x1000 ; 1000 bytes (characters) reserved for the HTTP GET request
cl_name resb 300 ; 300 bytes (characters) reserved for URL typed at command line
cl_path_len resd 1 ; 4 bytes (1 double word) reserved for storing length of typed URL's path
cl_filename_len resd 1 ; 4 bytes to store length of the URL's filename
cl_filename resb 50 ; 50 bytes (characters) reserved for filename to download
cl_hostname resb 50 ; 50 bytes (characters) reserved for the hostname of URL
cl_hostname_len resd 1 ; 4 bytes to store length of the hostname
cl_tail resb 100 ; 100 bytes (characters) reserved for the end of URL (past hostname)
pathname resb 100 ; 100 bytes (characters) for the parsed pathname
ip_hostname resd 1 ; for storing the resolved hostname- to- IP address
file_fd resd 1
-
I imagine "myfile.text", 0 would work. If that's a problem. "./myfile.txt", 0 Obviously you haven't tried this. Besides sys_open you might look into sys_creat (8) (note not "create"!).
The "permissions" are usually to be represented as octal. C takes a leading zero as octal, Nasm does not. 700q will probably work better than 700 decimal.
The file descriptor will be in eax, not ebx, You will certaimly want to check for error.
Don't hurt yourself.
Best,
Frank