Author Topic: Writing networking code in assembly  (Read 22629 times)

Offline turtle13

  • Jr. Member
  • *
  • Posts: 73
Writing networking code in assembly
« on: September 15, 2017, 09:18:50 PM »
Continuing on earlier conversations about networking in NASM:

for the exit code..

Code: [Select]
exit:
        mov ebx, eax
        test ebx, ebx
        jns exit_good
        neg ebx

exit_good:
        mov eax, 1
        int 0x80

I would like to use my custom l_exit function.
I coded it as:

Code: [Select]
l_exit:

        mov ebx, [ebp + 8]      ; int rc
       
        mov eax, 1              ; exit sys call
        int 0x80

Can I simply replace the exit and exit_good and call this function as:

Code: [Select]
push ebx
call l_exit
« Last Edit: September 15, 2017, 09:22:59 PM by turtle13 »

Offline turtle13

  • Jr. Member
  • *
  • Posts: 73
Re: Writing networking code in assembly
« Reply #1 on: September 15, 2017, 09:55:27 PM »
Frank, earlier you provided sample code:

Code: [Select]
request_len equ $ - sa

sa:
        dw 2
        dw htons (80)           ; port for http
        .addr:
                dd 0, 0         
        sa_size equ $ -sa

^ I'm not sure what exactly is going on here?

Next:

Code: [Select]
section .data

request db "GET / HTTP/1.0", 13, 10
db "Host: www.google.com", 13, 10
db "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10

What are the "13, 10" after each line? The first line has 14 characters, not 13.. and what is the 10 for after the 13?
Also, there should be a \r\n after the GET line, so should it look something like:

Code: [Select]
request db "GET / HTTP1.0\r\n", 15, 10

I am probably way off here..
« Last Edit: September 15, 2017, 10:39:22 PM by turtle13 »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Writing networking code in assembly
« Reply #2 on: September 15, 2017, 11:35:00 PM »
The way that I exit is just so I can see the error code (or not) by typing "echo $?". If I weren't so lazy, I would "issue a diagnostic and exit cleanly" - also known as "scream and die". Your l_exit function should do the same, depending on what you've got in ebx when you push it.

Code: [Select]
request_len equ $ - sa
; wot? should be $ - request, no?


sa:
        dw 2
        dw htons (80)           ; port for http
        .addr: should be a "dd 0"  here, no?
                dd 0, 0         
        sa_size equ $ -sa

"sa" is a structure expected by socket code. The "connect" command wants a pointer to it, and its length, in its arguments.

13 is a carriage return and 10 is a linefeed. It has nothing to do with the length of the line. Nasm will accept:
Code: [Select]
`foo \r\n`
if you put it between "back quotes" - the character under the ~ on my keyboard. Between double quotes " or single quotes ' - Nasm just sees a literal "\r\n". Do one or the other, not both. I don't know where you got 15, 10. The double  13, 10 at the end is just a blank line which indicates the end of the request.  The internet is fussy about that cruft.

Best,
Frank


Offline turtle13

  • Jr. Member
  • *
  • Posts: 73
Re: Writing networking code in assembly
« Reply #3 on: September 16, 2017, 12:41:00 AM »
Thanks that clears a bit up for me.

If I want to be able to input a custom URL on the command line instead of the default "www.somehost.com" as below:

Code: [Select]
section .data

request db "GET / HTTP/1.0", 13, 10             ; input the carriage return and linefeed after request
db "Host: www.somehost.com", 13, 10
db "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10

I would need to parse out the contents of argv[1] correct?
So that if I entered a URL like:

Code: [Select]
./networkprogram www.anotherhost.com/dir/index.html
it would connect to www.anotherhost.com and not www.somehost.com

If I want to have argv[1] become the URL:

Code: [Select]
request db "GET / HTTP/1.0", 13, 10             ; input the carriage return and linefeed after request
db "Host: www.somehost.com", 13, 10
db "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10

Now I'm stuck

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Writing networking code in assembly
« Reply #4 on: September 16, 2017, 01:20:46 AM »
Yeah, I guess. I didn't realize that you were supposed to get the URL from the command line. How  the command line (and environment variables) are set up depends on whether we start with "_start:" or "main". In either case, the command line argument will be zero-terminated. That should be all set to send off to "resolv". To fit it into the "request", we'll have to concatenate it to "Host:" and replace the zero with CR,LF (13, 10)... and we will need to calculate the length of the request at runtime. "$ - request" isn't going to work.

I do not have an example at hand of doing that.

Best,
Frank


Offline turtle13

  • Jr. Member
  • *
  • Posts: 73
Re: Writing networking code in assembly
« Reply #5 on: September 20, 2017, 12:12:35 AM »
Back to basics..

if I want the command line argv[1] to be the URL to get the file from:

Code: [Select]
[shell]$ ./prog_name http://www.someplace.com/index.html

Would I code this something like:

Code: [Select]
pop ebx                 ; pops esp- top of stack (argc) into ebx
pop ebx                 ; pops argv[0] (program name: ./prog_name) into ebx
pop ebx                 ; pops argv[1] (URL) into ebx

mov ecx, [ebx]          ; store contents of ebx (URL) into ecx for sys write
mov eax, 4              ; write sys call- to write the URL into the HTTP request
int 0x80
(When I compile and run the above, I get a seg fault)

How do I code what needs to be written to? So that the URL can be parsed and placed into a request:

Code: [Select]
GET /index.html HTTP/1.0\r\n
Host: www.someplace.com\r\n
Connection: close\r\n
\r\n

Using pseudocode below, I am thinking that I would need to declare a variable such as

Code: [Select]
section .data
request db "GET " + (everything after hostname) + " HTTP/1.0\r\nHost: " + (host name) + "\r\nConnection: close\r\n\r\n"

Because the length of the request will be different each time the program is run.

I can use my previous l_write function to pass the file descriptor (this should be the socket for the web server?), the char *buf, and the len of the amount to write. To determine len for the initial GET request, I will need to determine how many characters (bytes) are in the URL typed at the command line and then add that to the rest of the generic GET request. I'm trying to think more on how I can do this. I would need to declare some variable like request_len

What is a good way to accomplish this, if I have explained myself clearly?
« Last Edit: September 20, 2017, 02:22:03 AM by turtle13 »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Writing networking code in assembly
« Reply #6 on: September 20, 2017, 03:12:47 AM »
Hi turtle13,

To begin with, what we find on the stack depends on how we link the program. I guess I just told you that. Consult "start.asm" (from an earlier assignment) for what linking with the C startup code changes. I'll assume we do not do that here.

Code: [Select]
_start:

pop ebx                 ; pops esp- top of stack (argc) into ebx
; might want to check that this is 2 - "usage" otherwise

pop ebx                 ; pops argv[0] (program name: ./prog_name) into ebx
pop ebx                 ; pops argv[1] (URL) into ebx
; I'd pop ecx here, or...

mov ecx, ebx          ; store contents of ebx (URL) into ecx for sys write
; [ebx] should be "http" or so, thus the segfault

mov eax, 4              ; write sys call- to write the URL into the HTTP request
int 0x80
; this is not going to write into your request
; ebx is gawd knows what
; and we need to find length of zero-terminated string

I think what I would do is to reserve a "big enough" buffer in .bss. I would then copy into it, counting all the way, "GET " (note the space), the "tail" of the command line argument, then "HTTP 1.1" 13, 10. Then "Host: ", and the hostname part - we will want to send this off to "resolv", too. That wants zero-terminated, the request wants CR,LF terminated (13, 10). Then the rest of the request.

That is all untested. I'll try to work up an example. You seem to understand what's required, you try it too.

Best,
Frank


Offline turtle13

  • Jr. Member
  • *
  • Posts: 73
Re: Writing networking code in assembly
« Reply #7 on: September 20, 2017, 04:14:50 AM »
^Frank, looking at my start.asm from last assignment,

Code: [Select]
bits 32
extern main
global _start
section .text
_start:
lea eax, [esp + 4]
mov ecx, [esp]
lea edx, [eax + ecx * 4 + 4]
push edx
push eax
push ecx
call main
add esp, 12
mov ebx, eax
mov eax, 1
int 0x80

In this case it looks like ecx would store argc, eax argv[0], and edx argv[1] (which would be the URL)
Is this what you were referring to?


Here is some code I coughed up which is not nearly finalized but which in my mind is where I should start to get to writing the GET request:

Code: [Select]
bits 32


section .bss

%define buf_size 0x10000        ; initialize a 10,000 byte buffer
buf resb buf_size               ; buf will be written into for setting up HTTP GET request


section .data

hostname db 1000                ; set up 1000 bytes for hostname
pathname db 1000                ; set up 1000 bytes for URL path
filename db 1000                ; set up 1000 bytes for filename to retrieve
get_request1 db "GET "
get_request2 db " HTTP/1.0", 13, 10
get_request3 db "Host: "
get_request4 db "Connection: close", 13, 10, 13, 10


section .text

global _start

_start:

; accepting command line arguments:

pop eax                 ; pops esp- top of stack (argc) into eax
pop ecx                 ; pops argv[0] (program name: ./assign5) into ecx
pop edx                 ; pops argv[1] (URL) into edx

; something like this to build the GET request?
push get_request4
push hostname
push get_request3 ; need to also include the \r\n somehow
push get_request2
push filename
push pathname
push get_request1

; ok the GET request is on the stack, now what?

In my code above I pushed the GET request one at a time onto the stack, but now I'm wondering how do I get all of that information into a "single" variable to be used to send in one big chunk to the web server?

Offline turtle13

  • Jr. Member
  • *
  • Posts: 73
Re: Writing networking code in assembly
« Reply #8 on: September 20, 2017, 05:06:01 AM »
I've been doing some research on creating a socket and have some questions..

- The syscall for socket is 102. So..
Code: [Select]
mov eax, 102
mov ebx, "connect"
mov ecx, full_get_request
int 0x80

I'm stuck on how to get it to connect.. what value goes into ebx here?

Looking at the Linux man pages for "socketcall"
Code: [Select]
int socketcall(int call, unsigned long *args)
I believe that first I need to "bind" so that I have the web server IP address bound with port 80.

Well, then I go into the man page for "bind"
Code: [Select]
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
Now it looks like there are a bunch of data structures like "struct" involved here and I'm lost as to how I get all of this information connected so that I can just make a connection to the web server and send the GET request.

To bind, would I need to do something like

Code: [Select]
mov eax, 102
mov ebx, "bind"
mov ecx, "address and port"
int 0x80
How do I get the "code" for "bind" and how to get address and port into ecx, or would address be in ecx and port be in edx? Is this making any sense?

Then, how would I get the new value that was "bound" to be "saved" so that I can use it again..
Code: [Select]
mov eax, 102
mov ebx, "bind"
mov ecx, "address and port"
int 0x80

mov edi, [ebp - 8]        ; get the return value from "bind"
mov [bound_socket], edi        ; store the bound socket into a variable called "bound_socket"


I am not allowed to call any C library functions, so I am stuck as to how to implement all of this using assembly.
« Last Edit: September 20, 2017, 05:11:20 AM by turtle13 »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Writing networking code in assembly
« Reply #9 on: September 20, 2017, 06:14:24 AM »
Unless I'm mistaken, "bind" is if you're the server - you don't need it here.

As the client,  you need to "socket" and then "connect". Then you could use the "send" command to sys_socketcall, or use sys_write... or maybe your l_write?

Did I post "fbkget.asm" that gets "www.nasm.us"? I think so. I'm ripping it to shreds now to read the command line. I'm about up to the '/' in "www.nasm.us/index"... if the pesky user types that... and trying to figure out if not... It is getting extremely ugly. Won't even assemble at the moment, let alone work. I may rip it up and start over. Here it is, as is:
Code: [Select]
; nasm -f elf32 httpget.asm
; ld -o httpget httpget.o dns.o (-melf_i386 for 64-bit systems?)

global _start
extern resolv ; from The professor's dns.o

; Convert numbers (constants!) to network byte order
%define hton(x) ((x & 0xFF000000) >> 24) | ((x & 0x00FF0000) >>  8) | ((x & 0x0000FF00) <<  8) | ((x & 0x000000FF) << 24)
%define htons(x) ((x >> 8) & 0xFF) | ((x & 0xFF) << 8)

section .data

; the real request is named request - this just for reference
    msg db"GET / HTTP/1.1", 13, 10
        db "Host: www.nasm.us", 13, 10
db "Connection: close", 13, 10
db "User-Agent: assembly language", 13, 10, 13, 10
    msg_len equ $ - msg

    msg1 db "GET "
    msg1len equ $ - msg1
    ; cltail comes next
    msg2 db "HTTP/1.1", 13, 10, "Host: "
    msg2len equ $ - msg2
    ; clhostname next
    msg3 db 13, 10, "Connection: close", 13, 10
         db "User-Agent: assembly language", 13, 10, 13, 10
    msg3len equ $ - msg3



; todo: correct name and description of this structure
sa:
    dw 2 ; family
    dw htons (80) ; port
.addr    dd 0 ; fill in later
    dd 0, 0 ; zero
sa_size equ $ - sa


; first of these wants to be socket descriptor
; we fill it in later...
connect_args dd 0, sa, sa_size

name db "www.nasm.us", 0


section .bss
%define BUFSIZ 4000h
    buf resb BUFSIZ
    sock_fd resd 1
    request resb 100h
    clname resb 80 ; this one zero terminated for resolv
    clnamelen resd 1
    clhostname resb 80 ;
    cltail resb 80


section .text
_start:

nop

; get argc
    pop eax
    cmp eax, 2
    je goodargc
    ;print usage message
    jmp exit
goodargc:
    pop esi ; throw away program name
    pop esi ; our argument?
    cmp [esi], dword "http"
    je killhttp
    cmp [esi], dword "HTTP"
    jne nohttp
killhttp:
    add esi, 4
    cmp [esi], word "\\"
    jne nohttp
    add esi, 2
    mov edi, clname
morename:
    lodsb
    cmp al, "/"
    je foundtail
    stosb
    cmp al, 0
    je endarg
    jmp morename
foundtail:
; end clname with 0
    mov [edi], byte 0
    sub edi, clname
    mov [clnamelen], edi
    mov edi, cltail
    stosb
moretail:
    lodsb
    stosb
    cmp al, 0
    je endarg2



    push name
call resolv ; returns ip, big endian
add esp, 4
; call showeaxh ; just to see it
mov [sa.addr], eax

; socket
    push 0
    push 1
    push 2
    mov ecx, esp
    mov ebx, 1
    mov eax, 102
    int 80h
    add esp, 12
    test eax, eax
    js exit
    mov [sock_fd], eax
    mov [connect_args], eax
   
; connect
    mov ecx, connect_args
    mov ebx, 3 ; connect
    mov eax, 102
    int 80h
    test eax, eax
    js exit

; write
    mov edx, msg_len
    mov ecx, msg
    mov ebx, [sock_fd]
    mov eax, 4
    int 80h
    test eax, eax
    js exit

; read
    xor esi, esi ; total count read
   
    mov edx, BUFSIZ
    mov ecx, buf
    mov ebx, [sock_fd]
reread:
    mov eax, 3
    int 80h
    test eax, eax
    js exit
    jz goodread
    sub edx, eax
    jna goodread ; all we have room for
    add ecx, eax
    add esi, eax
    jmp reread
goodread:


; write to stdout

mov ecx,buf
mov edx, esi
    mov ebx, 1
    mov eax, 4
    int 80h
    xor eax, eax
   
exit:
    mov ebx, eax
    test ebx, ebx
    jns goodexit
    neg ebx
goodexit:
    mov eax, 1
    int 80h

;------------------------------
showeaxh:   
    push eax
    push ebx
    push ecx
    push edx

    sub esp, 10h
   
    mov ecx, esp
    xor edx, edx
    mov ebx, eax
.top:   
    rol ebx, 4
    mov al, bl
    and al, 0Fh
    cmp al, 0Ah
    sbb al, 69h
    das
    mov [ecx + edx], al
    inc edx
    cmp edx, 8
    jnz .top

; add a linefeed
    mov byte [ecx + edx], 10
    inc edx

    mov ebx, 1
    mov eax, 4
    int 80h
   
    add esp, 10h
   
    pop edx
    pop ecx
    pop ebx
    pop eax
    ret
;------------------------------   

The formerly working parts may help you figure out how "socket" and "connect" and sys_read go together...

Best,
Frank


Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Writing networking code in assembly
« Reply #10 on: September 20, 2017, 06:57:14 AM »
This is something different. It connects to an echo server, not a http server. It has some comments and names of structures that might be helpful.
Code: [Select]
;-----------------------------   
;
; nasm -f elf32 echocli.asm
; ld -o echocli echocli.o

global _start

struc sockaddr_in
    .sin_family resw 1
    .sin_port resw 1
    .sin_addr resd 1
    .sin_zero resb 8
endstruc

_ip equ 0x7F000001 ; loopback - 127.0.0.1
;_ip equ 0x48400C6B ; loopback - 127.0.0.1

_port equ 2002

; Convert numbers to network byte order

IP equ ((_ip & 0xFF000000) >> 24) | ((_ip & 0x00FF0000) >>  8) | ((_ip & 0x0000FF00) <<  8) | ((_ip & 0x000000FF) << 24)
PORT equ ((_port >> 8) & 0xFF) | ((_port & 0xFF) << 8)

AF_INET        equ 2
SOCK_STREAM    equ 1

BUFLEN         equ  0x80


STDIN          equ 0
STDOUT         equ 1
LF equ 10
EINTR equ 4

__NR_exit equ 1
__NR_read       equ 3
__NR_write      equ 4
__NR_socketcall equ 102

SYS_SOCKET     equ 1
SYS_CONNECT    equ 3

section .data

my_sa istruc sockaddr_in
    at sockaddr_in.sin_family, dw AF_INET
    at sockaddr_in.sin_port, dw PORT
    at sockaddr_in.sin_addr, dd IP
    at sockaddr_in.sin_zero, dd 0, 0
iend

socket_args dd AF_INET, SOCK_STREAM, 0

; first of these wants to be socket descriptor
; we fill it in later...
connect_args dd 0, my_sa, sockaddr_in_size


section .bss
    my_buf resb BUFLEN
    sock_desc resd 1


section .text
 _start:

    ; socket(AF_INET, SOCK_STREAM, 0)

    mov     ecx, socket_args ; address of args structure
    mov     ebx, SYS_SOCKET     ; subfunction or "command"
    mov     eax, __NR_socketcall     ;c.f. /usr/src/linux/net/socket.c
    int     80h

    cmp     eax, -4096
    ja      exit

    mov     [sock_desc], eax
    ; and fill in connect_args
    mov     [connect_args], eax

     ; connect(sock, (struct sockaddr *)&sa, sizeof(struct sockaddr))

    mov     ecx, connect_args
    mov     ebx, SYS_CONNECT ; subfunction or "command"
    mov     eax, __NR_socketcall
    int     80h

    cmp     eax, -4096
    ja      exit

again:
    push BUFLEN
    push my_buf
    push STDIN
    call readline
    add esp, 12

    push eax
    push my_buf
    push dword [sock_desc]
    call writeNbytes
    add esp, 12

    cmp dword [my_buf], 'quit'
    jz goodexit

    push BUFLEN
    push my_buf
    push dword [sock_desc]
    call readline
    add esp, 12

    push eax
    push my_buf
    push STDOUT
    call writeNbytes
    add esp, 12

    jmp again

goodexit:

    xor eax, eax ; success
 exit:
    mov     ebx, eax ; exitcode
    neg ebx
    mov     eax, __NR_exit
    int     80h

readline:
    push ebp
    mov ebp, esp
    sub esp, 4
    pusha
%define fd ebp + 8
%define buf ebp + 12
%define max ebp + 16
%define result ebp - 4

    mov dword [result], 0
    mov ebx, [fd]
    mov ecx, [buf]
    mov edx, [max]
.reread:
    mov eax, __NR_read
    int 80h
    cmp eax, -EINTR
    jz .reread
    cmp eax, -4096
    ja exit
    add [result], eax
    cmp byte [eax + ecx - 1], LF
    jz .done
    add ecx, eax
    sub edx, eax
    jna .done
    jmp .reread
.done:
    popa
    mov eax, [result]
%undef fd
%undef buf
%undef max
%undef result
    mov esp, ebp
    pop ebp
    ret


writeNbytes:
    push ebp
    mov ebp, esp
    sub esp, 4
    pusha
%define fd ebp + 8
%define buf ebp + 12
%define Nbytes ebp + 16
%define bytesleft ebp - 4

    mov ebx, [fd]
    mov ecx, [buf]
    mov edx, [Nbytes]
    mov [bytesleft], edx
.rewrite:
    mov eax, __NR_write
    int 80h
    cmp eax, -EINTR
    jz .rewrite
    cmp eax, -4096
    ja exit
    sub [bytesleft], eax
    jz .done
    add ecx, eax
    sub edx, eax
    jna .done
    jmp .rewrite
.done:
    popa
    mov eax, [Nbytes]
%undef fd
%undef buf
%undef Nbytes
%undef bytesleft
    mov esp, ebp
    pop ebp
    ret

I've got an echo server to go with it...

Best,
Frank


Offline turtle13

  • Jr. Member
  • *
  • Posts: 73
Re: Writing networking code in assembly
« Reply #11 on: September 20, 2017, 07:00:40 AM »
I'm going to look at it more carefully tomorrow morning but for now I noticed something that sticks out to me in your code:

Code: [Select]
goodargc:
    pop esi ; throw away program name
    pop esi ; our argument?
    cmp [esi], dword "http"
    je killhttp
    cmp [esi], dword "HTTP"
    jne nohttp
killhttp:
    add esi, 4
    cmp [esi], word "\\"
    jne nohttp
    add esi, 2
    mov edi, clname

under killhttp:, under add esi, 4, shouldn't it be "cmp [esi], dword "://" you have two backslashes and forgot the colon which the user would type as part of the request. Which brings up another issues- there are three characters but dword makes room for 4 characters, meaning that there is a memory alignment issue. How to solve that?

Also, "add esi, 2" shouldn't that be "add esi, 3" to include the colon as well. So if the user typed in "http://www.url.com" then this is trying to parse out the "http://" which is 7 characters. add esi, 4 and add esi, 3 would add up to 7 characters.

And while I can get one more thing off my head, you declare "sock_fd resd 1" under section .bss, does this mean that the file descriptor for socket is "1" or is that just reserving 1 double word (4 bytes) for a socket file descriptor which will be determined later on.. and how are socket file descriptors determined?

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Writing networking code in assembly
« Reply #12 on: September 20, 2017, 07:33:27 AM »
Good Catch! Besides missing the colon, there are probably other errors. I will probably do the colon as a single byte. As you point out, 3 bytes doesn't "fit" well.

"resd 1" reserves one dword. .bss is uninitialized, so it doesn't have a value until we put one there. Actually, the OS fills it with zeros, but I don't think we're supposed to "count on" that

Socket descriptors are returned by the "socket" command :)

This is my echo server. I post it only because it has "bind" which you were asking about.
Code: [Select]
; echo server
; echos lines until "quit"
; "kill" unloads server
; runs on loopback, 127.0.0.1, and port 2002,
; no options on this model!
;
; nasm -f elf echosrv.asm
; ld -o echosrv echosrv.o

global _start

;-------------------
; should probably be in an .inc file

struc sockaddr_in
    .sin_family resw 1
    .sin_port resw 1
    .sin_addr resd 1
    .sin_zero resb 8
endstruc

; Convert numbers (constants!) to network byte order
%define hton(x) ((x & 0xFF000000) >> 24) | ((x & 0x00FF0000) >>  8) | ((x & 0x0000FF00) <<  8) | ((x & 0x000000FF) << 24)
%define htons(x) ((x >> 8) & 0xFF) | ((x & 0xFF) << 8)

AF_INET        equ 2
SOCK_STREAM    equ 1
INADDR_ANY equ 0 ; /usr/include/linux/in.h

STDIN          equ 0
STDOUT         equ 1

__NR_exit equ 1
__NR_read       equ 3
__NR_write      equ 4
__NR_close equ 6
__NR_socketcall equ 102

; commands for sys_socketcall
; /usr/include/linux/in.h
SYS_SOCKET equ 1
SYS_BIND equ 2
SYS_CONNECT equ 3
SYS_LISTEN equ 4
SYS_ACCEPT equ 5
SYS_SEND equ 9
SYS_RECV equ 10
;------------------------

_ip equ 0x7F000001 ; loopback - 127.0.0.1
_port equ 2002

; Convert 'em to network byte order
IP equ hton(_ip)
PORT equ htons(_port)

BACKLOG equ 128 ; for "listen"
BUFLEN         equ  1000

section .data

my_sa istruc sockaddr_in
    at sockaddr_in.sin_family, dw AF_INET
    at sockaddr_in.sin_port, dw PORT
    at sockaddr_in.sin_addr, dd INADDR_ANY
    at sockaddr_in.sin_zero, dd 0, 0
iend

socket_args dd AF_INET, SOCK_STREAM, 0

; first of these wants to be socket descriptor
; we fill it in later...
bind_args   dd 0, my_sa, sockaddr_in_size
listen_args dd 0, BACKLOG
accept_args dd 0, 0, 0

section .bss
    my_buf resb BUFLEN
    fd_socket resd 1
    fd_conn resd 1

section .text
 _start:

    ; socket(AF_INET, SOCK_STREAM, 0)
    mov ecx, socket_args ; address of args structure
    mov ebx, SYS_SOCKET     ; subfunction or "command"
    mov eax, __NR_socketcall     ;c.f. /usr/src/linux/net/socket.c
    int 80h

    cmp eax, -4096
    ja exit

    mov [fd_socket], eax
    ; and fill in bind_args, etc.
    mov [bind_args], eax
    mov [listen_args], eax
    mov [accept_args], eax


    mov ecx, bind_args
    mov ebx, SYS_BIND ; subfunction or "command"
    mov eax, __NR_socketcall
    int 80h

    cmp eax, -4096
    ja exit

    mov ecx, listen_args
    mov ebx, SYS_LISTEN ; subfunction or "command"
    mov eax, __NR_socketcall
    int 80h

    cmp eax, -4096
    ja exit

again:
    mov ecx, accept_args
    mov ebx, SYS_ACCEPT ; subfunction or "command"
    mov eax, __NR_socketcall
    int 80h
    cmp eax, -4096
    ja  exit

    mov [fd_conn], eax


readagain:

    ; read(sock, buf, len)
    mov edx, BUFLEN ; arg 3: max count
    mov ecx, my_buf ; arg 2: buffer
    mov ebx, [fd_conn] ; arg 1: fd
    mov eax, __NR_read ; sys_read
    int 80h

    cmp eax, -4096
    ja exit

    mov edx, eax ; length read is length to write
    mov ecx, my_buf
    mov ebx, [fd_conn]
    mov eax, __NR_write
    int 80h
    cmp eax, -4096
    ja exit

    cmp dword [my_buf], 'quit'
    jz closeconn

    cmp dword [my_buf], 'kill'
    jz killserver

    jmp readagain

closeconn:
    mov eax, __NR_close
    mov ebx, [fd_conn]
    int 80h
    cmp eax, -4096
    ja exit
    jmp again
   
killserver:
    mov eax, __NR_close
    mov ebx, [fd_conn]
    int 80h
    mov eax, __NR_close
    mov ebx, [fd_socket]
    int 80h

goodexit:
    xor eax, eax ; success
 exit:
    mov ebx, eax ; exitcode
    neg ebx
    mov eax, __NR_exit
    int 80h
;-----------------------

A better server would fork off a child to handle the connection, while the parent went back to listening. (I think)

Best,
Frank


Offline turtle13

  • Jr. Member
  • *
  • Posts: 73
Re: Writing networking code in assembly
« Reply #13 on: September 21, 2017, 02:14:31 AM »
- Do I need to use "hton" because the dns.o file I was provided with has unsigned int resolv(const char *hostName) function that should already resolve hostnames:

This function uses the cdecl calling convention and returns a network byte ordered IP address of the named host or -1 if the host name can't be resolved.

- doing some research on Unix socket programming I am starting to understand more how this all works and is set up but still have some questions. So how I understand it is a socket needs to be created using the socket syscall. This sys call takes three arguments: domain/ family (AF_INET), type (SOCK_STREAM), and PROTOCOL. Protocol should be 0, but how is "AF_INET" and "SOCK_STREAM" set in assembly? And what registers should each one go in? That is my biggest headache with assembly is that there is really no clear answer as to what registers do what at a given time.

- after this socket sys call, I should receive an integer which will be the file descriptor for the socket? OK, now how do I "store" that to save later on? For something like Python it would be
Code: [Select]
socket_fd= socket(AF_INET, SOCK_STREAM, 0)
but translating this to assembly is throwing me off.

- OK now after the socket is created (and the socket file descriptor is set), I must use this file descriptor to connect and download something from the web server. Now this is where a "structure" is required? Ex:
Code: [Select]
struc sockaddr_in
        .sin_family:    resw 1
        .sin_port:    resw 1
        .sin_addr:    resd 1
        .sin_pad:    resb 8
endstruc
How I understand it:
sockaddr_in is the name of the structure.
.sin_family is the 16-bit code for AF_INET
.sin_port is the 16- bit port # (which would be port 80 for HTTP)
.sin_addr is the 32- bit IP address (that was previously resolved using the dns.o and resolv function)
*Not sure what .sin_pad resb 8 is used for

I feel really stupid because I'm sure it's been explained 1,000 times before but.. how do I get those values to send to the web server now?

To set up the initial socket sys call do I do something like:

Code: [Select]
mov eax, 102
mov ebx, 3
mov ecx, full_request

Ok now, the resource I was using to help explain this http://www.linuxhowtos.org/C_C++/socket.htm which is of course a C/C++ reference because there is no other resource for assembly programming on the planet besides Frank Kotler, but anyway.. it says that I then need to make a "connect" sys call. Well I don't see a sys call for "connect" (using this reference table as a guide http://faculty.nps.edu/cseagle/assembly/sys_call.html ) so, I'm lost as to how to actually do that sys call.

Then after connecting, I make "read" sys calls to read the web page into the buffer, and then "write" sys calls to write the buffer to a local file. Right?

I think I have the basic outline, but getting this all into something that the processor can execute so that it does what I want is consuming my every waking moment at this point.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Writing networking code in assembly
« Reply #14 on: September 21, 2017, 05:47:53 AM »
Hi turtle13,

No, you won't need "hton" - what you get from "resolv" is already in "network byte order". That is, "big endian". We normally use "little endian" in x86. You may wish to use "htons" - the "short" (16 bit) version. The "port" - 80 for http - needs to be in "network byte order", too.

"AF_INET" and "SOCK_STREAM" can be found in the echo examples, or you can do it the crude way:
Code: [Select]
; socket
    push 0 ; todo: name these!
    push 1
    push 2
    mov ecx, esp
    mov ebx, 1 ; the "socket" command
    mov eax, 102 ; sys_socketcall
    int 80h
    add esp, 12
    test eax, eax
    js exit
    mov [sock_fd], eax
    mov [connect_args], eax
Although rather cryptic, this creates a "structure" on the stack. What goes in registers is the address of this "structure" (esp in this case), the "command" (for socket, bind, connect, etc.) and the sys_socketcall number (102). You can also see how the socket descriptor (returned in eax) is stored in variables. The C library pretends that these are separate functions, but under the hood it's the same sys_socketcall with different numbers in ebx. While I've got the socket descriptor in eax, I stuff it into the arguments we'll be using for "connect". The remainder of those arguments are the address and length of that "sockaddr_in" structure... into which we have stuffed the address we got from "resolv". The "padding" may have once been "reserved for future use", but it's just zeros. It has to be there.

After the "connect", we first write to the socket - our "request". Then we read what it sends us. Besides the page we requested, I find that this might be "bad request" or perhaps "forbidden".

I'm still having a lot of trouble getting from the command line to the request. The command line argument is zero-terminated. We want the "hostname" that we send to "resolv" to be zero terminated, but we don't want the zero in the "request". I think that's where I'm going wrong...

One thing to keep in mind is that Nasm's "struc" keyword is just a "typedef". It does not create an "instance" of your structure or reserve any memory. For that, you need "istruc". See the Friendly Manual or there's an example in the "echo" code. Fairly ugly, IMO. You do not need to use this. If you put a few variables together, one after another, it's a "structure" whether we call it that or not. "struc" and "istruc" just help you keep 'em together.

Best,
Frank