NASM - The Netwide Assembler

NASM Forum => Programming with NASM => Topic started by: turtle13 on August 15, 2017, 04:43:01 AM

Title: Help with converting keyboard input into its hex output
Post by: turtle13 on August 15, 2017, 04:43:01 AM
I have an assignment due Friday night that looks pretty overwhelming to me as a newbie. Any help you can assist with how I should go about setting this up would be greatly appreciated, assembly language programming is not easy for me and am struggling to learn it.

Assignment:

Part 1

Write an assembly language program to run on 32 bit Linux. Your program is to read a single line of input from the keyboard and output the two digit hexadecimal representation of each character separating each two digit hex value with exactly one space. You can expect no more than 200 characters to be entered on the input line and the line is considered to be terminated when a new line character ('\n', 0x0A) is read. It is up to you whether you choose to read one character at a time or many characters at a time. You will need to use the read, write, and exit system calls. Note that the read system call will not return until the user of your program presses the Enter key at which point you will be able to read every character up to and including the new line. On Linux systems, information regarding these system calls can be found in the man pages for each (under section 2) such as: man 2 read. The man pages describe the operation of each function including the number, type, and sequence of parameters required by each function. The required system call numbers can be found on slide 119 in the course notes or in the file /usr/include/asm/unistd_32.h.

A couple of hints

A) you will be working with byte size data.
B) A character is just a byte, each byte can be written as two hex digits. You need to break each character down into its two hex digits, determine the appropriate ASCII representation of each of those digits and then output both digits. All hex digits fall in the range 0..9A..F. Feel free to use upper or lower case letters at your discretion.
C) You need a loop, consider how to terminate it.
D) Example input and output:
ABCD\n
41 42 43 44 0A

Note that the 0A at the end of the output above represents the new line character at the end of the input line. Your program MUST NOT output anything other than the converted hex values (ie, DO NOTprompt for user input or display any other helpful messages)

Part 2

Write a Linux assembly language program that prints, one per line, the command line arguments used to invoke your program along with all of the environment variables in your programs execution environment. You may not use ANY system calls. Your program must consist solely of a main function and may use only the printf and exit functions from the C standard library. Your program MUST NOToutput anything other than argv and envp values (ie, DO NOT prompt for user input or display any other helpful messages). Your main function must adhere to the following prototype (keep in mind that main uses the cdecl calling convention):
int main(int argc, char *argv[], char *envp[]);
Example: If your final executable is named assign3_part2, a sample run might look like this:
# ./assign3_part2 hello world
./assign3_part2
hello
world
TERM=xterm
SHELL=/bin/bash
PATH=/bin:/usr/bin:/usr/local/bin
PWD=/home/jones
HOME=/home/jones
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 15, 2017, 07:29:54 AM
Hi again turtle13,

Actually, hex is a lot easier to display than decimal. We don't need to deal with that confusing (and slow) "div" instruction. Each four bits is a digit. We want to print the leftmost four bits first, then the rightmost four bits. The difference between a number and its ascii representation is 30h... or 48 decimal... or '0'. The latter is somewhat "self-documenting". The characters A..F do not immediately follow the characters 0..9. The difference is 7, or if you want lowercase 27h. Try "man ascii" for an ascii table.

First, of course, you need to use sys_read to read some characters. Do you know how to do that part? Your instructor kindly tells us that we can expect no more than 200 bytes. The typical user will get bored long before that. sys_read will return the number of bytes typed in eax (includes the linefeed). We can use that, or just watch for the linefeed to terminate our loop.

Supposing we've got our input in "inbuf"...
Code: [Select]
mov ecx, eax ; length read into loop counter
mov esi, inbuf
mov edi, outbuf
.top:
lodsb ; get a byte in al
push eax ; save it - we're gonna need it twice
shr al, 4 ; get high 4 bits into low
add al, '0' ; note: character '0', not 0
cmp al, '9'
jbe .skip
add al, 7
.skip:
stosb
pop eax ; get our byte back
and al, 0Fh ; low 4 bits this time
add al, '0' ; and do it again
cmp al, '9'
jbe .skip2
add al, 7
.skip2:
stosb
mov al ' ' ; exactly one space
stosb
loop .top
mov al, 0Ah ; linefeed
stosb
... and print "outbuf". Oops, we probably should have counted the length as we went along. You can add that. Or we can figure the length another way...

There are different ways to do this. This is the shortest way I know:
Code: [Select]
; isolate a nibble
cmp al, 0Ah
sbb al, 69h
das
"das" is real slow... and it's hard to figure out! I don't recommend this method unless you're short on memory.

Another method:
Code: [Select]
section .data
hextable db ""0123456789ABCDEF"
this might have the "advantage" that it's one of the few uses I know for "xlatb".

Which brings us to a question... Which instructions do you know? What are you using for a textbook? No sense in my showing you "string" instructions if you don't know them. Take a shot at this and show us what you can get. (write the comments first if you don't know where to start)

Fortunately, "part 2" looks like a completely different program - a completely different world! Going from "display hex" to "just call printf"? Why not use printf to display hex? Well, I'm not teaching the course...

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 18, 2017, 04:01:35 AM
This is what I have come up with so far:

Code: [Select]
bits 32

section .text

global _start

_start:

        mov ebx, 0              ; set up stdin for keyboard input
        mov eax, 3              ; sys call for read- returns #bytes typed into eax
        int 0x80                ; kernel trap

        ;mov esi, input_buffer
        ;mov edi, output_buffer

returnkey:                      ; loop that will terminate the input if return is pressed
        mov



done:
        mov ebx, 0
        mov eax, 1
        int 0x80

section .data
hex_table db "0123456789ABCDEF"

section .bss

input_buffer resb 200   ; declare 200 bytes for keyboard input

Does this appear to be on the right track?
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 18, 2017, 04:37:42 AM
Yeah, pretty much...
Code: [Select]
bits 32

section .text

global _start

_start:

        mov ebx, 0              ; set up stdin for keyboard input
        mov eax, 3              ; sys call for read- returns #bytes typed into eax
; need a buffer to read  into in ecx
        mov ecx, input_buffer
; and the (maximum) count
        mov edx, 200
        int 0x80                ; kernel trap

        mov esi, input_buffer
        mov edi, output_buffer
        mov ecx, eax ; count of bytes typed
        mov ebx, hex_table
        xor edx, edx ; counter for output bytes to print

; sys_read does this
;returnkey:                      ; loop that will terminate the input if return is pressed
 ;       mov

top:
; get a byte/character into al
        lodsb
        push eax ; save a copy
; isolate high nibble
        shr al, 4
        xlat ; alias of xlatb
        stosb ; put it in output buffer
        pop eax ; get our byte back
        and al, 0x0F ; isolate low nibble
        xlatb ; alias of xlat - take your pick
        stosb ; put in output buffer
        mov al, ' ' ; a space
        stosb
        add edx, 3
        loop top  ; do 'em all
        mov al, 10 ; linefeed
        stosb
        inc edx

; now print it out
        mov eax, 4 ; sys_write
        mov ebx, 1 ; stdout
        mov ecx, output_buffer
        ; edx should be all set
        int 0x80

done:
        mov ebx, 0
        mov eax, 1
        int 0x80

section .data
hex_table db "0123456789ABCDEF"

section .bss

input_buffer resb 200   ; declare 200 bytes for keyboard input
output_buffer resb 601 ; 3 bytes for each input byte, plus 1 for the final linefeed

That's untested - "should be" pretty close... I'll try it and get back to ya...

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 18, 2017, 04:55:15 AM
Yeah... looks okay...
Code: [Select]
; from turtle13 on the Forum

; since Nasm will do many things, I like to put the command line in the file
; Nasm -f elf32 myprog.asm
; ld -o myprog muprog.o -m elf_i386

bits 32

section .text

global _start

_start:

        mov ebx, 0              ; set up stdin for keyboard input
        mov eax, 3              ; sys call for read- returns #bytes typed into eax
; need a buffer to read  into in ecx
        mov ecx, input_buffer
; and the (maximum) count
        mov edx, 200
        int 0x80                ; kernel trap

        mov esi, input_buffer
        mov edi, output_buffer
        mov ecx, eax ; count of bytes typed
        mov ebx, hex_table
        xor edx, edx ; counter for output bytes to print

; sys_read does this
;returnkey:                      ; loop that will terminate the input if return is pressed
 ;       mov

top:
; get a byte/character into al
        lodsb
        push eax ; save a copy
; isolate high nibble
        shr al, 4
        xlat ; alias of xlatb
        stosb ; put it in output buffer
        pop eax ; get our byte back
        and al, 0x0F ; isolate low nibble
        xlatb ; alias of xlat - take your pick
        stosb ; put in output buffer
        mov al, ' ' ; a space
        stosb
        add edx, 3
        loop top  ; do 'em all
        mov al, 10 ; linefeed
        stosb
        inc edx

; now print it out
        mov eax, 4 ; sys_write
        mov ebx, 1 ; stdout
        mov ecx, output_buffer
        ; edx should be all set
        int 0x80

done:
        mov ebx, 0
        mov eax, 1
        int 0x80

section .data
hex_table db "0123456789ABCDEF"

section .bss

input_buffer resb 200   ; declare 200 bytes for keyboard input
output_buffer resb 601 ; 3 bytes for each input byte, plus 1 for the final linefeed

I probably should have left more for you to do, but I wanted to test it.

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 18, 2017, 11:49:30 PM
Ahhh, yeah. I definitely gave you too much here. I suppose what I should have done is completed your code - to make sure what I was telling you was correct - and kept it to myself, dribbling it out to you an instruction or two at a time. My bad. Sorry.

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 19, 2017, 01:19:02 AM
no worries Frank, my next assignment coming up you can dribble it out. At this point now I spoon feeding is probably best because I end up slobbering all over myself  ;D

A few questions for you on your code and what it is supposed to be doing so that I can learn from it:



; from turtle13 on the Forum

; since Nasm will do many things, I like to put the command line in the file
; Nasm -f elf32 myprog.asm
; ld -o myprog muprog.o -m elf_i386

bits 32

section .text

global _start

_start:

        mov ebx, 0              ; set up stdin for keyboard input
        mov eax, 3              ; sys call for read- returns #bytes typed into eax
; need a buffer to read  into in ecx
        mov ecx, input_buffer 1. What exactly is going into ecx here? What value does input_buffer have at this point?
; and the (maximum) count
        mov edx, 200 2. Why is 200 declared here, if 200 bytes is already declared for input_buffer?
        int 0x80                ; kernel trap

        mov esi, input_buffer 3. Not too familiar with esi and edi instructions yet.. what are these two lines of code doing?
        mov edi, output_buffer
        mov ecx, eax ; count of bytes typed 4. Where are the values for eax coming from? Because when I look at eax above it holds the value "3" for the sys call read argument
        mov ebx, hex_table
        xor edx, edx ; counter for output bytes to print

; sys_read does this
;returnkey:                      ; loop that will terminate the input if return is pressed
 ;       mov

top:
; get a byte/character into al
        lodsb 5. What does lodsb do in this case?
        push eax ; save a copy
; isolate high nibble
        shr al, 4
        xlat ; alias of xlatb 6. I don't see the xlat command in my instructor's slides, what is it doing here?
        stosb ; put it in output buffer 7. are you referring to the output_buffer variable, if so, how is this instruction doing that?
        pop eax ; get our byte back
        and al, 0x0F ; isolate low nibble
        xlatb ; alias of xlat - take your pick
        stosb ; put in output buffer
        mov al, ' ' ; a space
        stosb
        add edx, 3 9. Why storing 3 in edx?
        loop top  ; do 'em all
        mov al, 10 ; linefeed 9. What is going on here?
        stosb
        inc edx

; now print it out
        mov eax, 4 ; sys_write
        mov ebx, 1 ; stdout
        mov ecx, output_buffer 10. How is the data actually getting passed into output_buffer?
        ; edx should be all set
        int 0x80

done:
        mov ebx, 0
        mov eax, 1
        int 0x80

section .data
hex_table db "0123456789ABCDEF"

section .bss

input_buffer resb 200   ; declare 200 bytes for keyboard input
output_buffer resb 601 ; 3 bytes for each input byte, plus 1 for the final linefeed
Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 19, 2017, 02:07:54 AM
Perhaps you could help to dribble out part 2 of the assignment- haven't learned much about incorporating C standard library
into nasm language
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 19, 2017, 04:36:14 AM
Lemme deal with part 1 first. I see I made some assumptions that were apparently not correct...

Quote
; need a buffer to read  into in ecx
        mov ecx, input_buffer 1. What exactly is going into ecx here?
"input_buffer" is the address (the "offset" part of the address - in "flat memory model", which Windows and Linux are, the "base" part of the address is zero) of the buffer, which you had already declared.

Quote
What value does input_buffer have at this point?
The "[contents]" of the buffer are supposedly "uninitialized", but are in fact initialized (by the OS when it loads our program) to zeros - 200 of 'em.

Quote
; and the (maximum) count
        mov edx, 200 2. Why is 200 declared here, if 200 bytes is already declared for input_buffer?
We're telling sys_read how many bytes to read (maximum), that is, how much space is available in the buffer... including the linefeed (Enter key) which terminates sys_read. Although we have asked to reserve 200 bytes, sys_read doesn't know this until we tell it - here. If the pesky user types more than what we've allowed in edx, the excess remains in the OS's input buffer - I think of it as the "keyboard buffer" - and will screw up the next read, ours or the shell's. It is safer to flush this - look for the linefeed and if we don't see it, read more into a "dummy" buffer and throw it away until we do see the linefeed. Since the assignment specified "not more than 200 bytes", I didn't do this. Safer to do it!

Quote
        mov esi, input_buffer 3. Not too familiar with esi and edi instructions yet.. what are these two lines of code doing?
        mov edi, output_buffer
I asked if you knew the "string" instructions, and then didn't wait for an answer. The "string" instructions - lodsb, stosb, movsb, scasb, cmpsb, insb, and outsb - and their "w" and "d" and "q" friends (I think that's all of them) all use (e|r)si as a source, and es:(e|r)di as a destination. Since we're in "flat" memory model, ds: and es: refer to the same memory, so you don't need to worry about the "es:" part (in real mode you do). "lodsb" loads al from [esi] and advances esi to point to the next byte. "stosb" stores al in [edi] and advances edi to point to the next byte. "lodsw" loads ax and advances esi by two bytes, etc, etc, etc. Here, we're just setting up the "source index" and "destination index" for future use.

Quote
        mov ecx, eax ; count of bytes typed 4. Where are the values for eax coming from? Because when I look at eax above it holds the value "3" for the sys call read argument
Right. But when sys_read returns - when it sees the Enter key being hit - it holds the number of bytes actually read... or an error number. By rights, it is always wise to check for an error! Since reading and writing to stdin and stdout are "unlikely" to encounter an error, I skipped that part and ASSumed no error... thus, bytes read. If you were reading/writing to a disk file, for example, it would be important to check for errors (a negative number between -1 and -4095. "man 2 read" claims that it returns -1 and the actual error number is in "errno", but that's the "C wrapper" - we get the negative of the error number in eax. "errno.h" has the numbers )

Quote
top:
; get a byte/character into al
        lodsb 5. What does lodsb do in this case?
Loads al from [esi] (in input_buffer) and advances esi to point to the next byte.

Quote
        push eax ; save a copy
; isolate high nibble
        shr al, 4
        xlat ; alias of xlatb 6. I don't see the xlat command in my instructor's slides, what is it doing here?
Essentially, mov al, [ebx + al]. There's no such instruction - the sizes don't match - but that's about what xlat does. If it does not appear in your instructor's slides, perhaps you're not supposed to be using it. There are other ways, but it may be too late!

Quote
        stosb ; put it in output buffer 7. are you referring to the output_buffer variable, if so, how is this instruction doing that?
It's the other one of those "string" instructions that we're using. Moves al - now "translated" to a hex digit - to [edi], which we pointed to "output_buffer", and advances edi to the next byte.

This reminds me of an important point that I haven't mentioned! In the "flags register" - where the zero flag, carry flag, etc. live - there's a "direction flag" whose function is to control the direction of the "string" instructions. If the flag is set - the "std" instruction - the "string" instructions work "down". That is, esi or edi will be decremented to point to the previous byte. It is considered rude to leave this flag pointed "down". It is wise to do "cld" to clear the direction flag so we go "up". I never set it "down" so I ASSumed it was set "up". This is a fairly serious error in the code I posted! Really should do "cld" before using the "string" instructions... unless you want to work "down" which is sometimes useful. My bad!

Quote
        add edx, 3 9. Why storing 3 in edx?
We're not "storing" it, we're adding it. We zeroed edx at the beginning of this loop, and we've added 3 bytes to our output buffer - the two hex digits and the space. When we're done, this will be the number of bytes to print (sys_write).
 
Quote
       loop top  ; do 'em all
This us the same as:
Code: [Select]
dec ecx
jnz top
You probably knew that one...

Quote
       mov al, 10 ; linefeed 9. What is going on here?
        stosb
        inc edx
We're just adding a linefeed to the end of the buffer, and counting it. Just for a neater display - without it the shell prompt would be on the same line as our output.

Quote
; now print it out
        mov eax, 4 ; sys_write
        mov ebx, 1 ; stdout
        mov ecx, output_buffer 10. How is the data actually getting passed into output_buffer?
We put it there with the "stosb"s.

Quote
        ; edx should be all set
... 'cause we counted 'em!

I'll get to "part 2" soon. Remind me if I don't.

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 19, 2017, 08:33:18 AM
Okay, this is about as rudimentary as we can get. We're not really "incorporating" the C library into Nasm, but merely calling the C library from Nasm-written code. There is virtually no advantage to this compared to simply calling it from C. But we can do it.
Code: [Select]
; nasm -f elf32 hwprintf.asm
; gcc hwprintf.o -o hwprintf
; ./hwprintf

global main
extern printf

section .data
    fmtstr db 'Hello, World!',10,0
section .text
    main:

    push dword fmtstr
    call printf
    add esp,4

    ret
   

"global" tells Nasm to tell ld that "main" is here, in case anybody's looking for it. There is in fact some "startup code" - the thing that calls "main" - that will be looking for it.

"extern" tells Nasm that we'll be looking for "printf". ld will find it for us... in the C library, of course.

You will notice that we don't call ld. We call gcc and let gcc call ld for us. We can use ld directly, but we have to tell ld that we want libc, and perhaps where to find it, and we need to tell ld which "interpreter" aka "dynamic linker" we want, and where to find it. For some reason, by default ld looks for an interpreter that's not (usually) there. This produces the interesting error "file not found" when we run our executable, even though we can see it right there! gcc knows where to find all this crap and other stuff to tell ld. Easier to use it, even though there's no C code to compile.

I have exited with "ret", rather than with "exit()". I like to "prove" that I haven't butchered the stack. It is easy to add "exit" (as you're "supposed" to) and it will cover a lot of errors we could make. For now, I'm going with "ret".

Do you know C? Know how to use "printf"? Whether or not, feel free to fool with this. Alter it and see what it does. Try using "exit()". You won't break anything. (probably... my lawyer makes me say that).

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 19, 2017, 09:15:08 AM
This is a second baby step towards finding those command line parameters and environment variables. They're on the stack, of course...

Code: [Select]
; nasm -f elf32 argv.asm
; gcc argv.o -o argv
; ./argv one two three

global main
extern printf

section .text
    main:
    mov eax, [esp + 8] ; find **argv on the stack - **envp is at + 12
    mov eax, [eax + 12] ; + 4 for "one", + 8 for "two"
    push eax

    call printf
    add esp,4

    ret
   

Again, feel free to play with this and see what it does.

Do you even know what the "stack" is? You probably know "push" and "pop". Do you know that "call" and "ret" use the stack?  Do you know how parameters are passed to subroutines on the stack? We can come up with some even simpler examples if need be. I'd like to think that they're teaching you something, not just handing out assignments and letting you flounder around. Well, if you've got questions we can discuss 'em...

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 19, 2017, 08:55:36 PM
thanks Frank,

in reference to your previous two posts, yes I am familiar with C as it was the prerequisite for this assembly class. The way it is being taught requires a bit of outside research to figure out how to complete these assignments which is a pain. When I'm in class learning it seems to make sense, then when I actually have to sit down and write something from scratch all of a sudden it's ancient Sumerian language.

You have included the line "add esp, 4" after the call printf, If I understand it correctly that decrements the stack by 4 bytes, but what is the end result in this case? Is it to clean up the stack?

When trying to link the program using your "gcc argv.o -o argv" instruction I get the error:
/usr/bin/ld: i386 architecture of input file `assign3_part2.o' is incompatible with i386:x86-64 output
collect2: error: ld returned 1 exit status

(The VM that we are using for this class is 32 bit Linux Fedora, so I'm guessing there is an option missing in gcc to link it to that?)
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 19, 2017, 10:02:56 PM
Yeah, if using a 64-bit system (I'm not) we have to indicate to gcc that we want a 32-bit program - not a 64-bit program. "-m32" should do it.

The "cdecl" calling convention specifies that the caller cleans up stack...
Code: [Select]
push (some parameter to printf)
call printf
add esp, 4 ; "clean up stack"
If we had more than one parameter to printf (fairly likely), we would "add esp, 4 * 2", or whatever. We could write it as "add esp, 8", of course, but it's easy to change the number of parameters - 4 bytes per parameter (we've got a 32-bit stack) * number of parameters.

An alternative, the "stdcall" convention, has the callee clean up the stack. The subroutine ends with "ret 4" (or some number of bytes - not the number of parameters). Windows uses this. It is not suitable for a function that has a variable number of parameters (like printf). ("vararg")

The "return value" from the function is in eax. printf  returns the number of items printed - not very useful. This is not the same as "ret 4".

You can learn Sumerian - the Sumerians did it!

Best,
Frank



Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 20, 2017, 12:56:03 AM
so far I am able to get the program to print my "Hello World!" variable but not the "envp" stuff:

Code: [Select]
bits 32

global main

extern printf
extern exit

section .data
        test_str db 'Hello World!', 10, 0

section .text
main:
        push test_str
        call printf
        pop eax

        xor eax, eax
        call exit
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 20, 2017, 02:53:03 AM
Okay. I think "exit" should have a parameter pushed on the stack. No need to "clean up stack" after, since we don't return. Doesn't matter too much. We can see the latest return value with "echo $?". You might try "push 42", just to see if you can see it. Traditional to return zero to indicate "no error".

What have you tried to see argv and envp?

My little example was a little too "bare bones" to see anything but one-at-a-time. Since printf uses eax ro return "number of items printed", we probably don't want to use that. C library is also allowed to trash ecx and edx. We can count on ebx, esi, and edi - C expects us to not alter them, also. Using "exit" will probably allow us to get away with that. We can create a "stack frame" with ebp as a "frame pointer"...
Code: [Select]
main:
    push ebp ; save caller's ebp - they're probably using it for the same thing we are
    mov ebp, esp ;pointer to stack frame
    ; sub esp, ??? if we need "local variables"

    push ebx ; C likes us to preserve these registers
    push esi
    push edi

    ; our code

    pop edi
    pop esi
    pop ebx

    mov esp, ebp ; destroy stack frame
    pop ebp
    ret
When we arrived at "main", main's return address was at [esp], argc was at [esp + 4], **argv was at [esp + 8], and **envp was at [esp + 12]. Since we pushed ebp, these values are increased by 4. So **argv should be at [ebp + 12]. Your assignment writes it as "*argv[]". Means the same thing - I like to stress  that it's "pointer to pointer". We have to "dereference" it before it's useful to us.

We ignore argc, although it's often useful. It should always be at least 1 - the program name. We can use it to count command line arguments, or there should be a zero at the end of 'em. I usually do this without the assistance of libc. From the "_start label, it's not called, so there's no return address - argc is the first thing on the stack. So I'm less familiar with what we're going to find doing it "C style" - I think that zero's still there...

Code: [Select]
;...
main:
    push ebp
    push ebx
    push esi
    push edi

    mov ebx, [ebp + 12]
    mov esi, [ebx] ; plus zero should be program name
    push esi
    call printf
    ; etc.

If we only pressed this into a clay tablet with a stylus, it would be "clear as mud".

Best,
Frank


Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 21, 2017, 12:41:48 AM
Instead of mud it's more like quicksand at this point  ;D

Here's my code so far incorporating your teachings, I'm a slow learner for sure. That's why they call me turtle. Actually just I call myself that.

I have commented out some lines that I think I may or may not need, your comments greatly appreciated.

Also when I try to compile with gcc I get error:

/usr/bin/ld:assign3_part2.asm: file format not recognized; treating as linker script
/usr/bin/ld:assign3_part2.asm:1: syntax error
collect2: error: ld returned 1 exit status


Code: [Select]
bits 32

global main

extern printf
extern exit

section .data
        ;test_str db 'Hello World!', 10, 0

section .text
main:
        push ebp                         ; save caller's frame pointer
        ;mov ebp, esp                    ; set up frame pointer
        push ebx
        push esi
        push edi     

        ; code
        mov ebx, [ebp + 12]
        mov esi, [ebx]
        push esi

        pop edi
        pop esi
        pop edx

        mov esp, ebp                    ; clean up stack frame
        pop ebp
        ret                           
       
        ;mov eax, [esp + 8]              ; this is where argv is on stack           
       
        ;mov eax, [eax + 12]             ; the envp parameters are stored here
       


        ;push test_str
        call printf
        pop eax

        xor eax, eax
        call exit
        push 0
        add esp, 4

The further I go the less I know!
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 21, 2017, 02:43:35 AM
We used to say, "the more you learn, the more you forget. the more you forget, the less you know. why bother?" Or something like that. Actually, I forget what it was we used to say. It was a long time ago... but still proves the point...

Back to our clay tablet: On a 64-bit system, there is a possibility that the 32-bit C library is not installed. That's a game-ender. The solution is something like "aptget install gcc multilibs" or some such. Due to a poor choice of Linux distro (I think), I never had amy luck with that. I repeatedly wound up with a MS EULA, of all things. I said, "yeah yeah I agree" but still no go. It was my daughter's old machine, and shortly died entirely. I have a 64-bit machine, I just need to shovel off a space to put it. I am in no hurry!

However, the error message suggests a different problem. gcc, or ld, should never see the .asm file at all. First, assemble it with Nasm to an .o file, and then pass the .o file to gcc.
Code: [Select]
nasm -f elf32 assign3_part2.asm
gcc -m32 assign3_part2.o -o assign3_part2
Let us hope you have better luck with that.

Now... or whenever... you do need a commented-out line:
Code: [Select]
; told ya it was a good idea to do this...
; nasm -f elf32 assign3_part2.asm
; gcc -m32 assign3_part2.o -o assign3_part2
; ./assign3_part2 one two three

bits 32

global main

extern printf
extern exit

section .data
        ;test_str db 'Hello World!', 10, 0
        ; you don't use it yet, but you probably will want
        ; Nasm accepts C "escape sequences" if we
        ; use "back quotes"
        format db `%s\n\0`
        ; or
        ; format db "%s", 10, 0
section .text
main:
        push ebp                         ; save caller's frame pointer
        ; you do need this
        ; it is part of the normal "prolog"
        mov ebp, esp                    ; set up frame pointer

        push ebx
        push esi
        push edi     

        ; code
        mov ebx, [ebp + 12]
        mov esi, [ebx]
        push esi
        ; now of course you want:
        call printf
        pop eax ; or whatever

        pop edi
        pop esi
        ;pop edx same typo I made - I thought I fixed it
        pop ebx

        mov esp, ebp                    ; clean up stack frame
        pop ebp
        ret                           
        ; if you stop here, you should see the program name

In the part I've snipped, the idea was to push rhe parameter before calling "exit".

See if that gets you any closer...

Best,
Frank

Edit: tested, and works so far... on my 32-bit system


Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 21, 2017, 06:20:38 AM
Thanks for gcc help I was not paying attending and trying to compile the .asm file instead of .o which you pointed out.

My code so far, there must be something fundamental I am not understanding here. It returns the name of the program but nothing else.

Code: [Select]
bits 32

global main

extern printf
extern exit

section .data

        format db "%s", 10, 0

section .text
main:
        push ebp                        ; save caller's frame pointer
        mov ebp, esp                    ; set up frame pointer
       
        push ebx
        push esi
        push edi     

        mov ebx, [ebp + 12]
        mov esi, [ebx]
        push esi

        call printf
        pop eax

        pop edi
        pop esi
        pop ebx

        mov esp, ebp                    ; clean up stack frame
        pop ebp
        ret                           
       
        mov eax, [esp + 8]              ; this is where argv is on stack           
       
        pop eax

        mov eax, [eax + 12]             ; the envp parameters are stored here
       

        pop eax

        xor eax, eax
        call exit
        push 0
        add esp, 4
        ret
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 21, 2017, 04:28:36 PM
Good progress getting it to compile.

All you ask for is argv[0]. At your first "ret", your program returns from "main". Code after that is never reached. To get the next argument, you'll want to add 4 to ebx. I do it as [ebx + edi] and "add edi, 4", but I think you could just "add ebx, 4". To know when to stop, check if esi is zero... or you could get "argc" off the stack. Don't try to use "loop" - printf trashes ecx! After printing args, get "envp" off the stack at [ebp + 16]. I think you'll find that it's just another 4 up the stack from where we were. The next time esi is zero, you're all done. Then "ret" or "exit" (not both).

On my machine, there are a lot of environment variables - much more than a screenful. You mught want to pipe it into "less".

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 21, 2017, 06:44:10 PM
Getting segmentation fault at this point:

Code: [Select]
; nasm -f elf32 -g assign3_part2.asm -o assign3_part2.o
; gcc -m32 assign3_part2.o -o assign3_part2
; Prints argv and envp values for a C library function

bits 32

global main

extern printf
extern exit

section .data

        format db "%s", 10, 0

section .text
main:
        push ebp                        ; save caller's frame pointer
        mov ebp, esp                    ; set up frame pointer
       
        push ebx
        push esi
        push edi     

        mov ebx, [ebp + 12]
        mov esi, [ebx]
        push esi

        call printf

args:
       
        pop eax

        pop edi
        pop esi
        pop ebx

        mov esp, ebp                    ; clean up stack frame
        pop ebp                         
        add ebx, 4                      ; advance to next argv
        cmp esi, 0
        je done                         ; finishes if no more args   
       
        mov eax, [esp + 8]              ; this is where argv is on stack           
       
        pop eax

        mov eax, [eax + 12]             ; the envp parameters are stored here
       

        pop eax
        jmp args

done:

        ;ret       
        xor eax, eax
        call exit
        push 0
        add esp, 4
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 21, 2017, 07:42:56 PM
Yeah, you're popping too much stuff. I haven't been clear...
Code: [Select]
; nasm -f elf32 -g assign3_part2.asm -o assign3_part2.o
; gcc -m32 assign3_part2.o -o assign3_part2
; Prints argv and envp values for a C library function

bits 32

global main

extern printf
extern exit

section .data

        format db "%s", 10, 0

section .text
main:
; prolog
        push ebp                        ; save caller's frame pointer
        mov ebp, esp                    ; set up frame pointer

; save regs that C wants preserved
        push ebx
        push esi
        push edi     

        mov ebx, [ebp + 12]

top:
        mov esi, [ebx]
   
        test esi, esi ; or cmp esi, 0
        jz done_with_args

        push esi
        push format ; might as well use it
        call printf
        add esp, 8 ; or pop two "dummy" regs

        add ebx, 4
        jmp top
done_with_args:

; do envp here - mov ebx, [ebp + 16], etc.

; get back regs C wants preserved
        pop edi
        pop esi
        pop ebx

; epilog
        mov esp, ebp                    ; clean up stack frame
        pop ebp                         
; "leave" does the same as these two lines

        ret ; or call exit

Some of your comments near the end of your code about where **argv and **envp are not correct.

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 22, 2017, 02:16:13 AM
After getting some help today from my professor and doing a little research online, I have rewritten my code from the ground up. Still getting a seg fault, but I'm getting there:

Code: [Select]
bits 32

section .text
        global main

        extern printf


section .data
        format db "%s", 10, 0

main:

        add esp, 8                      ; argv[0] each stack frame 4 bytes (esp + 4 = argc)

argv_loop:

        cld                             ; clear the direction flag to increment
        lodsd                           ; pointer to string at esi is loaded into eax

        cmp eax, 0                      ; have we reached the end of argv[]?
        je envp_loop                    ; jump if yes
        push eax                        ; push parameter for printf
        push dword format               ; push the format for printf
        call printf                     ; print the argv[] parameter
        pop esi
        pop eax       
        add esp, 4
        jmp argv_loop

envp_loop:

        cld
        lodsd   
 
        cmp eax, 0                      ; have we reached end of envp[]?
        je done                         ; jump to done if yes
        push eax
        push dword format       
        call printf                     ; otherwise, print the envp value       
        pop esi
        pop eax       
        add esp, 4
        jmp envp_loop

done:
        mov ebx, 0                      ; exits program
        mov eax, 1
        int 0x80
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 22, 2017, 03:25:21 AM
I'm afraid I've totally lost the plot. I can confirm that it segfaults. It appears to be in libc somewhere - it isn't in your code. Before the segfault, I see one environment variable.

I can see that your entire code is in "section .data". I can see that you use "lodsd" without initializing esi to anything useful. Beyond that, I'll have to study it some more. Honestly, I think your last attempt was a lot closer.

Best,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 22, 2017, 04:42:09 AM
Well, I remain quite lost. After your call to printf, you pop esi. At this point, esi ought to point to your format string. This can't be right, can it? I have commented this out - both times. I have initialized esi to [esp], which "might" be right. Maybe. To my total astonishment, this seems to work. I really don't know why, but if it works it works.

Code: [Select]
bits 32

section .text
        global main

        extern printf


section .data
        format db "%s", 10, 0

section .text

main:

        add esp, 8                      ; argv[0] each stack frame 4 bytes (esp + 4 = argc)

    mov esi, [esp] ; point esi at... something

argv_loop:

        cld                             ; clear the direction flag to increment
        lodsd                           ; pointer to string at esi is loaded into eax

        cmp eax, 0                      ; have we reached the end of argv[]?
        je envp_loop                    ; jump if yes
        push eax                        ; push parameter for printf
        push dword format               ; push the format for printf
        call printf                     ; print the argv[] parameter
;        pop esi
        pop eax       
        add esp, 4
        jmp argv_loop

envp_loop:
        cld
        lodsd   
 
        cmp eax, 0                      ; have we reached end of envp[]?
        je done                         ; jump to done if yes
        push eax
        push dword format       
        call printf                     ; otherwise, print the envp value       
;        pop esi
        pop eax       
        add esp, 4
        jmp envp_loop

done:
        mov ebx, 0                      ; exits program
        mov eax, 1
        int 0x80

Baffled,
Frank

Title: Re: Help with converting keyboard input into its hex output
Post by: turtle13 on August 22, 2017, 05:38:35 AM
Thank you mucho Baffled Frank and Best Frank, by removing the pop esi and putting the value of argv[0] (esp + 8) into esi and continuing with the code it is working great. Finally can turn this SOB in, and now on to the next assignment...
Title: Re: Help with converting keyboard input into its hex output
Post by: Frank Kotler on August 22, 2017, 04:12:43 PM
I don't know why I didn't see this earlier. I think the "add esp, 8" threw me for a loop. You "can't do that". That trashes the return address from "main"! Of course if you're not returning from main, that's not a problem. Throwing away "argc" also - we don't need it. That puts us at "**argc" - right where we want to be. I've used "lodsd" before for much the same purpose.

If we skip the C startup code, from "_start:", there's no return address - we start with "argc" and then the arguments then a zero then the environment variables. You've got almost the same thing. So I shouldn't be surprised that it worked. That really is a better approach than what you were doing originally. Good job!

Best,
Frank