NASM - The Netwide Assembler

NASM Forum => Programming with NASM => Topic started by: Nicholas Westerhausen on February 01, 2013, 07:06:53 PM

Title: Problem with itoa code
Post by: Nicholas Westerhausen on February 01, 2013, 07:06:53 PM: Hello NASM forums,

( I just tried to post this and the post button wasn't working, so I am retyping it )

Basically, I have a program I am writing that I want to print out integers to STDOUT. I have researched many ways of doing this, some used 16-bit registers, some were written in TASM/MASM, some used 64-bit registers, and some used 32-bit registers but were very opaque. I finally understood what was being done, and wrote a modified version of the itoa code from Bare Metal OS.

For trouble-shooting purposes, I stripped out the itoa code I wrote and put it into a separate .asm to try and figure out where I was going wrong. Here it is:
Code: [Select]
section .text global _start _start: mov eax, 1231 mov ebx, 10 xor ecx, ecx ;; Now is the main loop of the conversion. It takes each ;; digit off of the input starting at the ONES place and ;; moving upward until it reaches the end of the number godivide: xor edx, edx ; clear remainder div ebx ; divide eax by 10 push edx ; put edx on stack inc ecx ; increment number of digits cmp eax, 0 ; if eax jne godivide ; is not equal to zero, divide again ;; Next we have to go through each digit and add 0x30 ('0') ;; to each digit so that it will print correctly ;; For simplicity, each digit is written as it is converted nextdigit: pop eax ;pull off the stack add eax, 30h ;add '0' to eax push ecx ; store counter xor ecx, ecx ;; write the value to STDOUT mov edx, 1 ; one byte mov ecx, eax ; char mov ebx, 1 ; stdout mov eax, 4 ; syswrite int 80h ;; pop ecx ; retrieve counter dec ecx ; move down through digits cmp ecx, 0 ; if ecx jg nextdigit ; is greater than 0, jump nextdigit mov eax, 1 mov ebx, 0 int 80h
Compiling with
Code: [Select]
$ nasm -g -f elf32 itoa-test.asm; ld -o -melf_i386 itoa-test itoa-test.o
Before the method above, where I'm trying to print each character as I convert it, I was trying to use stosb and lodsb, but I don't understand how they work exactly when I'm not calling a function and getting it back (and if I did that, I don't know how to get the char bytes back). I want to be able to call this as a macro or something so that I can just do a single line in the code for printing the integer to the screen (in line). I have yet to be able to get this to work.

I can watch the registers using gdb and see the correct conversion take place. What I do not see is the printing occur like it should. Does this have to do with forcing i386 architecture? My machine is x86_x64 but the code must be 32bit.

I'm new to NASM (just started last week) and I've spent what feels like ages (but is in reality only like 10-12 hours) researching how to do this. Any advice or tips on this are welcome.

Thanks for your time,
Nick
Title: Re: Problem with itoa code
Post by: Frank Kotler on February 01, 2013, 10:51:10 PM: Don't ya hate it when that happens? I sure do! Lost a long reply over at AsmCommunity just the other day. Thanks for typing it in again!

Here's where things went wrong...
Code: [Select]
;; write the value to STDOUT mov edx, 1 ; one byte mov ecx, eax ; charFor sys_write, ecx needs to point to where the character(s) live (in memory). Just putting the character in ecx won't do it. One place the character could "live" temporarily would be on the stack...
Code: [Select]
section .text global _start _start: mov eax, 1231 mov ebx, 10 xor ecx, ecx ;; Now is the main loop of the conversion. It takes each ;; digit off of the input starting at the ONES place and ;; moving upward until it reaches the end of the number godivide: xor edx, edx ; clear remainder div ebx ; divide eax by 10 push edx ; put edx on stack inc ecx ; increment number of digits cmp eax, 0 ; if eax jne godivide ; is not equal to zero, divide again ;; Next we have to go through each digit and add 0x30 ('0') ;; to each digit so that it will print correctly ;; For simplicity, each digit is written as it is converted nextdigit: pop eax ;pull off the stack add eax, 30h ;add '0' to eax push ecx ; store counter ; xor ecx, ecx ;; write the value to STDOUT mov edx, 1 ; one byte ; mov ecx, eax ; char ; no, it has to "be" someplace, and ecx point to it push eax ; we can put it on the stack (again!) mov ecx, esp ; and point to it there... mov ebx, 1 ; stdout mov eax, 4 ; syswrite int 80h pop eax ; remove it from the stack (unused) ;; pop ecx ; retrieve counter dec ecx ; move down through digits cmp ecx, 0 ; if ecx jg nextdigit ; is greater than 0, jump nextdigit mov eax, 1 mov ebx, 0 int 80hObviously, that's got a little more pushin' and poppin' than is ideal, but it works. (I love Linux questions 'cause I can actually try it!) We could perhaps improve the situation somewhat by using some other register besides ecx to count digits. We could provide a buffer for the digit in "section .bss" or so... or all of 'em...

As you know, "div" gives us the digits in the "wrong order". Pushing them on the stack and popping them off seems "easy" to me, and it's what I usually show to "beginners". There are better ways to do it. We could start at the right end of the buffer (put in a zero terminator... or maybe a linefeed "last" if you want) and work leftwards... until we run out of digits. You could remember this position and start printing there... or you could pad the rest of the buffer with spaces (right-justified numbers look nice in a column). Sys_write will need both the start position and how many characters. "Calling conventions" return only one thing, but there's no reason you can't return both is you want to! It is plausible to use ecx to point into your buffer throughout, and be ready for sys_write. Since "div" uses edx, we have to use some other register to count digits and swap it back to edx for the sys_write.

We can discuss examples if you want - I've got a pile of 'em - but first let me post the solution to your immediate problem first. TTYL.

Best,
Frank
Title: Re: Problem with itoa code
Post by: Nicholas Westerhausen on February 05, 2013, 02:04:58 AM: So, with your modified code, the itoa algorithm works very well :D, and even when I make it a macro at the beginning of the file. The problem is that once I add any more data to the file (example below) it doesn't print the integer, instead, some number I have no idea where it comes from.

For example. I have it running when this is the itoa.asm:
Code: [Select]
section .data num dw 13579 %macro iwrite 1 mov eax, %1 ...snip, same code from above... %endmacro section .text global _start _start: iwrite [num] ...sys_exit call...
But as soon as I add another number up at the top in the data section
Code: [Select]
section .data num dw 13579 sum dw 1932it prints out something that I can't tell where it comes from (I'm not modifying the .text section at all). I'm not sure if this is because the reference point for num somehow gets messed up, or something. But I've fiddled with it a little bit. Posting here even though I'm continuing to mess with it.
Title: Re: Problem with itoa code
Post by: Frank Kotler on February 05, 2013, 03:02:42 AM: The numbers should probably be "dd" not "dw".

Best,
Frank
Title: Re: Problem with itoa code
Post by: Nicholas Westerhausen on February 05, 2013, 04:32:03 AM: That fixed it.

Thanks a lot.

Back to what you were saying about starting at a different place in memory and going in the "correct" order though the digits, I have seen example code like that, but when I tried to implement it myself, it got messy when turning it into a macro because I couldn't reference the address of the argument.
Title: Re: Problem with itoa code
Post by: Frank Kotler on February 05, 2013, 07:35:30 AM: I'm not sure what you mean by "couldn't reference the address of the argument.". The buffer?

Here's something I use for a quick and dirty "spit out the number". It creates a buffer on the stack, fills it "backwards", falls into sys_write, and makes it all go away. Intended more for "debugging" than nicely formatted numbers...
Code: [Select]
; nasm -f elf32 myprog.asm ; ld -o myprog myprog.o -m elf_i386 global _start section .text _start: mov eax, 0FFFFFFFFh call showeaxd mov eax, 1 xor ebx, ebx int 80h ;--------------------------------- showeaxd: push eax push ebx push ecx push edx push esi sub esp, 10h lea ecx, [esp + 12] mov ebx, 10 xor esi, esi ; want linefeed? mov byte [ecx], 10 inc esi .top: dec ecx xor edx, edx div ebx add dl, '0' mov [ecx], dl inc esi or eax, eax jnz .top mov edx, esi mov ebx, 1 mov eax, 4 int 80h add esp, 10h pop esi pop edx pop ecx pop ebx pop eax ret ;---------------------------------
Not really suitable for a macro, but I guess it could be adapted if you wanted to...

Best,
Frank
Title: Re: Problem with itoa code
Post by: Nicholas Westerhausen on February 05, 2013, 05:21:01 PM: So here's a question, is it common practice to define a subroutine rather than a macro? I lean towards using macros because they are what I see as functions. When I write code, I tend to abstract out the many lines the "doing" of something takes so that the source code looks clean.

When I look at your "debugging atoi" code, I think I understand what is happening. Most of my confusion comes from the "magic" you are doing with esi. I'm confused with the stack pointer, whether adding is moving which direction. A quick search says that adding to esp is like popping from the stack. So it is moving "down" and by subtracting you are moving "up" into non-assigned memory?

Code: [Select]
showeaxd: ;; push all registers onto the stack push eax push ebx push ecx push edx push esi ;; move the instruction pointer 16 bits left ("down" the stack?) sub esp, 10h ;; load effective address of one byte one the stack (to the right? "up"?) lea ecx, [esp + 12] ;; moving 10 to ebx to convert from decimal to ascii mov ebx, 10 ;; clearing the esi register (used for string operations?) xor esi, esi ; want linefeed? mov byte [ecx], 10 inc esi .top: ;; decrement ecx, clear edx, divide eax by 10 dec ecx xor edx, edx div ebx ;; add a byte to the lower d register (for ascii conversion of remainder) add dl, '0' ;; put dl into the memory address stored in ecx mov [ecx], dl ;; move the string pointer?? inc esi ;; binary 'or' comparison on eax or eax, eax ;; if eax || eax != 0 (so there is SOMETHING in eax), jump .top jnz .top ;; printing the string (number converted) ;; puting the string length in edx? ;; ecx has the address at the beginning of the string mov edx, esi mov ebx, 1 mov eax, 4 int 80h ;; move the pointer back to where it was (up 10?) add esp, 10h ;; put the original values back in all registers pop esi pop edx pop ecx pop ebx pop eax ;; return ret
Title: Re: Problem with itoa code
Post by: Bryant Keller on February 05, 2013, 11:04:12 PM: Quote from: Nicholas Westerhausen on February 05, 2013, 05:21:01 PM
So here's a question, is it common practice to define a subroutine rather than a macro? I lean towards using macros because they are what I see as functions.

Macros should only really be used for small blocks of code that get reused frequently. Subroutines/Procedures are used to reduce and reuse code.

Let's use Frank's as an example:

Code: (showeaxd as Procedure) [Select]
global _start section .text _start: mov eax, 10 call showeaxd mov eax, 11 call showeaxd mov eax, 12 call showeaxd mov eax, 1 xor ebx, ebx int 80h showeaxd: push eax push ebx push ecx push edx push esi sub esp, 10h lea ecx, [esp + 12] mov ebx, 10 xor esi, esi ; want linefeed? mov byte [ecx], 10 inc esi .top: dec ecx xor edx, edx div ebx add dl, '0' mov [ecx], dl inc esi or eax, eax jnz .top mov edx, esi mov ebx, 1 mov eax, 4 int 80h add esp, 10h pop esi pop edx pop ecx pop ebx pop eax ret
Code: (showeaxd as Macro) [Select]
%macro print_int 1 %push _print_int_ mov eax, %1 push eax push ebx push ecx push edx push esi sub esp, 10h lea ecx, [esp + 12] mov ebx, 10 xor esi, esi ; want linefeed? mov byte [ecx], 10 inc esi %$top: dec ecx xor edx, edx div ebx add dl, '0' mov [ecx], dl inc esi or eax, eax jnz %$top mov edx, esi mov ebx, 1 mov eax, 4 int 80h add esp, 10h pop esi pop edx pop ecx pop ebx pop eax %pop %endmacro global _start section .text _start: print_int 10 print_int 11 print_int 12 mov eax, 1 xor ebx, ebx int 80h
Although these two might look similar and the second does "resemble" a high level function call, these two are extremely different. In fact, the second is 3 times bigger when assembled than the first.

When we assemble the first, we get the following code:

Code: [Select]
%line 1+1 main.asm [global _start] [section .text] _start: mov eax, 10 call showeaxd mov eax, 11 call showeaxd mov eax, 12 call showeaxd mov eax, 1 xor ebx, ebx int 80h showeaxd: push eax push ebx push ecx push edx push esi sub esp, 10h lea ecx, [esp + 12] mov ebx, 10 xor esi, esi mov byte [ecx], 10 inc esi .top: dec ecx xor edx, edx div ebx add dl, '0' mov [ecx], dl inc esi or eax, eax jnz .top mov edx, esi mov ebx, 1 mov eax, 4 int 80h add esp, 10h pop esi pop edx pop ecx pop ebx pop eax ret
As you can see, it's practically the same as what we started with. However, when we assemble the second, we get:

Code: [Select]
%line 42+1 main.asm [global _start] [section .text] _start: mov eax, 10 %line 47+0 main.asm push eax push ebx push ecx push edx push esi sub esp, 10h lea ecx, [esp + 12] mov ebx, 10 xor esi, esi mov byte [ecx], 10 inc esi ..@3.top: dec ecx xor edx, edx div ebx add dl, '0' mov [ecx], dl inc esi or eax, eax jnz ..@3.top mov edx, esi mov ebx, 1 mov eax, 4 int 80h add esp, 10h pop esi pop edx pop ecx pop ebx pop eax %line 48+1 main.asm mov eax, 11 %line 48+0 main.asm push eax push ebx push ecx push edx push esi sub esp, 10h lea ecx, [esp + 12] mov ebx, 10 xor esi, esi mov byte [ecx], 10 inc esi ..@5.top: dec ecx xor edx, edx div ebx add dl, '0' mov [ecx], dl inc esi or eax, eax jnz ..@5.top mov edx, esi mov ebx, 1 mov eax, 4 int 80h add esp, 10h pop esi pop edx pop ecx pop ebx pop eax %line 49+1 main.asm mov eax, 12 %line 49+0 main.asm push eax push ebx push ecx push edx push esi sub esp, 10h lea ecx, [esp + 12] mov ebx, 10 xor esi, esi mov byte [ecx], 10 inc esi ..@7.top: dec ecx xor edx, edx div ebx add dl, '0' mov [ecx], dl inc esi or eax, eax jnz ..@7.top mov edx, esi mov ebx, 1 mov eax, 4 int 80h add esp, 10h pop esi pop edx pop ecx pop ebx pop eax %line 50+1 main.asm mov eax, 1 xor ebx, ebx int 80h
As you can see, the output of this second example is dramatically larger than the first. That is the nature of macros, they don't invoke the code you reference, they place the referenced code directly into your source as though you typed it yourself.
Title: Re: Problem with itoa code
Post by: Frank Kotler on February 06, 2013, 05:48:25 AM: Code: [Select]
;; move the instruction pointer 16 bits left ("down" the stack?) sub esp, 10h
"stack pointer", not "instruction pointer"

Code: [Select]
;; clearing the esi register (used for string operations?) xor esi, esi
Naw, just a digit counter. Since sys_write is going to want the "string pointer" in ecx, I use ecx throughout. I'd use edx as the counter (since that's where sys_write wants it), but "div" ties up edx so I had to use something else. I should have commented it better, but you seem to have figured it out...

As Bryant has demonstrated, "abstract away the lines" can also sometimes "hide the fact that your program is filled with repetitive blocks of code". That's not the end of the world. It might be exactly what you want to do. I personally would prefer to see that much code as a subroutine, included once and called from multiple places, but it'll work either way. It might run faster "in line", avoiding the call and ret... but you won't notice any difference. "Programmer's choice". It is good to be the programmer! :)

Best,
Frank
Title: Re: Problem with itoa code
Post by: Nicholas Westerhausen on February 07, 2013, 04:00:30 PM: Thank you for the explanation about macros. It is really clear now. Macros should be used when you are going to type the same thing everytime (like a write or read). Subroutines should be used when you have many lines that you want to reuse but not retype. Makes sense to me. I'm going to rewrite my code without macros I think and compare file sizes for myself.

Nick