NASM - The Netwide Assembler

NASM Forum => Programming with NASM => Topic started by: jedi on January 21, 2013, 03:02:12 PM

Title: Hex/Dec Swap
Post by: jedi on January 21, 2013, 03:02:12 PM: I'm looking for an algorithm to convince my program that the digits it has stored as a hexadecimal are really meant to be a decimal.

For example, '6789' is in memory as a hexadecimal even though in reality it is supposed to be a decimal number. I need to convince the computer that it is really 1A85h (which is the hexadecimal equivalent of 6789d).

I haven't figured out an algorithm to convert this. If you divide by 16 then you separate a '9' which is not useful because we need a '5'. If you divide by 10 then you'll get a '5' followed by a '0' ending up in the number '26505' which is the decimal representation of 6789h which is also not what we want.

Any ideas?

Thanks guys!

The following code probably won't help you but it basically takes unicode characters entered via console (assumed to be passed as a decimal) and then translates them out of unicode. At this point the computer has '6789' in memory but it thinks it is a hex.

Code: [Select]
SECTION .data data: db '6789' ;************ for debugging SECTION .bss newdata: resq 100 SECTION .text global uni_hex extern print# ;************for debugging extern ExitProcess ;************for debugging uni_hex: ; ******************** values for debugging mov r8, data mov rax, 4 mov r10, newdata ; r8 = A pointer to the unicode data. ; rax = Length of string. Not a pointer. ; r10 = A pointer to the storage of the converted data. ; Housekeeping. mov r11, rax ; make a copy dec r11 ; want to point at last byte mov r12, rax ; make another copy mov r13, 0 loop1: ; Convert the unicode to decimal. mov byte al, [r8+r11] ; load last byte of string sub rax, 30h ; convert to 0xd mov r9, rax ; this fills the lower half-byte of the r9 register. dec r11 ; one less character dec r12 ; one less character jz oddfinish ; if it isn't an even number of characters then it is very important to end now or we will crash. mov byte al, [r8+r11] ; load next to last byte of string sub rax, 30h ; convert to 0xd shl rax, 4 ; make it take the upper half-byte add rax, r9 ; add the upper and lower half-bytes together ... ; ... rax now holds the two digits together -- one correct byte in decimal format ... ; ... although the computer thinks that this proper decimal number is actually hexadecimal ... ; ... and we'll need to fix that and turn it into hexadecimal later. ; Now we store it. mov byte [r10+r13], al ; r13 begins at 0 dec r11 ; point at the previous byte next loop inc r13 dec r12 ; one less digit to analyze jnz loop1 jmp skipoddfinish ; you do not want to do label 'oddfinish' because it will double the final byte. oddfinish: mov byte [r10+r13], al ; we store it skipoddfinish:
Title: Re: Hex/Dec Swap
Post by: Rob Neff on January 21, 2013, 05:11:02 PM: Your code is treating the input as hex - not decimal.
If you are truly expecting decimal only input then you need a loop that will multiply the current value by 10 for each digit read in and then add the current digit.

Assume that we're reading the string from left to right, here's pseudo-code:

Code: [Select]
value = 0 ; <-- must initialize to zero first again: read next digit value = value * 10 + digit loop again ;<-- if still have chars remaining
Hope that helps.
Title: Re: Hex/Dec Swap
Post by: jedi on January 21, 2013, 07:20:50 PM: Thank you Rob.

In the past I made a function that does just that. It uses the algorithm you suggested. The limitation is that I could only get it to do four digits before overflow occurs. I read somewhere that there is a way to manipulate it to overcome the overflow but it was complicated and I thought that this idea was simpler and more efficient. It can handle unlimited characters and also does away with the multiplication making it multiple times faster/more efficient.

If I could trick the computer into thinking the hex is a decimal that would be ideal. However, I'm not sure if that is possible.

Jedi
Title: Re: Hex/Dec Swap
Post by: Rob Neff on January 22, 2013, 03:16:54 AM: Even with just using a 32-bit register you could store a value up to 2^32 ( ie: an input string with a value no larger than '4294967296' ).
I think you should look harder at your decimal algorithm if you're overflowing at only 4 decimal digits. ;)
Title: Re: Hex/Dec Swap
Post by: Mathi on January 22, 2013, 04:25:07 AM: ~~How about this one.~~
This is same as what Rob suggests i guess :)

After all , We need to calculate 9+80+700+6000
This program does that.

we get the last hex digit using "and ecx,0xF"

Code: [Select]
mov ebx, 0x6789 ;; Our input in hex xor edi,edi mov eax, 1 xor ecx,ecx test ebx,ebx ;; If input = 0 , output = 0 jz EndProcess StartProcess: mov ecx,ebx and ecx,0xF ;; Take the last digit push eax mul ecx add edi,eax pop eax imul eax, 10 ;; Prepare the next place. shr ebx,4 jz EndProcess jmp StartProcess EndProcess: mov eax,edi
0x1A85 is stored in eax. (final output).

I have written it for 32 bit. But you should be able to change this for 64 bit.
by changing eax to rax, ebx to rbx etc.

64 bit (not tested)

Code: [Select]
mov rbx, 0x6789 ;; Our input in hex xor rdi,rdi mov rax, 1 xor rcx,rcx test rbx,rbx ;; If input = 0 , output = 0 jz EndProcess StartProcess: mov rcx,rbx and rcx,0xF ;; Take the last digit. push rax mul rcx add rdi,rax pop rax imul rax, 10 ;; Prepare the next place. shr rbx,4 jz EndProcess jmp StartProcess EndProcess: mov rax,rdi
Output is stored in rax

EDIT : Previous method to find the last digit was inefficient. so changed it.

Regards,
Mathi.
Title: Re: Hex/Dec Swap
Post by: Frank Kotler on January 22, 2013, 09:10:31 AM: I'm about as confused as I've ever been... lately, anyway... or maybe I've just now realized it...

What's the "specification" on this program, anyway? Jedi expresses it as "convincing" the computer that something is something it's not. This isn't likely to be possible. The computer doesn't "think" much. The computer works with bit patterns, we're the ones who "think" it's a "number" (signed or unsigned?) or an ascii or unicode "character" or something else (flags register, for example). We provide instructions to manipulate these bits to treat 'em as a "character" or a "number" or whatever. Providing instructions to force the computer to do what we want is going to work better than "convincing" it. :)

Jedi provides:
Code: [Select]
data: db '6789'
What this puts in memory (expressed in hex for our convenience) is 36h, 37h, 38h, 39h.

Mathi starts out with:
Code: [Select]
mov ebx, 0x6789
What gets stored in memory here (in hex again) is 89h, 67h, 00h, 00h. This may be useful for what Jedi wants, I'm not sure. In any case, I changed the input to 0x1001 and I don't seem to be getting the expected results. I think I downloaded it pre-edit, so it might be okay now, but I don't know about this one, Mathi...

Jedi mentions unicode. I don't know much about unicode. I understand it provides a standardized mapping between a number and the glyph it represents. I understand that the number in question can be encoded in different ways - UTF-8, UTF-16, UTF-32... others? ... but a certain number will always refer to the same glyph. The glyphs we use to represent numbers we learned from the Arabs - '0', '1' etc. If you need to be able to handle some language (Mayan calendar?) that used a different set of glyphs, you'd want unicode. I don't see anything in your code that's going to handle unicode. As I recall, in UTF-8 encoding at least, the ascii characters will work without any special treatment. I'd suggest "forget unicode" until/unless you really need it. I plan to! :)

If the "specification" is simply "input a string of characters representing a decimal number and print the hex equivalent", that isn't too difficult. Convert the text to the number it represents using the algorithm Rob shows, and then convert the number into text representing hex... and print it. The only difficulty I see here is that if you tell the pesky user "a decimal number" they'll, sure as shootin', put a decimal point in it. I don't think an attempt to convert it "digit by digit", either characters or numbers, will be easy - the "carry" comes in the wrong places.

Converting hex text <-> number involves multiplying/dividing by 16... which we can do with shifts. We also have "rol"/"ror", which can be handy. Converting decimal text <-> number, we get into that situation where we get the digits in the opposite order than we want to print 'em. For hex, a "rol" by four will put the high bits we want to print first in the low four bits - right where we want 'em to isolate 'em, convert 'em to character, and print 'em first... Just a thought...

If we need to convert both dec->hex and hex->dec, can we make the user enter a hex value as "0x..." or "...h", assuming decimal otherwise, or do we want a "menu" where they can specify which way to convert? Can ya clarify the "specification" any, Jedi?

Best,
Frank
Title: Re: Hex/Dec Swap
Post by: Mathi on January 22, 2013, 11:02:47 AM: Frank,

One clarification :)

The routine i pasted was not a replacement for Jedi's code. It should be executed on the output of Jedi's routine.

I think his routine actually converts the char bytes ('6789')
From
36h, 37h, 38h, 39h

to 2bytes at [newdata]

89h 67h (which is the little endian representation of 0x6789)

But he wants these two bytes to be changed to (convince the program :) )
85h 1Ah - which is 0x1A85 = 6789 in decimal.

Well.. We can be sure only if Jedi replies back :D

My code should actually be.
Code: [Select]
xor ebx,ebx mov bx,word [newdata] ;; Our input in hex this will load ebx = 0x6789 xor edi,edi mov eax, 1 xor ecx,ecx test ebx,ebx ;; If input = 0 , output = 0 jz EndProcess StartProcess: mov ecx,ebx and ecx,0xF ;; Take the last digit push eax mul ecx add edi,eax pop eax imul eax, 10 ;; Prepare the next place. shr ebx,4 jz EndProcess jmp StartProcess EndProcess: mov eax,edi mov word[newdata], ax ;****NOTE****
Even if it had been in big endian format . We have the BSWAP instruction handy. :)

Still, it cannot be applied to unlimited number of chars.
MAXLIMIT for the input i think is "sixteen 9's "

Regards,
Mathi.
Title: Re: Hex/Dec Swap
Post by: jedi on January 22, 2013, 12:42:29 PM: Thanks guys. Woke up to so many posts. ;D

Rob is right that my past implementation of his recommended algorithm was bad. I thought I was limited because I read the rule that the processor requires a register or ax:dx set twice as big as the multiplicands. I was doubling on each iteration of the loop which quickly ate up my registers. Thank you Rob.

Frank, the reason I'm converting Unicode is because when I get data from console it arrives in Unicode format. So a 9 is represented as 39h. I need a basic function to be able to receive data from console. I'd like it to be able to handle decimal input and not only hex. I know such a function exists in libraries but I'm gaining a lot of experience from creating this from scratch. :)

The algorithm I uploaded changed the 39h to 09h then removed the prefixed 0 and squashed what was two bytes (38h + 39h) into one (89h). The problem is that this is an intended decimal number. Using Rob's algorithm I thought I was limited to measly 4 digits or 8 Unicode bytes. I'm wrong about that. 8)

Thanks so much Mathi. Your algorithm is perfect. Thanks so much for putting it together. I can see now how I could implement Rob's algorithm without running into a 4 digit limit. Thanks man.

Thanks everyone, this is a great community. It's quite enjoyable to program in assembly really. It's cool stuff.

Jedi
Title: Re: Hex/Dec Swap
Post by: Mathi on January 22, 2013, 03:19:17 PM: Jedi,

Code: [Select]
Frank, the reason I'm converting Unicode is because when I get data from console it arrives in Unicode format. So a 9 is represented as 39h.
9 represented as 39h is ascii encoding
Unicode is a 16 bit encoding.

If you are just trying to convert the text read from console to integer or viceversa,
You can check the below macros which are part of nagoa macroset.

you can invoke this macro as,

str2int data ;; after execution eax will hold 6789. but data should be null terminated like data db '6789',0

to convert integer to ascii (string)

int2str eax, buffer ;; after execution buffer will point to null terminated ascii string.("6789" in this case)
Code: [Select]
It's quite enjoyable to program in assembly really. It's cool stuff.All the best. :)
Code: [Select]
;;================================ ;; MACRO str2int int2str by Written by mastercpp. ;;================================ %macro str2int 1 push ebx ; push esi ; push edi ; mov ebx, 0 mov ecx, 0 xor eax,eax mov ebx,0000000Ah mov esi,offset %1 %%ConvertLoop: movzx ecx,byte [esi] ;Zeichen laden. test ecx,ecx jz short %%ExitConvertLoop ;0 => Exit inc esi sub cl,30h ;0-9... mul ebx ;Ergebnis * 10 add eax,ecx ;+ nächste Ziffer jmp short %%ConvertLoop %%ExitConvertLoop: pop edi pop esi pop ebx %endmacro %macro int2str 2 push ebx ; push esi ; push edi ; %%start: mov eax, %1 xor ecx, ecx mov ebx, 000ah %%DecConvert: xor edx, edx div ebx add edx, 0030h push edx inc ecx or eax, eax jnz short %%DecConvert mov edi, %2 mov edx,ecx %%SortDec: pop eax stosb loop %%SortDec mov eax, 0h stosb pop edi pop esi pop ebx %endmacro ;;================================ ;; mastercpp end macros ;;================================
Regards,
Mathi.