Author Topic: Hex/Dec Swap  (Read 12984 times)

Offline jedi

  • Jr. Member
  • *
  • Posts: 9
Hex/Dec Swap
« on: January 21, 2013, 03:02:12 PM »
I'm looking for an algorithm to convince my program that the digits it has stored as a hexadecimal are really meant to be a decimal.

For example, '6789' is in memory as a hexadecimal even though in reality it is supposed to be a decimal number. I need to convince the computer that it is really 1A85h (which is the hexadecimal equivalent of 6789d).

I haven't figured out an algorithm to convert this. If you divide by 16 then you separate a '9' which is not useful because we need a '5'. If you divide by 10 then you'll get a '5' followed by a '0' ending up in the number '26505' which is the decimal representation of 6789h which is also not what we want.

Any ideas?

Thanks guys!


The following code probably won't help you but it basically takes unicode characters entered via console (assumed to be passed as a decimal) and then translates them out of unicode. At this point the computer has '6789' in memory but it thinks it is a hex.

Code: [Select]
SECTION .data
data: db '6789' ;************ for debugging

SECTION .bss
newdata: resq 100

SECTION .text

global uni_hex
extern print# ;************for debugging
extern ExitProcess ;************for debugging

uni_hex:
; ******************** values for debugging
mov r8, data
mov rax, 4
mov r10, newdata
; r8 = A pointer to the unicode data.
; rax = Length of string. Not a pointer.
; r10 = A pointer to the storage of the converted data.

; Housekeeping.
mov r11, rax ; make a copy
dec r11 ; want to point at last byte
mov r12, rax ; make another copy
mov r13, 0

loop1: ; Convert the unicode to decimal.
mov byte al, [r8+r11] ; load last byte of string
sub rax, 30h ; convert to 0xd
mov r9, rax ; this fills the lower half-byte of the r9 register.
dec r11 ; one less character
dec r12 ; one less character
jz oddfinish ; if it isn't an even number of characters then it is very important to end now or we will crash.

mov byte al, [r8+r11] ; load next to last byte of string
sub rax, 30h ; convert to 0xd
shl rax, 4 ; make it take the upper half-byte
add rax, r9 ; add the upper and lower half-bytes together ...
; ... rax now holds the two digits together -- one correct byte in decimal format ...
; ... although the computer thinks that this proper decimal number is actually hexadecimal ...
; ... and we'll need to fix that and turn it into hexadecimal later.

; Now we store it.
mov byte [r10+r13], al ; r13 begins at 0

dec r11 ; point at the previous byte next loop
inc r13
dec r12 ; one less digit to analyze
jnz loop1
jmp skipoddfinish ; you do not want to do label 'oddfinish' because it will double the final byte.
oddfinish:              mov byte [r10+r13], al ; we store it
skipoddfinish:

Offline Rob Neff

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 429
  • Country: us
Re: Hex/Dec Swap
« Reply #1 on: January 21, 2013, 05:11:02 PM »
Your code is treating the input as hex - not decimal.
If you are truly expecting decimal only input then you need a loop that will multiply the current value by 10 for each digit read in and then add the current digit. 

Assume that we're reading the string from left to right, here's pseudo-code:

Code: [Select]
    value = 0    ; <-- must initialize to zero first
again:
    read next digit
    value = value * 10 + digit
    loop again    ;<-- if still have chars remaining

Hope that helps.
   

Offline jedi

  • Jr. Member
  • *
  • Posts: 9
Re: Hex/Dec Swap
« Reply #2 on: January 21, 2013, 07:20:50 PM »
Thank you Rob.

In the past I made a function that does just that. It uses the algorithm you suggested. The limitation is that I could only get it to do four digits before overflow occurs. I read somewhere that there is a way to manipulate it to overcome the overflow but it was complicated and I thought that this idea was simpler and more efficient. It can handle unlimited characters and also does away with the multiplication making it multiple times faster/more efficient.

If I could trick the computer into thinking the hex is a decimal that would be ideal. However, I'm not sure if that is possible.

Jedi

Offline Rob Neff

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 429
  • Country: us
Re: Hex/Dec Swap
« Reply #3 on: January 22, 2013, 03:16:54 AM »
Even with just using a 32-bit register you could store a value up to 2^32 ( ie: an input string with a value no larger than '4294967296' ).
I think you should look harder at your decimal algorithm if you're overflowing at only 4 decimal digits.  ;)

Offline Mathi

  • Jr. Member
  • *
  • Posts: 82
  • Country: in
    • Win32NASM
Re: Hex/Dec Swap
« Reply #4 on: January 22, 2013, 04:25:07 AM »
How about this one.
This is same as what Rob suggests i guess  :)

After all , We need to calculate 9+80+700+6000
This program does that.

we get the last hex digit using "and ecx,0xF"

Code: [Select]
mov ebx, 0x6789  ;; Our input in hex
xor edi,edi
mov eax, 1
xor ecx,ecx
test ebx,ebx       ;; If input = 0 , output = 0
jz EndProcess
StartProcess:
        mov ecx,ebx           
        and ecx,0xF ;; Take the last digit
        push eax
        mul ecx             
        add edi,eax
        pop eax
        imul eax, 10       ;; Prepare the next place.
        shr ebx,4
        jz EndProcess
jmp StartProcess
EndProcess:
mov eax,edi

0x1A85 is stored in eax. (final output).

I have written it for 32 bit. But you should be able to change this for 64 bit.
by changing eax to rax, ebx to rbx etc.

64 bit (not tested)

Code: [Select]
mov rbx, 0x6789  ;; Our input in hex
xor rdi,rdi
mov rax, 1
xor rcx,rcx
test rbx,rbx       ;; If input = 0 , output = 0
jz EndProcess
StartProcess:
        mov rcx,rbx           
        and rcx,0xF     ;; Take the last digit.
        push rax
        mul rcx             
        add rdi,rax
        pop rax
        imul rax, 10       ;; Prepare the next place.
        shr rbx,4
        jz EndProcess
jmp StartProcess
EndProcess:
mov rax,rdi

Output is stored in rax

EDIT : Previous method to find the last digit was inefficient. so changed it.

Regards,
Mathi.

« Last Edit: January 22, 2013, 05:49:21 AM by Mathi »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Hex/Dec Swap
« Reply #5 on: January 22, 2013, 09:10:31 AM »
I'm about as confused as I've ever been... lately, anyway... or maybe I've just now realized it...

What's the "specification" on this program, anyway? Jedi expresses it as "convincing" the computer that something is something it's not. This isn't likely to be possible. The computer doesn't "think" much. The computer works with bit patterns, we're the ones who "think" it's a "number" (signed or unsigned?) or an ascii or unicode "character" or something else (flags register, for example). We provide instructions to manipulate these bits to treat 'em as a "character" or a "number" or whatever. Providing instructions to force the computer to do what we want is going to work better than "convincing" it. :)

Jedi provides:
Code: [Select]
    data: db '6789'

What this puts in memory (expressed in hex for our convenience) is 36h, 37h, 38h, 39h.

Mathi starts out with:
Code: [Select]
    mov ebx, 0x6789

What gets stored in memory here (in hex again) is 89h, 67h, 00h, 00h. This may be useful for what Jedi wants, I'm not sure. In any case, I changed the input to 0x1001 and I don't seem to be getting the expected results. I think I downloaded it pre-edit, so it might be okay now, but I don't know about this one, Mathi...

Jedi mentions unicode. I don't know much about unicode. I understand it provides a standardized mapping between a number and the glyph it represents. I understand that the number in question can be encoded in different ways - UTF-8, UTF-16, UTF-32... others? ... but a certain number will always refer to the same glyph. The glyphs we use to represent numbers we learned from the Arabs - '0', '1' etc. If you need to be able to handle some language (Mayan calendar?) that used a different set of glyphs, you'd want unicode. I don't see anything in your code that's going to handle unicode. As I recall, in UTF-8 encoding at least, the ascii characters will work without any special treatment. I'd suggest "forget unicode" until/unless you really need it. I plan to! :)

If the "specification" is simply "input a string of characters representing a decimal number and print the hex equivalent", that isn't too difficult. Convert the text to the number it represents using the algorithm Rob shows, and then convert the number into text representing hex... and print it. The only difficulty I see here is that if you tell the pesky user "a decimal number" they'll, sure as shootin', put a decimal point in it. I don't think an attempt to convert it "digit by digit", either characters or numbers, will be easy - the "carry" comes in the wrong places.

Converting hex text <-> number involves multiplying/dividing by 16... which we can do with shifts. We also have "rol"/"ror", which can be handy. Converting decimal text <-> number, we get into that situation where we get the digits in the opposite order than we want to print 'em. For hex, a "rol" by four will put the high bits we want to print first in the low four bits - right where we want 'em to isolate 'em, convert 'em to character, and print 'em first... Just a thought...

If we need to convert both dec->hex and hex->dec, can we make the user enter a hex value as "0x..." or "...h", assuming decimal otherwise, or do we want a "menu" where they can specify which way to convert? Can ya clarify the "specification" any, Jedi?

Best,
Frank


Offline Mathi

  • Jr. Member
  • *
  • Posts: 82
  • Country: in
    • Win32NASM
Re: Hex/Dec Swap
« Reply #6 on: January 22, 2013, 11:02:47 AM »
Frank,

One clarification :)

The routine i pasted was not a replacement for Jedi's code. It should be executed on the output of Jedi's routine.

I think his routine actually converts the char bytes ('6789')
From
36h, 37h, 38h, 39h

to 2bytes at [newdata] 

89h 67h    (which is the little endian representation of 0x6789)

But he wants these two bytes to be changed to (convince the program :) )
85h 1Ah     -  which is 0x1A85 =  6789 in decimal.

Well.. We can be sure only if Jedi replies back :D

My code should actually be.
Code: [Select]
xor ebx,ebx
mov bx,word [newdata]  ;; Our input in hex this will load ebx = 0x6789
xor edi,edi
mov eax, 1
xor ecx,ecx
test ebx,ebx       ;; If input = 0 , output = 0
jz EndProcess
StartProcess:
        mov ecx,ebx           
        and ecx,0xF ;; Take the last digit
        push eax
        mul ecx             
        add edi,eax
        pop eax
        imul eax, 10       ;; Prepare the next place.
        shr ebx,4
        jz EndProcess
jmp StartProcess
EndProcess:
mov eax,edi
mov word[newdata], ax   ;****NOTE****

Even if it had been in big endian format . We have the BSWAP instruction handy. :)

Still, it cannot be applied to unlimited number of chars.
MAXLIMIT for the input i think is "sixteen 9's "

Regards,
Mathi.

Offline jedi

  • Jr. Member
  • *
  • Posts: 9
Re: Hex/Dec Swap
« Reply #7 on: January 22, 2013, 12:42:29 PM »
Thanks guys. Woke up to so many posts. ;D

Rob is right that my past implementation of his recommended algorithm was bad. I thought I was limited because I read the rule that the processor requires a register or ax:dx set twice as big as the multiplicands. I was doubling on each iteration of the loop which quickly ate up my registers. Thank you Rob.

Frank, the reason I'm converting Unicode is because when I get data from console it arrives in Unicode format. So a 9 is represented as 39h. I need a basic function to be able to receive data from console. I'd like it to be able to handle decimal input and not only hex. I know such a function exists in libraries but I'm gaining a lot of experience from creating this from scratch.  :)

The algorithm I uploaded changed the 39h to 09h then removed the prefixed 0 and squashed what was two bytes (38h + 39h) into one (89h). The problem is that this is an intended decimal number. Using Rob's algorithm I thought I was limited to measly 4 digits or 8 Unicode bytes. I'm wrong about that.  8)

Thanks so much Mathi. Your algorithm is perfect. Thanks so much for putting it together. I can see now how I could implement Rob's algorithm without running into a 4 digit limit. Thanks man.

Thanks everyone, this is a great community. It's quite enjoyable to program in assembly really. It's cool stuff.

Jedi

Offline Mathi

  • Jr. Member
  • *
  • Posts: 82
  • Country: in
    • Win32NASM
Re: Hex/Dec Swap
« Reply #8 on: January 22, 2013, 03:19:17 PM »
Jedi,

Code: [Select]
Frank, the reason I'm converting Unicode is because when I get data from console it arrives in Unicode format. So a 9 is represented as 39h.
9 represented as 39h is ascii encoding
Unicode is a 16 bit encoding.

If you are just trying to convert the text read from console to integer or viceversa,
You can check the below macros which are part of nagoa macroset.

you can invoke this macro as,

str2int data  ;; after execution eax will hold 6789.  but data should be null terminated like data db '6789',0

to convert integer to ascii (string)

int2str eax, buffer  ;; after execution buffer will point to null terminated ascii string.("6789" in this case)
Code: [Select]
It's quite enjoyable to program in assembly really. It's cool stuff.All the best. :)
Code: [Select]
;;================================
;; MACRO  str2int     int2str by Written by mastercpp.
;;================================
%macro str2int 1
        push    ebx ;
        push    esi ;
        push    edi ;
        mov ebx, 0
        mov ecx, 0
        xor eax,eax
        mov ebx,0000000Ah
        mov esi,offset %1
        %%ConvertLoop:
        movzx ecx,byte [esi] ;Zeichen laden.
        test ecx,ecx
        jz  short %%ExitConvertLoop ;0 => Exit
        inc esi
        sub cl,30h ;0-9...
        mul ebx ;Ergebnis * 10
        add eax,ecx ;+ nächste Ziffer
        jmp short %%ConvertLoop
        %%ExitConvertLoop:
        pop     edi
        pop     esi
        pop     ebx
%endmacro

%macro int2str 2
        push    ebx ;
        push    esi ;
        push    edi ;
        %%start:
        mov  eax, %1
        xor  ecx, ecx
        mov  ebx, 000ah
        %%DecConvert:
        xor  edx,  edx
        div  ebx
        add  edx,  0030h
        push edx
        inc  ecx
        or   eax,  eax
        jnz  short %%DecConvert
        mov  edi,  %2
        mov edx,ecx
        %%SortDec:
        pop   eax
        stosb
        loop  %%SortDec
        mov eax, 0h
        stosb
        pop     edi
        pop     esi
        pop     ebx
%endmacro
;;================================
;; mastercpp  end macros
;;================================

Regards,
Mathi.