NASM - The Netwide Assembler

NASM Forum => Programming with NASM => Topic started by: Teol on January 13, 2017, 03:47:58 PM

Title: Getting three bytes.
Post by: Teol on January 13, 2017, 03:47:58 PM
Hello,
I´m trying to read UTF-8 characters starting from certain memory address until i get linefeed char. Problem is that since UTF-8 has variable bit length i need to read different number of bytes depending on the first byte. How can i "mov"  three bytes? mov eax, word [source] is too little. mov eax, dword [source] is too many bytes and would result in me having the first byte for the another character.
Only solution i know involves bit shifts, but is there other ways which are more faster than doing shifts?

Thanks for reading and replies.
Title: Re: Getting three bytes.
Post by: dreamCoder on January 13, 2017, 04:46:44 PM
Code: [Select]
mov eax,dword[source]
and eax,0xffffff
Title: Re: Getting three bytes.
Post by: Teol on January 13, 2017, 06:14:20 PM
Thanks, but i dont see how this helps.
For example if i have string with these characters: €®€ it is in binary:
11100010 10000010 10101100 11000010 10101001 11100010 10000010 10101100

If i read "three" actually four bytes for the euro sign
I have in eax the following
11100010 10000010 10101100 11000010


Doing the bitwise and as you suggested:
Code: [Select]
and eax,0xffffffMeans the following:

11100010 10000010 10101100 11000010
                 11111111 11111111 11111111
--------------------------------------------------------
00000000 100000010 10101100 11000010

Am i right?
Title: Re: Getting three bytes.
Post by: Teol on January 28, 2017, 10:16:18 AM
Code: [Select]
mov eax,dword[source]
and eax,0xffffff

Ok, this was the actual answer. I did not realize it reverses the order when moving the bytes from memory address to register. Thank you for the answer. :D