Author Topic: Getting three bytes.  (Read 5779 times)

Offline Teol

  • Jr. Member
  • *
  • Posts: 12
Getting three bytes.
« on: January 13, 2017, 03:47:58 PM »
Hello,
I´m trying to read UTF-8 characters starting from certain memory address until i get linefeed char. Problem is that since UTF-8 has variable bit length i need to read different number of bytes depending on the first byte. How can i "mov"  three bytes? mov eax, word [source] is too little. mov eax, dword [source] is too many bytes and would result in me having the first byte for the another character.
Only solution i know involves bit shifts, but is there other ways which are more faster than doing shifts?

Thanks for reading and replies.

Offline dreamCoder

  • Full Member
  • **
  • Posts: 107
Re: Getting three bytes.
« Reply #1 on: January 13, 2017, 04:46:44 PM »
Code: [Select]
mov eax,dword[source]
and eax,0xffffff

Offline Teol

  • Jr. Member
  • *
  • Posts: 12
Re: Getting three bytes.
« Reply #2 on: January 13, 2017, 06:14:20 PM »
Thanks, but i dont see how this helps.
For example if i have string with these characters: €®€ it is in binary:
11100010 10000010 10101100 11000010 10101001 11100010 10000010 10101100

If i read "three" actually four bytes for the euro sign
I have in eax the following
11100010 10000010 10101100 11000010


Doing the bitwise and as you suggested:
Code: [Select]
and eax,0xffffffMeans the following:

11100010 10000010 10101100 11000010
                 11111111 11111111 11111111
--------------------------------------------------------
00000000 100000010 10101100 11000010

Am i right?
« Last Edit: January 13, 2017, 06:16:54 PM by Teol »

Offline Teol

  • Jr. Member
  • *
  • Posts: 12
Re: Getting three bytes.
« Reply #3 on: January 28, 2017, 10:16:18 AM »
Code: [Select]
mov eax,dword[source]
and eax,0xffffff

Ok, this was the actual answer. I did not realize it reverses the order when moving the bytes from memory address to register. Thank you for the answer. :D