c# - Reversing the bytes of a unicode txt file -


i have file 1.txt in unicode. text of file: "12345" if read bytes byte array, 12 bytes:

255 254 49 0 50 0 51 0 52 0 53 0

that's fine. can't understand if reverse bytes this:

0 53 0 52 0 51 0 50 0 49 254 255

c# method encoding.unicode.getstring(bytearray) returns 㔀㐀㌀㈀㄀� , that's correct, notepad shows 5 4 3 2 1юя, why?

you can find byte reverse method here:

your text file encoded utf16.

the 2 bytes @ front byte order mark (bom) , aren't part of text.

you must not alter them. should skip first 2 bytes , reverse remainder of bytes.

but give problems because can't reverse bytes in utf16 code - give code different character, or indeed invalid code.

anyway, what's happening when reverse order wind bom stuck @ end forms invalid utf16 code happens "?" character @ end you're seeing, , messes encoding other characters.

however, looks notepad opening file using ansi encoding, code page used current locale.

the text file contains bytes 0 53 0 52 0 51 0 50 0 49 254 255 , notepad converting 0 space, , other values less 0x80 being converted ascii characters, while 254 converted ю , 255 я (which assume value of characters in ansi code page current locale).

i'm guessing you're in slavic region uses cyrillic script.


Comments

Popular posts from this blog

Change php variable from jquery value using ajax (same page) -

How can I fetch data from a web server in an android application? -

jquery - How can I dynamically add a browser tab? -