How to get a single Arabic letter in a string with its Unicode transformation value in DELPHI? -
considering arabic word(جبل) made of 3 letters .
-the first letter جـ, -name (ǧīm), -its unicode value fe9f when in beginning, -its basic value 062c , -its isolated value fe9d last 2 values return same shape drawing ج .
now, whenever try single character -trying many different ways-, delphi returns basic unicode value. well,that makes sense,but happens char transformation? single char too..looks takes transformed value when within string, where? how extract it?when , process decides these values? again main question: how can arabic letter or unicode value within string?
just information: unlike english has tow cases letters(capital , small), arabic has 4 cases(isolated, beginning,middle , end) different rules well.
i'm not sure understand question. if want know how write u+fe9f in delphi source code, in modern unicode version of delphi. so:
char($fe9f)
if want read individual characters جبل this:
const myword = 'جبل'; var c: char; .... c := myword[1];//this u+062c
note code above fine particular word because each code point can encoded single utf-16 widechar
character element. if code point required multiple elements, best transform utf-32 code point level processing.
now, let's @ string included in question. downloaded question using wget , file came down wires utf-8 encoded. used notepad++ convert utf16-le , picked out 3 utf-16 characters of string. are:
u+062c u+0628 u+0644
you stated:
the first letter جـ, name (ǧīm), unicode value u+fe9f.
but incorrect. can seen above, actual character posted u+062c. reason why attempts read first character yield u+062c u+062c first character of string.
the bottom line nothing in delphi code transforming character. when do:
s[1] := char($fe9f);
the compiler performs simple 2 byte copy. there no context aware transformation occurs. , likewise when reading s[1]
.
let's @ how these characters displayed, using simple code on vcl forms application contains memo control:
memo1.clear; memo1.lines.add(stringofchar(char($fe9f), 2)); memo1.lines.add(stringofchar(char($062c), 2));
the output looks this:
as can see, rendering layer knows u+062c character appears @ beginning of string.
Comments
Post a Comment