

Note 1: The TELUGU VOWEL SIGN E U+0C46 should combine with TELUGU DIGIT ZERO U+0C66 - if I've identified the characters correctly, which seems improbable. I don't know Telugu at all, so what follows may be inaccurate, but I think it more or less makes sense of what's in the Anu Script Software output: UTF-8 bytes PUA Telugu GlyphĠ圎F 0x83 0xA1 = U+F0E1 => U+0C2F య (three code points for one character)Ġ圎F 0x83 0xA7 = U+F0E7 => U+0C46 U+0C66 ౦ె (Note 1) You might need to understand combining accents and various other aspects of Telugu.

Your best bet is probably to analyze the similarities and differences between the code points used by Anu Script Software and the Unicode standard range for Telugu, and then use the Unicode standard codes. The standard Unicode range for Telugu is U+0C00.U+0C7F. You cannot use these code points except with software that understands what Anu Script Software does.īrowsers will only understand those code points if they're made aware of where the relevant font is, which gets into intricate details and is probably platform specific.

They only work with software that understands the convention. What this means is that the Anu Script Software is using Unicode points that have no international agreed meaning - the BMP PUA is, by definition, for 'private use' and the parties sharing data using the PUA must agree on what the code points mean and how to display them. The Private Use Area does not contain any character assignments, consequently no character code charts or names lists are If you go to the Unicode Charts page and enter 'F020' as the code, it gives you UE000.pdf to download, which says: Private Use Area Range: E000-F8FF The character codes that were copied and pasted into the question are Unicode code points in the Unicode BMP (Basic Multilingual Plane) Private Use Area (PUA).
