4 byte utf8 and 2 utf16 support required for some Unicode 15.0.0 areas #886

rovasiras · 2022-11-06T17:47:00Z

In the Unicode Standard 15.0.0 has two important area: U+10EC0 - U+10EFF arabic extended-C
U+1E030 - U+1E08F cyrillic extended-D

rovasiras · 2022-11-26T21:32:20Z

@caolanm Required for capability the following steps in the u8_u16 function:4 byte Utf8 code transform to utf32, then divide it two surrogate word. The u16_u8 function needs this mirrored method. You can found about the correct method in unicode faq "utf8 utf16 utf32".

rovasiras · 2022-11-27T11:14:54Z

@laszlonemeth what do you about it? #886 (comment)

laszlonemeth · 2022-11-29T08:45:48Z

A temporary and back-compatible solution could be to use ICONV and OCONV to convert the non-BMP characters e.g. to user-defined characters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4 byte utf8 and 2 utf16 support required for some Unicode 15.0.0 areas #886

4 byte utf8 and 2 utf16 support required for some Unicode 15.0.0 areas #886

rovasiras commented Nov 6, 2022

rovasiras commented Nov 26, 2022

rovasiras commented Nov 27, 2022

laszlonemeth commented Nov 29, 2022

4 byte utf8 and 2 utf16 support required for some Unicode 15.0.0 areas #886

4 byte utf8 and 2 utf16 support required for some Unicode 15.0.0 areas #886

Comments

rovasiras commented Nov 6, 2022

rovasiras commented Nov 26, 2022

rovasiras commented Nov 27, 2022

laszlonemeth commented Nov 29, 2022