The question about unicode coding comes from Ruan Yifeng's article.

clipboard.png

original text:
Yan Unicode is 4E25 (100111000100101). According to the above table, you can find that 4E25 is within the range of the third line (0000 0800-0000 FFFF),), so strict UTF-8 coding requires three bytes, that is, the format is 1110xxxx 10xxxxxx 10xxxxxx. Then, starting with the strict last binary bit, fill in the x in the format from back to front, and the extra bit 0. In this way, the strict UTF-8 code is 11100100 10111000 10100101, and the conversion to hexadecimal is E4B8A5.

original address:
http://www.ruanyifeng.com/blo.

can someone help me explain? Thank you!

Apr.30,2021

clipboard.png

Menu