Why the value here is different is why the solution

package main

import (
    "fmt"
)

func printBytes(s string) {
    for i:= 0; i < len(s); iPP {
        fmt.Printf("%x ", s[i])
    }
}


func printChars(s string) {
    runes := []rune(s)
    for i:= 0; i < len(runes); iPP {
        fmt.Printf("%c ",runes[i])
    }
}

func main() {
    runeSlice := []rune{0x0053, 0x0065, 0x00f1, 0x006f, 0x0072}
    str := string(runeSlice)
    fmt.Println(str)
    name := "Seor"
    printBytes(name)
    fmt.Printf("\n")
    printChars(name)
}

output
Se or
53 65 c3 b1 6f 72 / / 5 characters are divided into 6
Se or

Nov.02,2021

problems caused by UTF8 coding, UTF-8 (8-bit Unicode Transformation Format) is a variable length character encoding for Unicode, also known as universal code, created by Ken Thompson in 1992. It has now been standardized to RFC 3629. UTF-8 encode Unicode characters with 1 to 6 bytes

< table > < thead > < tr > < th align= "center" > Unicode range < / th > < th align= "center" > number of bit < / th > < th > number of bytes < / th > < / tr > < / thead > < tbody > < tr > < td align= "center" > 0000 ~ 007F < / td > < td align= "center" > 0-7 < / td > < td > 1 < / td > < / tr > < tr > < td align= "center" > 0080 ~ 07FF < / td > < td align= "center" > 8-11 < / td > < td > 2 < / td > < / tr > < tr > < td align= "center" > 0800 ~ FFFF < / td > < td align= "center" > 12-16 < / td > < td > 3 < / td > < / tr > < / tbody > < / table >
  1. your third character 0x00f1 is already in the range of 8-11 bits.
  2. len of
  3. golang is to calculate the number of bytes, so you len will return the length of 3 * Chinese
  4. .

clipboard.png

Menu