How to correctly judge how long the input content is stored in mysql with js?

if I enter

annual salary above 30K

the length judged by the length attribute is 7

.

my database code is utf8mb4, and the length of a Chinese character is equivalent to 4
, but if this field is stored in the database, it is equivalent to 19

.

add me to enter the content of [annual salary above 30K], how should I use js to calculate his length is 19

May.19,2022

function chineseLen () {
  let reg = /[\u4E00-\u9FA5]/g;    //
  let str = '30K'
  len = str.length
  str.replace(reg,function(e){      //3
    len+=3
  })
  return len
}
chineseLen()

I wish you success!


utf8 is the character storage method of variable length encoding, utf8mb4 is only the maximum length of each character is 4 bytes, not any character is 4 bytes.

the original MySQL utf8 can only store characters on the basic multilingual plane (BMP). Under utf8mb4 coding, the characters on BMP are exactly the same as utf8 codes, and commonly used Chinese characters still occupy only 3 bytes. The only common characters that occupy four bytes are name generation characters and emoticons.

in the MySQL database, the length of the varchar field is described by the character , not bytes, so the "annual salary above 30K" can be saved with the field of varchar (7) .

you can try it yourself:

function byteLength(str) {
  let len = 0;
  console.log("");
  str.replace(/[\u{00}-\u{7f}]/gu, e => {
    len += 1;
    console.log(e);
  });
  console.log("");
  str.replace(/[\u{80}-\u{07ff}]/gu, e => {
    len += 2;
    console.log(e);
  });
  console.log("");
  str.replace(/[\u{0800}-\u{ffff}]/gu, e => {
    len += 3;
    console.log(e);
  });
  console.log("");
  str.replace(/[\u{010000}-\u{10ffff}]/gu, e => {
    len += 4;
    console.log(e);
  });
  return len;
}

console.log("", byteLength("English??Espaol?"));
Menu