How do regular expressions match from the next word instead of the next word?

when using regular expressions, it is found that matching starts with the next word of the matched word. How can I get him to look for it according to the index order of the string?

for example: "SegmentFault is a good forum"

I write regular expressions: [X {4e00}-x {9fa5}] {2}

can match: "Yes", "good", "Forum"

but I want to match today: "Yes", "one", "good", "good comment" and "Forum"

what method can be used to achieve this?


using regular words alone, it should be difficult to deal with, because the content matched is equivalent to being consumed and will not be matched twice.
if it is js, you can write

like this.
var str="SegmentFault";
var regex=/[\u4e00-\u9fa5]{2}/g;
var matchStr=null;
var result=[];
while((matchStr=regex.exec(str))!=null){
    result.push(matchStr[0]);
    regex.lastIndex--;
}


Recursive

var string = "SegmentFault";
var reg = /[\w]{2}/;
function seg(str) {
  if (str.length) {
    console.log(str.match(reg)[0]);
    str = str.substring(1, str.lenfth);
    seg(str);
  }
}
seg(string);

different languages have different processing methods. In JavaScript, global matching regularities have a lastIndex attribute that adjusts the starting position of the next match.

const str = "SegmentFault"
const matcher = /[\u4e00-\u9fa5]{2}/g
const result = []
while (true) {
  const m = matcher.exec(str)
  if (!m) { break }
  result.push(m[0])
  matcher.lastIndex = matcher.lastIndex - m[0].length + 1
}
console.log(result)
Menu