How the keyword highlighting of the search function is implemented

with a learning attitude, I would like to ask if I want to do a full-text search function for my blog, such as brief Book, Baidu, not only the title, but also the content, matching the same keywords and highlighting the keywords. Whether this function should be done by the front end or by the back end

in order to avoid confusing the questions, I would like to ask them step by step. I hope you will answer the questions step by step. Thank you

question 1:

how to achieve keyword matching, of course, it is not simple, for example, when I type "how java implements full-text search", I directly match the whole field of "how java implements full-text search" in the class content, with multiple words in the middle. The desired effect of
is as follows:
Baidu:

clipboard.png

:

clipboard.png

of course, maybe it takes a lot of money and effort to achieve this effect. I don"t want to be exactly the same. It"s good to be similar. I don"t know if there is such a framework

.

question 2:

how to highlight keywords, but forget the stupid full-field matching. Baidu and the brief book seem to have been taken apart. It is not clear whether the framework has such a function or is realized by internal technicians. And I see Baidu, the keyword is tagged with "em", then the function of tagging is to return the data to the front end by the background, the front end is highlighted, or after the background obtains the data, directly add the tag and return to the front end, and then the front end sets the "em" to be displayed directly in red.

Feb.25,2022

implementation method in jq:

//
function encode(s) {
    return s.replace(/&/g, "&").replace(/</g, "<").replace(/>/g, ">").replace(/([\\\.\*\[\]\(\)\$\^])/g, "\\$1");
}
function decode(s) {
    return s.replace(/\\([\\\.\*\[\]\(\)\$\^])/g, "$1").replace(/>/g, ">").replace(/</g, "<").replace(/&/g, "&");
}
function loopSearch(s, obj) {
    var cnt = 0;
    if (obj.nodeType == 3) {
        cnt = replace(s, obj);
        return cnt;
    }
    for (var i = 0, c; c = obj.childNodes[i]; iPP) {
        if (!c.className || c.className != "highlight")
            cnt += loopSearch(s, c);
    }
    return cnt;
}
function replace(s, dest) {
    var r = new RegExp(s, "g");
    var tm = null;
    var t = dest.nodeValue;
    var cnt = 0;
    if (tm = t.match(r)) {
        cnt = tm.length;
        t = t.replace(r, "{searchHL}" + decode(s) + "{/searchHL}")
        dest.nodeValue = t;
    }
    return cnt;
}
function highlight(s,l) {
    if (s.length == 0) {
        return false;
    }
    s = encode(s).toLowerCase();
    var obj_li = $('-sharp'+l+'').find('h4');//idh4
    obj_li.each(function (i, o) {
        var t = o.innerHTML.replace(/<span\s+class=.?highlight.?>([^<>]*)<\/span>/gi, "$1");
        o.innerHTML = t;
        var cnt = loopSearch(s, o);
        t = o.innerHTML
        var r = /{searchHL}(({(?!\/searchHL})|[^{])*){\/searchHL}/g
        t = t.replace(r, "<span class='highlight'>$1</span>");
        o.innerHTML = t;
    });
}

use:
for example: your content is displayed in this div-sharpaaaa , and the title of the search content is displayed in the h4 tag, then call the above method: highlight ('hehe', 'aaaa') , and the hehe in the result content will be highlighted.

throw a brick to attract jade


keyword-based highlighting needs to be supported by search engines. You can choose one of the two open source lucene-based search engines to learn from Elasticsearch or Solr .

from the search results, Elasticsearch seems to be more popular than Solr. It should be easier to find help to solve problems. It's still a little complicated, and it takes a while to get started. You can take a look at ide/en/elasticsearch/reference/current/getting-started.html" rel=" nofollow noreferrer "> official document and feel it.

  • Chinese documents are only version 2.0, which has not been updated for a long time. At present, the latest version is 6.5
  • .
  • Elasticsearch default word splitter can not meet the needs of Chinese word segmentation, so you need to use a Chinese word separator (such as IK word splitter).
< H2 > add < / H2 >

if the backend uses java, and just wants to do a simple keyword-based highlight search function and get started quickly, then hibernate search looks like a good choice

Menu