How to find a specific node in an html document with 100, 000 nodes

this is an interview question from Tencent. Ask the great god to answer it.

the description of the topic may not be very careful before, so talk about the whole topic in detail:

Job title: web front-end development

question:

there is an html document with 100, 000 nodes:

HTML
    body
        1
            1.2
        2
            2.1
            2.2
            2.3
            ...
        ...
        
        5
            5.2
            5.3
                5.3.1
                5.3.2
       ...
        
  1. how to find nodes 5.3.2 in a html document with 100, 000 nodes
  2. how to quickly insert node 5.3.0 before node 5.3.2

the interviewer said that the focus of the investigation is: the part of algorithm and performance optimization

how should I solve this problem?

Aug.24,2021

1. What does a particular node mean? For example, in one hundred thousand nodes, does it mean to find the thousandth p tag? In that case, just document.getElementByTagName ("p") [999] is fine.
of course, I may not understand the meaning of the topic, do not spray if it is not right.


if the document has 100, 000 nodes, the person who asked him to save it as html to disk is a scum. You should go to nosql at this time.


1
2 insertBefore insertAdjacentHTML


since you did not mention the position and requirements of the interview, it is assumed that it is a front-end interview.

how to find a specific node in an html document with 100, 000 nodes

according to the attributes of the node, select the most accurate match

from the following methods
getElementById
getElementsByName
getElementsByTagName
getElementsByClassName
querySelector
querySelectorAll

Please refer to
https://javascript.info/searc.


my idea is to filter using css's nth-child () selector.

<!--htmlidclass-->

<body>
<div>
  <header>
    <nav>
      <h1>
        <a></a>
      </h1>
      <h3>
        

</h3> </nav> </header> <div> <section> <article> <h2><a></a></h2> <div>

</div>

<span></span>

</article> </section> <section> <article> <h2><a></a></h2> <div> <div> <div> <header> <nav> <h1> <a></a> </h1> <h3>

</h3> </nav> </header> <div> <section> <article> <h2><a></a></h2> <div>

</div>

<span></span>

</article> </section> <section> <article> <h2><a></a></h2> <div>

target

</div>

<span></span>

</article> </section> <section> <article> <h2><a></a></h2> <div>

</div>

<span></span>

</article> </section> <section> <article> <h2><a></a></h2> <div>

</div>

<span></span>

</article> </section> <section> <article> <h2><a></a></h2> <div>

</div>

<span></span>

</article> </section> </div> <footer> <span><a></a></span> <span><a></a></span> <span><a></a></span> <span><a></a></span> <span><a></a></span> </footer> </div> </div> </div>

<span></span>

</article> </section> <section> <article> <h2><a></a></h2> <div>

</div>

<span></span>

</article> </section> <section> <article> <h2><a></a></h2> <div>

</div>

<span></span>

</article> </section> <section> <article> <h2><a></a></h2> <div>

</div>

<span></span>

</article> </section> </div> <footer> <span><a></a></span> <span><a></a></span> <span><a></a></span> <span><a></a></span> <span><a></a></span> </footer> </div> </body>
// js

function isString (target) {
  return typeof target === 'string';
}

function findNode (parent, child) {
  if (!(isString(parent) && isString(child))) {
    return;
  }

  const path = child.split('.');
  let str = `${parent}>*`;
  for (let i = 0, len = path.length; i < len; iPP) {
    str += `:nth-child(${path[i]})${i === len - 1 ? '' : '>*'}`;
  }

  //body>*:nth-child(1)>*:nth-child(2)>*:nth-child(2)>*:nth-child(1)>*:nth-                
  //child(2)>*:nth-child(1)>*:nth-child(1)>*:nth-child(2)>*:nth-child(2)>*:nth-    
  //child(1)>*:nth-child(2)>*:nth-child(1)

  console.log(str);

  return document.querySelector(str);
}

console.log(findNode('body', '1.2.2.1.2.1.1.2.2.1.2.1'));  //

target

this is certainly not the best solution, because traversal is still needed. If the location of the target node is too deeply nested, for example, tens of thousands of nesting, then the input location as above will definitely fail, and the performance estimate is also very poor.

I feel that the binary tree traversal mentioned upstairs is relatively close to the requirements of the topic, and I am not very familiar with the algorithm myself, so I can only think of this method.


from the algorithmic level, suppose the document is a tree whose Root node is a html tag, because only the first node found needs to be returned, so the breadth-first traversal is realized through the queue. If you use querySelector, the whole tree is actually traversed.

question 1:

function isWanted(node) {
    //
}

function findNode(isWanted) {
    const root = document.getElementByTagName('html');
    const queue = [root];
    while(!isWanted(queue[0])) {
      const current = queue[0];
      const children = queue[0].children || [];
      for(i = 0; i < children.length; iPP) {
        // childrenparent, 
        children[i].parent = current;
        queue.push(children[i]);
      }
      queue.unshift();
    }
    return queue[0];
}

question 2:
according to the node, found in the first question, you can then find the parent node, then find all the child nodes through parent, and insert the node corresponding to the index position.

function insertNode(newNode) {
  const newChildren = [];
  const existingNode = findNode(isWanted);
  const parent = existingNode.parent;
  const siblings = parent.children;
//  indexOfspliceO(2n)
  for(i = 0; i < siblings; iPP) {
    if (siblings[i] === existingNode) {
      newChildren.push(newNode);
    }
    newChildren.push(siblings[i]);
  }
  parent.children = newChildren;
  return;
}

this is the algorithm for asking questions. If you have an algorithm, you can try to search deeply and extensively

.
Menu