What algorithm should be used for this search?

data = [
    {type: "point", ...},  
    {type: "point", ...},
    {type: "point", ...},
    {type: "point", ...},
    {type: "point", ...},
    {type: "point", ...},
    {type: "line", ...},
    {type: "line", ...},
    {type: "line", ...},
    {type: "line", ...},
    {type: "area", ...},
    {type: "area", ...},
    {type: "area", ...},
    {type: "area", ...},
    {type: "area", ...}
]

this data set is characterized by the fact that individual data are clustered together according to their own type. Also, the collection whose type is point must precede the line, and similarly, the collection of line must precede the area.

now it"s time to insert a new piece of data, type as" area", insert into the first item of the area collection. Is there any other search algorithm that can be faster than sequential search? Thank you

Jan.19,2022

if the order of type can be determined in advance, using dichotomy to search should be the fastest
idea is to first fetch the index of the type to be found in this sequence

then get the middle object of the target array, take the index of the middle object type and compare it with the lookup index
, then use the second half of the array [less than the first half] to continue this intermediate object matching
until you find the intermediate object that is the same as the lookup type, and use this object index to traverse forward to find the initial position and return.

when there is a large amount of data, it should increase a lot of efficiency, but if not much, it should be directly findIndex

. < hr >

just now I felt that this function might be used later, so I took the time to implement

.
function binaryFindIndex(array, find, {key, sorts}) {
  let index = sorts.indexOf(find);
  if (index > 0) {
    let right = Math.round(array.length / 2),
      i = sorts.indexOf(array[right][key]);
    while (i > -1 && i !== index) {
      right = Math.round(right / 2) + (i < index ? right : 0);
      i = sorts.indexOf(array[right][key]);
    }
    if (i > -1) {
      for (; right > -1; right--) {
        if (array[right][key] !== find) {
          return right + 1;
        }
      }
    }
  }
  return 0;
}

efficiency test

 function test(arrayLength, size) {
  let data = [].concat(
    new Array(arrayLength).fill({type: 'point'}),
    new Array(arrayLength).fill({type: 'line'}),
    new Array(arrayLength).fill({type: 'area'})
  );
  for (let i = 0; i < size; iPP) {
    console.time('binary');
    let index = binaryFindIndex(data, 'area', {
      key: 'type',
      sorts: [
        'point',
        'line',
        'area'
      ]
    });
    console.timeEnd('binary');

    console.time('find');
    let _index = data.findIndex(item => item.type === 'area');
    console.timeEnd('find');

    console.log(` ${data.length}, ${i + 1} `, index, _index);
  }
}

test(10000, 3);
test(100000, 3);
test(1000000, 3);
test(10000000, 3);

result

it can be seen that dichotomy does improve the efficiency to a certain extent under the condition of large amount of data

Menu