Mongodb aggregate query?

Pure front end, did not contact the background database before, now study mongodb database, encounter such a situation, set up a blog database, there is such a collection containing all the article information, the data forgery is as follows:

{ "id" : 1, "name" : "one", "tags" : [ "a", "c", "e" ] }
{ "id" : 2, "name" : "two", "tags" : [ "e" ] }
{ "id" : 3, "name" : "three", "tags" : [ "d", "e" ] }
{ "id" : 4, "name" : "four", "tags" : [ "g", "c", "e", "h" ] }
{ "id" : 5, "name" : "five", "tags" : [ "a", "c", "d" ] }

tags indicates the tag type to which this article belongs. Now you want to find out how many different tag types there are and how many pieces of data are under each tag. How can this be realized? At present, the idea is to split the tags with $unwind in aggregate, and then group statistics with $group, but how to write the specific code, or there are other more effective ways, please let me know!

Mar.28,2021

After the

data is extracted, this is the problem of pure statistical algorithm, which has nothing to do with whether the front and rear ends are used or not. Of course, it is relatively simple to implement

.
const list = [
  {"id": 1, "name": "one", "tags": ["a", "c", "e"]},
  {"id": 2, "name": "two", "tags": ["e"]},
  {"id": 3, "name": "three", "tags": ["d", "e"]},
  {"id": 4, "name": "four", "tags": ["g", "c", "e", "h"]},
  {"id": 5, "name": "five", "tags": ["a", "c", "d"]}
];
const result = {
  length: 0 //
};
list.forEach(item => {
  item.tags.forEach(tag => {
    if (!result[tag]) {
      result[tag] = {
        name: tag,  //
        list: []    //id length
      };
      result.length += 1;
    }
    result[tag].list.push(item.id);
  })
});
console.log(result)

this is a very classic statistical problem, in fact, it is no longer a specific category of MongoDB, but also common to other databases.
there is no problem with the way you come up with. You can definitely get the result you want through aggregation's $unwind + $group . The problem is that over time, more and more blogs will participate in the statistics, which means that your statistics will be slower and slower. The crux of the problem is how to limit the number of records that participate in statistics each time.
from another point of view, blogs are rarely changed when they are finished. Today's statistics of which tags all blogs belong to are the same as those of yesterday's statistics tomorrow (assuming there are no changes). So doing statistics every time is actually a waste.
based on these considerations, you can actually count how many times each tag appears at regular intervals (such as every day), save them, and reaggregate the data that has already been aggregated when needed. This method is called prepolymerization. The general direction of
is like this. You can think about the details for yourself first.

Menu