Database: which representation of many-to-many relationship is better? Comprehensive?

background:
I want to add tagging capabilities for our products. A product can have multiple tags, and each tag can be used by multiple products, which is the so-called many-to-many relationship.
specific:
We have 10 million products Product, administrator defined 20 tags Tag, we want to label these 10 million products (one or more), for this we created a Product table and a Tag table.
question:
how to express this many-to-many relationship? Optimization plan? Performance?
our first idea:
Intermediate table: ProductTag connects Product and Tag, to form a many-to-many relationship. But each time you need to traverse the ProductTag table and get the tag description information from the Tag table, isn"t it a waste of efficiency because checking the tag is much more than setting the tag?
so the second idea:
cache field, define an array type field in Product to store the id, of the Tag object in the tags, field, so that when you can view it, you can directly obtain the array of Tag objects according to the tags. Then get the description information of each tag. But the problem is that one day the administrator deletes a tag object from the Tag table, so we have to traverse all the Product, to delete the id. of the tag from the tags. Oh my God!
so the third idea:
We still use the second idea, but when the administrator deletes a tag object, we do not do anything, but when viewing the label data of the product, check whether the tag in tags is valid (query the Tag table). So here comes the problem again: isn"t it a waste of efficiency to check products more times than to delete tags?
is there a better way? Or optimize the solution?
look for an answer! Thank you ~


you can first expand the " second idea ". Product stores the tags field, and naturally Tag can also store the products field to store the product, of the tag. You can delete a certain tag, and take out the products of this tag to traverse part of it, that is, this "relationship" is redundant in both Product and Tag. But it won't take long to do this. The products field is very large, and it must be very slow in practice.

then combined with " first idea ", no products field, just tags field, plus indexed ProductTag intermediate table, usually do not use ProductTag, once you want to delete tag, to read from the ProductTag table, and then deal with the tags field of Product one by one.

I hope I can help you.

Menu