If I have two articles, how to calculate their similarity? What are the specific ideas and mature methods?

RT, if I have two articles, how can I calculate their similarity? What are the specific ideas and mature methods?


the broader algorithm is to calculate the vector sum of the article twice, roughly in the way that each word or word is a vector, and then all the vectors are added to see if the result vectors of the two articles are not much different.
for example, a sentence "I went out to play today" participle "I", "today", "go out", "play". The second sentence "I won't go to the zoo tomorrow" participle "I", "tomorrow", "not going", "zoo". Suppose the vector is expressed as the length and angle:
I: (1d0)
Today: (1d10)
go out: (1,20)
play: (1,30)
tomorrow: (1,15)
not going: (1,200)
Zoo: (1,5)
finally add the related vectors and calculate the sum of the two result vectors. In this way of thinking, look for what you should have on the Internet.

Menu