zipf’s law and heaps’ law

Heaps’ Law

Heaps’ law can also be applied in characterizing natural language processing, according to which the vocabulary size grows in a sublinear function with document size, say with , where t denotes the total number of words and N(t) is the number of distinct words.


Zip’s Law

Ranking all the words in descending order of occurrence frequency and denoting by z(r) the frequency of the word with rank r, the Zipf’s law reads , where is the maximal frequency and is the so-called Zipf’s exponent.

Zip法则指出一个词出现的频率是它频率降序排名的倒数。 也代表频率,x代表排名。