If you check the Wikipedia article on determining the number of clusters in a set, I think it uses the standard silhouette method to try to partition it into two, three, four, or five groups, and then see whether any of these k values yields a better silhouette number for all the groups involved. It chooses the one that’s best. It is quite standard, statistics speaking.

Keyboard shortcuts

j previous speech k next speech