performance - Difference between Hierarchical and Non Hierarchical Clustering? -


i trying see if performance of both can compared based on objective functions work on?

hierarchical : single linkage, complete linkage , average linkage algorithms

non hierarchical : fuzzy c means , k means

don't confuse algorithms , tasks.

there more 1 algorithm k-means. there @ least 2 dozen. use k-d-trees acceleration, example. , heuristics, because finding optimal k-means solution shown np-hard in general, believe.

similarly, there naive o(n^3) runtime , o(n^2) memory approach hierarchical clustering, , there algorithms such slink single-linkage hierarchical clustering , clink complete-linkage hierarchical clustering run in o(n^2) time , o(n) memory.

last not least, if use dbscan , set minpts=2, result same single-link hierarchical clustering when cut @ height of epsilon. yet, appropriate index, dbscan runs in o(n log n) (e.g. in elki - scipy implementation not clever, needs o(n^2) memory , runtime).

so why can have such different runtimes apparently same task? first of all, algorithms (and implementations) can primitive. easy implement , understand, not clever be. secondly, algorithms perform work may or may not interested in. single-linkage hierarchical clustering computes hierarchy; dbscan minpts=2 single cut through hierarchy - much simpler result complete hierarchy. , last not least, there heuristics. of k-means variants fall last category: accepting miss global best solution, can of course become faster if performed exhaustive search.


Comments

Popular posts from this blog

jquery - How can I dynamically add a browser tab? -

node.js - Getting the socket id,user id pair of a logged in user(s) -

keyboard - C++ GetAsyncKeyState alternative -