i have problem pretty unsure how solve this. given arrays in such format: $array01 = array( 0 => array("hallo", "welt", "du", "ich"), 1 => array("mag", "dich"), 2 => array("nicht", "haha", "huhu") ); $array02 = array( 0 => array("haha", "welt", "dich"), 1 => array("hallo", "mag", "nicht"), 2 => array("du", "ich", "huhu") ); now want calculate kind of similarity value of these arrays. these arrays result of clustering terms according meaning. what want know how similar these terms clustered 2 different users ($array01 = user1, $array02 = user2) . 0,1,2 clusters (they don't have same length) edit: try describe little bit further: every array result of user clustering terms (hallo, welt, du, ich...) according meaning. every sub-array 1 cluster defined...