Subject:  Re: [Haskellcafe] FGL/Haskell and Hierarchical Clustering/dendograms 

From:  Nikolas BorrelJensen 
Date:  Tue, 29 Dec 2009 02:15:26 +0100 
Thank you very much for your reply! I have been looking at the code, and there are two problems, as I can see. First, trying with the example t1 :: Tree (Id, Cost) t1 = Node (4,0)
[Node (3,2) [Node (1,12) []] ,Node (2,3) [Node (5,1) [Node (6,2) [Node (7,2) [] ]]]] printed as (4,0)  + (3,2)  
 ` (1,12)  ` (2,3)  ` (5,1)  ` (6,2)  ` (7,2) your function 'cluster fst snd t1' returns
Many [Many [Many [Many [Many [One (0,[4]),One (1,[5])],One (2,[3])],One (2,[6,7])],One (3,[2])],One (12,[1])] I can't see how this representation is giving the hierarchical clusters. The example above should resolve into
level 1: [[(2,3),(5,1)],[(6,2)],[(7,2)],[(4,0)], [(3,2)], [(1,12)]] level 2: [[(2,3),(5,1),(6,2),(7,2)], [(4,0),(3,2)], [(1,12)]] level 3: [[(2,3),(5,1),(6,2),(7,2),(4,0),(3,2)], [(1,12)]]
level 4 (or (cost) level 12): [[(2,3),(5,1),(6,2),(7,2),(4,0),(3,2),(1,12)]] By doing it this way, we cluster all nodes connected with edges less than or equal x at (cost) level x. Clearly, we can have level 1: [[(1,1),(2,1)],[(3,1),(4,1)],...] if the edges between [(1,1),(2,1)] and [(3,1),(4,1)] are greater than 1.
Second, I don't think it is trivial to treeify the root path tree. I have done the function treeifyMST, which surely isn't efficient, since the list encounteredNodes is traversed as many times as the number of nodes (a binary search tree would be more efficient). But more important, the tree isn't correct, since each path is connected at the root of the tree.
Example (LRTree Int): [ [(1,0)],[(5,1),(1,0) ], [(2,2),(1,0)] , [(3,3),(2,2),(1,0)] , [(4,4),(2,2),(1,0)] ] > [ [(5,1),(1,0) ] , [(3,3),(2,2),(1,0)] , [(4,4),(2,2),(1,0)] ]
In my code, all 3 paths are branching at the root (1,0), but should for the last two paths branch at node (2,2). How should I cope with that in an efficient way? I wonder if if it is easier to implement it from the ground using the approach given at http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/hierarchical.html ?
 TO DO: now all paths are connected at the root of the tree. Should be patched at the right places inside the tree. The search in the list encounteredNodes is not efficient.
treeifyMST :: LRTree Int > Tree (Id,Cost) treeifyMST rootpathtree = let (LP rpt:rpts) = rootpathtree
root = head rpt revrootpathtree = reverse rootpathtree in
Node root (constructTree [] revrootpathtree) where constructTree :: [Int] > LRTree Int > [Tree (Id,Cost)]
constructTree encounteredNodes (LP x:[]) = [] constructTree encounteredNodes (LP x:xs) =
let path1 = x !! 0 path2 = x !! 1
id1 = fst path1 id2 = fst path2 in
case (L.find (==id1) encounteredNodes) of  because we have encountered an already processed id, we can skip this sublist
Just _ > constructTree (id2:encounteredNodes) xs  new id, meaning that we have encountered a new path
Nothing > let lenpath = length x revpath = reverse $ take (lenpath1) x
tree = listToNode revpath in tree:constructTree (id2:encounteredNodes) xs
constructTree _ _ = [] listToNode (p:ps:[]) = Node p [Node ps []] listToNode (p:ps) = Node p [listToNode ps]
 Nikolas _______________________________________________ HaskellCafe mailing list HaskellCafe@xxxxxxxxxxx http://www.haskell.org/mailman/listinfo/haskellcafe 
<Prev in Thread]  Current Thread  [Next in Thread> 


Previous by Date:  Re: [Haskellcafe] Upgraded to GHC 6.12 and can't find anything, Tom Hawkins 

Next by Date:  [Haskellcafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve), Will Ness 
Previous by Thread:  Re: [Haskellcafe] FGL/Haskell and Hierarchical Clustering/dendograms, Matt Morrow 
Next by Thread:  [Haskellcafe] Control.Parallel and the Reader monad, Paul Brauner 
Indexes:  [Date] [Thread] [Top] [All Lists] 