Walking a hierarchical tree

I want to be able to "walk" (iterate) through a hierarchical cluster (see figure below and code). What I want is:

  • A function that that takes a matrix and a minimum height. Say 10 in this example.

    splitme <- function(matrix, minH){
        ##Some code
    }
    
  • Starting from the top to minH , start cutting whenever there is a new split. This is the first problem. How to detect a new splits to get an height h .

  • At this particular h , how many clusters are there? Retrieve clusters

    mycl <- cutree(hr, h=x);#x is that found h
    count <- count(mycl)# Bad code
    
  • Save in variable(s) each of the new matrices. This is another hard one, dynamic creation of x new matrices. So perhaps a function that takes the clusters does what needs to be done (comparisons) and returns a variable ??

  • Continue 3 and 4 until minH reached

  • Figure

    在这里输入图像描述

    Code

    # Generate data
    set.seed(12345)
    desc.1 <- c(rnorm(10, 0, 1), rnorm(20, 10, 4))
    desc.2 <- c(rnorm(5, 20, .5), rnorm(5, 5, 1.5), rnorm(20, 10, 2))
    desc.3 <- c(rnorm(10, 3, .1), rnorm(15, 6, .2), rnorm(5, 5, .3))
    
    data <- cbind(desc.1, desc.2, desc.3)
    
    # Create dendrogram
    d <- dist(data) 
    hc <- as.dendrogram(hclust(d))
    
    # Function to color branches
    colbranches <- function(n, col)
      {
      a <- attributes(n) # Find the attributes of current node
      # Color edges with requested color
      attr(n, "edgePar") <- c(a$edgePar, list(col=col, lwd=2))
      n # Don't forget to return the node!
      }
    
    # Color the first sub-branch of the first branch in red,
    # the second sub-branch in orange and the second branch in blue
    hc[[1]][[1]] = dendrapply(hc[[1]][[1]], colbranches, "red")
    hc[[1]][[2]] = dendrapply(hc[[1]][[2]], colbranches, "orange")
    hc[[2]] = dendrapply(hc[[2]], colbranches, "blue")
    
    # Plot
    plot(hc)
    

    I think what you need essentially is the cophenetic correlation coefficient of the dendrogram. It will tell you the heights of all splitting points. From there you can easily walk through the tree. I made an attempt below and store all submatrices to a list "submatrices". It's a nested list. The first level is the submatrices from all splitting points. The second level is the submatrices from a splitting point. For example, if you want all submatrices from the 1st splitting point (grey and blue clusters), it should be submatrices[[1]]. If you want the first submatrix (red cluster) from submatrices[[1]], it should be submatrices[[1]][1].

    splitme <- function(data, minH){
      ##Compute dist matrix and clustering dendrogram
      d <- dist(data)
      cl <- hclust(d)
      hc <- as.dendrogram(cl)
    
      ##Get the cophenetic correlation coefficient matrix (cccm)
      cccm <- round(cophenetic(hc), digits = 0)
    
      #Get the heights of spliting points (sps)
      sps <- sort(unique(cccm), decreasing = T)
    
      #This list store all the submatrices
      #The submatrices extract from the nth splitting points
      #(top splitting point being the 1st whereas bottom splitting point being the last)
      submatrices <- list()
    
      #Iterate/Walk the dendrogram
      i <- 2 #Starting from 2 as the 1st value will give you the entire dendrogram as a whole
      while(sps[i] > minH){
        membership <- cutree(cl, h=sps[i]) #Cut the tree at splitting points
        lst <- list() #Create a list to store submatrices extract from a splitting point
        for(j in 1:max(membership)){
          member <- which(membership == j) #Get the corresponding data entry to create the submatrices
          df <- data.frame()
          for(p in member){
            df <- rbind(df, data[p, ])
            colnames(df) <- colnames(data)
            dm <- dist(df)
          }
          lst <- append(lst, list(dm)) #Append all submatrices from a splitting point to lst
        }
        submatrices <- append(submatrices, list(lst)) #Append the lst to submatrices list
        i <- i + 1
      }
      return(submatrices)
    }
    
    链接地址: http://www.djcxy.com/p/75738.html

    上一篇: 居中一个没有宽度的div块

    下一篇: 走分层树