Scatterplot matrix with logarithmic axes in R

I am trying to create a scatterplot matrix from my dataset so that in the resulting matrix:

  • I have two different groups based on
  • Quarter of the year (distinguished as the colours of points)
  • Day type (shape of points indicating, is it weekend or casual day between Monday and Friday)
  • Logarithmic-scaled x and y axes.
  • Values on axis tick labels are not logarithmic ie values should be shown on axes as integers between 0 to 350, not their log10 counterparts.
  • Upper panel has correlation values for each quarter.
  • So far I've tried using functions:

  • pairs()
  • ggpairs() [from GGally package]
  • scatterplotMatrix()
  • splom()
  • But I haven't been able to get decent results with these packages, and every time it seems that one or more of my requirements are missing.

  • With pairs(), I'm able to create the scatterplot matrix, but the parameter log="xy" somehow removes the variable names from the diagonal of the resulting matrix.
  • ggpairs() doesn't support logarithmic scales directly, but I created a function that goes through the scatterplot matrix's diagonal and lower plane based on this answer. Though the logarithmic scaling works on lower plane, it messes up the variable labels and value ticks.
  • Function is created and used as follows:

    ggpairs_logarithmize <- function(a) { # parameter a is a ggpairs sp-matrix
            max_limit <- sqrt(length(a$plots))
            for(row in 1:max_limit) { # index 1 is used to go through the diagonal also
                    for(col in j:max_limit) {
                            subsp <- getPlot(a,row,col)
                            subspnew <- subsp + scale_y_log10() + scale_x_log10()
                            subspnew$type <- 'logcontinous'
                            subspnew$subType <- 'logpoints'
                            a <- putPlot(a,subspnew,row,col)
                    }
            }
            return(a)
    }
    scatplot <- ggpairs(...)
    scatplot_log10 <- ggpairs_logarithmize(scatplot)
    scatplot_log10
    
  • scatterplotMatrix() didn't seem to support two groupings. I was able to do this separately for season and day type though, but I need both groups in the same plot.
  • splom() somehow labels the axis tick values also to logarithmic values, and these should be kept as they are (between integers 0 and 350).
  • Are there any simple solutions available to create a scatterplot matrix with logarithmic axes with the requirements I have?

    EDIT (13.7.2012): Example data and output were asked. Here's some code snippets to produce a demo dataset:

    Declare necessary functions

    logarithmize <- function(a)
    {
            max_limit <- sqrt(length(a$plots))
            for(j in 1:max_limit) {
                    for(i in j:max_limit) {
                            subsp <- getPlot(a,i,j)
                            subspnew <- subsp + scale_y_log10() + scale_x_log10()
                            subspnew$type <- 'logcontinous'
                            subspnew$subType <- 'logpoints'
                            a <- putPlot(a,subspnew,i,j)
                    }
            }
            return(a)
    }
    
    add_quarters <- function(a,datecol,targetcol) {
        for(i in 1:nrow(a)) {
            month <- 1+as.POSIXlt(as.Date(a[i,datecol]))$mon
            if ( month <= 3 ) { a[i,targetcol] <- "Q1" }
            else if (month <= 6 && month > 3) { a[i,targetcol] <- "Q2" }
            else if ( month <= 9 && month > 6 ) { a[i,targetcol] <- "Q3" }
            else if ( month > 9 ) { a[i,targetcol] <- "Q4" }
        }
        return(a)
    }
    

    Create dataset:

    days <- seq.Date(as.Date("2010-01-01"),as.Date("2012-06-06"),"day")
    bananas <- sample(1:350,length(days), replace=T)
    apples <- sample(1:350,length(days), replace=T)
    oranges <- sample(1:350,length(days), replace=T)
    weekdays <- c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday")
    fruitsales <- data.frame(Date=days,Dayofweek=rep(weekdays,length.out=length(days)),Bananas=bananas,Apples=apples,Oranges=oranges)
    fruitsales[5:6,"Quarter"] <- NA
    fruitsales[6:7,"Daytype"] <- NA
    fruitsales$Daytype <- fruitsales$Dayofweek
    levels(fruitsales$Daytype) # Confirm the day type levels before assigning new levels
    levels(fruitsales$Daytype) <- c("Casual","Casual","Weekend","Weekend","Casual","Casual","Casual
    ")
    fruitsales <- add_quarters(fruitsales,1,6)
    

    Excecute (NOTE! Windows/Mac users, change x11() according to what OS you have)

    # install.packages("GGally")
    require(GGally)
    x11(); ggpairs(fruitsales,columns=3:5,colour="Quarter",shape="Daytype")
    x11(); logarithmize(ggpairs(fruitsales,columns=3:5,colour="Quarter",shape="Daytype"))
    

    The problem with pairs stems from the use of user co-ordinates in a log coordinate system. Specifically, when adding the labels on the diagonals, pairs sets

    par(usr = c(0, 1, 0, 1))
    

    however, if you specify a log coordinate system via log = "xy" , what you need here is

    par(usr = c(0, 1, 0, 1), xlog = FALSE, ylog = FALSE) 
    

    see this post on R help.

    This suggests the following solution (using data given in question):

    ## adapted from panel.cor in ?pairs
    panel.cor <- function(x, y, digits=2, cex.cor, quarter, ...)
    {
      usr <- par("usr"); on.exit(par(usr))
      par(usr = c(0, 1, 0, 1), xlog = FALSE, ylog = FALSE)
      r <- rev(tapply(seq_along(quarter), quarter, function(id) cor(x[id], y[id])))
      txt <- format(c(0.123456789, r), digits=digits)[-1]
      txt <- paste(names(txt), txt)
      if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
      text(0.5, c(0.2, 0.4, 0.6, 0.8), txt)
    }
    
    pairs(fruitsales[,3:5], log = "xy", 
          diag.panel = function(x, ...) par(xlog = FALSE, ylog = FALSE),
          label.pos = 0.5,
          col = unclass(factor(fruitsales[,6])), 
          pch = unclass(fruitsales[,7]), upper.panel = panel.cor, 
          quarter = factor(fruitsales[,6]))
    

    This produces the following plot

    在日志坐标系上绘制对

    链接地址: http://www.djcxy.com/p/61258.html

    上一篇: FM收音机在android开发中的应用

    下一篇: 在R中具有对数轴的散点图矩阵