lagging in data.table R

Currently I have a utility function that lags things in data.table by group. The function is simple:

panel_lag <- function(var, k) {
  if (k > 0) {
    # Bring past values forward k times
    return(c(rep(NA, k), head(var, -k)))
  } else {
    # Bring future values backward
    return(c(tail(var, k), rep(NA, -k)))
  }
}

I can then call this from a data.table :

x = data.table(a=1:10, 
               dte=sample(seq.Date(from=as.Date("2012-01-20"),
                                   to=as.Date("2012-01-30"), by=1),
                          10))
x[, L1_a:=panel_lag(a, 1)]  # This won't work correctly as `x` isn't keyed by date
setkey(x, dte)
x[, L1_a:=panel_lag(a, 1)]  # This will

This requires that I check inside panel_lag whether x is keyed. Is there a better way to do lagging? The tables tend to be large so they should really be keyed. I just do setkey before i lag. I would like to make sure I don't forget to key them. So I would like to know if there is a standard way people do this.


If you want to ensure that you lag in order of some other column, you could use the order function:

x[order(dte),L1_a:=panel_lag(a,1)]

Though if you're doing a lot of things in date order it would make sense to key it that way.

链接地址: http://www.djcxy.com/p/68396.html

上一篇: D开发过程

下一篇: 滞后于data.table R