Drop data frame columns by name

I have a number of columns that I would like to remove from a data frame. I know that we can delete them individually using something like:

df$x <- NULL

But I was hoping to do this with fewer commands.

Also, I know that I could drop columns using integer indexing like this:

df <- df[ -c(1, 3:6, 12) ]

But I am concerned that the relative position of my variables may change.

Given how powerful R is, I figured there might be a better way than dropping each column one by one.


You can use a simple list of names :

DF <- data.frame(
  x=1:10,
  y=10:1,
  z=rep(5,10),
  a=11:20
)
drops <- c("x","z")
DF[ , !(names(DF) %in% drops)]

Or, alternatively, you can make a list of those to keep and refer to them by name :

keeps <- c("y", "a")
DF[keeps]

EDIT : For those still not acquainted with the drop argument of the indexing function, if you want to keep one column as a data frame, you do:

keeps <- "y"
DF[ , keeps, drop = FALSE]

drop=TRUE (or not mentioning it) will drop unnecessary dimensions, and hence return a vector with the values of column y .


There's also the subset command, useful if you know which columns you want:

df <- data.frame(a = 1:10, b = 2:11, c = 3:12)
df <- subset(df, select = c(a, c))

UPDATED after comment by @hadley: To drop columns a,c you could do:

df <- subset(df, select = -c(a, c))

within(df, rm(x))

is probably easiest, or for multiple variables:

within(df, rm(x, y))

Or if you're dealing with data.table s (per How do you delete a column by name in data.table?):

dt[, x := NULL]   # deletes column x by reference instantly

dt[, !"x", with=FALSE]   # selects all but x into a new data.table

or for multiple variables

dt[, c("x","y") := NULL]

dt[, !c("x", "y"), with=FALSE]

In the development version of data.table (installation instructions), with = FALSE is no longer necessary:

dt[ , !"x"]
dt[ , !c("x", "y")]
链接地址: http://www.djcxy.com/p/12226.html

上一篇: 在Python熊猫中向现有DataFrame添加新列

下一篇: 按名称删除数据框列