How can I order a dataframe by the second column in R?

Possible Duplicate:
How to sort a dataframe by column(s) in R

I was just wondering if some one could help me out, I have what I thought should be a easy problem to solve.

I have the table below:

SampleID           Cluster

R0132F041p          1

R0132F127           1

R0132F064           1

R0132F068p          1

R0132F015           2

R0132F094           3

R0132F105           1

R0132F013           2

R0132F114           1

R0132F014           2

R0132F039p          3

R0132F137           1

R0132F059           1

R0132F138p          2

R0132F038p          2

and I would like to sort/order it by Cluster to get the results as below:

SampleID    Cluster

R0132F041p  1

R0132F127   1

R0132F064   1

R0132F068p  1

R0132F105   1

R0132F114   1

R0132F137   1

R0132F059   1

R0132F015   2

R0132F013   2

R0132F014   2

R0132F138p  2

R0132F038p  2

R0132F094   3

R0132F039p  3

I have tried the following R code:

data<-read.table('Table.txt', header=TRUE,row.names=1,sep='t')

data <- data.frame(data)
data <- data[order(data$Cluster),]
write.table(data, file = 'OrderedTable.txt', append = TRUE,quote=FALSE, sep = 't', na ='NA', dec = '.', row.names = TRUE, col.names = FALSE)

and get the following output:

1   1

2   1

3   1

4   1

5   1

6   1

7   1

8   1

9   2

10  2

11  2

12  2

13  2

14  3

15  3

Why have the SampleIDs been replaced by the numbers 1-15 and what do these numbers represent, I have read the ?order() page however this seems to explain sort.list better than order() if any one could help me out on this I would be very grateful.


The short answer is you did it perfectly. You just are having some difficulty with reading and writing files. Going through your code:

data<-read.table('Table.txt', header=TRUE,row.names=1,sep='t')

The above line is reading in your data fine, but the row.names=1 told it to use the first column as names for rows. So now your SampleIDs are row names instead of being their own column. If you type data or head(data) or str(data) immediately after running this line, this should be clear. Just omit that row.names argument and it will read properly.

data <- data.frame(data)

You don't need this above line because read.table() produces a dataframe. You can see that with str(data) as well.

data <- data[order(data$Cluster),]

The above line is perfect.

write.table(data, file = 'OrderedTable.txt', append = TRUE,
   quote=FALSE, sep = 't', na ='NA', dec = '.', row.names = TRUE, 
   col.names = FALSE)

Here you included the argument col.names = FALSE which is why your file doesn't have column names. You also don't need/want append=TRUE . If you look at help(write.table) , you see it is "only relevant if file is a character string". Here it seems to make the file write without ending the last line, which would likely cause any later read.table() to complain.

The numbers 1-15 in your result look like row numbers. You don't explain how you look at the resulting file, so I cannot be sure. You likely read your file in a way that doesn't parse the row.names and is showing row numbers instead. If you make certain your SampleIDs column does not get assigned to be names of rows, you'll probably be fine.


看看plyr软件包的arrange功能。

arrange(data, Cluster)
write.table(data, "ordered_data.txt")
链接地址: http://www.djcxy.com/p/70838.html

上一篇: 按两列对数据框排序(带条件)

下一篇: 我如何在R中的第二列订购数据框?