Create an empty data.frame

I'm trying to initialize a data.frame without any rows. Basically, I want to specify the data types for each column and name them, but not have any rows created as a result. The best I've been able to do so far is something like: df <- data.frame(Date=as.Date("01/01/2000", format="%m/%d/%Y"), File="", User="", stringsAsFactors=FALSE) df <- df[-1,] Which create

创建一个空的data.frame

我试图初始化data.frame没有任何行。 基本上,我想为每列指定数据类型并命名它们,但是没有创建任何行作为结果。 到目前为止,我所能做到的最好的事情是: df <- data.frame(Date=as.Date("01/01/2000", format="%m/%d/%Y"), File="", User="", stringsAsFactors=FALSE) df <- df[-1,] 它创建一个data.frame,其中一行包含我想要的所有数据类型和列名称,但也会创建一个无用的行,然后需要将其删除

convert data.frame column format from character to factor

I am programming in R language. I would like to change the format (class) of some columns of my data.frame object ( mydf ) from charactor to factor. I don't want to do this when I'm reading the text file by read.table() function. Any help would be appreciated. Hi welcome to the world of R. mtcars #look at this built in data set str(mtcars) #allows you to see the classes of the vari

将data.frame列格式从字符转换为factor

我用R语言编程。 我想将data.frame对象( mydf )的某些列的格式(类)从charactor更改为factor。 当我通过read.table()函数读取文本文件时,我不想这样做。 任何帮助,将不胜感激。 嗨,欢迎来到R的世界。 mtcars #look at this built in data set str(mtcars) #allows you to see the classes of the variables (all numeric) #one approach it to index with the $ sign and the as.factor function mtcars$am <- a

Remove an entire column from a data.frame in R

Does anyone know how to remove an entire column from a data.frame in R? For example if I am given this data.frame: > head(data) chr genome region 1 chr1 hg19_refGene CDS 2 chr1 hg19_refGene exon 3 chr1 hg19_refGene CDS 4 chr1 hg19_refGene exon 5 chr1 hg19_refGene CDS 6 chr1 hg19_refGene exon and I want to remove the 2nd column. You can set it to NULL . > Data$g

从R中的data.frame中删除整列

有谁知道如何从R中的data.frame中移除整个列? 例如,如果我给这个data.frame: > head(data) chr genome region 1 chr1 hg19_refGene CDS 2 chr1 hg19_refGene exon 3 chr1 hg19_refGene CDS 4 chr1 hg19_refGene exon 5 chr1 hg19_refGene CDS 6 chr1 hg19_refGene exon 我想删除第二列。 您可以将其设置为NULL 。 > Data$genome <- NULL > head(Data) chr region 1 chr1 CDS

Remove rows with NAs (missing values) in data.frame

I'd like to remove the lines in this data frame that contain NA s across all columns. Below is my example data frame. gene hsap mmul mmus rnor cfam 1 ENSG00000208234 0 NA NA NA NA 2 ENSG00000199674 0 2 2 2 2 3 ENSG00000221622 0 NA NA NA NA 4 ENSG00000207604 0 NA NA 1 2 5 ENSG00000207431 0 NA NA NA NA 6 ENSG00000221312

在data.frame中删除具有NAs(缺失值)的行

我想删除此数据框中所有列中包含NA的行。 以下是我的示例数据框。 gene hsap mmul mmus rnor cfam 1 ENSG00000208234 0 NA NA NA NA 2 ENSG00000199674 0 2 2 2 2 3 ENSG00000221622 0 NA NA NA NA 4 ENSG00000207604 0 NA NA 1 2 5 ENSG00000207431 0 NA NA NA NA 6 ENSG00000221312 0 1 2 3 2 基本上,我想获得如下的数据框。

Drop factor levels in a subsetted data frame

I have a data frame containing a factor. When I create a subset of this data frame using subset() or another indexing function, a new data frame is created. However, the factor variable retains all of its original levels -- even when they do not exist in the new data frame. This creates headaches when doing faceted plotting or using functions that rely on factor levels. What is the most suc

子集数据框中的下降因子水平

我有一个数据框包含一个因素。 当我使用subset()或其他索引函数创建此数据框的subset() ,会创建一个新的数据框。 但是,因子变量保留了所有原始级别 - 即使它们不存在于新数据框中。 这会在进行多面绘图或使用依赖于因子级别的函数时造成麻烦。 在我的新数据框架中从一个因素中删除层次的最简洁的方法是什么? 这是我的例子: df <- data.frame(letters=letters[1:5], numbers=seq(1:5)) level

rearrange data according to pattern

This question already has an answer here: How to sort a dataframe by multiple column(s)? 16 answers If there is only one male and female per household, then you can just do: dta <- dta[order(dta$householdid.x, dta$isex), ] Which gives the desired output: householdid.x idno isex iage 1 101366 1013661 FEMALE 29 2 101366 1013662 MALE 36 4 102481 1024811 F

根据模式重新排列数据

这个问题在这里已经有了答案: 如何按多个列排序数据框? 16个答案 如果每个家庭只有一名男性和女性,那么你可以这样做: dta <- dta[order(dta$householdid.x, dta$isex), ] 其中给出了所需的输出: householdid.x idno isex iage 1 101366 1013661 FEMALE 29 2 101366 1013662 MALE 36 4 102481 1024811 FEMALE 29 3 102481 1024812 MALE 39 6 103755 103755

sort a data frame based on multiple columns in R

This question already has an answer here: How to sort a dataframe by multiple column(s)? 16 answers To sort in ascending order: Use dplyr like this: library(dplyr) df <- df %>% arrange(type, frequency, word) Just arrange the variables in the order you would like to sort. To sort in descending order: Just use a negative sign in front of the variable you want to sort in reverse o

根据R中的多个列对数据框进行排序

这个问题在这里已经有了答案: 如何按多个列排序数据框? 16个答案 按升序排序: 像这样使用dplyr: library(dplyr) df <- df %>% arrange(type, frequency, word) 只需按照您想要排序的顺序排列变量即可。 按降序排列: 只需在要按相反顺序排序的变量前面使用负号。 喜欢这个。 df %>% arrange(-type, frequency, word) 使用文字... 如果您想尝试使用上述方法按相反顺序排序文本,则可能会出现错误

Sort data in R data frame within subgroups

This question already has an answer here: How to sort a dataframe by multiple column(s)? 16 answers NEW UPDATE Much better now with that ISIN and more ties, I used two auxiliary columns. First, I generate the order by DATE, then group by the ISIN and get the min value for each group (that gives me the group order). My data.frame is named B. ord<-B %>% arrange(DATE) %>% mutate(o

将子数据组中的R数据帧中的数据进行排序

这个问题在这里已经有了答案: 如何按多个列排序数据框? 16个答案 新的更新 现在用ISIN和更多的关系更好,我用了两个辅助柱。 首先,我通过DATE生成订单,然后通过ISIN进行分组,并获得每个组的最小值(这给了我的组顺序)。 我的data.frame被命名为B. ord<-B %>% arrange(DATE) %>% mutate(ord=order(DATE)) ord2<-ord %>% group_by(ISIN) %>% summarize(min_ord=min(ord)) ord3<-merge(ord,ord

How to order data set in R

This question already has an answer here: How to sort a dataframe by multiple column(s)? 16 answers You can use with or $ , [ , ie dataset[with(dataset, order(-COL1)),] Or dataset[order(-dataset$COL1),] Or dataset[order(-dataset['COL1']),] Or library(data.table) setorder(setDT(dataset), -COL1)

如何订购R中的数据集

这个问题在这里已经有了答案: 如何按多个列排序数据框? 16个答案 您可以使用with或$ , [ ,即 dataset[with(dataset, order(-COL1)),] 要么 dataset[order(-dataset$COL1),] 要么 dataset[order(-dataset['COL1']),] 要么 library(data.table) setorder(setDT(dataset), -COL1)

sorting data by variables in R

This question already has an answer here: How to sort a dataframe by multiple column(s)? 16 answers 为了提高可读性,我建议sort.airquality <- airquality[with(airquality, order(Ozone, Wind)),] 你要sort.airquality <- airquality[order(airquality$Ozone, airquality$Wind),]

按R中的变量排序数据

这个问题在这里已经有了答案: 如何按多个列排序数据框? 16个答案 为了提高可读性,我建议sort.airquality <- airquality[with(airquality, order(Ozone, Wind)),] 你要sort.airquality <- airquality[order(airquality$Ozone, airquality$Wind),]