Limiting size of hierarchical data for reproducible example

I am trying to come up with reproducible example (RE) for this question: Errors related to data frame columns during merging. To be qualified as having a RE, the question lacks only reproducible data. However, when I tried to use pretty much standard approach of dput(head(myDataObj)) , the output produced is 14MB size file. The problem is that my data object is a list of data frames, so head()

限制分层数据的大小以重现示例

我想为这个问题提出一个可重现的例子(RE):合并期间与数据帧列相关的错误。 要被认定为拥有RE,这个问题只缺少可重现的数据。 但是,当我试图使用非常标准的dput(head(myDataObj)) ,生成的输出是14MB大小的文件。 问题是我的数据对象是数据框的列表,所以head()限制似乎不能递归地工作。 我还没有找到dput()和head()函数的任何选项,这些选项将允许我递归地控制复杂对象的数据大小。 除非我在上面说错了,否则还有什么

How to create example data set from private data (replacing variable names and levels with uninformative place holders)?

To provide a reproducible example of an approach, a data set must often be provided. Instead of building an example data set, I wish to use some of my own data. However this data can not be released. I wish to replace variable (column) names and factor levels with uninformative place holders (eg. V1....V5, L1....L5). Is an automated way to do this available? Ideally, this would be done in

如何从私人数据创建示例数据集(用无用的占位符替换变量名称和级别)?

为了提供可重复的方法示例,必须经常提供数据集。 我不想创建一个示例数据集,而是希望使用我自己的一些数据。 但是,这些数据无法发布。 我希望用无意义的占位符(例如V1 .... V5,L1 .... L5)替换变量(列)名称和因子级别。 是否提供了一种自动化的方法? 理想情况下,这将在R中完成,接收数据帧并生成这个匿名数据帧。 有了这样的数据集,只需在脚本中搜索并替换变量名称即可,并且您有一个公开可释放的可重复示例

What are the differences between "=" and "<

What are the differences between the assignment operators = and <- in R? I know that operators are slightly different, as this example shows x <- y <- 5 x = y = 5 x = y <- 5 x <- y = 5 # Error in (x <- y) = 5 : could not find function "<-<-" But is this the only difference? The difference in assignment operators is clearer when you use them to set an argument value i

“=”和“<”之间有什么区别?

赋值运算符=和<-在R中有什么区别? 正如这个例子所显示的,我知道操作员略有不同 x <- y <- 5 x = y = 5 x = y <- 5 x <- y = 5 # Error in (x <- y) = 5 : could not find function "<-<-" 但这是唯一的区别吗? 使用它们在函数调用中设置参数值时,赋值运算符的差异更加明显。 例如: median(x = 1:10) x ## Error: object 'x' not found 在这种情况下, x在函数的范围内声明,所以它不存

How to sort a dataframe by multiple column(s)?

I want to sort a data.frame by multiple columns. For example, with the data.frame below I would like to sort by column z (descending) then by column b (ascending): dd <- data.frame(b = factor(c("Hi", "Med", "Hi", "Low"), levels = c("Low", "Med", "Hi"), ordered = TRUE), x = c("A", "D", "A", "C"), y = c(8, 3, 9, 9), z = c(1, 1, 1, 2)) dd b x y z 1 Hi A 8 1 2 Med D 3 1 3

如何按多个列排序数据框?

我想通过多列对数据框进行排序。 例如,在下面的data.frame中,我想按列z (降序)然后按列b (升序)排序: dd <- data.frame(b = factor(c("Hi", "Med", "Hi", "Low"), levels = c("Low", "Med", "Hi"), ordered = TRUE), x = c("A", "D", "A", "C"), y = c(8, 3, 9, 9), z = c(1, 1, 1, 2)) dd b x y z 1 Hi A 8 1 2 Med D 3 1 3 Hi A 9 1 4 Low C 9 2 您可以直接使用order()函数,而无需借助

modify the body text of existing function objects

I have some .Rdata files that contain saved functions as defined by approxfun(). Some of the save files pre-date the change to approxfun from package "base" to "stats", and so the body has PACKAGE = "base" and the wrong package causes the function to fail. I can fix(myfun) and simply replace "base" with "stats", but I want a neater automatic way. Can

修改现有函数对象的正文文本

我有一些包含由approxfun()定义的保存函数的.Rdata文件。 一些保存文件会在更改为约从包“基”到“统计”之前进行更新,因此主体具有 PACKAGE = "base" 并且错误的包导致该功能失败。 我可以修复(myfun),只需将“base”替换为“stats”,但我想要一个整洁的自动方式。 我可以用gsub()和body()以某种方式做到这一点吗? 我可以得到正文并用其替代 as.character(body(myfun)) 但我不知道如何将它变回“调用”并替换定义

Accumulating tailored ggpairs() plot objects into a list object

I am trying to create a list object that contains GGally plots. These plots are each created with two datasets, the main dataset and a subset of the main dataset to be plotted again in orange. In the MWE below, three plots are created, each comparing two columns from the mtcars data and each containing a different number of subset points to be plotted in orange: Plot_1: mpg and cyl, 1 orange

累积定制的ggpairs()将对象绘制到列表对象中

我正在尝试创建一个包含GGally图的列表对象。 这些图分别由两个数据集创建,主数据集和主数据集的子集将以橙色再次绘制。 在下面的MWE中,创建了三个图,每个图都比较mtcars数据中的两列,每个列包含不同数量的要用橙色绘制的子集点: Plot_1:mpg和cyl,1个橙色重叠点 Plot_2:mpg和disp,20个橙色重叠点 Plot_3:mpg和hp,30个橙色重叠点 library(GGally) library(ggplot2) data = mtcars data$ID = rownames(mtcars)

How to make a great R reproducible example?

When discussing performance with colleagues, teaching, sending a bug report or searching for guidance on mailing lists and here on SO, a reproducible example is often asked and always helpful. What are your tips for creating an excellent example? How do you paste data structures from r in a text format? What other information should you include? Are there other tricks in addition to using d

如何做一个伟大的R可重现的例子?

在讨论与同事的表现,教学,发送错误报告或在邮件列表上寻找指导时,以及在这里,可复制的例子经常被问到,并且总是有帮助的。 你有什么建议创建一个很好的例子? 如何从文本格式粘贴r的数据结构? 你应该包括哪些其他信息? 除了使用dput() , dump()或者structure()之外,还有其他的技巧吗? 什么时候应该包含library()或require()语句? 除了c , df , data等之外,还应该避开哪些保留字? 一个人如何做出一个可重