r-顶级程序员

How to plot histograms of raw data on the margins of a plot of interpolated data

I would like to show in the same plot interpolated data and a histogram of the raw data of each predictor. I have seen in other threads like this one, people explain how to do marginal histograms of the same data shown in a scatter plot, in this case, the histogram is however based on other data (the raw data). Suppose we see how price is related to carat and table in the diamonds dataset: li

2018-06-24 14:35:48

如何在插值数据图的边界上绘制原始数据的直方图

我想在同一个图中显示每个预测器的内插数据和原始数据的直方图。我曾在其他线索中看到过这样的情况，人们会解释如何对散点图中显示的相同数据做边缘直方图，但在这种情况下，直方图是基于其他数据（原始数据）。假设我们看到价格与钻石数据集中的克拉和表格有关： library(ggplot2) p = ggplot(diamonds, aes(x = carat, y = table, color = price)) + geom_point() 我们可以添加一个边缘频率图，例如ggMarginal library(

2018-06-24 14:35:47

Subscript a title in a Graph (ggplot2) with label of another file

In my program I have two main files, the first with the data and the second with labels (or titles of my graphics): File total1 (data) 3 10000 3 32039232 1 0.0017290351 2 0.0002781092 3 10001 3 32101193 1 0.0045398899 2 0.0032875689 3 1000 1 60233253 1 0.0022057964 2 6.747e-06 3 10002 3 32108182 1 0.0219913914 2 0.0102120679 3

2018-06-24 14:34:42

在Graph（ggplot2）中标记标题，并带有另一个文件的标签

在我的程序中，我有两个主要文件，第一个是数据，第二个是标签（或图形标题）：文件总数1（数据） 3 10000 3 32039232 1 0.0017290351 2 0.0002781092 3 10001 3 32101193 1 0.0045398899 2 0.0032875689 3 1000 1 60233253 1 0.0022057964 2 6.747e-06 3 10002 3 32108182 1 0.0219913914 2 0.0102120679 3 10003 3 32133994 1 0.0007025013

2018-06-24 14:34:42

Arrange common plot width with facetted ggplot 2.0.0 & gridExtra

Since I have updated to ggplot2 2.0.0, I cannot arrange charts propperly using gridExtra. The issue is that the faceted charts will get compressed while other will expand. The widths are basically messed up. I want to arrange them similar to the way these single facet plots are: left align two graph edges (ggplot) I put a reproducible code library(grid) # for unit.pmax() library(gridExtra)

2018-06-24 14:33:40

使用分面ggplot 2.0.0和gridExtra排列通用绘图宽度

由于我已更新到ggplot2 2.0.0，因此无法使用gridExtra正确安排图表。问题在于分面图表会被压缩，而其他图表会被展开。宽度基本上是混乱的。我想将它们排列成类似于这些单面图的方式：左对齐两个图形边（ggplot）我把一个可重复的代码 library(grid) # for unit.pmax() library(gridExtra) plot.iris <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + geom_point() + facet_grid(. ~ Species) + stat_sm

2018-06-24 14:33:40

Arranging GGally plots with gridExtra?

I'd like to arrange my ggpairs plots with arrangeGrob : library(GGally) library(gridExtra) df <- structure(list(var1 = 1:5, var2 = 4:8, var3 = 6:10), .Names = c("var1", "var2", "var3"), row.names = c(NA, -5L), class = "data.frame") p1 <- ggpairs(df, 1:3) p2 <- ggpairs(df, 1:2) p <- arrangeGrob(p1, p2, ncol=2) which results in this error: Error in arrangeGrob(p1, p2, ncol =

2018-06-24 14:32:36

用gridExtra排列GGally地块？

我想安排我的ggpairs地块与arrangeGrob ： library(GGally) library(gridExtra) df <- structure(list(var1 = 1:5, var2 = 4:8, var3 = 6:10), .Names = c("var1", "var2", "var3"), row.names = c(NA, -5L), class = "data.frame") p1 <- ggpairs(df, 1:3) p2 <- ggpairs(df, 1:2) p <- arrangeGrob(p1, p2, ncol=2) 导致这个错误： Error in arrangeGrob(p1, p2, ncol = 2) : input must be grobs! 有没有

2018-06-24 14:32:35

How to control plot width in gridExtra?

Possible Duplicate: left align two graph edges (ggplot) I am trying to put two plots produced with ggplot on the same page, top and bottom, so that their widths are the same. The data is from the same time series, x axis being time, so it is important that data points with the same time are not shifted horizontally with respect to each other. I tried grid.arrange from package gridExtra: gr

2018-06-24 14:31:33

如何控制gridExtra中的绘图宽度？

可能重复：左对齐两个图形边缘（ggplot）我试图将两张使用ggplot生成的图绘制在同一页上，上下两个图上，以便它们的宽度相同。数据来自同一时间序列，x轴是时间，所以重要的是同一时间的数据点不会相互水平移动。我试图grid.arrange从包gridExtra： grid.arrange(p1, p2) 但是由于y轴标签的宽度不同，图中的宽度不同。我看了这篇文章，讨论类似的问题，但我无法应用这些信息来解决我的问题。根据我的评论（以及

2018-06-24 14:31:33

When should I use setDT() instead of data.table() to create a data.table?

I am having difficulty grasping the essence of the setDT() function. As I read code on SO, I frequently come across the use of setDT() to create a data.table. Of course the use of data.table() is ubiquitous. I feel like I solidly comprehend the nature of data.table() yet the relevance of setDT() eludes me. ?setDT tells me this: setDT converts lists (both named and unnamed) and data.frames t

2018-06-24 13:26:51

什么时候应该使用setDT（）而不是data.table（）来创建data.table？

我很难setDT()函数的本质。当我在SO上读取代码时，我经常遇到使用setDT()创建data.table。当然data.table（）的使用是无处不在的。我觉得我很data.table()的本质，但data.table()的相关性setDT()我无法setDT() 。 ?setDT告诉我这个： setDT通过引用将列表（包括named和unnamed）和data.frames转换为data.tables。以及：按照data.table的说法，所有set *函数通过引用来改变它们的输入。也就是说，除了临时工作存

2018-06-24 13:26:51

Normalize each row of data.table

This seems like it should be easy, but I can't find an answer :(. I'm trying to normalize each row of a data_table like this: normalize <- function(x) { s = sum(x) if (s>0) { return(x/s) } else { return 0 } } How do I call this function on every row of a data.table and get a normalized data.table back? I can do a for loop, but that's surely not the right way,

2018-06-24 13:25:49

标准化每行data.table

这似乎应该很容易，但我找不到答案:(我试图规范每一行data_table像这样： normalize <- function(x) { s = sum(x) if (s>0) { return(x/s) } else { return 0 } } 如何在data.table的每一行调用这个函数并获得一个标准化的data.table？我可以做一个for循环，但这肯定不是正确的方法，并且根据我的理解， apply(data, 1, normalize)将会将我的data.table转换为矩阵，这将是一个很大的性能影响。这是

2018-06-24 13:25:49

Writings functions (procedures) for data.table objects

In the book Software for Data Analysis: Programming with R, John Chambers emphasizes that functions should generally not be written for their side effect; rather, that a function should return a value without modifying any variables in its calling environment. Conversely, writing good script using data.table objects should specifically avoid the use of object assignment with <- , typically u

2018-06-24 13:24:47

为data.table对象编写函数（过程）

在“用于数据分析的软件：用R编程的软件”一书中，约翰·钱伯斯强调函数通常不应写为副作用; 相反，函数应该返回一个值，而不需要在其调用环境中修改任何变量。相反，使用data.table对象编写好的脚本应特别避免使用带<-的对象分配，通常用于存储函数的结果。首先，是一个技术问题。设想一个名为proc1的R函数，它接受一个data.table对象x作为它的参数（除了可能还有其他参数）。 proc1返回NULL，但使用:=修改x 。根据

2018-06-24 13:24:46

Add multiple columns to R data.table in one function call?

I have a function that returns two values in a list. Both values need to be added to a data.table in two new columns. Evaluation of the function is costly, so I would like to avoid having to compute the function twice. Here's the example: library(data.table) example(data.table) DT x y v 1: a 1 42 2: a 3 42 3: a 6 42 4: b 1 4 5: b 3 5 6: b 6 6 7: c 1 7 8: c 3 8 9: c 6 9 Here'

2018-06-24 13:23:45

在一个函数调用中将多个列添加到R data.table？

我有一个函数返回列表中的两个值。两个值都需要添加到两个新列中的data.table中。对函数的评估代价很高，所以我想避免两次计算函数。这是一个例子： library(data.table) example(data.table) DT x y v 1: a 1 42 2: a 3 42 3: a 6 42 4: b 1 4 5: b 3 5 6: b 6 6 7: c 1 7 8: c 3 8 9: c 6 9 这是我的功能的一个例子。请记住，我说这是昂贵的计算，最重要的是无法从其他给定值中推导出一个返回值（如下例所

2018-06-24 13:23:44

How to delete a row by reference in data.table?

My question is related to assignment by reference versus copying in data.table . I want to know if one can delete rows by reference, similar to DT[ , someCol := NULL] I want to know about DT[someRow := NULL, ] I guess there's a good reason for why this function doesn't exist, so maybe you could just point out a good alternative to the usual copying approach, as below. In particular,

2018-06-24 13:22:43

如何通过data.table中的引用删除行？

我的问题与通过引用赋值与data.table拷贝data.table 。我想知道是否可以通过引用删除行，类似于 DT[ , someCol := NULL] 我想知道 DT[someRow := NULL, ] 我想这个函数不存在的原因很充分，所以也许你可以指出一个通常的复制方法的好替代方法，如下所示。特别是，用我最喜欢的例子（data.table）， DT = data.table(x = rep(c("a", "b", "c"), each = 3), y = c(1, 3, 6), v = 1:9) # x y v # [1,] a 1 1 # [2,] a

2018-06-24 13:22:42