different results using train(), predict() and resamples()

I'm using the Caret package to analyse various models and I'm assessing the results using: print() [print the results of train()], predict(), and resamples(). Why are these results in the following example different? I'm interested in sensitivity (true positives). Why is J48_fit assessed as having a sensitivity of .71, then .81, then .71 again The same happens when I run

不同的结果使用train(),predict()和resamples()

我正在使用Caret软件包来分析各种模型,并使用以下方法评估结果: print()[打印train()的结果], 预测()和 重采样()。 为什么这些结果在以下示例中有所不同? 我对灵敏度(真正的好处)感兴趣。 为什么J48_fit被评估为0.71的敏感度,然后是0.81,然后是0.71 当我运行其他模型时会发生同样的情况 - 灵敏度根据评估而变化。 注意:我在这里包含了两个模型,以便说明resamples()函数,它必须以两个模型作

Plot SVM linear model trained by caret package in R

Purpose I was trying to visualize SVMLinear classification model via plot . I am using the example code and data provided in kernlab package having noticed caret actually train svm via ksvm function (referring to src code here (https://github.com/topepo/caret/blob/master/models/files/svmLinear.R)) Problem When I plot the final model of caret model object, it did not yield figure. And I

在R中绘制由插入符号包训练的SVM线性模型

目的 我试图通过plot显示SVMLinear分类模型。 我正在使用kernlab包中提供的示例代码和数据,注意到caret实际上是通过ksvm函数训练svm(参考此处的src代码(https://github.com/topepo/caret/blob/master/models/files/svmLinear)。 R)) 问题 当我绘制脱字符模型对象的最终模型时,它没有产生数字。 和我 我尝试了三种方法后没有找到出路。 码 require(caret) require(kernlab) # ===== sample codes from ksvm x

Training nnet and avNNet models with caret when the output has negatives

My question is about the typical feed-forward single-hidden-layer backprop neural network, as implemented in package nnet, and trained with train() in package caret. This is related to this question but in the context of the nnet and caret packages in R. I demonstrate the problem with a simple regression example where Y = sin(X) + small error : raw Y ~ raw X: predicted outputs are uniformly

当输出结果为负时,用插入符号训练nnet和avNNet模型

我的问题是关于典型的前馈单隐层backprop神经网络,在package nnet中实现,并且使用train()在package中进行train() 。 这与这个问题有关,但是在R的nnet和caret包中。 我用一个简单的回归示例来说明问题,其中Y = sin(X) + small error : raw Y ~ raw X:预测输出在原始Y < 0统一为零。 scaled Y (to 0-1) ~ raw X :解决方案看起来很棒; 看下面的代码。 代码如下 library(nnet) X <- t(t(runif(200, -pi, pi)))

Predict function from Caret package give an Error

I am doing just a regular logistic regression using the caret package in R. I have a binomial response variable coded 1 or 0 that is called a SALES_FLAG and 140 numeric response variables that I used dummyVars function in R to transform to dummy variables. data <- dummyVars(~., data = data_2, fullRank=TRUE,sep="_",levelsOnly = FALSE ) dummies<-(predict(data, data_2)) model_data<- as.dat

从Caret软件包预测功能会给出错误

我正在使用R中的caret包进行正则逻辑回归。我有一个编码为1或0的二项式响应变量(称为SALES_FLAG)和140个数字响应变量,我使用R中的dummyVars函数转换为虚拟变量。 data <- dummyVars(~., data = data_2, fullRank=TRUE,sep="_",levelsOnly = FALSE ) dummies<-(predict(data, data_2)) model_data<- as.data.frame(dummies) 这给了我一个数据框来处理。 所有的变量都是数字的。 接下来我分成训练和测试: trainIn

ROC in rfe() in caret package for R

I am using the caret package in R for training a radial basis SVM for classification; in addition, a linear SVM is used for variable selection. With metric="Accuracy", this works fine, but eventually I am more interested in optimizing metric="ROC". While the ROC is calculated for all models that are fit, there seems to be some problem with aggregating the ROC values. The

ROC在rfe()中为caret package for R

我正在使用R中的caret包来训练径向基SVM进行分类; 另外,线性SVM用于变量选择。 使用度量=“精度”,这工作得很好,但最终我对优化度量=“ROC”更感兴趣。 虽然计算所有适合的模型的ROC,但汇总ROC值似乎存在一些问题。 以下是一些示例代码: library(caret) library(mlbench) set.seed(0) data(Sonar) x<-scale(Sonar[,1:60]) y<-as.factor(Sonar[,61]) # Custom summary function to use both # defaultSummary() an

train() function and rate model (poisson regression with offset) with caret

I fitted a rate model using glm() (poisson link with offset, like y ~ offset(log(x1)) + x2 + x3 the response is y/x1 in this case). Then I wanted to do cross validation using caret package so I used 'train()' function with k-fold CV control. It turns out the 2 models I have are very different. It seems that train() can't handle offset : I change the variable within offset to be

train()函数和速率模型(具有偏移量的泊松回归)与脱字符号

我用glm() (带有偏移量的泊松链接,如 y ~ offset(log(x1)) + x2 + x3 在这种情况下,响应是y/x1 )。 然后,我想使用插入符号包进行交叉验证,因此我使用了k-fold CV控制的“train()”函数。 事实证明,我有两个模型是非常不同的。 似乎train()无法处理offset :我将offset量内的变量更改为offset offset(log(log(x1))或offset(log(sqrt(x1)) ,模型保持不变。 任何人都有过这种经历,你是如何处理它的? 谢谢! 顺

Why is caret train taking up so much memory?

When I train just using glm , everything works, and I don't even come close to exhausting memory. But when I run train(..., method='glm') , I run out of memory. Is this because train is storing a lot of data for each iteration of the cross-validation (or whatever the trControl procedure is)? I'm looking at trainControl and I can't find how to prevent this...any hints? I o

为什么插页训练占用这么多的记忆?

当我使用glm训练时,一切都有效,而且我甚至不会耗尽记忆。 但是当我运行train(..., method='glm') ,我用完了内存。 这是否是因为train为交叉验证的每次迭代(或者trControl过程)存储了大量数据? 我在看trainControl ,我无法找到如何防止这...任何提示? 我只关心绩效总结,可能是预测的回应。 (我知道这与存储参数调整网格搜索的每次迭代中的数据无关,因为我相信glm没有网格。) 问题有两个。 i) trai

Automatically split function output (list) into component data.frames

I have a functions which yields 2 dataframes. As functions can only return one object, I combined these dataframes as a list. However, I need to work with both dataframes separately. Is there a way to automatically split the list into the component dataframes, or to write the function in a way that both objects are returned separately? The function: install.packages("plyr") require(plyr)

自动将功能输出(列表)分解为组件数据帧

我有一个产生2个数据帧的函数。 由于函数只能返回一个对象,因此我将这些数据框合并为一个列表。 但是,我需要分别处理两个数据帧。 有没有办法自动将列表分割成组件数据框,或以两种对象分别返回的方式编写函数? 功能: install.packages("plyr") require(plyr) fun.docmerge <- function(x, y, z, crit, typ, doc = checkmerge) { mergedat <- paste(deparse(substitute(x)), "+",

How to open multiple .RDATA and save one of there names as data.frame

I have a multiple .RData files and I need to save one of their objects as a data frame. For instance, I have 5 Rdata file in certain folder and I see it like this, files <- list.files(path="/home/user/data/bumphunter", pattern="*.RData", full.names=TRUE, recursive=FALSE) which shows me, files [1] "/home/R1/Results.alt_ID.RData" [2] "/home/R1/Results.alt.RData" [3] "/home/R1/Results.al

如何打开多个.RDATA并将其中一个名称保存为data.frame

我有多个.RData文件,我需要将其中一个对象保存为数据框。 例如,我在某个文件夹中有5个Rdata文件,我看到它是这样的, files <- list.files(path="/home/user/data/bumphunter", pattern="*.RData", full.names=TRUE, recursive=FALSE) 这向我显示, files [1] "/home/R1/Results.alt_ID.RData" [2] "/home/R1/Results.alt.RData" [3] "/home/R1/Results.alt_REL.RData" [4] /home/R1/Results.DU_ID.RData" [5] "

Making a CSV file into an RData file

Please bear with an R newbie here. I'm trying to follow along with a tutorial published on the wonderful flowingdata.com site by using my own data to replace the .Rdata file included in the tutorial. The Rdata file, "unisexCnts.RData", contains unisex names and the number of times used for different years: head(unisexCnts) 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939

将CSV文件制作成RData文件

请在这里忍受一个R新手。 我试图通过使用我自己的数据替换本教程中包含的.Rdata文件,以及在精彩的flowsdata.com网站上发布的教程。 Rdata文件“unisexCnts.RData”包含无性别名称和不同年份使用的次数: head(unisexCnts) 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 Addison 0 0 0 0 0 0 0 0 0 0 0 0 0