Monte Carlo simulation in R

I am trying to simulate data (Y) from an AR(1) model with rho=0.7. Then I will use this data to run a regression of Y on an intercept ( by so doing the parameter estimate becomes the mean of Y), then test the null hypothesis of the coefficient being less than or equal to zero ( alternative is greater than 0) using robust standard errors. I want to run a Monte Carlo simulation of this hypothesis using 2000 replications for different lag values. the purpose is to show the finite sample performance of the Newey West estimator as the lag changes. so this is how I began

A<-array(0, dim=c(2000,1))
for(i in 1:2000){
  y_new<-arima.sim(model=list(ar=0.7), n=50, mean=0,sd=1)
  reg<-lm(y_new~1)
  ad<-coeftest(reg, alternative="greater", vcov=NeweyWest(reg, lag=1, prewhite=FALSE))
  A[i]<-ad[,3]
}

My question: is the code above the right way of doing this kind of simulation? And if it is, how can I get a code to repeat this process for different lag values in the HAC test. I want to run the test each time increasing the lag by 1, thus I will be doing this 50 times for lags 1,2,3,4......,50, each time storing the 2000 simulated test statistics in a vector with different names. calculate rejection probabilities for the test statistic (sig. level =0,05, using the critical value of 1.645) for each case and plot them(rejection probabilities) against the various lag values. Please help


Because you didn't mention the possible purpose of the simulation, it is hard to tell whether it is the right way.

You save a lot of time by computing 50 test statistics for each simulated sample, instead of repeating the simulation 2000 times for each lag (that is, the number of simulation is 2000*50).

Much better format of doing simulation is

library(AER)
library(dplyr)
lags <- 1:50
nreps <- 2000

sim <- function (){
  ynew <- arima.sim(model = list(ar=0.7), n=50, mean=0, sd=1)
  reg <- lm(ynew ~ 1 )
  s <- rep(NA, 50)
  for(i in lags){    
    ad <- coeftest(reg, alternative="greater", vcov=NeweyWest(reg, lag = i, prewhite=FALSE))
    s[i] <- ad[ ,4]
  }
  s
}

Following code stores simulation results in a data.frame

 result <- lapply(1:nreps, function(i)data.frame(simulation = i, lag = lags, pvalues = sim())) %>%
 rbind_all

From your vague description, I extrapolate what you want looks something like

library(ggplot2)
result %>% 
  group_by(lag) %>% 
  summarize(rejectfreq = mean(pvalues > 0.05)) %>% 
  ggplot(., aes(lag, rejectfreq)) + geom_line()+
  coord_cartesian(ylim = c(0,1)) +
  scale_y_continuous(breaks=seq(0, 1, by=0.1))

Although the figure was created using only 100 simulations, it is evident that the choice of the lags in Newey-West wouldn't matter much when the disturbance terms are iid

链接地址: http://www.djcxy.com/p/57756.html

上一篇: 我应该如何解释斯皮尔曼的等级相关性为零的意义?

下一篇: 蒙特卡罗模拟在R