R: Scatter plot of time series data for multiple points, ggplot?, reshape?

I have data in the following format. Column V1 is the genomic location of interest, and column V4 and V5 are the minor allele frequencies at two different points in time. I would like to make a simple xy scatter plot with a line connecting the allele frequency for each specific location from timepoint 1 to timepoint 2(plotted on y-axis). (Note, I actually have hundreds to thousands of data points).

   V1    V2      V3          V4          V5
1 153 1/113   1/115 0.008849558 0.008695652
2 390 0/176 150/152 0.000000000 0.986842105
3 445 1/149   1/152 0.006711409 0.006578947
4 507 0/154 144/146 0.000000000 0.986301370
5 619 1/103  99/101 0.009708738 0.980198020
6 649 0/138 120/123 0.000000000 0.975609756

I feel like I should be able to accomplish this with ggplot, but I am not sure how to go about doing so, as I don't know how to specify two y-values for each genomic position, nor specify a column as a category. I suspect the data needs to be reshaped somehow. Any help or suggestions are greatly appreciated!


Update:

Thanks to all who gave me suggestions. I don't think I was very clear about wanting the time points to be my x-axis as opposed to the genomic position - my apologies. Hopefully this picture clarifies that!

I have successfully generated the plot I wished to make with the following code:

ggplot(dat) + geom_segment(aes(x="timepoint 1", y=V4, xend="timepoint2", yend=V5))

and this is what the plot looks like with more data points...

allelefreqtrajectories

I haven't changed the axes titles and played with margins yet, but this is the general idea!


If your example data was in DF , then

ggplot(DF) +
  geom_segment(aes(x=V4, y="timepoint 1", xend=V5, yend="timepoint 2"))


It's not completely clear from the question, but I think this is what you're after:

ggplot(d, aes(x=V1, y=V4, ymin=V4, ymax=V5)) 
  + geom_linerange() 
  + xlab('Genomic location') 
  + ylab('Minor allele frequency')

Docs: http://docs.ggplot2.org/current/geom_linerange.html

在这里输入图像描述


with(dat, plot(x=V1, y=V5, ylim=c(0,1) ,type='n',
      xaxt="n", ylab="Allele Frequency", xlab="Genomic Location"))
with(dat, axis(1, V1,V1, cex.axis=0.7)   )
with( dat, arrows(x0=V1,x1=V1+10, y0=V4, y1=V5) )

You can clean up the labeling and tweak colors and arrowhead features:

?arrows

在这里输入图像描述

链接地址: http://www.djcxy.com/p/86148.html

上一篇: 用颜色编码的风向箭头绘制ggplot2时间序列图

下一篇: R:多点时间序列数据的散点图,ggplot?,重塑?