R rownames(foo[bar]) prints as null but can be successfully changed

I've written a script that works on a set gene-expression data. I'll try to separate my post in the short question and the rather lengthy explanation (sorry about that long text block). I hope the short question makes sense in itself. The long explanation is simply to clarify if I don't get the point along in the short question.

I tried to aquire basic R skills and something that puzzles me occurred, and I didn't find any enlightment via google. I really don't understand this. I hope that by clarifying what is happening here I can better understand R. That said I'm not a programmer so please bear with my bad code.

SHORT QUESTION:

When I have rownames(foo) eg

> print(rownames(foo))
"a"     "b"     "c"     "d" 

and I try to access it via print(rownames(foo[bar]) it prints it as null. Eg

> print(rownames(foo[2]))
NULL

Here in the second answer Richie Cotton explains this as "[...] that where there aren't any names, [...]" This would indicate to me, that either rownames(foo) is empty - which is clearly not the case as I can print it with "print(rownames(foo))" - or that this method of access fails.

However when I try to change the value at position bar, i get a warning message, that the replacement length wouldn't match. However the operation nevertheless succeeds - which pretty much proves, that this method of access is indeed successful. Eg

> bar = 2
> rownames(foo[bar]) = some.vector(rab)
> print(rownames(foo[bar])
NULL
> print(rownames(foo))
"a"     "something else"        "c"     "d" 

Why is this working? Obviously the function can't properly access the position of bar in foo, as it prints it as empty. Why the heck does it still replace the value successfully and not fail in a horrific way? Or asked the other way around: When it successfully replaces the value at this position why is the print function not returning the value properly?

LONG BACKGROUND EXPLANATION:

The data source contains the number in the list, the entrez-id of the gene, the official gene symbol, the affimetrix probe id and then the increase or decrease values. It looks something like this:

No  Entrez  Symbol  Probe_id    Sample1_FoldChange  Sample2_FoldChange
1   690244  Sumo2   1367452_at  1.02                0.19

Later when displaying the data I want it to print out only the gene symbol and the increases. Now if there is no gene-symbol in the data set it is printed as "n/a", this is obviously of no value for me, as I can't determine which one of many genes it is. So I made a first processing step, that only for this cases exchanges the "n/a" result with "n/a(12345) where 12345 is the entrez-id.

I've written the following script to do this. (Note as I'm not a programmer and I am new with RI doubt that it is pretty code. But that's not the point I want to discuss.)

no.symbol.idx <-which(rownames(expr.table) == "n/a")
c1 <- character (length(rownames(expr.table)))
c2 <- c1
for (x in 1:length(c1))
{
    c1[x] <- "n/a ("
}
for (x in 1:length(c2))
{
    c2[x] <- ")"
}

rownames(expr.table)[no.symbol.idx] <- paste(c1, (expr.table[no.symbol.idx , "Entrez"]),c2, sep="")

The script works and it does what it should do. However I get the following error message.

Warning message:
In rownames(expr.table)[no.symbol.idx] <- paste(c1, (expr.table[no.symbol.idx,  :
  number of items to replace is not a multiple of replacement length

To find out what happened here is i put some text output into the script.

no.symbol.idx <-which(rownames(expr.table) == "n/a")
c1 <- character (length(rownames(expr.table)))
c2 <- c1
for (x in 1:length(c1))
{
    c1[x] <- "n/a ("
}
for (x in 1:length(c2))
{
    c2[x] <- ")"
}
print("print(rownames(expr.table)):")
print(rownames(expr.table))
print("print(no.symbol.idx):")
print(no.symbol.idx)
print("print(rownames(expr.table[no.symbol.idx])):")
print(rownames(expr.table[no.symbol.idx]))
print("print(rownames(expr.table[14])):")
print(rownames(expr.table[14]))
print("print(rownames(expr.table[15])):")
print(rownames(expr.table[15]))

cat("print(expr.table[no.symbol.idx,"Entrez"]):n")
print(expr.table[no.symbol.idx,"Entrez"])

rownames(expr.table)[no.symbol.idx] <- paste(c1, (expr.table[no.symbol.idx , "Entrez"]),c2, sep="")

print("print(rownames(expr.table)):")
print(rownames(expr.table))
print("print(rownames(expr.table[no.symbol.idx])):")
print(rownames(expr.table[no.symbol.idx]))

And I get the following output in the console.

[1] "print(rownames(expr.table)):"
 [1] "Sumo2"  "Cdc37"  "Copb2"  "Vcp"    "Ube2d3" "Becn1"  "Lypla2" "Arf1"   "Gdi2"   "Copb1"  "Capns1" "Phb2"   "Puf60"  "Dad1"   "n/a"   
[1] "print(no.symbol.idx):"
[1] 15
[1] "print(rownames(expr.table[no.symbol.idx])):"
NULL
[1] "print(rownames(expr.table[14])):"
NULL
[1] "print(rownames(expr.table[15])):"
NULL

... (to be continued) so obviously no.symbol.idx gets the right position for the n/a value. When I try to print it however it claims that rownames for this position was empty and returns NULL. When I try to access this position "by hand" and use expr.table[15] it also returns NULL. This however has nothing to do with the n/a value as the same holds true for the value stored at position 14.

... (the continuation)
print(expr.table[no.symbol.idx,"Entrez"]):
[1] "116727"
[1] "print(rownames(expr.table)):"
 [1] "Sumo2"        "Cdc37"        "Copb2"        "Vcp"          "Ube2d3"       "Becn1"        "Lypla2"       "Arf1"         "Gdi2"        
[10] "Copb1"        "Capns1"       "Phb2"         "Puf60"        "Dad1"         "n/a (116727)"
[1] "print(rownames(expr.table[no.symbol.idx])):"
NULL

and this is the result that surprises me. Despite this it is working. It claims everything would be NULL but the operation is successful. I don't understand this.

EDIT: Here are the results of the functions you wanted me tu run.

str(expr.table)
chr [1:15, 1:17] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "401" "690244" "114562" "60384" ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:15] "Sumo2" "Cdc37" "Copb2" "Vcp" ...
  ..$ : chr [1:17] "No" "Entrez" "Symbol" "Probe_id" ...


head(expr.table)


dput(head(expr.table,10))
structure(c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", 
"690244", "114562", "60384", "116643", "81920", "114558", "83510", 
"64310", "29662", "114023", "Sumo2", "Cdc37", "Copb2", "Vcp", 
"Ube2d3", "Becn1", "Lypla2", "Arf1", "Gdi2", "Copb1", "1367452_at", 
"1367453_at", "1367454_at", "1367455_at", "1367456_at", "1367457_at", 
"1367458_at", "1367459_at", "1367460_at", "1367461_at", "1.02000", 
"-1.04000", "1.03000", "-0.12000", "-0.02000", "-0.03000", "0.09000", 
"0.05000", "-0.09000", "0.16000", "0.19000", "0.11000", "-0.00425", 
"0.52000", "0.46000", "0.42000", "0.20000", "0.05000", "0.21000", 
"0.37000", "0.26000", "0.19000", "-0.03000", "0.35000", "0.34000", 
"0.07000", "0.00156", "0.12000", "0.08000", "0.16000", "0.59000", 
"0.20000", "-0.16000", "0.28000", "0.46000", "-0.15000", "0.00168", 
"0.23000", "-0.01000", "0.10000", "0.05000", "0.12000", "-0.00522", 
"0.58000", "0.23000", "0.06000", "0.01000", "0.07000", "-0.11000", 
"0.23000", "-0.03", "0.08", "0.09", "0.08", "0.11", "0.03", "-0.08", 
"0.02", "-0.05", "0.06", "0.03000", "-0.06000", "0.09000", "0.00940", 
"0.11000", "-0.09000", "0.04000", "-0.04000", "-0.09000", "0.01000", 
"0.04000", "-0.02000", "0.21000", "0.27000", "0.08000", "0.12000", 
"0.06000", "0.26000", "0.04000", "0.40000", "0.05000", "0.05000", 
"0.00897", "0.09000", "0.20000", "0.09000", "0.13000", "-0.03000", 
"-0.08000", "-0.01000", "0.050000", "0.020000", "0.050000", "-0.005390", 
"0.020000", "0.008080", "0.060000", "-0.030000", "-0.020000", 
"-0.000406", "0.50", "0.11", "0.06", "0.19", "0.21", "0.32", 
"0.15", "0.17", "0.14", "0.03", "-0.08000", "-0.11000", "-0.07000", 
"0.03000", "-0.04000", "0.02000", "-0.00444", "-0.07000", "-0.13000", 
"-0.11000", "0.25000", "0.15000", "0.22000", "0.74000", "0.39000", 
"0.36000", "-0.08000", "0.18000", "0.00865", "0.43000"), .Dim = c(10L, 
17L), .Dimnames = list(c("Sumo2", "Cdc37", "Copb2", "Vcp", "Ube2d3", 
"Becn1", "Lypla2", "Arf1", "Gdi2", "Copb1"), c("No", "Entrez", 
"Symbol", "Probe_id", "AA_HD_24h_FoldChange", "AAF_HD_24h_FoldChange", 
"APAP_HD_24h_FoldChange", "BBZ_HD_24h_FoldChange", "BCT_HD_24h_FoldChange", 
"BEA_HD_24h_FoldChange", "CBP_HD_24h_FoldChange", "CCL4_HD_24h_FoldChange", 
"CPA_HD_24h_FoldChange", "CSP_HD_24h_FoldChange", "DEN_HD_24h_FoldChange", 
"LS_HD_24h_FoldChange", "PCT_HD_24h_FoldChange")))

And here I added the file I use for debugging. This is the data it reads into expr.table.

No  Entrez  Symbol  Probe_id    AA_HD_24h_FoldChange    AAF_HD_24h_FoldChange   APAP_HD_24h_FoldChange  BBZ_HD_24h_FoldChange   BCT_HD_24h_FoldChange   BEA_HD_24h_FoldChange   CBP_HD_24h_FoldChange   CCL4_HD_24h_FoldChange  CPA_HD_24h_FoldChange   CSP_HD_24h_FoldChange   DEN_HD_24h_FoldChange   LS_HD_24h_FoldChange    PCT_HD_24h_FoldChange
1   690244  Sumo2   1367452_at  1.02    0.19    0.26    0.59    0.05    -0.03   0.03    0.04    0.05    0.05    0.5 -0.08   0.25
2   114562  Cdc37   1367453_at  -1.04   0.11    0.19    0.2 0.12    0.08    -0.06   -0.02   0.05    0.02    0.11    -0.11   0.15
3   60384   Copb2   1367454_at  1.03    -4.25E-003  -0.03   -0.16   -5.22E-003  0.09    0.09    0.21    8.97E-003   0.05    0.06    -0.07   0.22
4   116643  Vcp 1367455_at  -0.12   0.52    0.35    0.28    0.58    0.08    9.40E-003   0.27    0.09    -5.39E-003  0.19    0.03    0.74
5   81920   Ube2d3  1367456_at  -0.02   0.46    0.34    0.46    0.23    0.11    0.11    0.08    0.2 0.02    0.21    -0.04   0.39
6   114558  Becn1   1367457_at  -0.03   0.42    0.07    -0.15   0.06    0.03    -0.09   0.12    0.09    8.08E-003   0.32    0.02    0.36
7   83510   Lypla2  1367458_at  0.09    0.2 1.56E-003   1.68E-003   0.01    -0.08   0.04    0.06    0.13    0.06    0.15    -4.44E-003  -0.08
8   64310   Arf1    1367459_at  0.05    0.05    0.12    0.23    0.07    0.02    -0.04   0.26    -0.03   -0.03   0.17    -0.07   0.18
9   29662   Gdi2    1367460_at  -0.09   0.21    0.08    -0.01   -0.11   -0.05   -0.09   0.04    -0.08   -0.02   0.14    -0.13   8.65E-003
10  114023  Copb1   1367461_at  0.16    0.37    0.16    0.1 0.23    0.06    0.01    0.4 -0.01   -4.06E-004  0.03    -0.11   0.43
11  29156   Capns1  1367462_at  -0.23   0.32    0.11    0.13    -0.38   -0.15   -0.08   0.15    -0.18   0.2 0.13    -0.18   0.09
12  114766  Phb2    1367463_at  1.01E-003   0.29    0.41    0.59    0.05    -0.07   -0.13   -0.18   -0.28   -0.21   -0.22   -0.2    0.39
13  84401   Puf60   1367464_at  -0.05   0.33    0.14    0.3 0.03    0.02    8.96E-003   2.96E-003   -8.63E-003  -0.13   0.07    -0.15   0.44
14  192275  Dad1    1367465_at  0.22    -0.21   -0.19   -0.24   -0.47   -0.01   -0.09   0.68    -0.06   -0.08   0.02    -0.29   -0.25
401 116727  n/a 1367852_s_at    -0.34   -0.12   -0.06   -0.11   0.13    0.03    0.07    -0.18   0.08    -0.2    0.04    -0.04   0.06

Rownames is filled with the Gene symbols eg Sumo2 for No 1. What the script should do (and does) is for Entry No 401 it should change the name from n/a to n/a(116727). However the afforementioned warning occurs and I want to understand what's going on here.


I assume you are using a data.frame called foo . Underneath the hood, a data.frame is a list of vectors each of which is of the same length.

  • So foo[2] refers to the second column of foo as a dataframe, foo[,2] refers to the second column of foo as a vector. rownames(foo) is a vector and its second term is rownames(foo)[2]

  • If you want the second column of foo as a dataframe then you can use foo[2] or foo[,2,drop=FALSE] and print(rownames(foo[2])) will give you the same result as print(rownames(foo))

  • If you want the second row of foo as a dataframe then you need a comma as in foo[2,] and print(rownames(foo[2,])) will give you the same result as print(rownames(foo)[2])

  • If you want to change the name of the second row of foo in the original foo dataframe then try something like:

    rownames(foo)[2] = "example of new name for row 2"
    
    链接地址: http://www.djcxy.com/p/96262.html

    上一篇: 用布尔值分隔列

    下一篇: R rownames(foo [bar])打印为空,但可以成功更改