如何解决比较两个基因组范围 (R)
我有 2 个基因组范围
g1<-GRanges(c("chr1:0-14","chr1:15-29"),score=c(20.2,10.4));g1
GRanges object with 2 ranges and 1 Metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <numeric>
[1] chr1 0-14 * | 20.2
[2] chr1 15-29 * | 10.4
g2<-GRanges(c("chr1:0-9","chr1:10-19","chr1:20-29"),state=c('E1','E2','E1'));g2
GRanges object with 3 ranges and 1 Metadata column:
seqnames ranges strand | state
<Rle> <IRanges> <Rle> | <character>
[1] chr1 0-9 * | E1
[2] chr1 10-19 * | E2
[3] chr1 20-29 * | E1
我想让它们具有可比性。首先我将它们组合起来,然后我使用了分离:
g3<-(c(g1,g2)); g3
GRanges object with 5 ranges and 2 Metadata columns:
seqnames ranges strand | score state
<Rle> <IRanges> <Rle> | <numeric> <character>
[1] chr1 0-14 * | 20.2 <NA>
[2] chr1 15-29 * | 10.4 <NA>
[3] chr1 0-9 * | <NA> E1
[4] chr1 10-19 * | <NA> E2
[5] chr1 20-29 * | <NA> E1
disjoin(g3)
GRanges object with 4 ranges and 0 Metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr1 0-9 *
[2] chr1 10-14 *
[3] chr1 15-19 *
[4] chr1 20-29 *
所以,disjoin 正在执行我想要的拆分,但不幸的是没有保留元数据。有没有办法像这样保留元数据并获得GRanges?
GRanges object with 5 ranges and 2 Metadata columns:
seqnames ranges strand | score state
<Rle> <IRanges> <Rle> | <numeric> <character>
[1] chr1 0-9 *| 20.2 E1
[2] chr1 10-14 *| 20.2 E2
[3] chr1 15-19 *| 10.4 E2
[4] chr1 20-29 *| 10.4 E1
谢谢
解决方法
我想你会在这里找到帮助:https://support.bioconductor.org/p/82551/ 但请注意,在您的情况下,它并不准确,因为输出中的一个范围可以映射到输入中的多个范围
,是的,with.revmap=T
绝对是解决方案:
g1<-GRanges(c("chr1:0-14","chr1:15-29"),score=c(20.2,10.4));g1
g2<-GRanges(c("chr1:0-9","chr1:10-19","chr1:20-29"),state=c('E1','E2','E1'));g2
g3<-(c(g1,g2)); g3 #combining GRanges
g4<-disjoin(g3,with.revmap=TRUE);g4 #disjoining to compare them WITH revmap
l1<-g4$revmap;l1
score<-extractList(mcols(g3)$score,l1);score
state<-extractList(mcols(g3)$state,l1);state
na.omit<-function(l){sapply(l,function(x){x[!is.na(x)]})} #remove NA's
mcols(g4)$score<-na.omit(score)
mcols(g4)$state<-na.omit(state)
g4
GRanges object with 4 ranges and 3 metadata columns:
seqnames ranges strand | revmap score state
<Rle> <IRanges> <Rle> | <IntegerList> <numeric> <character>
[1] chr1 0-9 * | 1,3 20.2 E1
[2] chr1 10-14 * | 1,4 20.2 E2
[3] chr1 15-19 * | 2,4 10.4 E2
[4] chr1 20-29 * | 2,5 10.4 E1
现在我可以轻松地将状态与其分数进行比较,例如进行箱线图。 谢谢巴斯蒂安
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。