微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

基因数据处理21之BWASW算法ref分块建立索引然后比对ref切分为四段,read为250条

1.时间分析

对ref为单条染色体进行比对,第一次比对在3-5s不等,对chr1-4比对,在20s左右

连续比对多次后,对单染色体比对降到1s左右,chr1-4降到2s左右

不懂为什么比一次比对时间比较长,后面几次比对时间变短


运行代码

hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 2.885 sec; cpu: 1.118 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.068 sec; cpu: 1.022 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.068 sec; cpu: 1.017 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.068 sec; cpu: 1.019 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 2.511 sec; cpu: 1.056 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 0.999 sec; cpu: 0.950 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.017 sec; cpu: 0.964 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.009 sec; cpu: 0.965 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.071 sec; cpu: 1.019 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.072 sec; cpu: 1.015 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.068 sec; cpu: 1.018 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000chr1.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.065 sec; cpu: 1.017 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000chr1.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.070 sec; cpu: 1.017 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000chr1.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.050 sec; cpu: 1.009 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000chr2.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.017 sec; cpu: 0.969 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000chr2.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.015 sec; cpu: 0.969 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000chr2.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.023 sec; cpu: 0.966 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq >SRR003161h1000chr3.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq
[main] Real time: 0.940 sec; cpu: 0.885 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq >SRR003161h1000chr3.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq
[main] Real time: 0.933 sec; cpu: 0.888 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq >SRR003161h1000chr3.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq
[main] Real time: 0.915 sec; cpu: 0.872 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq >SRR003161h1000chr4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq
[main] Real time: 0.918 sec; cpu: 0.871 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq >SRR003161h1000chr4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq
[main] Real time: 0.919 sec; cpu: 0.868 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq >SRR003161h1000chr4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq
[main] Real time: 0.889 sec; cpu: 0.853 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 20.819 sec; cpu: 3.195 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 17.380 sec; cpu: 2.803 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 14.140 sec; cpu: 2.454 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 4.305 sec; cpu: 2.166 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 2.034 sec; cpu: 1.970 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 2.059 sec; cpu: 1.995 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 2.079 sec; cpu: 2.000 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ rm SRR003161h1000chr1-4.sam 
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 2.046 sec; cpu: 1.997 sec

2.准确性分析:

hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr1.sam 
264 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
105 + 0 mapped (39.77% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr2.sam 
260 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
83 + 0 mapped (31.92% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr3.sam 
256 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
80 + 0 mapped (31.25% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr4.sam 
254 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
58 + 0 mapped (22.83% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr1-4.sam 
264 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
146 + 0 mapped (55.30% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
3.比对结果文件,太长,就不粘了

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐