在LINUX中查看文件时略过头部的注释部分

在LINUX中查看文件常用less命令,但是在遇到带有注释的文件时,如果单单使用less命令,会看到长长的注释部分,给查看文件带来极大不便,比如VCF文件,头部使用#注释

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
##fileformat=VCFv4.2
##ALT=<ID=NON_REF,Description="Represents any possible alternative allele not already represented at this location by REF and ALT">
##FILTER=<ID=CNN_1D_INDEL_Tranche_99.40_100.00,Description="INDEL truth resource sensitivity between 99.40 and 100.00 for info key CNN_1D">
##FILTER=<ID=CNN_1D_SNP_Tranche_99.95_100.00,Description="SNP truth resource sensitivity between 99.95 and 100.00 for info key CNN_1D">
##FILTER=<ID=LowQual,Description="Low quality">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another; will always be heterozygous and is not intended to describe called alleles">
##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phasing set (typically the position of the first variant in the set)">
##FORMAT=<ID=RGQ,Number=1,Type=Integer,Description="Unconditional reference genotype confidence, encoded as a phred quality -10*log10 p(genotype call is wrong)">
##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
##Funcotator Version=4.1.9.0 | Gencode 34 CANONICAL | ACMGLMMLof 1 | ACMG_recommendation SF_v2.0 | ClinVar_VCF 20180429_hg38 | LMMKnown 20180618

# 此处仅列举部分,注释部分其实很长

查看VCF时如需略过注释部分,直接查看主体内容,可以结合grep命令,使用-v参数(select non-matching lines),选择不匹配的行

1
2
3
4
5
6
7
8
$ grep -v "^#" annotated.filtered.funcotator.germline.vcf | less

# 此时,略过头部带##的注释部分,可以直接查看文件主体
chr1 15903 . G GC 59.28 PASS AC=2;AF=1.00;AN=2;CNN_1D=-0.800;DP=3;ExcessHet=3.0103;FS=0.000;FUNCOTATION=[WASH7P|hg38|chr1|15903|15904|RNA||INS|-|-|C|g.chr1:15903_15904insC|ENST00000488147.1|-|||c.e9-44C>GC|||0.6475|AGCAGAGTGGCCAGCCACCG||||||||||||||||||||||||||||||false||];MLEAC=1;MLEAF=0.500;MQ=30.13;QD=29.64;SOR=2.303 GT:AD:DP:GQ:PL 1/1:0,2:2:6:71,6,0
chr1 16495 . G C 36.65 CNN_1D_SNP_Tranche_99.95_100.00 AC=1;AF=0.500;AN=2;BaseQRankSum=-9.670e-01;CNN_1D=-4.292;DP=3;ExcessHet=3.0103;FS=0.000;FUNCOTATION=[WASH7P|hg38|chr1|16495|16495|RNA||SNP|G|G|C|g.chr1:16495G>C|ENST00000488147.1|-|||c.e8-112C>G|||0.486284289276808|TATTTGAAATGGAAACTATTC||||||||||||||||||||||||||||||false||];MLEAC=1;MLEAF=0.500;MQ=22.00;MQRankSum=0.00;QD=12.22;ReadPosRankSum=0.967;SOR=1.179 GT:AD:DP:GQ:PL 0/1:1,2:3:18:44,0,18
chr1 16734 . TG T 31.60 CNN_1D_INDEL_Tranche_99.40_100.00 AC=1;AF=0.500;AN=2;BaseQRankSum=-6.740e-01;CNN_1D=-6.239;DP=10;ExcessHet=3.0103;FS=0.000;FUNCOTATION=[WASH7P|hg38|chr1|16735|16735|RNA||DEL|G|G|-|g.chr1:16735delG|ENST00000488147.1|-|||c.e8-31CA>A|||0.6309226932668329|TGGGGGCGGTGGGGGTGGTGT||||||||||||||||||||||||||||||false||];MLEAC=1;MLEAF=0.500;MQ=32.17;MQRankSum=-1.400e-01;QD=3.51;ReadPosRankSum=-9.210e-01;SOR=0.132 GT:AD:DP:GQ:PL 0/1:7,2:9:39:39,0,197
chr1 17614 . G A 54.64 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=0.00;CNN_1D=-0.693;DP=5;ExcessHet=3.0103;FS=0.000;FUNCOTATION=[WASH7P|hg38|chr1|17614|17614|RNA||SNP|G|G|A|g.chr1:17614G>A|ENST00000488147.1|-|||c.e5+8C>T|||0.6259351620947631|ACAGGTTCTCGGTGGTGTTGA|MIR6859-1_ENST00000619216.1_FIVE_PRIME_FLANK|||||||||||||||||||||||||||||false||];MLEAC=1;MLEAF=0.500;MQ=47.88;MQRankSum=-1.645e+00;QD=10.93;ReadPosRankSum=1.04;SOR=1.179 GT:AD:DP:GQ:PL 0/1:2,3:5:35:62,0,35
chr1 19190 . GC G 32.60 CNN_1D_INDEL_Tranche_99.40_100.00 AC=1;AF=0.500;AN=2;BaseQRankSum=-6.690e-01;CNN_1D=-7.533;DP=13;ExcessHet=3.0103;FS=0.000;FUNCOTATION=[WASH7P|hg38|chr1|19191|19191|RNA||DEL|C|C|-|g.chr1:19191delC|ENST00000488147.1|-|||c.e3+824GC>C|||0.5985037406483791|CTCCCAGTCGCCCCTGTAGCT|MIR6859-1_ENST00000619216.1_FIVE_PRIME_FLANK|||||||||||||||||||||||||||||false||];MLEAC=1;MLEAF=0.500;MQ=31.81;MQRankSum=-2.120e+00;QD=2.51;ReadPosRankSum=0.099;SOR=0.330 GT:AD:DP:GQ:PL 0/1:11,2:13:40:40,0,330
  • 本文作者:括囊无誉
  • 本文链接: Linux/grepbody/
  • 版权声明: 本博客所有文章均为原创作品,转载请注明出处!
------ 本文结束 ------
坚持原创文章分享,您的支持将鼓励我继续创作!