VCF manipulation
In order to use vcftools, we need gzip files:
need htslib package
gzip .vcf
tabix -pvcf .vcf.gz
Merge all annotated vcf files into a multi-sample vcf
#bgzip all the original vcf file, tabix index them to be used by bcftools
cat redo\_list2 \| while read line; do cd $line/COPII; bgzip COPII.out.hg19\_multianno.vcf \| tabix -p vcf; cd ../..; done
ls \*.vcf > vcf\_list.txt
cat vcf\_list.txt \| while read line; do bgzip $line; tabix -p vcf $line.gz; done
\#generate list for available annotated vcf files
ls \*.out.hg19\_multianno.vcf.gz > vcf\_list
\#merge single-sample vcf files to one multi-sample vcf file
bcftools merge --force-samples -l xaa -O v -o merge\_NB\_xaa.vcf
NOTE: if there are too many files to merge, the list needs to be split into smaller lists otherwise the program will give error message : cannot load the index file even with tbi file.
split -l 100 NB\_vcf\_list2.txt \#split list by number of lines
Generate final vcf file: merge\_NB.vcf
If need to do the rehired because of the sample name:
Generate the header text file
bcftools reheader -h merge\_header merge\_all.vcf -o ENG\_NGS\_PAQR\_reheader.vcf
use vcflib (vcffilter function) to filter through vcf variant based on INFO annotation
vcffilter -f "1000g2015aug\_all < 0.01" merge\_all\_new.vcf > merge\_all\_new\_1000G.vcf
vcffilter -f "esp6500si\_all < 0.01" merge\_all\_new\_1000G.vcf > merge\_all\_MAF0.01.vcf
NOTE: remember to leave space before and after condition < > etc
Compare two VCF files:
option 1:
vcf-compare -a HLI-0042-hg38bwa-chr12.recode.vcf.gz HLI-0042.chr12.vcf.recode.vcf.gz > vcf_compare
option 2:
java -jar /cm/shared/apps/GenomeAnalysisTk/3.6/GenomeAnalysisTK.jar -T VariantEval -R ~/lustre/LIBRARY_FILES/GDC/GRCh38.d1.vd1.fa -o output.eval.grp --eval:set1 HLI-0042-hg38bwa-chr12.recode.vcf.gz --eval:set2 HLI-0042.chr12.vcf.recode.vcf.gz
option3:
java -jar /Users/niy/Desktop/Software/snpEff/SnpSift.jar concordance -v /Volumes/ccg/HLI/YingAnalysis/ComparisonTest/HLI-0042-hg38bwa-chr12.recode.vcf /Volumes/ccg/HLI/YingAnalysis/ComparisonTest/HLI-0042.chr12.vcf.recode.vcf > SnpSift_concordance.txt