get script
1
|
git clone https://github.com/wlz0726/SequenceMapping.git
|
NB: samtools should be v1.* or later
1 2
|
bwa mem -R ref.fa SampleID.1.fq.gz SampleID.2.fq.gz | samtools sort -O bam -o SampleID.sort.bam samtools index SampleID.sort.bam
|
step2: remove PCR duplicates with picard
1 2 3 4 5 6 7
|
java -jar picard.jar MarkDuplicates INPUT=SampleID.sort.bam OUTPUT=SampleID.rmdup.bam METRICS_FILE =SampleID.dup.txt REMOVE_DUPLICATES=true VALIDATION_STRINGENCY=SILENT CREATE_INDEX=true
|
step3: do local realignment of reads to enhance the alignments in the vicinity of indel polymorphisms
NB: GATK v3.6
NB: Note that indel realignment is no longer necessary for variant discovery if you plan to use a variant caller that performs a haplotype assembly step, such as HaplotypeCaller or MuTect2. However it is still required when using legacy callers such as UnifiedGenotyper or the original MuTect.
1 2 3 4 5 6 7 8 9 10 11 12
|
# use the RealignerTargetCreator to identify regions where realignment was needed java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R ref.fa -o SampleID.intervals -I SampleID.rmdup.bam # use IndelRealigner to realign the regions found in the RealignerTargetCreator step java -jar GenomeAnalysisTK.jar -T IndelRealigner -R ref.fa -targetIntervals SampleID.intervals -I SampleID.rmdup.bam -o SampleID.realn.bam
|
近期评论