I have some HiSeq WGS data that it was made available to us as a BAM files, these are co-ordinate sorted and aligned to some variant of b37 by illumina’s Isaac aligner consequently I’ve reprocessed them to randomly ordered FASTQ in preparation for realignment using samtools commands that were previously recommended somewhere on the GATK forum:
samtools bamshuf -uOn 128 LP2000728-DNA_E03.bam /ramdisk/tmp | samtools bam2fq - | pigz > LP2000728-DNA_E03.fastq
Were pigz is a parallel-ish implementation of gzip which I've dropped in place of gzip (in the original recommendation), this has given me interleaved FASTQ would this sort of input be amenable to be used in the MuTect2 tumour/normal pipeline and the germline SNPs+Indel pipeline? Or should I specifically split the input into _1 and _2 files as per the norm with illumina sequencing runs? Just I'm not sure if the relevant WDL can optionally accept two or one file with out editing the workflow? Or alternatively is there a Picard tool which maybe more appropriate?