Quantcast
Channel: Ask the FireCloud Team — GATK-Forum
Viewing all articles
Browse latest Browse all 1147

five-dollar-genome-analysis-pipeline running errors in CheckContamination

$
0
0

I have cloned the five dollar genome analysis pipeline and uploaded my own WGS input data, however, I encountered problems in CheckContamination stages:

Job germline_single_sample_workflow.CheckContamination:NA:1 exited with return code 1

Workflow ID: 16497c6d-be4c-4579-9202-58960bbde32d

And I checked CheckContamination-stderr.log and found the following error msgs as:

Traceback (most recent call last):
File "", line 6, in
File "/usr/local/lib/python3.6/csv.py", line 111, in next
self.fieldnames
File "/usr/local/lib/python3.6/csv.py", line 98, in fieldnames
self._fieldnames = next(self.reader)
File "/usr/local/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 146: invalid start byte

And then I traced back it is in line 357, open function fails : with open('${output_prefix}.selfSM') as selfSM:

I tried two ways to fix this:
1. change line 354 in bam_processing.wdl from
python3 -> python2,
if we change python3 to python2, then the input .selfSM file can be handled correctly,

  1. change line 357 in bam_processing.wdl
    with open('${output_prefix}.selfSM', ) as selfSM:
    to
    with open('${output_prefix}.selfSM',errors='ignore') as selfSM:

to ignore the unrecognized byte.


Viewing all articles
Browse latest Browse all 1147

Trending Articles