Quantcast
Channel: Ask the FireCloud Team — GATK-Forum
Viewing all articles
Browse latest Browse all 1147

failing to get size of file

$
0
0

The QC workflow that we run on BAMs before running mutect is failing on a particular data set. The data files in this set reside in a non-workspace, protected bucket that I have read access to (broad-ibmwatson-broad_private_data-bucket). The failure appears in the first task of the workflow that takes as input the sizes of several files; i.e.,

call QC_Prepare_Task {
input:
preemptible=preemptible,
tBamBytes=size(tumorBam),
tBaiBytes=size(tumorBamIdx),
nBamBytes=size(normalBam),
nBaiBytes=size(normalBamIdx),
regionFileBytes=size(regionFile),
rgBLBytes=size(readGroupBlackList),
capNormDBZipBytes=size(captureNormalsDBRCLZip),
fastaBytes=size(refFasta),
fastaDictBytes=size(refFastaDict),
fastaIdxBytes=size(refFastaIdx),
exomeIntervalsBytes=size(exomeIntervals),
snpSixBytes=size(SNP6Bed),
hapMapVCFBytes=size(HapMapVCF),
hapDBForCCBytes=size(HaplotypeDBForCrossCheck),
dbSNPVCFBytes=size(DB_SNP_VCF),
dbSNPVCFIDXBytes=size(DB_SNP_VCF_IDX),
picardHapMapVCFBytes=size(picardHapMap),
picardTargetIntervalsBytes=size(picardTargetIntervals),
picardBaitIntervalsBytes=size(picardBaitIntervals)
}

The files that reside on the private bucket are tumorBam, tumorBamIdx, normalBam and normalBamIdx.

This prepare tasks fails with the following message:
message: Couldn't resolve all inputs for QC_Workflow.QC_Prepare_Task at index None.
causedBy:
message: Input evaluation for Call QC_Workflow.QC_Prepare_Task failed.
causedBy:
message: nBamBytes
causedBy:
message: fc-a903aa03-a935-463b-bdff-bf782a05a55a/ac0ebcbc-4a9a-43b1-abdb-d5fc82c89986/QC_Workflow/c5d05e91-cec0-4734-ad04-5b7d45656417/call-QC_Prepare_Task/gs:/broad-ibmwatson-broad_private_data-bucket/seq/picard_aggregation/RP-897/Exome/05246_CCPM_030102_Blood/v5/05246_CCPM_030102_Blood.bam
message: tBaiBytes
causedBy:
message: fc-a903aa03-a935-463b-bdff-bf782a05a55a/ac0ebcbc-4a9a-43b1-abdb-d5fc82c89986/QC_Workflow/c5d05e91-cec0-4734-ad04-5b7d45656417/call-QC_Prepare_Task/gs:/broad-ibmwatson-broad_private_data-bucket/xchip/bloodbiopsy/data/cell_free_DNA/27Feb17SR_cfDNA_IBM_WES/mergedBamFilesV2/FC19270072.markDuplicates.bai
message: nBaiBytes
causedBy:
message: fc-a903aa03-a935-463b-bdff-bf782a05a55a/ac0ebcbc-4a9a-43b1-abdb-d5fc82c89986/QC_Workflow/c5d05e91-cec0-4734-ad04-5b7d45656417/call-QC_Prepare_Task/gs:/broad-ibmwatson-broad_private_data-bucket/seq/picard_aggregation/RP-897/Exome/05246_CCPM_030102_Blood/v5/05246_CCPM_030102_Blood.bai
message: tBamBytes
causedBy:
message: fc-a903aa03-a935-463b-bdff-bf782a05a55a/ac0ebcbc-4a9a-43b1-abdb-d5fc82c89986/QC_Workflow/c5d05e91-cec0-4734-ad04-5b7d45656417/call-QC_Prepare_Task/gs:/broad-ibmwatson-broad_private_data-bucket/xchip/bloodbiopsy/data/cell_free_DNA/27Feb17SR_cfDNA_IBM_WES/mergedBamFilesV2/FC19270072.markDuplicates.bam

However, if I run a simple single-task workflow on the same entity where the workflow takes as input one of these files, rather than the file size, the file gets successfully localized (from task log):

2017/06/12 21:00:26 I: Running command: sudo gsutil -q -m cp gs://broad-ibmwatson-broad_private_data-bucket/xchip/bloodbiopsy/data/cell_free_DNA/27Feb17SR_cfDNA_IBM_WES/mergedBamFilesV2/FC19270072.markDuplicates.bai /mnt/local-disk/gs:/broad-ibmwatson-broad_private_data-bucket/xchip/bloodbiopsy/data/cell_free_DNA/27Feb17SR_cfDNA_IBM_WES/mergedBamFilesV2/FC19270072.markDuplicates.bai
2017/06/12 21:00:28 I: Done copying files.

I believe it is cromwell that is retrieving the size of the files from the bucket, possibly by making a gsutil stats call. What credentials does cromwell use to make this request? If it is not employing the user's credentials, then that might explain the failure.

Note that the gs urls in the failure message are corrupted (gs:/ instead of gs://). This might hint at the presence of a bug, or could simply be a formatting issue associated with the generation of the error message. Regardless, this is blocking progress on our collaboration project.


Viewing all articles
Browse latest Browse all 1147

Trending Articles