Hi-
I am having some trouble with file delocalization in firecloud when using my own docker images.
So for the setup, I made a dockerfile that contains all of my source code (an autobuild from a github repo) that I wanted to use for analyses in Firecloud. Then in my method in Firecloud, this docker image would be called in the runtime block, allowing access to my source code (basically a bunch of R scripts that would be run). I hard-carded paths within each wdl task to the corresponding script within the docker container.
When I try to run my method in Firecloud, I am seeing some weird behavior in where files are being moved/written:
message: Task fullPipe.getarray:NA:1 failed. JES error code 5. Message: 10: Failed to delocalize files: failed to copy the following files: "/mnt/local-disk/getarray-rc.txt -> gs://fc-fa093e72-dbcb-4028-ae82-609a79ced51a/3d32ccf4-28ba-43d8-8704-7c87d8f34be7/fullPipe/ae7b05d4-cc26-451b-8a07-00b5b12d26a8/call-getarray/getarray-rc.txt (cp failed: gsutil -q -m cp -L /var/log/google-genomics/out.log /mnt/local-disk/getarray-rc.txt gs://fc-fa093e72-dbcb-4028-ae82-609a79ced51a/3d32ccf4-28ba-43d8-8704-7c87d8f34be7/fullPipe/ae7b05d4-cc26-451b-8a07-00b5b12d26a8/call-getarray/getarray-rc.txt, command failed: CommandException: No URLs matched: /mnt/local-disk/getarray-rc.txt\nCommandException: 1 file/object could not be transferred.\n)"
From the log file, the task seems to be completing but failing when copying files:
2017/08/29 18:09:53 I: Running command: iptables -I FORWARD -d metadata.google.internal -p tcp --dport 80 -j DROP
2017/08/29 18:09:53 I: Setting these data volumes on the docker container: [-v /tmp/ggp-146399440:/tmp/ggp-146399440 -v /mnt/local-disk:/cromwell_root]
2017/08/29 18:09:53 I: Running command: docker run -v /tmp/ggp-146399440:/tmp/ggp-146399440 -v /mnt/local-disk:/cromwell_root -e fc-d960a560-7e5c-4083-b61e-b2ea71ae5b14/passgt.minDP10-gds500/chunk2.freeze4.chrALL.pass.gtonly.minDP10.genotypes.gds=/cromwell_root/fc-d960a560-7e5c-4083-b61e-b2ea71ae5b14/passgt.minDP10-gds500/chunk2.freeze4.chrALL.pass.gtonly.minDP10.genotypes.gds -e __extra_config_gcs_path=gs://cromwell-auth-amp-t2d-op/ae7b05d4-cc26-451b-8a07-00b5b12d26a8_auth.json -e getarray.gdsfilesin-0=/cromwell_root/fc-d960a560-7e5c-4083-b61e-b2ea71ae5b14/passgt.minDP10-gds500/chunk1.freeze4.chrALL.pass.gtonly.minDP10.genotypes.gds -e getarray.gdsfilesin-1=/cromwell_root/fc-d960a560-7e5c-4083-b61e-b2ea71ae5b14/passgt.minDP10-gds500/chunk2.freeze4.chrALL.pass.gtonly.minDP10.genotypes.gds -e exec=/cromwell_root/exec.sh -e getarray-rc.txt=/cromwell_root/getarray-rc.txt -e fc-d960a560-7e5c-4083-b61e-b2ea71ae5b14/passgt.minDP10-gds500/chunk1.freeze4.chrALL.pass.gtonly.minDP10.genotypes.gds=/cromwell_root/fc-d960a560-7e5c-4083-b61e-b2ea71ae5b14/passgt.minDP10-gds500/chunk1.freeze4.chrALL.pass.gtonly.minDP10.genotypes.gds tmajarian/topmed@sha256:b0b54996d86746d199493a94dbc92751c4a1d9399c7898e58174c84d35fe44fe /tmp/ggp-146399440
2017/08/29 18:09:54 I: Switching to status: delocalizing-files
2017/08/29 18:09:54 I: Calling SetOperationStatus(delocalizing-files)
2017/08/29 18:09:54 I: SetOperationStatus(delocalizing-files) succeeded
2017/08/29 18:09:54 I: Docker file /cromwell_root/getarray-rc.txt maps to host location /mnt/local-disk/getarray-rc.txt.
2017/08/29 18:09:54 I: Running command: sudo gsutil -q -m cp -L /var/log/google-genomics/out.log /mnt/local-disk/getarray-rc.txt gs://fc-fa093e72-dbcb-4028-ae82-609a79ced51a/3d32ccf4-28ba-43d8-8704-7c87d8f34be7/fullPipe/ae7b05d4-cc26-451b-8a07-00b5b12d26a8/call-getarray/getarray-rc.txt
2017/08/29 18:09:55 E: command failed: CommandException: No URLs matched: /mnt/local-disk/getarray-rc.txt
CommandException: 1 file/object could not be transferred.
(exit status 1)
This problem seems to only be with the docker files/images that I create; the task called above completes when a different docker is used (one that was build by someone else). The docker image is public also: tmajarian/topmed. Also, here is the wdl that I am using:
task getarray {
Array[File] gdsfilesin
command {
ls -lh ${sep = ' ' gdsfilesin}
}
output {
Array[File] gdsfilesout = gdsfilesin}
runtime {
docker: "tmajarian/topmed@sha256:1b10a60f8ad47316b71e51ea864fa1b68fb0585cc5ac190f827573e6eaa0348e"
}
}
task common_ID {
File gds
File ped
String idcol
String label
command {
R --vanilla --args ${gds} ${ped} ${idcol} ${label} < /src/workflows/singleVariantFull/commonID.R
}
meta {
author: "jasen jackson"
email: "jasenjackson97@gmail.com"
}
runtime {
docker: "tmajarian/topmed@sha256:1b10a60f8ad47316b71e51ea864fa1b68fb0585cc5ac190f827573e6eaa0348e"
disks: "local-disk 100 SSD"
memory: "3G"
}
output {
File commonIDstxt = "${label}.commonIDs.txt"
File commonIDsRData = "${label}.commonIDs.RData"
}
}
task assocTest {
File gds
File ped
File GRM
File commonIDs
String label
String colname
String outcome
String outcomeType
String covariates
command {
R --vanilla --args ${gds} ${ped} ${GRM} ${commonIDs} ${colname} ${label} ${outcome} ${outcomeType} ${covariates} < /src/workflows/singleVariantFull/assocSingleVar.R
}
meta {
author: "jasen jackson; Alisa Manning, Tim Majarian"
email: "jasenjackson97@gmail.com; amanning@broadinstitute.org, tmajaria@braodinstitute.org"
}
runtime {
# docker: "tmajarian/topmed@sha256:1b10a60f8ad47316b71e51ea864fa1b68fb0585cc5ac190f827573e6eaa0348e"
docker: "tmajarian/topmed:latest"
disks: "local-disk 100 SSD"
memory: "30G"
}
output {
File assoc = "${label}.assoc.RData"
}
}
task summary {
Array[File] assoc
String pval
String label
String title
command {
R --vanilla --args ${pval} ${label} ${title} ${sep = ' ' assoc} < /src/workflows/singleVariantFull/summarySingleVar.R
}
runtime {
docker: "tmajarian/topmed@sha256:1b10a60f8ad47316b71e51ea864fa1b68fb0585cc5ac190f827573e6eaa0348e"
disks: "local-disk 100 SSD"
memory: "30G"
}
output {
File mhplot = "${label}.mhplot.png"
File qqplot = "${label}.qqplot.png"
File topassoccsv = "${label}.topassoc.csv"
File allassoccsv = "${label}.assoc.csv"
}
}
workflow fullPipe {
Array[File] genFiles
File this_ped
File this_kinshipGDS
String this_label
String this_colname
String this_outcome
String this_outcomeType
String this_covariates
String this_pval
String this_title
call getarray { input: gdsfilesin=genFiles }
call common_ID {
input: gds=getarray.gdsfilesout[0], ped=this_ped, idcol=this_colname, label=this_label
}
scatter ( this_genfile in getarray.gdsfilesout ) {
call assocTest {
input: gds = this_genfile, ped = this_ped, GRM = this_kinshipGDS, commonIDs = common_ID.commonIDsRData, colname = this_colname, outcome = this_outcome, outcomeType = this_outcomeType, covariates = this_covariates, label=this_label
}
}
call summary {
input: assoc = assocTest.assoc, pval=this_pval, label=this_label, title=this_title
}
output {
File mhplot=summary.mhplot
File qqplot=summary.qqplot
File allassoc=summary.allassoccsv
File topassoc=summary.topassoccsv
}
}
Any input would be totally awesome.
-Tim