Quantcast
Channel: Ask the FireCloud Team — GATK-Forum
Viewing all articles
Browse latest Browse all 1147

transient workflow failure

$
0
0

A submission made March 10, 2017, 5:14 PM, containing 19 entities in its pairset, had 3 of them fail within minutes. While it was easier to mitigate because the problem 1) occurred quickly and 2) was easily visible, this is an area the system could be made more robust. There was no useful error message, and manually relaunching the 3 failed jobs worked fine.

submission id

475120fa-2411-4c24-9d7c-de11c956174b

One of the three failing workflows:

workflow id: d5b4ddd2-eb3c-4d3a-afa0-01ddd5114b9e

ID:operations/EI7kqNKrKxjj6YnzjubghakBIP3g3tG1AioPcHJvZHVjdGlvblF1ZXVl

lines from workflow log with the error and line before:

2017-03-10 22:15:13,590 INFO - JesAsyncBackendJobExecutionActor [UUID(d5b4ddd2)pcawg_full_workflow.pcawg_full:NA:1]: JesAsyncBackendJobExecutionActor [UUID(d5b4ddd2):pcawg_full_workflow.pcawg_full:NA:1] Status change from - to Running
2017-03-10 22:15:59,151 INFO - JesAsyncBackendJobExecutionActor [UUID(d5b4ddd2)pcawg_full_workflow.pcawg_full:NA:1]: JesAsyncBackendJobExecutionActor [UUID(d5b4ddd2):pcawg_full_workflow.pcawg_full:NA:1] Status change from Running to Failed


Viewing all articles
Browse latest Browse all 1147

Trending Articles