Multiple issues with FC workflow execution & call caching

Hi FC team,

Our team has been unable to get any workflows to successfully complete since mid-afternoon yesterday, Tues, Oct 2. We have experienced these issues across multiple workspaces, workflows, and configs.

There are no persistent errors, but here is a sampling of the issues we have encountered:

Workflow died because of a temporary server error (example: https://portal.firecloud.org/#workspaces/talkowski-sv-gnomad-wgs-v2/SV_Talkowski_GNOMAD_WGS-V2/monitor/43065914-ef9b-4634-81b2-8675b1176ca5/b02e2598-e568-447c-8eac-43ba6bf40185)
Tasks in workflows with call caching disabled suddenly are spending 1-2 hours in a CheckingCacheEntryExistence state (example: task CleanVCF.Clean4.combine_multi_IDs in Call #2 here: https://portal.firecloud.org/#workspaces/talkowski-sv-gnomad-wgs-v2/SV_Talkowski_GNOMAD_WGS-V2/monitor/4678e997-47d2-4fbe-8c4f-4c65120e341b/d9c779da-d6eb-4a31-b5a5-0321b6b46823)
Workflows with call caching enabled not launching for over 12 hours (example: https://portal.firecloud.org/#workspaces/talkowski-sv-gnomad-wgs-v2/SV_Talkowski_GNOMAD_WGS-V2/monitor/555bda8d-6cdd-45ec-b91d-7b89d7ac83b6/1357a70f-afad-4a77-8dcb-9f6a7804ac29)
Tasks failing because they aren't able to find outputs from previous tasks, despite these outputs existing in the gs:// bucket and looking correct when downloaded & investigated locally (example: task CleanVCF.cleanvcf5 in Call #2 here: https://portal.firecloud.org/#workspaces/talkowski-sv-gnomad-wgs-v2/SV_Talkowski_GNOMAD_WGS-V2/monitor/4678e997-47d2-4fbe-8c4f-4c65120e341b/d9c779da-d6eb-4a31-b5a5-0321b6b46823)

I suspect these issues could be related to the following two posts from yesterday by @jgould and @Chip:
https://gatkforums.broadinstitute.org/firecloud/discussion/13147/long-wait-times#latest
https://gatkforums.broadinstitute.org/firecloud/discussion/13148/2-hours-per-task-to-check-call-cache#latest

At this point, we are completely stalled on all workspaces, and don't want to launch any new workflows due to these unpredictable errors and long queue times.

Any idea what could be going on, or how long we should expect this behavior to persist?

Thanks a lot,
Ryan & the Talkowski lab

Multiple issues with FC workflow execution & call caching

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List