I am trying to submit a large membership tsv file (63K lines), and I've written a FISSFC command (available in v0.13.1) to batch the update into multiple smaller groups (default is 500, but configurable), as submitting large TSV files all at once is a known issue. However, I am still getting 500 errors from the importEntities endpoint:
timdef@wmc82-d41:~/tmp$ fissfc -v
0.13.1
timdef@wmc82-d41:~/tmp$ fissfc -V entity_import -C 2000 -w dev -p broad-firecloud-gdac -f patched_TCGA.2016_11_03__01_00_02.Sample_Set.loadfile.txt
Batching 63807 updates to Firecloud...
Updating sample_set memberships 1-2000, batch 1/32
Traceback (most recent call last):
File "/usr/local/bin/fissfc", line 9, in <module>
load_entry_point('firecloud==0.13.1', 'console_scripts', 'fissfc')()
File "/usr/local/lib/python2.7/site-packages/firecloud/fiss.py", line 1686, in main
sys.exit(args.func(args))
File "/usr/local/lib/python2.7/site-packages/firecloud/fiss.py", line 144, in entity_import
chunk_size, api_url, verbose):
File "/usr/local/lib/python2.7/site-packages/firecloud/fiss.py", line 1143, in _batch_load
fapi._check_response_code(r, 200)
File "/usr/local/lib/python2.7/site-packages/firecloud/api.py", line 55, in _check_response_code
raise FireCloudServerError(response.status_code, response.content)
firecloud.errors.FireCloudServerError: 500: {
"statusCode": 500,
"source": "FireCloud",
"timestamp": 1481308448927,
"causes": [],
"stackTrace": [{
"className": "spray.can.client.HttpHostConnectionSlot$$anonfun$connected$1",
"methodName": "applyOrElse",
"fileName": "HttpHostConnectionSlot.scala",
"lineNumber": 148
}, {
"className": "akka.actor.Actor$class",
"methodName": "aroundReceive",
"fileName": "Actor.scala",
"lineNumber": 480
}, {
"className": "spray.can.client.HttpHostConnectionSlot",
"methodName": "aroundReceive",
"fileName": "HttpHostConnectionSlot.scala",
"lineNumber": 33
}, {
"className": "akka.actor.ActorCell",
"methodName": "receiveMessage",
"fileName": "ActorCell.scala",
"lineNumber": 526
}, {
"className": "akka.actor.ActorCell",
"methodName": "invoke",
"fileName": "ActorCell.scala",
"lineNumber": 495
}, {
"className": "akka.dispatch.Mailbox",
"methodName": "processMailbox",
"fileName": "Mailbox.scala",
"lineNumber": 257
}, {
"className": "akka.dispatch.Mailbox",
"methodName": "run",
"fileName": "Mailbox.scala",
"lineNumber": 224
}, {
"className": "akka.dispatch.Mailbox",
"methodName": "exec",
"fileName": "Mailbox.scala",
"lineNumber": 234
}, {
"className": "scala.concurrent.forkjoin.ForkJoinTask",
"methodName": "doExec",
"fileName": "ForkJoinTask.java",
"lineNumber": 260
}, {
"className": "scala.concurrent.forkjoin.ForkJoinPool$WorkQueue",
"methodName": "runTask",
"fileName": "ForkJoinPool.java",
"lineNumber": 1339
}, {
"className": "scala.concurrent.forkjoin.ForkJoinPool",
"methodName": "runWorker",
"fileName": "ForkJoinPool.java",
"lineNumber": 1979
}, {
"className": "scala.concurrent.forkjoin.ForkJoinWorkerThread",
"methodName": "run",
"fileName": "ForkJoinWorkerThread.java",
"lineNumber": 107
}],
"message": "Service API call failed"
}
timdef@wmc82-d41:~/tmp$ fissfc -V entity_import -C 250 -w dev -p broad-firecloud-gdac -f patched_TCGA.2016_11_03__01_00_02.Sample_Set.loadfile.txt
Batching 63807 updates to Firecloud...
Updating sample_set memberships 1-250, batch 1/256
Updating sample_set memberships 251-500, batch 2/256
Traceback (most recent call last):
File "/usr/local/bin/fissfc", line 9, in <module>
load_entry_point('firecloud==0.13.1', 'console_scripts', 'fissfc')()
File "/usr/local/lib/python2.7/site-packages/firecloud/fiss.py", line 1686, in main
sys.exit(args.func(args))
File "/usr/local/lib/python2.7/site-packages/firecloud/fiss.py", line 144, in entity_import
chunk_size, api_url, verbose):
File "/usr/local/lib/python2.7/site-packages/firecloud/fiss.py", line 1143, in _batch_load
fapi._check_response_code(r, 200)
File "/usr/local/lib/python2.7/site-packages/firecloud/api.py", line 55, in _check_response_code
raise FireCloudServerError(response.status_code, response.content)
firecloud.errors.FireCloudServerError: 500: The server was not able to produce a timely response to your request.
How can I reliably upload an update of this size, and how can it run faster? It's still a dozen seconds or so round trip for each api call, even if it succeeds.