We (and many other people) batch together multiple samples for sequencing. The resulting output file from the sequencer thus contains reads from multiple samples that then need to be split into separate (per-sample) files. We would like to implement such a demultiplexing task in WDL. It would take one file as input (e.g. a FASTQ representing all the reads from the sequencing run), and output multiple output files (e.g. one FASTQ per sample). I know how to glob the output to create an array of output files, but these then become attached to a single entity in the data model (i.e. the sequencing run). We would instead like each of these output samples to become individual sample entities in the output model. Does anyone have advice on how to tackle this?
↧