Description of feature
SRA is currently mirroring their files on AWS, so it seems like adding an AWS-direct method might be a useful method, particularly for anyone launching workflows in the cloud.
In my own little pipeline, I've set it up like this:
https://github.com/tanaes/nf-reads-profiler/blob/d5a3ec72cb8aeeff7bd0df1db03e3f583ea21842/modules/data_handling.nf#L1-L24
I just give it a samplesheet with an optional SRA column (see schema here) and, if it finds a valid SRA ID for that sample, it will use aws s3 cp to download the file and feed to fasterq_dump.
This method has been incredibly fast and reliable for me! It would be great to see it in fetchngs.
Description of feature
SRA is currently mirroring their files on AWS, so it seems like adding an AWS-direct method might be a useful method, particularly for anyone launching workflows in the cloud.
In my own little pipeline, I've set it up like this:
https://github.com/tanaes/nf-reads-profiler/blob/d5a3ec72cb8aeeff7bd0df1db03e3f583ea21842/modules/data_handling.nf#L1-L24
I just give it a samplesheet with an optional SRA column (see schema here) and, if it finds a valid SRA ID for that sample, it will use
aws s3 cpto download the file and feed to fasterq_dump.This method has been incredibly fast and reliable for me! It would be great to see it in fetchngs.