Resources and limitations

What are the resources that I need and what are the limits?

Type of setup:

There are a number of ways to fine tune the usage of the SEQuoia Express Toolkit. It will vary depending on the system you intend to use.

Big data (large data sets of large files): clusters / cloud computing is a good path forward
Norma data sets ~50M reads: cloud computing or a local machine that has a good amount of resources: 8-16 cores, and 32-64 gb RAM

A laptop is not recommended to do this computational work, unless it is being used to to submit the job to a cluster or cloud computing platform and retrieve the results, in which case you just need to make sure you plenty of hard drive space.

What have you tested?

For development of the SEQuoia Express Toolkit, we used a variety of ways to test the upper limits of the software.

For Large single file data sets we tested up to 125 million reads. Experiments of this size require large amounts of RAM (approximately 80 GB) for deduplication in order to build the graph. We recommend running samples of this size on a large cluster or cloud platform.

Fine tuning

Each process in the main.nf file is tagged with a resource allocation. If a process is untagged, it will revert to the defaults present conf/base.config. The tags can be used as an indicator of whether if the process requires more than the minimum CPU or RAM. Some need a lot of RAM (such as deduplication) but not a lot of CPU and vice-versa. While others might need a lot of both resources. These allocations are already included in the main.nf file but can always be changed to suit your needs or system size.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resources and limitations

What are the resources that I need and what are the limits?

Type of setup:

What have you tested?

Fine tuning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally