The Cloud provides a secure and
scalable way to share access to
data and tools; facilitate collaboration
and increase computational
reproducibility; and enhance data security.
The Centers for Common Disease Genomics (CCDG) program has sequenced whole genomes or exomes from over 65,000 participants, representing 900 Terabytes of data, with many more planned.
All of Us
The All of Us Research Program is sequencing over 250,000 whole genomes representing 3.7 Petabytes, out of a planned cohort of over 1 Million participants.
Given datasets on the scale of Terabytes or more, the traditional model of downloading copies to local computing infrastructure cannot scale, and imposes unfair barriers to participation against all but the largest and most highly-funded research centers.
Thanks to containers, you can bundle all of the software needed for your analysis in a self contained portable environment that can be run anywhere, so anyone else can run your code without spending hours trying to install packages and resolve conflicts.
Everyone has access to a wide range of the same off the shelf hardware — including more special-purpose hardware like GPUs — so you can reproduce someone else’s work without having to buy bespoke hardware.
There are many misconceptions about the security of data on the cloud. In Terra, security is a first-class feature, drawing on the best practices in information security, accredited through top US federal certification programs and compliant with European GDPR laws. For more details, see the Security resource page.