multi-regional storage buckets

Moving away from multi-regional storage buckets

Adam Mullen, Software Product Manager in the Data Sciences Platform at the Broad Institute, is responsible for Terra Workspaces. In this blog, he announces a change to the default storage bucket setting for new Terra workspaces that will address the upcoming Google Cloud pricing changes.


 

Google Cloud recently announced upcoming changes (starting Oct 1, 2022) to the pricing structure of a number of services, including multi-regional storage buckets. Crucially, they will start charging egress fees — which they normally charge when data stored in one region needs to be transferred to another region for computing — on data stored in multi-regional buckets.

Multi-regional buckets provide “geo-redundancy“, meaning that the data they contain is copied in data centers located in multiple regions. This offers a degree of protection against data unavailability or loss in case of large-scale failure (such as a disaster affecting an entire region), which can be an attractive feature for some use cases. 

Until this announced pricing change, multi-regional buckets also offered the opportunity to use the data in multiple compute regions without incurring egress fees. That is why, until now, all Terra workspaces were created with a multi-regional bucket (unless you specified a region at time of creation).

Under the new pricing structure, however, it will no longer be advantageous to use multi-regional buckets for Terra workspaces by default. Using a multi-regional storage bucket would lead to egress fees every time you run any analysis on data stored in the bucket. 

 

Going forward: single-region buckets for all new Terra workspaces

To address this emerging issue, we are switching the default bucket setting for new workspaces from “US multi-regional” to “us-central1”, which is the same region Terra uses by default for compute. This will protect you from incurring egress fees related to the region of your workspace buckets that would otherwise result from Google Cloud’s pricing change. In fact, you may notice a slight drop in your storage costs, since single-region buckets are a little cheaper than multi-regional buckets.

You will of course still be able to select a different storage region when you create a workspace, and the “US multi-regional” option will remain available, with a notice of caution regarding the pricing policy that will be displayed upon selection of that option. There’s also a documentation article that details the trade-offs involved.

 

Migrating existing Terra workspaces to single-region storage

There are already over 30,000 Terra workspaces with multi-regional buckets in existence, all of which will be subject to Google Cloud’s new pricing structure starting in October 2022. For those, we are currently developing a migration protocol that will allow us to convert existing workspaces to using single-region storage buckets without breaking any links. 

In order to minimize burden on individual users, we envision doing this on an opt-out basis: we will migrate all workspaces UNLESS the workspace owner requests their workspace(s) be excluded from the migration. We have not yet determined the exact timeframe for this; however, we are committed to providing ample notice, and will send out a separate communication announcing the specifics of the migration plan in due time. 

As always, we are undertaking these changes with the goal of serving the best interests of the research community, and we are open to your suggestions, questions, and concerns. Don’t hesitate to reach out to our support team in the Terra community forum or privately through the Helpdesk as you prefer. 

Share